PRM128 Development of a Translation Evidence Tracking Tool to Comply With the FDA Review Requirements for the Translation of PRO Instruments

PRM128 Development of a Translation Evidence Tracking Tool to Comply With the FDA Review Requirements for the Translation of PRO Instruments

VALUE IN HEALTH 15 (2012) A277–A575 1 Necker Hospital, Paris, France, 2René Descartes University, Paris, France, 3Pierre Fabre, Boulogne Billancourt...

67KB Sizes 0 Downloads 9 Views

VALUE IN HEALTH 15 (2012) A277–A575

1

Necker Hospital, Paris, France, 2René Descartes University, Paris, France, 3Pierre Fabre, Boulogne Billancourt, France, 4CREES PFSA, Boulogne, France

OBJECTIVES: Atopic dermatitis is one of the most common chronic inflammatory skin diseases. It has an estimated prevalence of between 5 and 30% in children, and is steadily increasing in industrialized countries. Pruritus, xerosis and inflammation are the main symptoms. METHODS: The ABS questionnaire (Atopy Burden Score) consists of 19 items, structured around 5 components (Everyday Life, Leisure, Family Life, Budget, Work and Privacy). It was distributed to a random sample of families consulting at the Necker Hospital. The ABS was accompanied by SF12 and PGWBI to obtain internal and external validation, and by the PO-SCORAD to assess the level of severity. RESULTS: 58 were considered evaluable. Internal validity was measured by Cronbach’s alpha, which is equal to 0.81, reflecting a good homogeneity of the 19 questionnaire items. The mean PGWBI score is 51.82⫾14.28. The scores are significantly different depending on the severity of the atopy. Families’ quality of life, measured using the SF12, revealed no deterioration in the physical component (52.83⫾7.08). The ABS score is correlated with the scores of the questionnaires used, thus confirming external validity.The mean score calculated from the ABS is 48.17⫾18.36. The score increases with the severity of the atopy.A statistically significant difference is observed between the three severity groups, i.e. mild, moderate and severe, with scores of 30.63⫾10.89, 42.55⫾15.72 and 62.62⫾13.59 respectively. Each of the five components is also correlated with the severity of atopy, which opens the door for more detailed analyses of sensitivity to change. CONCLUSIONS: The internal and external validity of our questionnaire were confirmed. ABS is correlated with the severity of atopy. This is currently being done as part of a program aimed at evaluating the therapeutic education and treatment of children in hydrotherapy centers. Following cultural and linguistic validation, the ABS is now available in US English, Spanish, German and Italian. PRM124 TRANSLATION OF THE ALCOHOL TIMELINE FOLLOWBACK (TLFB) IN 4 LANGUAGES A1

2

2

Trevin , Carter Sobell L , Sobell M 1 MAPI Institute, Lyon, France, 2Center for Psychological Studies, Nova Southeastern University, Fort Lauderdale, FL, USA

OBJECTIVES: The Alcohol Timeline Followback (TLFB), developed in English (US), is a drinking assessment method that obtains estimates of daily drinking. Using a calendar, people provide retrospective estimates of their daily drinking over a specified time period that can vary up to 12 months from the interview date. Several memory aids can be used to enhance recall, such as 24 key dates and country specific holidays for calendar year 2012 serving as anchors for reporting drinking, or use of standard drink conversion. The objective of this study is to present the translation of the TLFB in Czech, Slovak, Italian, and Hungarian. METHODS: The following translation method was used: concept definition, forward translation in the target languages, backward translation and test on five individuals (cognitive interviews) in each country. RESULTS: Difficulties encountered during the process were twofold: 1) ensuring that the standard drink conversions were correct, i.e., reflecting US standards, and culturally appropriate, and 2) adapting the 24 US key anchor dates to each country. For the standard drink conversion, the main change was the use of the metric system instead of the US customary units, i.e., use of liter and sub-units instead of ounces. In addition, quantities had to be adapted to fit cultural uses. For instance, in Slovakia and Czech Republic, “one 12 oz can/bottle of beer” was adapted to “one beer in a can / bottle of 0.5 liter.” Thirteen anchor dates in the US calendar had to be deleted in all countries, e.g., Martin Luther King day (01/16/2012) or President’s Day (02/20/2012), as culturally inappropriate. Other key dates were added, e.g., Labor Day (05/01/2012) or Easter Monday (04/09/2012) in all four countries. Patients were key in discussing changes or proposing solutions. CONCLUSIONS: The multistep process proved crucial to ensure cultural relevance and cross-cultural equivalencies across different languages. PRM125 WHOSE VALUES IN HEALTH? A COMPARISON OF ADULT AND ADOLESCENT VALUES FOR THE CHU9D AND AQOL-6D Ratcliffe J Flinders University, Daw Park, South Australia, Australia

OBJECTIVES: The Child Health Utility-9D (CHU9D) and Assessment of Quality of Life-6D (AQOL-6D) currently represent the only two generic preferences based instruments developed for application in the economic evaluation of new health technologies with both adult and adolescent specific scoring algorithms attached to them. The main objective of this study was to compare and contrast the application of adult and adolescent scoring algorithms for the CHU9D and AQOL-6D in valuing the health of a community based sample of adolescents. METHODS: A web based survey including the CHU9D and the AQOL-6D was developed for administration to adolescents residing in Australia, aged 11-17 years (n⫽500). Individual responses to both instruments were converted to health state utility values by applying [1] adult and [2] adolescent scoring algorithms pertaining to each instrument. RESULTS: Both the AQOL-6D and CHU9D discriminated well according to health status and long standing illness regardless of the scoring algorithm employed. However, important discrepancies were found in that employment of the adolescent algorithm resulted in consistently lower mean health state values for the CHU9D but consistently higher mean health state values for the AQOL-6D relative to employment of their respective adult algorithms and these differences were found to be statistically significant for both instruments (p⬍0.05). CONCLUSIONS: The findings from this study concur with an expanding evidence base highlighting discrepancies in adult and adolescent values for identical health states. The differences in adolescent and adult values were more profound for the

A483

CHU9D, particularly in relation to mental health impairment states, and may be significant enough to strongly affect the findings of cost effectiveness studies and ultimately health care policy. There are important differences between both the CHU9D and AQOL-6D descriptive systems and the methods of valuation utilized for each instrument which may impact on the health state utility values generated by each scoring algorithm. PRM126 CHOICE-BASED VALUATION OF THE SF-12V1 Craig BM1, Watson V2, Busschbach JJV3 1 Moffitt Cancer Center, Tampa, FL, USA, 2University of Aberdeen, Aberdeen, UK, 3Erasmus University Medical Center, Rotterdam, The Netherlands

OBJECTIVES: To value SF-12v1 outcomes from the perspective of US adults METHODS: The paired comparisons of the SF-12v1 items were incorporated into a larger, online survey of US health preferences using an invitation-only panel of US respondents. This pivoted discrete choice experiment (DCE) collected 28,080 responses on 408 pairs from 936 respondents. The quantal response model includes an additive multi-attribute regression within a stacked logit distribution to allow for excess kurtosis due to satisficing. RESULTS: On a quality-adjusted life year scale, 30 out of 31 decrements were significantly non-zero at 0.05 significance level. The only insignificant decrement, LE2, represented the decrease in energy from “You have a lot of energy all of the time” (i.e., manic) to “You have a lot of energy most of the time.” The value of pits was 0.1532 QALY (95% CI 0.1022,0.1970) and was significantly better than dead (0). CONCLUSIONS: This is the first national valuation study of the SF-12v1 descriptive system. Instead of reducing domains of complex descriptive systems to single items (e.g., SF-36v1), DCE allows for choice-based valuation of all outcomes, regardless of the number of items. PRM127 MEASURING QUALITY OF LIFE IN MENTAL HEALTH: ISSUES AND CHALLENGES Lewis L, Taylor M, Roberts S University of York, York, UK

OBJECTIVES: To undertake a detailed review of the existing use of utility scores in models evaluating the cost-effectiveness of mental health conditions. The study also aimed to identify the key issues and challenges that are faced by decision makers attempting to evaluate the benefits of treatments for mental health. METHODS: A detailed review was undertaken to identify a wide range of studies that used modelling techniques to estimate the cost-effectiveness of interventions for different mental health conditions. The review included studies of treatments for bi-polar disorder, schizophrenia, depression, anxiety, dementia, eating disorders. The review determined whether each model contained utility data and how those utility data were derived and reported. Quality grades were assigned to each study based on the appropriateness of use of the utility data. RESULTS: Nearly all cost-effectiveness models in mental health contained utility-based outcomes, such as quality-adjusted life years. However, the quality of data used to generate those outcomes varied considerably, and many studies contained poor data or data used in an inappropriate manner. In addition to the expected limitations of instruments used to derive quality of life scores, common misuses of data included: 1) inappropriate timing of elicitation (for instance, applying quality of life scores at diagnosis to the whole duration of the model); 2) failing to account for comorbidities and confounding factors; 3) assumptions around missing data; and 4) failing to account for the patient’s history when defining their current health state. CONCLUSIONS: The use of utility data in mental health models varies widely, and most models cannot be considered to provide reliable and robust data. For models to be useful to decision makers, it is recommended that a consistent approach toward measuring quality of life in patients with mental health conditions should be used where possible. PRM128 DEVELOPMENT OF A TRANSLATION EVIDENCE TRACKING TOOL TO COMPLY WITH THE FDA REVIEW REQUIREMENTS FOR THE TRANSLATION OF PRO INSTRUMENTS Mear I1, Anfray C1, Conway K2 1 MAPI Institute, Lyon, France, 2MAPI Research Trust, Lyon, France

OBJECTIVES: In its guidance on the use of PRO measures, the FDA specifies the areas to be addressed in PRO documents provided for review. Regarding language translation, four areas are listed: (A) Process used to translate and culturally adapt the instrument for populations that will use them in the trial; (B) Description of patient testing, language- or culture-specific concerns, and rationale for decisions made to create new versions; (C) Copies of translated or adapted versions; and (D) Evidence that content validity and other measurement properties are comparable between the original and new instruments. The objective of this study is to present the development of an evidence tracking tool to organize the evidence generated during the translation of a PRO instrument to comply with the FDA review requirements. METHODS: 1) Review of the process used to translate PRO instruments, and of the evidence provided during the translation process; 2) Organization of the evidence according to the four areas listed in the FDA guidance (excluding measurement properties); and 3) Development of a standardized tool to present this information. RESULTS: The translation evidence tracking tool is a table divided into three parts. Part 1 (translation background information) gathers the information requested in areas A and C. Part 2 (translation report) provides the evidence required in area B. Part 3 (content validity) concerns the comparability of content validity. For this part, it was assumed that comparability will depend on: (1) A clear definition and understanding of the concepts to be translated; and (2) The involvement of trained professionals. Hyperlinks are provided in each part of the table to lead to the evidence documents required for each area listed by the FDA.

A484

VALUE IN HEALTH 15 (2012) A277–A575

CONCLUSIONS: The translation evidence tracking tool is a unique device enabling researchers to submit a consistent and comprehensive translation dossier for FDA review. PRM129 FINAL REPORTS FOR TRANSLATION AND LINGUISTIC VALIDATION OF CLINICAL OUTCOME ASSESSMENTS Zarzar K1, Chulis C2, Simon M3 1 TransPerfect, Research Triangle Park, NC, USA, 2TransPerfect, New York, NY, USA, 3 TransPerfect, Paris, France

OBJECTIVES: As the use of existing translated clinical outcome assessments (COAs) across studies is common and continues to increase, appropriate documentation of the translation and linguistic validation processes utilized to create target language versions is essential. METHODS: A review of the contents of final reports and certifications for previously completed COA translations was conducted. Additional discussions with sponsors and CROs regarding final reports, and the prevalence of final report deliveries from linguistic validation providers across the industry were completed. RESULTS: Final reports for linguistic validation are not provided consistently across the industry. While some companies provide a final report for every project, upon further discussion with sponsors within the industry it was revealed that some companies do not automatically generate reports or provide them to sponsors upon completion of the linguistic validation process. Based upon regulatory expectations for review of translated COAs, it is recommended to utilize a final report which summarizes the overall linguistic validation project. This report should document the process used in detail, the reasoning for linguistic decisions at each stage, evidence of cognitive interviewing, cognitive interviewing population, demographic information of the respondents, a summary of the findings, the final formatted version of the questionnaire, and the relevant certification. CONCLUSIONS: Final reports for COA language versions provide valuable information needed to make critical decisions regarding the use of existing translations. Linguistic validation reports should be structured as a complete package addressing each item regulatory authorities require for review of translations, so the sponsor may easily include this in their submission packages. Final reports should be provided for every COA language version to document the translation and linguistic validation process completed. PRM130 TRANSLATABILITY OF RESPONSE SETS USED IN PATIENT REPORTED OUTCOMES AND BEST PRACTICES FOR DEVELOPMENT Gawlicki MC1, McKown S2, Talbert M2, Brandt BA1 1 Corporate Translations, Inc., East Hartford, CT, USA, 2Corporate Translations, Inc., Chicago, IL, USA

OBJECTIVES: To determine the best response sets for use in Patient Reported Outcomes (PRO) instruments intended for translation and subsequent data collection in multinational clinical trials, and to make recommendations for response options to avoid. As sound response sets are essential for data collection, a high degree of translatability is vital. METHODS: Twelve response sets from previously translated PRO instruments were analyzed. Additionally, linguists provided their feedback on translatability of the response sets. Observed response sets included measures of patient treatment satisfaction, improvement, discomfort level, frequency of symptoms or adverse affects and agreement level with statements about treatment or condition. RESULTS: A response set constructed of “Strongly agree, agree, uncertain, disagree, strongly disagree” was conceptually equivalent to the source in 94% of observed back-translations. In comparison, response options including the phrase “. . .of the time” were conceptually equivalent in 42% of back-translations. The concept of “fair” was conceptually equivalent in 64% of back-translations and was found to have negative interpretations in some languages, such as “Weniger gut” (“not so good”) in German, and positive in others, such as “Elfogadható” (“acceptable”) in Hungarian. Some response options within response sets may be indistinguishable when translated, such as “a little bit” and “somewhat.” Furthermore, “bit” is a non-existent concept in languages other than English. CONCLUSIONS: Response sets in PROs should achieve equivalence across languages, with no overlap between options. The aforementioned agreement set is easily localized, and is thus recommended. Use of verbose terminology and concepts which are non-existent in other languages should also be avoided. A translatability assessment of response sets is recommended during instrument development. Where appropriate the use of numerical rating scales or visual analogue scales is recommended. Such response sets avoid concept overlap, and numerical results avoid translatability issues. PRM131 HOW MIGHT EXPERIENCE-BASED UTILITY MEASURES INFLUENCE REIMBURSEMENT DECISIONS? Marsh K, Browne C United BioSource Corporation, London, UK

OBJECTIVES: The measurement of the ‘quality’ part of the QALY used to assess health interventions is conventionally based upon the general publics’ preferences for different health states. Research from behavioural economics tells us that the elicitation of preferences to inform this analysis introduces a number of biases into the estimation of the utility generated by technologies. In response to this, various authors have considered the theoretical possibility of replacing the preferencebased approaches with experience-based estimates of health related utility gains. This paper compares the ICERs for a range of interventions estimated based on both preference-based and experience-based valuation techniques. METHODS: Tariffs associated with changes in health states were extracted from published data. Tariffs based on four methods were identified: TTO, standard gamble, life satisfaction, and day affect. These data were combined with estimates of the cost

and effect of a range of interventions to model the impact on ICERs of these alternative tariffs, how this varied with intervention type and tariff method, and the challenges of adopting alternative tariffs. RESULTS: Experience-based utility measures generate lower ICERs for interventions that generated improvements in the social and mental dimensions of health outcomes, and higher ICERs for those that target the pain dimensions of health outcomes. This trend is not, however, replicated across different experience-based utility measures. CONCLUSIONS: Experience-based utility represents an alternative to the preference-based utility measures conventionally employed within health economics. If optimal use is to be made of health care budgets, further work is required to understand why different utility measures produce different ICERs, and further debate is required to inform the appropriate basis for reimbursement decisions. PRM132 USING CLINICAL OUTCOME ASSESSMENTS AND ECONOMIC DATA TO FACILITATE PATIENT ACCESS IN RHEUMATOID ARTHRITIS Gater A1, Kitchen H1, Heron L1, Hansen BB2, Højbjerre L2, Strandberg-Larsen M2 1 Adelphi Values, Bollington, Cheshire, UK, 2Novo Nordisk A/S, Søborg, Denmark

OBJECTIVES: Rheumatoid Arthritis (RA) is a chronic autoimmune disease, affecting over 2.9 million adults in Europe. A challenge for new RA therapies is demonstrating added value not only in efficacy and tolerability but also in terms of overall benefit for patients, carers, health care systems and society. The aims of this study were to document current unmet needs in RA in terms of patient-reported and economic burden and how such concepts may be assessed to capture the value of RA treatments. METHODS: Articles were identified via searches in MEDLINE, EMBASE, EconLit, HEED, CRD databases and PSYCINFO using pre-defined search terms and limits. PRO measures were identified via the reviewed literature and the Patient-Reported Outcome and Quality of Life Database (PROQOLID) and were assessed in context of FDA guidelines. RESULTS: The literature search revealed 2,517 abstracts of which 123 articles were reviewed in full. RA symptoms significantly impact patient’s physical functioning, daily activities, and ability to work. Financial stability of patients and caregivers is affected and direct and indirect costs are incurred by health care systems. A conceptual model summarising patient-reported and economic burden of RA was developed to identify key measurement concepts which can demonstrate efficacy, tolerability, and wider impact of RA treatments. Relevant endpoint measures were identified; beyond traditional symptom reports the assessment of health-related quality of life, sleep, fatigue, daily activities, resource use and work productivity can provide data for regulatory, reimbursement, patient, and physician decision making. CONCLUSIONS: A patientfocused approach is increasingly advocated for underpinning individual and societal health care decisions. PRO measures, therefore, are essential for monitoring treatment outcomes among RA patients both in clinical trials and clinical practice. Where adequate instruments are included in clinical trials there exists opportunity to demonstrate the overall efficacy, tolerability, and value of a treatment to key stakeholders including regulatory and HTA bodies. PRM133 REVIEWING TRANSLATABILITY PRIOR TO TRANSLATION AND LINGUISTIC VALIDATION OF PROS Anderson H1, Gordon-Stables R2, Wild D3 1 Oxford Outcomes, An ICON plc Company, Oxford , UK, 2Oxford Outcomes Ltd., Oxford, Oxon, UK, 3Oxford Outcomes, An ICON plc Company, Oxford, Oxon, UK

OBJECTIVES: When developing a new PRO for use in multinational studies it is recommended that a translatability assurance step is conducted before finalization and its subsequent translation and linguistic validation. A review of translatability and subsequent adaptation can reduce the chances of encountering difficulties related to concepts, idiomatic expressions, response scales, or syntax once translation has started. This step helps identify potential issues and develop solutions that side-step potential problems in the later translation process. The aim of this study was to assess the nature of issues that translatability highlights. METHODS: Translatability is assessed by native speakers representing several language groups (e.g. Europe, Africa, Asia, and Latin America). The linguists are asked to identify possible grammatical, lexical and cultural issues. Their recommendations are compiled and discussed with the developer of the measure, which is then adapted in accordance with these findings. The results of translatability checks were reviewed to find common issues that the process illuminates. RESULTS: Changes made to a PRO as a direct result of a translatability step can be varied but all work towards clarity of concepts and expression. Common changes are: 1) Complex sentences are segmented for clarity; 2) Ambiguity is removed; 3) Constructs common in English but not in other languages are changed; 4) Colloquialisms are simplified; 5) Elaborations are added to clarify concepts; and 6) Unnecessary noun repetition is removed. CONCLUSIONS: The translatability assurance step can increase cross-cultural equivalence between original and translated documents, and tackle potential translation difficulties uniformly before translation begins. It is recommended as a step in all PRO development projects. PRM134 REGRESSION METHODS FOR HEALTH-RELATED QUALITY OF LIFE DATA IN LONGITUDINAL SETTINGS: ARE MORE ADVANCED TECHNIQUES REALLY PERFORMING BETTER? Hunger M, Döring A, Holle R Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH), Neuherberg, Germany

OBJECTIVES: The statistical analysis of health utilities and health-related quality of life (HRQL) scores poses various challenges due to the distributional properties such data commonly exhibit. These include skewness and heteroscedasticity