Quality assurance in clinical trials

Quality assurance in clinical trials

Critical Reviews in Oncology/Hematology 47 (2003) 213 /235 www.elsevier.com/locate/critrevonc Quality assurance in clinical trials P.B. Ottevanger a...

355KB Sizes 0 Downloads 169 Views

Critical Reviews in Oncology/Hematology 47 (2003) 213 /235 www.elsevier.com/locate/critrevonc

Quality assurance in clinical trials P.B. Ottevanger a,*, P. Therasse b, C. van de Velde c, J. Bernier d, H. van Krieken e, R. Grol f, P. De Mulder a a

Department of Internal Medicine, Division of Medical Oncology, 550, University Hospital Nijmegen, Geert Grooteplein 8, PO 9101, 6500HB Nijmegen, The Netherlands b EORTC data centre, Brussels, Belgium c Department of Surgery, Leiden University Medical Centre, Leiden, The Netherlands d Cantonal Department of Radiation Oncology, San Giovanni Hospital, Bellinzona, Switzerland e Department of Pathology, University Medical Centre Nijmegen, Nijmegen, The Netherlands f Centre for Quality of Care Research, University Medical Centre Nijmegen, Nijmegen, The Netherlands Accepted 5 February 2003

Contents 1. Introduction

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2. Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1. Protocol design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1.1. Background and objectives . . . . . . . . . . . . . . . . . . 2.1.2. Patient selection criteria . . . . . . . . . . . . . . . . . . . . 2.1.3. Trial design and scheme . . . . . . . . . . . . . . . . . . . . 2.1.4. Therapeutic regimen, toxicity, dose modifications . . . . . . 2.1.5. Required clinical evaluations, laboratory tests and follow-up 2.1.6. Endpoints and criteria for evaluation . . . . . . . . . . . . 2.1.7. Patient registration and randomisation procedure . . . . . . 2.1.8. Statistical considerations . . . . . . . . . . . . . . . . . . . 2.1.8.1. Sample size . . . . . . . . . . . . . . . . . . . . . 2.1.8.2. Intention to treat analysis . . . . . . . . . . . . . 2.1.8.3. Hypothesis testing . . . . . . . . . . . . . . . . . . 2.2. Quality assurance of protocol design . . . . . . . . . . . . . . . . 2.3. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . .

3. Quality assurance by specialities involved in oncology 3.1. Surgery . . . . . . . . . . . . . . . . . . . . . . . 3.1.1. Quality of surgery . . . . . . . . . . . . . . 3.1.2. Quality assurance of surgery . . . . . . . . 3.1.2.1. Operative notes . . . . . . . . . . 3.1.2.2. Pathology reports . . . . . . . . . 3.1.2.3. Outcome . . . . . . . . . . . . . . 3.1.2.4. Audit . . . . . . . . . . . . . . . . 3.1.2.5. Instruction and education . . . . 3.1.3. Conclusions . . . . . . . . . . . . . . . . . . 3.2. Pathology . . . . . . . . . . . . . . . . . . . . . . 3.2.1. Quality of pathology . . . . . . . . . . . . . 3.2.2. Quality assurance of pathology . . . . . . . 3.2.2.1. Slide review for diagnosis . . . . 3.2.2.2. Review of pathology reports . . . 3.2.2.3. Education and training . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

* Corresponding author. Tel.: /31-243610353; fax: /31-243540788. E-mail address: [email protected] (P.B. Ottevanger). 1040-8428/03/$ - see front matter # 2003 Elsevier Science Ireland Ltd. All rights reserved. doi:10.1016/S1040-8428(03)00028-3

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

214

. . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

. . . . . . . . . . . . . . .

216 216 216 216 217 217 217 217 218 218 218 218 218 218 219

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . .

219 219 219 219 219 220 220 220 221 221 221 221 222 222 223 223

214

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235 3.2.3. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3. Radiotherapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.1. Investigation of equipment and human resources . . . . . . . . . . . . . . . . 3.3.2. Dosimetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3. Evaluation of treatment plan and actual treatment . . . . . . . . . . . . . . . 3.3.3.1. Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.2. Dummy run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.3.3. Phantom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.3.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4. Chemotherapy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.1. Investigation of equipment, preparation and administration of chemotherapy 3.4.2. Dosing and dose intensity audits . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.3. Systemic therapy checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.4.4. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

. . . . . . . . . . . . . .

223 223 224 224 225 225 225 226 227 228 228 228 229 229

4. Data monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

229 230

5. Discussion

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

230

6. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

231

Reviewers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

232

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

232

Biographies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

235

Abstract From the literature that was initially searched by electronic databases using the keywords quality, quality control and quality assurance in combination with clinical trials, surgery, pathology, radiotherapy, chemotherapy and data management, a comprehensive review is given on what quality assurance means, the various methods used for quality assurance in different aspects of clinical trials and the impact of this quality assurance on outcome and every day practice. # 2003 Elsevier Science Ireland Ltd. All rights reserved. Keywords: Quality assurance; Clinical trials; Surgery; Radiotherapy; Chemotherapy; Outcome research

1. Introduction This review is concerned with quality assurance in clinical trials. In order to get better acquainted with quality, quality assurance and clinical trials, we will first introduce some definitions and point out the scope of this review. The National Library of Medicine defined a clinical trial in 1980 as: a pre-planned, usually controlled trial of the safety, efficacy, or optimum dosage schedule of one or more diagnostic, therapeutic, or prophylactic drugs or techniques in humans selected according to predetermined criteria of eligibility and observed for predefined evidence of favourable and unfavourable effects. This definition does not preclude studies without control groups, such as for example Phase II trials. Large randomised co-operative studies are essential to define the state of the art treatment, supplying the best quality of care for a given disease in a given situation in every day practice. It is therefore very important to receive from these studies solid, reproducible and applicable answers on relevant clinical questions. This requires high quality research.

The definition of quality is difficult. Different meanings of quality are: a degree of excellence, conformance with requirements, the totality of product or service needs that satisfy given needs, fitness for use, freedom from defects etc. It can be stated that the definition of quality is depending on the beholder of quality: resulting in many different definitions and in the meaning of different things to different people. More in general, quality concerns the comparison between what should and what has been achieved. The quality of a product actually is composed of 3 parameters: the quality of design, the quality of conformance and the quality of use. The quality of design is the extent to which the design reflects a product or service the customer needs. For the designers of a trial this means they should pose questions that are relevant and design diagnostic or therapeutic options that are feasible as well in the trial as in daily practice. Quality of conformance is the extent to which the product or service conforms to the design standard. It is the task of designers and executers of the trial to check and do the utmost to conform to the protocol. The quality of use is the extent by which the user is able to secure continuity of use from the product

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

or service. Ultimately the diagnostic or therapeutic modules investigated must both be effective and feasible to be carried out in daily practice. Products that fail, are difficult to maintain, costly to use or in a way give the customer dissatisfaction, are products of poor quality. This means that criteria or standards should be set for all three parameters of quality. When these criteria are set evaluation of quality and activities to sustain and/or improve the quality should be defined. Quality control in simple terms is a process for maintaining standards: it prevents undesirable changes or variance being present in the quality of the product. This is done by checking variance at appropriate points and diagnosing the cause of variance. Variance can be due to systematic deviations (the same deviations occur repeatedly) or random deviations (deviations that occur accidentally). Especially for the systematic variance a plan of action to correct the variance should then be made and executed. For accidental variance this is often more difficult, but nevertheless important. It should be realised that in clinical trials the differences in effects between the compared treatments or diagnostic tests are usually estimated to be relatively small. However for the many patients treated in daily practice these small differences may be beneficial. Deviations from the design or protocol will usually decrease the differences in effects, thereby possibly ignoring beneficial effects, which will also lead to a waist of effort and money. On the other hand deviations potentially can harm the patient. Multi-centre clinical trials are complex products since many different contributors can be identified, each with his own speciality, treating patients for only a part of the study protocol, each treating a relatively small number of patients per study protocol and each often having a different cultural and geographical background. This often means that routines should be harmonised on one hand, while on the other hand it is difficult to acquire routine in practices one is not familiar with, because of the small numbers treated per investigator. The ISO definition states that quality assurance is all those planned and systematic actions necessary to provide adequate confidence that a product or service will satisfy given requirements for quality [1]. You can gain assurance of quality either by testing the product or service against the prescribed standard to establish its capability to meet them or by assessing the organisation, which supplies the products against standards to establish its capability to produce products of a certain standard. Auditing, planning, analysis, inspection and tests are techniques, which can be used for quality assurance. Since quality has a price an assessment of costs and benefit of quality is needed, realising the fact that cost and quality should be in balance according to the beholder. The beholders or customers of a clinical trial are diverse: participating specialists, participating patients,

215

patients in daily practice with a certain disease condition for whom the trial was developed, physicians treating patients in daily practice and ultimately the society. Each group has its own practical and ethical issue in high quality clinical trials. The participating specialist needs a clear description of what he should do, how he should do it and when it should be done, and he needs to be confident that the treatment is sufficiently safe for his patients. The patient that consents to participate in the trial should be confident that he is treated in a safe manner, with the highest probability of being optimally diagnosed and cured. For the patients for whom the trial tries to answer a question it is of utmost importance that the answers generated by the trial are trustworthy and treatment and diagnosis is sufficiently safe for them in order to get optimal care. The treating physicians in daily practice should be able to trust the results of the trial and be able to carry out the diagnostic modules or treatment and extrapolate the results to their patients in order to advise or give their patients the best available care. Society is usually the financial investor of these expensive trials, meaning that the cost and benefit should be satisfying for the community. This emphasises the need for relevant and answerable questions that justifies the time and expenses needed to carry out the study. The society also protects her members for experimental treatment by ordering regulations and laws for clinical research (FDA, EU). Also, the chance that fraud or personal or financial gain by researchers or involved companies influence results of clinical trials should be guarded and minimised. Historically, the interest in the quality of clinical trials increased rapidly since the seventies of the past century as the importance of multi-centre randomised trials in oncology was recognised and their number increased [2 / 5]. Some members of large co-operative trials have therefore published requirements to conduct good clinical trials [6 /9]. Although some of these requirements are universal, others depend on the type of study that is planned. They pay attention to concrete aspects as protocol, performance of investigators and their institutions, technical equipment and its use, data management, information transfer and publications of results. All these activities having the purpose, as said before, of obtaining reliable and meaningful results to the posed questions and to demonstrate to others the validity of the results. This review tries to assess from the literature on which subject-matters and with which methods quality assurance in clinical trials is carried out. It will focus on quality assurance of protocol design, surgery, pathology, radiotherapy and medical oncology and on data acquisition and data management of mainly large randomised trials. Furthermore an extrapolation of the

216

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

Table 1 Contents of a protocol as described by the EORTC Title page Background and introduction Objectives of the trial Patient selection criteria Trial design and scheme Therapeutic regimen, toxicity, dose modifications Required clinical evaluations, laboratory tests and follow-up Criteria for evaluation, endpoints Patient registration and randomisation procedure Forms and procedures to collect data Reporting of adverse events Statistical considerations Quality of life assessment Cost evaluation assessment Data monitoring committee Quality control Informed consent Administrative responsibilities Publication policy List of participants and expected yearly accrual References Appendices (as appropriate): TNM classification Performance status scale Body surface area scale Surgical details Radiotherapy details Toxicity grading scales Adverse drug reactions Flow sheet or checklist of required investigations Drug storage/ supply Case report forms Investigator assurance statement Informed consent statement Patient informed sheet Pathology review

implications of quality assurance in clinical trials to daily practice will be made.

2. Protocol 2.1. Protocol design The most important document for any clinical trial is the protocol that contains a description of the rationale, objectives and logistics of the study. This study protocol may be considered as a written agreement between the investigator, the patient and the scientific community and serves as a document to assist communication between those working in the trial. The success or failure of a trial may depend entirely on how well the protocol is written since a poorly designed, ambiguous or incompletely documented pro-

tocol will result in a trial, which will not be able to answer the questions of interest. Apparently, since cancer treatment predominantly needs a multidisciplinary approach, protocol design also needs to be a combined effort of all involved specialities (surgeons, medical oncologists, radiologists, radiotherapists, pathologists) in combination with a medical statistician, data managers and information technologists [10]. A comprehensive description of a good quality protocol and its required protocol contents is described in DeVita [11]. The EORTC advises in one of its former handbooks of clinical trials to use a more elaborate format, as described in Table 1 [9]. 2.1.1. Background and objectives Background and objectives of the study are necessary to understand and motivate the study design and required procedures and should therefore be a prerequisite for a study protocol. Careful attention should be given to avoid that objectives are obscured by compromises due to different interests, professional, proprietary or financial of coinvestigators. A randomised trial for example should be conducted only if there is substantial uncertainty about the relative value of one treatment versus the other. Studies in which the intervention and control are thought to be nonequivalent violate this uncertainty or equipoise principle and are therefore unethical. Djulbegovic et al. evaluated adherence this uncertainty principle in 136 published randomised trials focusing on multiple myeloma and related adherence to source of funding. They reported a significant difference in favouring new therapies over standards between trials sponsored by profit organisations versus non-profit organisations 74% versus 26% and 47% versus 53%, respectively (P /0.004). The profit organisations more often violated the uncertainty principle than the non-profit organisations: they compared experimental treatment more often with placebo, no therapy or a sub-standard [12]. Since this uncertainty principle for randomised trials is important for the patient as well as for the society, scrutiny of the scientific merits of a study proposal is needed. Many co-operative groups these days request from their participating investigators a statement of no conflicts of interests in order to avoid bias in trial generation and conclusions and eliminate doubt on the validity of the results. 2.1.2. Patient selection criteria The eligibility criteria for patients suitable for a study are defined for both safety and scientific reasons: to avoid excessive toxicity, to increase the homogeneity of the patient population and to decrease inter-patient biological variability. It is important to bear in mind that restriction of eligibility has several disadvantages: it

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

possibly makes extrapolation of results to the general population hazardous. This is especially true for phase III studies, in which different treatment regimens are compared [13]. It also increases the complexity of the trial because the physician who registers the patient must verify all the criteria, some of which may be difficult to determine or may be ambiguously defined. As a consequence quality assurance for example with audit procedures will be needed to verify that the criteria were indeed met. It also makes the extrapolation of results and the implementation of the treatment in general practice more difficult. Finally complexity will lead to increased costs [14]. There are situations favouring restriction, for example when there are good reasons to believe that a certain treatment will be beneficial only to a sub group of patients. The restriction will however lay a burden on the patient accrual time and this may be detrimental to obtain the number of patients needed to answer the question with sufficient statistical power.

2.1.3. Trial design and scheme Trial design and scheme are dependent on the goal of the study. Most phase III trials are comparative in nature, that is, the experience of a group of patients receiving a new treatment is compared either to a group of patients receiving a standard treatment or to an untreated group. The usual assumptions are that a new treatment modality is either more effective or equally effective but less toxic, associated with a better quality of life, less expensive or more efficient. However sufficient uncertainty should exist to the outcome. The endpoint is usually survival, or in case of advanced disease, progression free survival. For phase III trials a large and simple randomised design is the most desirable [15]. Depending on the difference expected, the number of patients needed can go from 500 to 3000 patients and multi-centre trials are often needed to include enough patients within an acceptable time frame [16,17]. Furthermore these multi-centre trials are more likely to be applicable to a broader population, ideally the patients treated in every day practice and finally the chance that fraud will influence study results will be reduced [18]. Most phase II trials on the other hand are developed to determine objective tumour responses in certain tumour types, to explore the spectrum and frequency of toxic effects, in particular cumulative effects and to find out how to manage toxicity. Ultimately these studies lead to recommendations whether it is desirable or not to continue further clinical development of the drug. For phase II trials required sample size is much lower and especially when it concerns an early phase II trial, randomisation is not necessary. In general 25 /40 patients are needed for a convenient sample size.

217

2.1.4. Therapeutic regimen, toxicity, dose modifications The protocol should clearly and specifically define all procedures critical to the study in order to minimise practice variation between co-investigators and institutions and to accurately translate the results to every day practice when the experimental treatment is proven effective. This means that for trials investigating adjuvant treatments accurate description of surgical methods, as well as radiotherapy and chemotherapy is mandatory [19]. Stopping rules and dose modification rules should be defined in case of presence or absence of toxicity. Toxicity should be scored uniformly using a validated toxicity scoring system, such as the CTC or NCI-CTC version 2, WHO or Radiation Therapy Oncology Group (RTOG)/EORTC acute radiation morbidity scoring system. For radiotherapy trials and adjuvant trials late toxicity scoring may be very important. This has led to the development of late effects scoring system for radiotherapy by the RTOG and EORTC, the LENT/ SOMA scoring system. 2.1.5. Required clinical evaluations, laboratory tests and follow-up Only the necessary and relevant tests and diagnostic procedures should be carried out. The more evaluations are asked, the more the efforts and costs will rise, because the data managers should register every issue and subsequently these need to be reviewed. Secondly the study compliance will decline when complexity rises. Since the results of the study ultimately need to be translated into every day practice this too requires careful motivation of what needs to be evaluated and what not. 2.1.6. Endpoints and criteria for evaluation The endpoints depend on the type of study, disease and disease phase. For adjuvant studies, where therapy is given to patients treated by a potentially curative primary therapy, but for whom the risk of recurrence is substantial, the goal is generally to compare duration of survival or disease free survival in two or more treatment groups. Advanced disease trials include all patients in whom local treatment is no longer curative. There are two types of advanced disease: locally advanced, where disease is still confined to the region of the primary tumour and distant recurrent or metastatic disease where the disease has spread to distant sites. The primary endpoint for phase III trials in locally advanced and metastatic disease is usually disease free survival or progression free survival, while the secondary endpoint is usually overall survival. For phase II trials response rate and toxicity are the usual primary endpoints.

218

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

Desirable characteristics of the primary outcome measures should be easy to observe, free of measurement or ascertainment errors, capable of being observed independently of treatment assignment and clinically relevant. They should be chosen before the start of data collection. It can be stated that only survival is an unambiguous endpoint. However we should do the utmost to minimise errors increasing variation due to different measurement frequencies and techniques and clinical judgement. It is very important to keep in mind that studies aiming at intermediate or surrogate endpoints only, such as response for survival time, will often decrease the clinical relevance of the answer [20]. When however response is chosen as an endpoint, estimation of response requires a very accurate use of internationally validated and accepted criteria for response with validated (reproducible) and unequivocal measuring techniques to evaluate responses [21]. It requires standardisation of laboratory techniques and exactly described technical imaging procedures (for example description of slice thickness and use of contrast in CT) and measuring techniques. Last but not least the estimation of response should be blinded for treatment whenever possible. 2.1.7. Patient registration and randomisation procedure Preferably patients are registered at a central data office. The registering person must ascertain that all eligibility criteria are met at the time of registration in order to reduce the number of patients who ultimately are considered ineligible. In case of randomisation, the central data office should do this. A computerised randomisation procedure should be used, in order to exclude any bias such as influence of the treating physician or the involved study co-ordinator, the patient or unknown prognostic factors. The randomisation scheme depends on the size of the study, the number of sites and the stratification parameters. 2.1.8. Statistical considerations For a reliable answer to hypothesis testing some issues need to be emphasised: 2.1.8.1. Sample size. An adequate sample size is very important to get a reliable answer. Many clinical trials that are reported as negative are really non-informative because they did not have a large enough sample size to detect medically important treatment effects [22]. A study designed to detect with a 90% statistical power a reduction in the annual risk of event of 15% needs 1603 events to be observed. The number of patients then needed equals the required number of events divided by the expected event rate during the duration of the trial

[23]. To help the readers of trial results to distinguish between truly negative and undetermined results it is advisable to use confidence intervals [24]. 2.1.8.2. Intention to treat analysis. Intention to treat analysis means that all patients allocated or randomised for a treatment will be analysed as having had the treatment, even if they had for any reason an alternative treatment or no treatment at all. In advance it is necessary to decide on an intention to treat analysis in order to avoid a selection bias for the excluded patients. 2.1.8.3. Hypothesis testing. The protocol should specify only a very few hypotheses to be tested, which concern all the registered patients. Exploratory analyses such as subset analysis should be used only for hypothesis generation for future trials [23]. It is also important to test only mature data, because trials that are repeatedly analysed in the course of accrual and follow-up have an increased probability to find one significant difference (P B/0.05) [25]. Interim analysis should preferably be done only in case of highly toxic treatment regimens, pivotal trials for drug registrations, and very large trials, involving more than 1000 patients. This also means that if interim analyses are planned, the statistical approach should require more extreme P values than P /0.05 to declare statistical significance. 2.2. Quality assurance of protocol design One of the important procedures for quality assurance of the protocol design is the co-operation of different disciplines with separate experiences together preparing the protocol. A proposal for a new clinical trial is usually developed by such a co-operative group, which generally appoints a study co-ordinator and a writing committee to prepare the protocol according to existing guidelines of good clinical research. This protocol design needs to be developed or reviewed in cooperation with data managers, statisticians and computer technologists, who usually are organised in a central statistical office or data centre. This protocol proposal is then presented to the co-operative group for discussion and if after some revisions the co-operative group accepts the protocol, an external review board, for example a protocol review committee or scientific board of the research organisation, should review it. When the protocol is ready to be activated, the local ethical committee and/or the protocol review board of the participating institution is the last checkpoint before the protocol is open for patient entry [9,26]. Most knowledge on protocol quality from noninvolved reviewers outside the study groups is from papers describing the results derived from these study protocols and from reviews on large, mainly randomised

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

trials for example when planning a meta-analysis. Some of these reviews of even recently published trials have shown major flaws in trial design and documentation of study results [27 /32]. These reviewers used for quality control evaluation of randomisation procedures, response assessment and statistical design. The quality of the randomisation was assessed for example by evaluation of the used procedure, of the distribution of major prognostic factors among different treatment arms and of the compliance to randomisation. The estimation of response was evaluated by assessment of response evaluation criteria, blinding of evaluation of response to treatment and interim results, follow-up schedules and withdrawals of follow-up. The statistical design was evaluated by assessment of a priori estimation of sample size, analysis of withdrawals, and study power (beta). Marsoni et al. reviewed the quality of randomised clinical trials on the treatment of advanced epithelial ovarian cancer [29]. Only 17 of the 38 identified trials were real multi-centre studies. In only 8 trials a completely blinded randomisation could be assessed. Full information on treatment compliance was available in only 9 trials. Intention to treat analysis was reported for 5 trials. Thiesse et al. evaluated the impact of a response review committee for a multi-centre trial on cytokine therapy for metastatic renal cell carcinoma [32]. They reported a 40% disagreement in reviewed files with errors in predefined tumour measurements, selection of measurable targets and radiological technical problems, such as varying slice thickness and varying use of contrasts. Mariani et al. reviewed phase II trials on systemic anti-neoplastic agents published in 1997 and reported that a statistical design was mentioned in only 19.7% of these published trials [27]. Often compliance with predetermined design was poor. Studies with a proper statistical design reported more negative results.

2.3. Conclusions A clear and accurate protocol is needed for every clinical trial, highlighting the goal and all the methods to be used to reach the goal. This protocol should also be the backbone of the publication and should be used to determine the subjects for quality assurance in the trial. The co-operation of persons from different disciplines with the use of their specific knowledge and experience is an important form of quality assurance. But quality assurance of the protocol could probably be further improved when uninvolved experts review it. Since protocol development is often secret this poses problems that are difficult to surmount. An independent expert review committee that should maintain secrecy could be an option.

219

3. Quality assurance by specialities involved in oncology 3.1. Surgery Surgery plays a role in many cancer clinical trials. This role can be divided in two parts: first those trials in which the evaluation of a surgical procedure is the primary objective and second those, often adjuvant trials, in which they evaluate radiotherapy or medical interventions, but in which surgery is or was part of the initial treatment plan. Until recently most efforts of quality assurance have been put in the first part, evaluation of surgical procedures, since in the co-operative groups, this is considered the primary domain of the surgeon. An obstacle in this type of surgery trials is that in case two surgical procedures are compared, the participating surgeons should be equally conversant with both techniques. The second part, in which surgery is part of the treatment, but not the primary research subject, is however as important as the first part, because surgery is often the first and most effective procedure to remove the largest tumour load and to assess important prognostic factors such as tumour stage. Surgeons should therefore be involved in all co-operative cancer clinical trials and carefully describe the optimal surgical procedure(s) to be applied. 3.1.1. Quality of surgery Because of the non-quantitative and often variable nature of surgical procedures it is difficult to define specifically what quality of surgery is. Surgery is often still regarded as a craft [33] and one of the most important barriers for quality is that standards of operative principles are frequently lacking [34 /36]. Due to anatomic variation and per-operative need for ad hoc decisions it is difficult to define criteria for staging and therapy [10]. However minimal requirements of procedures and techniques and maximal acceptable variations should be set and a careful description of how the procedures should be carried out is needed. This assessment of minimal requirements is not only important for the surgeons in co-operative groups but also for those who operate on patients in general practices and later on may offer their patients trials in the field of radiotherapy or chemotherapy. Much work in this field has still to be done [33,34,37 / 39]. 3.1.2. Quality assurance of surgery 3.1.2.1. Operative notes. After consensus of how procedures should be carried out, the next step is the quality control of these procedures. In many studies this is done using the documentation of the operative notes. How-

220

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

ever the quality of the documentation in operative notes is questionable. Jacobs et al. tried to assess quality of surgery in a prospective randomised trial for patients with advanced operable head and neck cancer by review of operative notes [40]. They asked the following questions: was the objective of the operation curative or palliative, was the objective achieved, was the rationale for a specific technical approach mentioned and was the clearance of the neoplasm adequate? These four proposed measures could be evaluated in only 78% of all treated patients. Forty-seven percent of the operation notes did not state the objective; the rationale for the technical approach was not mentioned or unclear in 51%. Achievement of objectives was classified satisfactory in 99%. Bijker et al. also used operative notes to assess compliance with predefined surgical procedures in a trial on breast conserving treatment of ductal carcinoma in situ (DCIS) [41]. They reported that for 30% of the patients a not recommended procedure was used. Christiaens proposed for quality assurance a standardised surgical report for breast conserving surgery for the EORTC breast surgeons [42]. The use of this standard report in a pilot study in the EORTC revealed that considerable treatment differences remained, for example overall surgery time, resection of overlying skin, length of resection scar, use of clips, type of incision in the axilla and use of drains, orientation of the axilla specimen and macroscopic tumour free margins. She concluded that better documentation of surgical procedures is needed. Birk did a similar proposal for a better and standard documentation of surgery and pathology for pancreatic cancer surgery [34].

3.1.2.2. Pathology reports. A classical quality control method in surgery clinical trials is the comparison of operation notes and pathology reports. They may be used for estimation of protocol adherence, staging [40,43], estimation of the adequacy of the resection [40], eligibility [5], and outcome after treatment [44,45]. Jacobs et al. reported differences in staging of head and neck tumours that were considered T4 by the treating surgeon and the review committee in 40% of the patient, for N1 status this difference was even 60%. Interestingly there was no evidence of a learning curve or influence of caseload in this study. Review of adequacy of tumour clearance was also reported by Jacobs et al.: in 17% of the patients the tumour was resected with microscopic residual disease in the resection margins and in 24% the margins were barely free of tumour. Balch et al. evaluated eligibility for adjuvant immunotherapy in melanoma patients by the use of operation and pathology notes [2]. They reported that 20 of 136 patients were judged as ineligible due to insufficient surgery, such as biopsy of a metastatic lymph node only, partial lymph

node dissection, too few nodes removed or examined and no operative note available [5]. 3.1.2.3. Outcome. One of the very important goals of quality assurance should be the evaluation of effects of treatment on outcome. However this has the disadvantage that it can be assessed only after the trial has been finished. Nevertheless these data remain important to improve the quality of future trials. Outcome can be classified in short term and long term outcome. Short term outcome parameters are per- and post-operative complications, such as bleeding, infections, hospitalisation days, the need for re-operation and 30-days mortality. Long term outcome in surgical oncology has to do with long term morbidity, local recurrence and survival. Information of short term outcome from randomised clinical trials is sparse and its importance for the patient and the cost effectiveness of therapies is underestimated. Examples of short term outcome are reported in the TME study of the Dutch Colorectal Cancer group [46] and the study reported by Holm et al. on preoperative radiotherapy in rectal cancer [47]: both reported on the anastomotic leakage rate: 13 and 12%, respectively and 30-day mortality: 3 and 1 /2%, respectively. Long term morbidity such as lymph edema after lymph node dissections, nerval damage resulting in sexual and urinary dysfunction after rectal, prostate and bladder surgery an and abdominoperineal resections are important determinants for the quality of life of cancer patients. Nevertheless these outcomes are frequently neither mentioned nor analysed to assess risk determinants in order to improve the quality of care. The ultimate indicators for surgical quality are local recurrence rate and survival. The European Osteosarcoma Intergroup, which currently investigates the difference in local recurrence between institutions [48], the Dutch Colorectal Cancer Group which related surgical procedures and pathology review of margins and intactness of the tumour to recurrence [46] and the EORTC which reported on the importance of adequate surgery of gastric cancer for survival [45] are examples of this type of quality control. 3.1.2.4. Audit. Another type of quality control is the on site audit of the procedure. It is expensive but it seems an effective method. Schraffordt Koops audited on-site methods of hyperthermic limb perfusion with Melphalan for melanoma [49]. Although procedures were described in detail in the protocol, he discovered four frequently occurring violations of the protocol. The operation technique, the cytostatic Melphalan was not dissolved immediately before injection, although it is unstable and the blood temperature and pump flow during perfusion were not adequate. These violations and misinterpretations of the protocol were reduced by one on-site visit. Reynolds et al. reported on an

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

institutional audit procedure of axillary dissection, showing that a second surgeon proved in 38% that the primary surgeon had performed an incomplete axillary dissection [50]. They also provided evidence that by continuing this audit, performance of all involved surgeons increased and hence the quality of treatment. 3.1.2.5. Instruction and education. In order to gain a certain quality level the first and very important step in quality assurance is to inform all participants who are to execute the surgical procedures. The study of Schraffordt Koops is a case in point. The co-operative groups usually consist of many surgeons but in general only a hard core of interested and motivated surgeons regularly attend group meetings and directly contribute to the scientific activities. Many surgeons are hence not completely familiar with the protocol and needed procedures, which has a negative impact on the commitment level. Merely offering them the protocol with a description of procedures is in general insufficient, especially when new or complex operative techniques are introduced. When surgeons who have a basic interest in research already have difficulties in learning new techniques, the teaching of new techniques to surgeons in general practice will even be more difficult. This means that transfer of information and education are crucial parts of quality assurance. Some research has been done to evaluate different educational tools in surgery. Instruction is often necessary on different levels, through booklets, educational sessions, and workshops and at the dissection table by experts. Haasse advocated the availability of directly visible file cards in the operation room, which serve as a sort of checklist for review of the necessary observations that need to be described, the procedures, and the specimen collection [10]. Instruction by experts, if interaction is possible, is one of the best but expensive and time-consuming methods to educate new techniques. The method has been used for example in sentinel lymph node mapping in breast cancer and total mesorectal excision (TME) for rectal cancer [46,51,52]. The ultimate quality level compared to predefined standards may then be assessed with learning curves before participation in trials is possible. Comparison of educational training by use of an expert, videotapes and interactive computer based systems has been done with students. Since the use of videotapes misses the interaction this method is the least effective one. Especially for implementation of new techniques in general practice the computer-based education could be effective and cost saving [53]. 3.1.3. Conclusions From these reports on quality assurance of surgery in trials we can learn four things:

221

First, quality assurance of surgery consists mainly of quality control with reviews and audits, although focus on education and training is increasing. Second education, training and quality control are expensive issues for mainly unsponsored surgery trials, which impede the development of these kinds of trials. Third, outcome is more and more identified as an indicator of surgical quality and not only of biological behaviour of the tumour, leading to increased attention to quality assurance aspects of trials involving surgery. Fourth, the identified differences in treatment in a group of surgeons dedicated to research who meet regularly and who would be expected to be willing to change procedures, is a matter of concern to differences and hence to quality of treatment in every day practice. More and more we learn from daily practice that implementation of new techniques is difficult and slow, especially when these techniques are more different from the older techniques or more complex [54]. It probably implicates that active, again money requiring, programs with education, training and audits are needed to implement new research findings in daily practice. 3.2. Pathology The eligibility of the patients in many randomised trials is dependent on the pathological diagnosis. Also therapy allocation may be guided by pathological subclassification or staging. This led in the early years of multi-institutional cancer trials to the installation of pathology review committees. However review is only one aspect of pathology quality assurance. 3.2.1. Quality of pathology The quality of pathology has to do with many aspects: it starts with the way the tumour material is submitted and the accompanying information provided by the surgeon to the pathologist. Next the processing of the tissue is important, for example, is the interval between resection and fixation standardised? Is the fixation procedure the same? Is the way the tumour is cut uniform (radial versus parallel cuts)? Are the sections of a standard thickness? Are the same stains and staining techniques used? Is the immuno-histochemistry technique used validated and standardised? There are only few reports on quality assurance in clinical trials of these aspects, neither are strict processing methods for the pathologist found in protocols [2]. Thunissen et al. reported (not in the framework of a clinical trial) on the hazards of mitosis counting when not using uniform processing and examination techniques and different microscopes [55]. Bunt et al. reported in a randomised trial on the difference in nodal retrieval in gastric surgery when a routine method was used, compared with a fat-clearance technique and retrieval of

222

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

nodes immediately postoperatively by the surgeon instead of the pathologist [39]. Bijker et al. reported on the histopathological work-up in an EORTC randomised clinical trial investigating the role of radiotherapy in DCIS [41]. They mentioned the recommended inking of the specimen was performed in 25% of the patients and the recommended sectioning and radiography was reported in a disappointing 13% of the lesions detected by mammography only. Strict morphological criteria should be universally applied, standard examination of the tumour, its extension in adjacent structures and lymph nodes need to be defined, as emphasised by Hermanek et al. in his report on the residual tumour classification and by Nagtegaal et al. for the TME study [56,57]. Then, after examination of the tumour by the pathologist, a standard report of findings needs to be produced [58]. For reporting Bijker et al. found large variation in the way tumours were measured: macroscopic size was mentioned in 28%, microscopic size in 17%, conclusive size in 23%. Margin status was reported with large variability and in only 5% an exact distance of the DCIS from the nearest margin was given [41]. Unfortunately still to many different staging definitions are handled which lead to confusion when comparing the outcome of different studies [34]. 3.2.2. Quality assurance of pathology 3.2.2.1. Slide review for diagnosis. As mentioned before, the main method of quality assurance in pathology is the review or audit. Most co-operative oncology trial groups have pathology review committees, who mainly assess correctness of diagnosis for eligibility purposes. One of the prerequisites of this eligibility review is that it should be fast and preferentially completed before a patient is entered in a study. Fisher et al. suggested for this purpose an advisory function of review pathologists [58]. The method by which the review is organised varies. In the Eastern Co-operative Oncology Group (ECOG) for example in some trials a group of designated pathologists reviewed the cases and the final diagnosis is based on a consensus, while in other trials one experienced pathologist was designated to that specific trial. For the lymphoma studies still another method was used with a second review and in case of discordance of diagnosis between the first and the second review a discussion in the central pathology panel was planned [59,60]. In the National Surgery Adjuvant Breast and Bowel Project (NSABP) trials a group of pathologists or an individual who has exhibited expertise concerning the pathological parameters evaluate a particular protocol. This expertise is gained by one on one training sessions and a consistency of findings of at least 90% is needed [58].

However many studies have documented lack of consensus among pathologists for a range of specimen types and have shown that even the same pathologist can produce different reports when examining the same specimen on different occasions, so that a review by more than one pathologist is to be recommended [61 / 65]. The estimation of how much pathology review contributes to high quality data and at what costs this can be reached varies. The ECOG reviewed their pathology audit practice for lymphoma studies and solid tumours. Hodgkin disease studies running between 1976 and 1984 reported disagreement percentages between contributing pathologists and the ECOG review hematopathologists and the secondary reviewers from the repository centre between 22.2 and 29.2% and 28.2 and 30.6%, respectively. The disagreement percentages between the ECOG reviewers and the repository centre, however ranged from 7.2 to 11.1%. For non-Hodgkin lymphoma protocols the percentage of disagreement between contributing pathologists and the ECOG reviewers and the repository centre ranged from 17.2 to 73.1% and 16.7 to 73.1%, respectively and between the ECOG reviewers and the repository centre from 2.9 to 25.7%. The percentage of pathology disagreements leading to exclusions from protocols for Hodgkin disease and nonHodgkin lymphoma protocols was 7.2 and 10.1% after ECOG review and another 0.4 and 2.1% after the second repository review, respectively. They concluded a secondary review did not contribute substantially to improve quality of trials concerning Hodgkin disease or non-Hodgkin lymphoma [60]. Gilchrist et al. used a database of ECOG containing over 14 000 cases accrued from 1978 to 1986 from 90 trials on solid tumours and reported on ineligible cases judged on slide review because no cancer, wrong type of cancer and wrong stage of cancer was diagnosed. Ineligibility rates varied between protocols and with tumour type: 0/1.3% for breast cancer trials, 0% for melanomas, 0 /12.3% for gastrointestinal tumours, 2 /16.7% for lung cancer trials, 4.4 /16.9% for genitourinary cancers and 1.5 /12.6% for soft tissue sarcomas. Using computer simulation models they estimated the potential loss of precision in detecting treatment differences if pathological ineligibility was not identified. They concluded that for many trials conducted by the ECOG very little efficiency would have been lost if central pathology review had not been conducted. These trials included particularly lung cancer and metastatic breast cancer trials. In trials in which the exclusion rate was at least 10% or those in which the patients with inappropriate histopathology might have a disease with a quite different natural history the loss of efficiency might be large. For these cancers slide review still seemed prudent. For the trials in which the exclusion rate was between 5 and 10%, firm conclusions

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

could not be made and they advised the need for review should be examined on a case-by-case basis [59]. This is in contrast with the opinion of the NSABP pathologists who require at least one paraffin block and review of slides for all patients. This group considered patients whose slides were not available as ineligible (accounting for 5 /10% exclusions) and found disagreement of histological tumour type, and nuclear and histological grades in 35% of the cases in the NSABP B18 protocol between institutional pathologists and headquarter pathologists [58]. The EORTC compared assessment of tumour stage and grade of stages Ta and T1 bladder tumours by the local pathologists and the review committee [66]. From the Ta tumours 10.3% were upstaged by the review pathologists, from the T1 stage 52.5% were down staged to Ta and 4.7% were upstaged to T2, while for the T2 or greater tumours 13% were down staged to Ta and 19.4% were down staged to T1. The review committee reclassified 42.3% of the tumours from grade 1 to grade 2 and 4.5% from grade 1 to grade 3, while for grade 2, 19.4% were down graded to grade 1 and 20.3% were upgraded to grade 3. For grade 3 38.8% were downgraded. Combining T stage and grade there was agreement between local pathologists and review pathologists in 56.9% of the Ta grade 1 tumours and 50% of the T1 grade 3 tumours. The review pathologists reclassified 10.6% of all T1 grade 3 tumours to invasive disease greater than T1. They associated prognosis with local and review pathology and concluded that there was no important difference in prognosis for T stage in the afore mentioned categories, however for grade, the prognosis from the review committee in terms of progression free survival and time to progression was slightly more predictive than that from the local pathologist. They concluded that for low and intermediate risk T1 grade 1 or 2 disease, since treatment is not essentially different, pathology review is unnecessary, for the more aggressive tumours however, therapy may be related to stage and for these patients review of tumour stage and grade may be necessary especially when variance in diagnosis is large. Some groups advocate automated review as a method that can be used effectively to reduce costs. However full experience is at this time mainly described for cytology review [67,68]. 3.2.2.2. Review of pathology reports. Another method of quality assurance is the use of a special purpose standardised pathology form. Together with the hospital pathology report they increase the completeness and accuracy of data [57,58]. The NSABP always uses these forms for their trials. Nagtegaal et al. audited the use of a special pathology case record form with the hospital pathology report for the rectal operation specimen. Although 86.5% of the data were correct only one third of the forms were

223

complete and correct. The most prominent missing item was the number of lymph nodes removed, while the most important error concerned the description of the circumferential margin, which is prognostic for recurrence. They reported that even 3 consecutive audit rounds were needed to obtain high quality data [57]. 3.2.2.3. Education and training. Education is another major quality assurance activity. The most intensive method is a one-on-one training of institutional pathologists, however this is a very costly and elaborate method for the large multi-centre trials performed today. A cheaper alternative is the organisation of workshops for the institutional pathologists, together with written materials, manuals and reminders. However reports on the effects of these educational methods in trials are scarce and the goals are not always reached. In the TME study the circumferential margin was wrong or missing in 18 and 12% of the cases, respectively, although its examination was trained and its importance was regularly emphasised [46]. 3.2.3. Conclusions It may be concluded that until recently quality assurance of pathology in clinical trials has mainly focussed on eligibility review through slide review. However this type of review is very time consuming and expensive, making it important to assess the influence of a change in diagnosis on prognosis and treatment allocation in order to decide for this type of review. In general a change in diagnosis after review for less than 5% of the study population is considered acceptable, and slide review is not necessary, while rates above 10% may be considered for review for reasons mentioned above. Quality assurance of structure and process of pathology in clinical trials is often not described and the impact of their variance is unknown. More knowledge of these subjects through clinical research is needed. For quality assurance purposes it is advisable to increase the role of the pathologist in the protocol design and the quality control. The digitalisation of pathology that is now starting will improve the possibilities of fast exchange of information and computerised (automated) review. The importance of education and training of pathologists is underestimated in many trials. 3.3. Radiotherapy Many of the early reports on quality assurance in cancer clinical trials concerned radiotherapy. The cancer and leukemia group B (CALGB) and EORTC were, among the major co-operative groups, those which reported already in the early eighties of the previous century on quality assurance procedures and systems for radiotherapy.

224

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

They developed methods to define and control quality of various aspects of radiotherapy. The main goal of most quality assurance measures in radiotherapy trials was to reduce variability and uncertainties related first to mechanical and radiation physics parameters of the treatment unit, including the calibration of the radiation beams, second, to the different steps of the treatment preparation including patient data, recording and reporting in clinical charts, and treatment dose planning and third to the actual daily patient irradiation. The need for accurate treatment in radiotherapy is high, since insufficient or inhomogenous coverage of the planned target volume (PTV) results in a higher risk of relapse or progression of the disease, while unintended radiotherapy of healthy tissues may result in short or long term toxicity. This is emphasised with the advent of novel techniques such as three dimensional radiation therapy and intensity modulated radiation therapy. 3.3.1. Investigation of equipment and human resources The first activities reported by the EORTC concerned the investigation of workload and equipment of 17 radiotherapy departments in the period from 1982 to 1984 [69]. They used questionnaires for the 17 participating centres and most importantly, performed on-site visits. This survey was extended and repeated in 1990, using mailed questionnaires to at that time 50 participating centres, and published in 1996 by Bernier et al. [70]. Both studies assessed department infrastructure, patients’ workload for medical and technical staff and treatment equipment. Using figures such as available types of equipment, the number of patients treated per equipment, radiotherapist, physicist and technician, and interviews of perceived comfort in treating patients they assessed variations and set definitions for quality of radiotherapy departments. Horiot et al. [69] defined a cumulative workload and staff index, by adding the 4 previously cited ratios, and considered indices below 1200 of best quality, Bernier et al. [70] later proposed an average tentative profile of a radiotherapy department contributing to EORTC studies and treating 900 /1200 patients per year. Both methods served as quality assessment tools, for the profile of new participating radiotherapy centres in EORTC studies, for reassessment of previously visited centres and for comparison between centres. Bernier et al. furthermore showed marked differences in the education, working and responsibility definitions of radiation physicists, radiation dosimetrists and radiographers, but not for radiation oncologists, for whom an European curriculum already existed. This emphasised the need for a better definition of a European curriculum for the staff categories mentioned above, with increased uniformity in training, leading to quality assurance.

3.3.2. Dosimetry Apart from equipment type and utilisation, the quality of the equipment can be tested and compared by dosimetry. The CALGB reported already in 1980 on the quality of dosimetry in the CALGB 7611/7612 studies concerning radiotherapy as Central Nervous System prophylaxis for acute lymphocytic leukemia [2,71]. For these studies the Radiological Physics Center (RPC), a group of physicists responsible to the American Association of Physicists in Medicine performed standardised reviews of each institutions physics program, such as dosimetry and isodose curves from each apparatus. In 1986 the EORTC reported in detail on the dosimetric intercomparison as part of their radiotherapy quality assurance program [72]. They used a model for the determination of the absorbed dose in water for certain points and measurements of the absorbed dose distribution at one depth perpendicular to the beam axis. The criteria for quality as defined by the RPC in the USA for the absorbed dose were used: the absorbed dose in the reference points in water should be equal within 3% of that stated by the institutions for all beams and the tumour dose prescription should be equal within 5% [73]. The reason for requiring such a high level of accuracy in the absorbed determination is that most tumours and mammalian normal tissues have steep sigmoid-shaped dose/response curves with little separation between the two curves. For the flatness and symmetry of the beam at different depths in a phantom the quality criteria set by the International Electrotechnical Commission were used [74]. They defined in analogy to Glicksmann minor and major deviations as between once or twice and more than twice the acceptable level of variance, respectively. Fifteen percent of the gamma beams had minor deviations, while no major deviations were seen. For photon beams minor and major deviations were seen in 25 and 5%, respectively, while for electron beams these percentages were 19 and 10%, respectively. This study resulted in recommendations from the EORTC radiotherapy group for all participating centres such as the use of specific dosimetric protocols, use of special water ionisation chambers, correction of loss of ions in the ionisation chamber according to set methods, cylindrical for photon beams and plane for electron beams. Furthermore regular re-calibration and constancy control of the local reference ionisation chamber, more accurate determination of wedge filter and shadow tray factors, and daily correction for air temperature and pressure. They also proposed (TL) dosimetric reference centres geographically distributed in Europe. Some years later however a more convenient and cost saving method of dosimetry using capsules containing dosimeters that can be sent by mail, the so called mailed

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

thermoluminescent dosimetry was introduced. Hanson et al. reported regular checks with mailed TL dosimetry and rapid feedback on performance increased unity in and between institutions. This method was also refined to check brachytherapy and to be used in vivo [75]. 3.3.3. Evaluation of treatment plan and actual treatment 3.3.3.1. Review. Glicksman et al. were the first who described evaluation of the treatment plan [2]. They developed for the CALGB a quality assurance program within the radiotherapy committee that required that the data for all patients entered in radiotherapy arms of protocols were submitted to the committee within 3 days after radiotherapy was started. Data used for review were: a complete description of the radiotherapy treatment plan, simulator (localisation) films and verification treatment films, isodose plan and, after completion of treatment, the department treatment sheets with the dose and volume delivered. Deviations of 5 /10% and of /10% were considered minor and major protocol deviations, respectively. Overall each institution’s performance was reviewed in terms of cases entered in radiotherapy arms of protocols, data received and evaluated and appropriateness of the treatment. They used feedback as a method to improve the proportion of evaluable and adequate data. Hunig et al. reported on the first evaluation of this program [71]. With the start of this quality assurance program in a childhood acute lymphocytic leukaemia protocol in 1974, 37% of the entered patients were evaluable and 26% were treated appropriately. This percentage rose to 53 and 35%, respectively in the last year of this study. In the next study that started in 1976 evaluability rates rose from 63 to 73% and the appropriateness rates from 37 to 61%. They concluded the program resulted in a significant improvement in protocol adherence as expressed by an increased number of evaluable and appropriately treated patients. The same method was used and comparable findings were reported by Wallner et al. for the RTOG in 1989 [76]. They suggested in case of good institution’s performance to limit the review by sampling of some entered cases. Hafermann et al. also reported in 1988 on the treatment plan for a trial with radiotherapy for localised prostatic cancer through review of simulation and portal films, mainly assessing compliance in terms of the spatial localisation of the pelvic fields and prostatic volume [77], as did Martenson et al. for radiotherapy of the rectum [78]. The review by Hafermann et al. led to adaptation of the protocol requirements and extra educational sessions: due to a high rate of non-compliance with the field definitions (12%), whereas 10% was not evaluable, a new technique for field localisation was developed and this was communicated to the participants in 2 workshops and a written

225

communication, leading to a 50% reduction in the rate of inadequate localisations. The RTOG reported in 1991 on the importance of total elapsed treatment time and fraction relative to the protocol prescription, beside the earlier mentioned treatment borders and dose [79]. They found a prolonged treatment time and fraction violations due to patients’ normal tissue reactions, machine down-time, holidays, and patient refusal in almost 10% of the entered and reviewed patients and demonstrated for this group a significantly poorer local control and absolute survival compared to those who had an acceptable treatment time, 13% versus 27% and 13% versus 26%, respectively. Duhmke et al. reported after review an inverse relationship between prospectively defined protocol violations and treatment success and survival for extended field radiotherapy in early stage Hodgkin disease. Again stressing the point of early control with a chance of correcting violations in time, preferably before the actual start of a patient’s treatment [80]. However, the review of all data is time consuming and costly. Horiot et al. suggested for this reason a random selection of cases per institute [69], while Martin et al. developed a satisfying system with a statistically based case sampling procedure depending on protocol compliance rates [81]. The radiotherapy reviewers of the NSABP phase III breast cancer trials suggested stopping evaluation of all portal and simulation films for trials in which radiotherapy was not the main issue, and for radiotherapy schedules that were easy to accomplish, because very few patients were judged to have inadequate fields, resulting in an inadequate cost-benefit ratio [82]. 3.3.3.2. Dummy run. Another way to check the treatment plan is a dummy run procedure. The main objectives of a dummy run is to tackle systematic errors and evaluate the compliance of a given centre to protocol guidelines. It is useful for quality control in specific radiation techniques and usually applied in the initial phase of the study, before institutions actually enter patients in the trial. Various aspects of a dummy run can be evaluated, such as the treatment planning facilities in the participating centre and the dose specification procedure, the clearness of the protocol prescriptions and the potential differences in treatment techniques and resulting dose heterogeneity. These parameters were extensively assessed by the EORTC, for instance when it used a dummy run procedure for the assessment of the role of a booster dose in breast conserving therapy. They sent three transverse sections of a patient to the participating institutions with a request to make a three-plane treatment plan according to protocol descriptions. It led to recommendations concerning the choice of a specification point in case half

226

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

beam techniques are applied, the choice of the wedge angles and the use of lung density correction in treatment planning and showed that despite differences in treatment techniques comparable data concerning dose and dose heterogeneity can be obtained and good accordance in dose homogeneity can be reached [83]. Valley et al. investigated a dummy run for a routine case of an advanced lung tumour in 10 Swiss radiation institutions [84]. He found large variations in different aspects of radiotherapy. First there were variations in the materials: X-rays between 6 and 45 MV from accelerators and betatrons, cobalt gamma radiation from cobalt units and the use of filters and compensators. Furthermore there were variations in the techniques used: opposing fields versus rotating fields, variation in field directions and field dimensions. Also, dose calculations and dose prescriptions varied: calculations by on-axis calculation with or without heterogeneity correction or computer calculation, variation in the definition of tumour and target volume, point prescription versus isodose prescription, total dose to the tumour (varying from 50 to 71.8 Gy) and dose per fraction (varying from 1.8 to 3 Gy/F). This led to the conclusion that standardisation especially for dose prescription is urgently needed and that the search for consensus of standard management is to be promoted. The procedure was also used for the EORTC radiotherapy trial 22 863 for prostatic cancer and 22 931 for head and neck carcinomas. For the prostatic cancer trial the variation was less pronounced [85], with a total dose in the pelvic field of 50 Gy as described in the protocol for 7 of the 9 participating institutions and 50.4 Gy in the other 2 institutions, with fractions varying between 1.8 and 2.0 Gy. Although in this study too, treatment units varied between 6 and 25 MV accelerators and Cobalt units and isocentric as well as point prescription was used. A major deviation from the protocol was seen for treatment technique in one institution, that used two opposed fields instead of the stated four-field box irradiation, while minor deviations were seen for beam weighting and the inconsistent use of shielding blocks. More unsatisfying results were reported for the head and neck cancer trial [86]. Dose reporting was not according to the recommendations of the ICRU in 7 of 10 institutions and large discrepancies in the delineation of the PTVs were reported, with a ratio between maximum and minimum PTV exceeding a factor 9 and overlap of PTV between 6 centres varying between 15 and 65% in one PTV and 9 and 52% in another PTV. This means that although great efforts have been made to ensure quality assurance of treatment equipment and treatment planning and delivery, (including immobilisation techniques) until now the human factor remains crucial in defining the target volume. This was already previously investigated by Leunens et al. and more or less confirmed by Ketting et al. for three dimensional

target volume planning [87,88]. Leunens et al. assessed the interobserver variability between 12 volunteering physicians in the two dimensional delineation of the tumour and target volume on the lateral orthogonal localisation radiograph of 5 brain tumours. For the five test cases the tumour surface on which all radiation oncologists agreed on the lateral orthogonal radiograph represented only 35 /73% of the corresponding mean tumour surface. Large differences existed between physicians both for the estimation of the size of the tumour and the target volume. She suggested the sources of the large interobserver variation might be related to the methodology used for tumour and target volume delineation, to the training and experience of the physicians involved, and probably most importantly to the subjective interpretation of these volumes. Different imaging techniques such as MRI may decrease this variation, as well as computerised transfer of digital diagnostic imaging data to treatment units [88]. 3.3.3.3. Phantom. The use of a phantom made it possible to refine quality control of the treatment plan, beside this, it added to evaluation of the actual treatment. Worsnop reported already in 1968 on the use of a lung tumour phantom with a thermoluminescent dosimeter for comparison between institutions [89]. Later reports on the use of a phantom are from the EORTC and the RTOG [90,91]. As part of their quality assurance program, the EORTC radiotherapy co-operative group also carried out an intercomparison on an Alderson phantom of a primary tumour of a tonsil with a homolateral subdigastric lymph node. The head, neck and supraclavicular parts of the phantom were used. Anatomic structures for the treatment were also shown in slices. Institutions were recommended to use their ordinary treatment technique for tonsillar tumour patients using parallel-opposed fields according to the running protocols. They were supposed to calculate the absorbed dose in specific slices and to state the absorbed dose in 17 measuring points in the phantom. The irradiation was performed according to the treatment schedule and the computer calculation made by the institution. In this study the following, mainly planning of treatment problems were encountered: insufficient margins around the tumour were found in 35% of the institutions, while too large margins were found around the ipsilateral lymph node in 25% of the institutions. This reflects the underestimation of the primary target volume and the nodal target volume, of patients’ motions, absorbed dose gradients and inaccuracy of machines. Different technique choices resulted in too much deviation from the expected optimal dose distribution in the various target volumes for the demonstrable tumour and microscopic disease.

Exact 1 Up 3 Down 1 Up/down 3 Unknown 3 9/11 No Nurse 6 Physician 2 Nurse/physician 3 Pharmacist 2 Nurse 8 Physician 1 9 Phase III Favelli et al. [95]

11

13/15 Nurse 10 Physician 5 Pharmacist 6 Nurse 7 Physician 2 15 15 Phase II Steward et al. [94]

Checks on chemotherapy Chemotherapy administered by Chemotherapy prepared by Chemotherapy prepared in laminar flow cabinet Number of institutions Study

Table 2 Review of chemotherapy practice in EORTC institutions

3.3.4. Conclusions In radiotherapy trials participating institutions are frequently evaluated on the eligibility of their patients and protocol compliance. Quality assurance of radiotherapy is mainly focussed on equipment evaluation, treatment planning and actual treatment by dosimetry, a dummy run and the use of a phantom, respectively. With the use of a phantom treatment planning as well as actual treatment can be audited and the protocol can be tested in advance. Depending on the results of this procedure modifications and amendments can be made before the actual start of the trial. To be successful, quality assurance must be exhaustive and should not neglect any of the planning and delivery phases, including a thorough assessment of equipment accuracy and staff resources, which often represent a weak link in the quality assurance chain. A program should therefore not only look for systematic deviations, but also for random errors of planning and delivery of radiotherapy. Regular quality control rounds are often necessary to assess the impact of the feedback [75]. Unfortunately the impact of quality assurance on the relation between actual treatment and protocol prescribed treatment and the outcome is relatively underestimated. Recently Bentzen et al. made an effort to assess the impact of dosimetry quality assurance programs in the EORTC using radiobiological modelling [92]. He showed the impact of small deviations was unexpectedly high, both on tumour control probability, which significantly decreased in case of under dosage, and normal tissue complication probability, which markedly increased in patients who were overdosed. These kinds of studies are needed, together with direct outcome studies to assess costs and benefits of quality assurance.

Rounding of chemotherapy dose

Twenty-two percent of the points in the phantom had an absorbed dose beyond the acceptable level of variance, with a larger spread in the primary target than in the primary tumour points. An undesirably high dose to the contralateral normal tissue was sometimes given. Inhomogenity of dose distribution to the homolateral neck could result in either a high risk of complications such as fibrosis or in the risk of failure in certain regions, while underestimation of the risk of contralateral microscopic spread in both sudigastric and upper posterior neck was also observed. From this study recommendations were made for future trials that hold radiotherapy. For example more detailed information on margins of safety around tumour and positive nodes, extension of target volumes, dose distribution outside the reference dose to the primary tumour, more accurate dose specifications for the dose taken as reference, maximum and minimum dose in the target volumes.

Up 15

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

227

228

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

3.4. Chemotherapy Information on quality assurance of chemotherapy in trials is sparse. 3.4.1. Investigation of equipment, preparation and administration of chemotherapy One of the first activities in quality assurance for chemotherapy trials was reported by the EORTC in 1991 [93]. It concerned the assessment of chemotherapy practices in order to define optimal quality. Two studies were assessed and results were presented in two separate papers: one was a phase II study from the soft tissue and bone sarcoma group on the use of GM-CSF with doxorubicin and ifosfamide [91] and one was a randomised phase III study from the gynaecological group comparing cisplatin, bleomycin and vindesine and mitomycin versus cisplatin alone for disseminated carcinoma of the cervix [93 /95]. Using mailed questionnaires they investigated technical aspects of chemotherapy prescription, rounding procedures of dosages, local facilities for preparation and the procedure for preparation and administration. A protocol specific part evaluated the guidelines in the protocol on the use of the products, such as reconstitution, sequence, timing and mode of administration of the chemotherapy and details on anti-emetic policy and recording of side effects. On-site visits completed the information with review of facilities as pharmacy, ward, outpatient department and data management office, checks of calculated and administered dose per cycle, including comparison with the protocol dose, recalculation of protocol dose based on length and height of the patient at onset and adaptation to the weight in subsequent cycles, reasons for reduction of treatment, if any and the intervals between cycles. In Table 2 chemotherapy practices in the various institutions of the EORTC as described by Steward et al. [94] and Favelli et al. [95] are summarised. Between institutions large variations existed for equipment, chemotherapy preparation, administration, checks and rounding of chemotherapy. Steward et al. [94] reported the chemotherapy was prepared by pharmacists in 6 centres, by nurses in 7 and by physicians in 2 centres. All but two centres had systems that involved two separate individuals checking the correct dose, in one centre no checks were performed, in one centre three individuals checked it. The median interval between chemotherapy preparation and administration was 60 min (range 2 min to 30 h). In all instances doses were rounded up to the nearest easily measurable value rather than rounded down. All centres used the similar anti-emetic policies, a steroid with either metoclopramide or a 5 HT3 antagonist. Specialist nurses administered the chemotherapy in 10 centres and physicians in the remaining five. Favelli

et al. described for the 11 centres they audited that chemotherapy was prepared by pharmacists in 2 centres, by nurses in 8 centres and by doctors in 1 centre. Chemotherapy was not checked in two centres. Three centres rounded up the dose, one rounded it down, three had a variable policy, three an unknown policy. Only one centre prescribed the exact dose. Considerable drug depending deviations in timing of administration of chemotherapy, were observed in 2/7 centres ranging from 15 min to 12 h. Sequencing of drugs was variable for some combinations and compliance with switching of protocol described chemotherapy regimens was poor in four centres. 3.4.2. Dosing and dose intensity audits Steward et al. [94] also assessed dose and dose intensity. The mean dose of the given chemotherapy over all courses was 102% of the planned dose with a range of 94 /110%. The mean dose of GM-CSF given was 100%, but the range was 14/114%, which was attributed to incorrect interpretation of instructions of the patient who should himself administer this drug. Overall median interval between the chemotherapy courses was as required by therapy protocol, however course delays up to 34 days occurred in 30% of the courses and they were due to organisational problems in 66% of the cases. Local clinicians were always unaware of such problems prior to the site visits and rearranged their practice with repeat admissions. Toxicities (29%) and patient request (5%) were the other reasons for delays. Favelli et al. [95] found median percentages of intended dose for vindesine, cisplatin, bleomycin and Mitomycin C and single dose cisplatin of 98 (range 49/ 120)%, 100 (range 92/112)%, 100 (range 22 /160)%, 99 (range 0 /195)% and 100 (range 85/105)%, respectively. They also reported on incidence and causes of changes in treatment intervals: they reported a delay in 48 of the 176 treatment cycles evaluated in 55 patients. Twentyfive of these delays were due to toxicity or intercurrent disease, 10 resulted from patients’ request and 13 were classified as avoidable delay. For 6 cycles the treatment interval was shortened. They considered 35%, 19 of the 54 altered intervals, avoidable. The CALGB reported in 1993 on chemotherapy practice audits in their group [96]. They defined major deviations as for example omission of one drug in a three to five drug regimen, dose change of more than 15%, a series of minor dosing deviations that in total are considered major, lengthening of the interval between the treatment cycles through error and using the wrong drug. They found major protocol deviations in drug dosing, mainly under dosing, in 8.9 and 14.8% of the patients entered by main (university) centres and affiliate centres, respectively. The percentage did not change much over a period of 9 years with 3 consecutive

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

audit cycles. A retrospective analysis of adjuvant chemotherapy with CMF by Bonadonna et al. reported 14% major deviations of protocol guidelines for chemotherapy dosing, concerning delay and reduction in the number of courses [97]. Lise et al. reported on the EORTC trial of adjuvant chemotherapy for gastric cancer that 12% of the patients randomised to chemotherapy never received it and only 48.4% received the intended 6 or 7 courses [45]. The reasons for not starting chemotherapy or not completing the 7 courses were unknown in 15% of the patients. Patient refusal was responsible for another 15%. Recently Benedetti-Panici et al. [98] reported on a randomised trial of neoadjuvant chemotherapy followed by surgery versus radiotherapy for locally advanced cervical cancer. The only requirements that were made in the protocol were that at least 240 mg/m2 total cisplatin dose, with a maximum of two additional drugs should be given, administered over a period of 6 /8 weeks. A number of different schedules was administered, varying from 1 to 3 drugs, intended dose of cisplatin per course varying from 40 to 80 mg/ m2, schedules varying from weekly to 3 weekly courses. With this variable neoadjuvant chemotherapy 11 patients (5%) had more than 2 weeks delay and 1 patient had more than 20% cisplatin dose reduction, in the absence of toxicity. Another 5% of the patients discontinued chemotherapy. Delay of 1/2 weeks was present in 15% of the patients and dose reduction in 3%, all due to toxicity of treatment, but not specifying which schedules. Although the protocol left room for variation in chemotherapy schedules, it is probably better to avoid this kind of institutional compromises and choose one schedule that is thought best on the basis of evidence. This probably would have increased the chance of a more solid conclusion on the beneficial effects of neoadjuvant chemotherapy and its consequences for toxicity, which was in this study compared with the toxicity of the radiotherapy. 3.4.3. Systemic therapy checklist Both EORTC studies on quality control of chemotherapy [24,94] reported that a disappointing 20% (range between institutions 0.7 /45.6%) of the data could not be assessed in audits because information on the case records could not be found in the hospital file. These data mainly concerned treatment dose, time and toxicity [99]. This, together with the fact that, although site visits are effectively increasing the quality, they are very expensive and time consuming, led to the development of a systemic therapy checklist, a card for the hospital record to improve recording and data acquisition. In this checklist variables related to eligibility, drug doses and their administration, biochemical and haematological parameters, variables related to toxicity of treatment and response parameters need to be filled out. Evalua-

229

tion of this instrument to increase the quality of data, resulted in an increase of correctly reported data from 68% before the introduction of the checklist and from 86% in the 5 hospitals that did not use it, to 98% in the 6 hospitals that did use it. Two of the 5 hospitals that did not want to use the checklist nevertheless had 94 and 99% correct data, however in both centres only one physician was involved in the treatment of the patients on trial and with the checking of the case record forms. In the other 3 institutions not using the checklist, correct data were found for 65, 76 and 83%, respectively. The organisation of these institutions was more complex, with different physicians treating patients on trial probably resulting in less involvement feelings. Due to the checklist especially the missing data in the hospital files decreased dramatically, from 28 to 0.6% [99]. Another advantage of the checklist is that the on-site data checking became much more efficient, because the median number of cycles that could be checked increased from 3.5 to 6.5 cycles/h. 3.4.4. Conclusions Not many trials have focussed on chemotherapy practices and their influence on outcome. Audits in different co-operative groups resulted in similar conclusions on practice variation and deviation ranges. Trials that should assess the value and toxicity of chemotherapy should better choose, by best evidence, the most effective and least toxic treatment and not a variety of different schedules. The institution of a toxicity and symptom therapy checklist in EORTC centres improved documentation of chemotherapy and efficiency and quality of data management in the hospitals that used it.

4. Data monitoring With the development of co-operative groups executing large multi-centre trials, data monitoring became of utmost importance and data monitoring centres were generated to guarantee the uniformity, the completeness and correctness of the clinical trial data [2,4,96,100,101]. The first step in data monitoring is the transfer of data from the medical charts to the case report form (CRF). This is usually done by local data managers. The EORTC checked this procedure in 15 of its institutions for 3 different protocols [102]. The quality of data was coded with an A code, which was related to the type and relevance of the error and a B code, which related to the cause for the incorrect data. It was noted that in the different institutions 0.4 to 14.5% of the data on the CRF could not be found in the patients’ file. This was due in part to the missing of a central file in some institutions and poorly written communication between the departments in that institutions, while in other

230

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

hospitals local data managers failed and study information was written directly on the CRF but not in the patient files by the treating physician. The percentage of missing data on the CRF was low, 0.3 /2.9% and was partly caused by missing information in the medical chart. The median rate of incorrect data was 2.8% (range 0.5 /7%), mainly caused by incorrect transfer of information. Although they found no striking differences in the correctness of data from institutions with local data managers and those without, the best results were found in the institutions with well-trained and experienced data managers. These data managers have created a structure to generate good quality data by for example prospective reminders, protocol summaries and notes in the medical charts, detailing what should be done. The results in the institutions without data managers were mainly influenced by the quality of the patient file and the time that elapsed between treatment and completion of the CRF by the treating physician. Important for this procedure is also the quality of the CRF with unambiguously defined items, in a logical clear format. Their quality can best be evaluated by piloting them and common forms should be used whenever possible to reduce coding errors [102,103]. The Danish Breast Cancer Co-operative Group evaluated the end results of 2 of their protocols, DBCG 82B and DBCG 82C on adjuvant systemic treatment of pre and postmenopausal breast cancer patients, respectively. They performed an audit on offstudy data of the breast cancer patients who were included in these studies by an on-site review of the patients’ records [104]. They observed incorrect data in 16.2% of the cases that went off-study due to recurrence. In 12% of the patients unidentical locations were demonstrated. In 5% of the patients a major difference in site of recurrence was found, of whom 2% were upstaged from local to distant metastases and 3% were down staged from distant to local recurrence. A time difference of more than 30 days was found in 9% of the patients. The major parameter in the statistical analysis of the two protocols however, was not significantly influenced by the validation. The second step is entrance of data from the CRF to the centralised database of the data centre. Usually the data managers from the central office do this in a duplicate entry procedure, although discussion exists as to how cost-effective this procedure is and if other procedures such as exploratory data analysis and adaptive double data entry are more effective [105 /107]. Errors are then checked by a computer-based relational validation routine. Missing, inconsistent, illogical, out of range and discrepant entries are marked and reviewed by the data manager of the central data office and if needed they immediately notify the participating institution for corrections. The information is also mailed to

the study co-ordinator, who is responsible for the overall direction of the protocol. After review by the study coordinator the statistician reviews the data for second time. The rationale of regular data monitoring is first to provide a feedback mechanism allowing for corrections of deficiencies in both data collection and protocol compliance, leading to quality assurance of the protocol, mainly assessed by eligibility, compliance of treatment and evaluability as has been outlined before under the heading of the different disciplines involved in cancer trials. For this purpose it is important that the feedback is fast. Many co-operative groups have reported an increase in the quality of data by this mechanism [2,26,108]. A second rationale is the detection of fraud. This can be suspected if the pattern of variability of the data in one hospital is different from that in the others [109]. A third rationale is the monitoring of problems, such as adverse drug reactions or unexpected large differences in results between treatment arms. Many co-operative groups have instituted for this purpose a data monitoring review committee [110]. These committees are usually composed of both internal and external reviewers and are in charge of the assessment of the progress of phase III trials and the recommendations concerning their early termination [14]. Especially for the decision of early stopping it is very important that stopping rules are laid down in the protocol before a trial starts. Some prefer a completely independent review committee, because when the study co-ordinator is involved in decision making, a conflict of interest could arise that might compromise the safety of the patient and the integrity of the trial [111]. 4.1. Conclusions Data management is a very important aspect of quality assurance of trials. The quality assurance of data is organised in different levels, each regarding specific aspects of the trial. They range from the registration of data from the records on a CRF file to statistical global evaluation of the entered data to evaluate extremes and data inconsistencies. Also data management through monitoring of study results is used as quality control of treatment results in order to protect patients from being treated with a suboptimal or toxic therapy, which can lead to preliminary closure of the study.

5. Discussion From the reviewed literature it has become clear that quality assurance of clinical trials is needed at all levels,

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

from their planning by the research group and the writing of a protocol by the writing committee for example to check the methodology, the protocol and all the administrative and regulatory issues, to their execution, the data monitoring and analysis and the publication. In order to organise this quality assurance efficiently a complex co-ordinating organisation is needed. The co-operative group should also pay attention to the interactions between different specialists treating the same patient and the interactions of different specialists in different institutes or regions with their own cultural background. Especially the cultural diversity of co-operative groups may lead to undesirable concessions in the development and execution of a protocol, thereby compromising the quality of the study and its results. As has been described in the introduction, quality assurance is all those actions necessary to provide adequate confidence that a product or service will satisfy given requirements for quality by either testing the product or service against the prescribed standard to establish its capability to meet them or by assessing the organisation, which supplies the products to establish its capability to produce products of a certain standard. Most quality assurance activities in the co-operative groups that were published were focussed on the first part of the definition with emphasis on standardisation, checking and audits. The goal of the audits in clinical trials was mainly improvement of data quality, improvement of performance of investigators or their sites and the detection of fraud. An audit is however time consuming and costly. Estimations of costs vary between $800 and $1500 per site visit [94,95], excluding the salaries lost for the time taken from regular duties. Per site visit approximately 4/10 patients may be checked. Same figures are reported for an audit per patient [112], depending on the number of patients audited and the number and distances of sites that need to be visited in a trial. In too many studies that report audits for quality assurance, many data are reviewed without a model or a statistical assessment of the influence of these data on outcome. Also critical assessment of quality assurance costs is not available. To increase efficiency of audits in future trials and for the sake of all patients treated in daily practice, co-operative groups should be challenged to find useful indicators for protocol adherence and good quality treatment and try to assess what deviations in treatment result in impaired outcome, and how and at what costs they could be corrected. With databases of many patients with comparable diseases this must be feasible. For some studies these issues were worked out. These papers were reported in this review, with some finding no influence of their audit on outcome [59], while others did report a possible influence of audit on outcome [66,60]. Another way to decrease costs of

231

audits is the use of informatics such as in telepathology and teleradiology. These techniques are especially useful for pathologists and radiologists to assess eligibility criteria and for response evaluation. Also data management could be simplified by electronic CRFs that may be incorporated in the (future) computerised patient file. The fast way of processing the data and the easy way of exchange of data, albeit that attention should be paid to privacy securing, makes it also easier to intervene in studies to adjust deviations at an early stage. Regarding the second part of the definition of quality assurance: assessing the organisation for its capability of delivering the required product, not much information is available in the scientific literature. Just very recently, because of its 40th anniversary the EORTC has published information on its organisation to perform good quality clinical trials [113,114]. One of the measures to improve quality assurance might be accreditation of the co-operative groups for their production of clinical trials by an independent organisation. This could reveal the strength and weakness of the co-operative groups that can be attacked in order to improve the quality of the clinical trials. Assessment of the organisational structure, investments in education and training of co-investigators are examples to be considered. Another way to improve performance is an increase of information exchange and interaction between different co-operative groups. By comparison of their structure, process and outcome opportunities are created to choose the best organisation structure. Also, by co-operation, these groups will be able to define universal standards of both diagnosis and treatment for patients in trials and in daily practice. For every day practice parallels can be made: the difficulties encountered in the implementation and quality assurance of trials will also recur when trial results are to be implemented in daily practice. Until recently this aspect was underestimated in clinical research. Merely publishing trial results in peer reviewed journals is often not enough to change practice behaviour of physicians. To our opinion the methods used to implement a trial in a co-operative group deserve more attention and should be an integral part of the trial. Its effects on quality assurance should be measured and reported in order to gain effective instruments to translate and implement the research findings to daily practice. Furthermore it seems worthwhile to compare treatment results in daily practice (phase IV) studies with the results mentioned in clinical trials as a final quality test.

6. Conclusions Quality assurance of clinical trials is complex and requires an organisation that is able to act adequately

232

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

and quickly at different levels of the development of the protocol to the publication of the study results. Close contact is needed with the developers of the protocol, the investigators and their institutions, the data management and statisticians that evaluate study results and the writers and publishers of the study results. The organisation should also be able to cope with different cultural backgrounds and international legal aspects of trials. It should play a key role in the development of standards for diagnosing and treatment of cancer. Since quality assurance is expensive, there is a need for a thorough evaluation of the effectiveness of quality assurance measures, including cost-benefit analyses. From this review quality control with audits appeared the most frequent measure of quality assurance. Costeffective ways of education and training should be explored. More attention should be paid to the results of quality assurance in relation to outcome. Also the feasibility of similar quality assurance measures for the diagnosis and treatment of patients in daily practice should be explored.

Reviewers Steven Hirschfeld , M.D., Ph.D., Medical Officer, US Public Health Service, Board Certified in Pediatric Hematology/Oncology, Food and Drug Administration, HFD-150, 1451 Rockville Pike, Rockville, MD 20852, USA. Prof. Dr. Allan T. van Oosterom , U.Z. Gasthuisberg, Department of Oncology Herestraat 49, B-3000 Leuven, Belgium. Prof. Jean-Claude Horiot , Directeur, Centre de Lutte contre le Cancer G.F. Leclerc, 1, rue marion, BP 77980, F-21079 Dijon, France.

References [1] Hoyle D. ISO9000 quality systems handbook. Oxford: Butterworth-Heinemann Ltd, 1994. [2] Glicksman AS, Reinstein LE, Brotman R, McShan D. Quality assurance programs in clinical trials. Cancer Treat Rep 1980;64(2 /3):425 /33. [3] Wirtschafter D, Carpenter JT, Mesel E. A consultant /extender system for breast cancer adjuvant chemotherapy. Ann Intern Med 1979;90(3):396 /401. [4] Sylvester RJ, Pinedo HM, De Pauw M, Staquet MJ, Buyse ME, Renard J, et al. Quality of institutional participation in multicenter clinical trials. N Engl J Med 1981;8 /10(15):852 /5. [5] Balch CM, Durant JR, Bartolucci AA. The impact of surgical quality control in multi-institutional group trials involving adjuvant cancer treatments. Ann Surg 1983;198(2):164 /7. [6] Buyse ME, Staquet MJ, Sylvester RJ. Cancer clinical trials: methods and practice. Oxford: Oxford University Press, 1984.

[7] National Cancer Institute. Investigators Handbook. Cancer evaluation program. Bethesda: National Cancer Institute, Division of Cancer Treatment; 1986. [8] Meinert. Clinical trials design, conduct and analysis. Clinical Trials Design, Conduct and Analysis. New York: Oxford Press; 1986, pp. 166 /76. [9] EORTC, A practical guide to EORTC studies; 1996. [10] Haase GM. The implications of surgical quality assurance in cancer clinical trials. Cancer 1994;74(Suppl 9):2630 /7. [11] Simon RM. Clinical trials in cancer. In: DeVita VT, Hellman S, Rosenberg S, editors. Cancer. Principles and practice of oncology, 6th ed.. Philadelphia: J.B. Lippincott Company, 1997:521 / 45. [12] Djulbegovic B, Lacevic M, Cantor A, Fields KK, Bennett CL, Adams JR, et al. The uncertainty principle and industrysponsored research. The Lancet 2000;356(9230):635 /8. [13] Begg CB, Engstrom PF. Eligibility and extrapolation in cancer clinical trials. J Clin Oncol 1987;5(6):962 /8. [14] George SL. Reducing patient eligibility criteria in cancer clinical trials. J Clin Oncol 1996;14(4):1364 /70. [15] Peto R, Collins R, Gray R. Large-scale randomized evidence: large, simple trials and overviews of trials. J Clin Epidemiol 1995;48(1):23 /40. [16] Armitage P. Controversies and achievements in clinical trials. Control Clin Trials 1984;5(1):67 /72. [17] Hawkins BS. Evaluating the benefit of clinical trials to future patients. Control Clin Trials 1984;5(1):13 /32. [18] Weiss RB, Gill GG, Hudis CA. An on-site audit of the South African trial of high-dose chemotherapy for metastatic breast cancer and associated publications. J Clin Oncol 2001;19(11):2771 /7. [19] Overgaard M, Hansen PS, Overgaard J, Rose C, Andersson M, Bach F, et al. Postoperative radiotherapy in high-risk premenopausal women with breast cancer who receive adjuvant chemotherapy. Danish Breast Cancer Cooperative Group 82b Trial. N Engl J Med 1997;337(14):949 /55. [20] Torri V, Simon R, Russek-Cohen E, Midthune D, Friedman M. Statistical model to determine the relationship of response and survival in patients with advanced ovarian cancer treated with chemotherapy. J Natl Cancer Inst 1992;84(6):407 /14. [21] Therasse P, Arbuck SG, Eisenhauer EA, Wanders J, Kaplan RS, Rubinstein L, et al. New guidelines to evaluate the response to treatment in solid tumors. European Organization for Research and Treatment of Cancer, National Cancer Institute of the United States, National Cancer Institute of Canada. J Natl Cancer Inst 2000;92(3):205 /16. [22] Freiman JA, Chalmers TC, Smith H, Jr., Kuebler RR. The Importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 ‘Negative’ trials. N Engl J Med 1978;299(13):690 /4. [23] Simon R. Randomized clinical trials in oncology. Principles and obstacles. Cancer 1994;74(Suppl 9):2614 /9. [24] Simon R. Confidence intervals for reporting results of clinical trials. Ann Intern Med 1986;105(3):429 /35. [25] Fleming TR, Watelet LF. Approaches to monitoring clinical trials. J Natl Cancer Inst 1989;81(3):188 /93. [26] Pollock BH. Quality assurance for interventions in clinical trials. Multicenter data monitoring, data management, and analysis. Cancer 1994;74(Suppl 9):2647 /52. [27] Mariani L, Marubini E. Content and quality of currently published phase II cancer trials. J Clin Oncol 2000;18(2):429 /36. [28] Nicolucci A, Grilli R, Alexanian AA, Apolone G, Torri V, Liberati A. Quality, evolution, and clinical implications of randomized, controlled trials on the treatment of lung cancer. A lost opportunity for meta-analysis. JAMA 1989;262 (15):2101 /7.

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235 [29] Marsoni S, Torri W, Taiana A, Gambino A, Grilli R, Liati P, et al. Critical review of the quality and development of randomized clinical trials (RCTs) and their influence on the treatment of advanced epithelial ovarian cancer. Ann Oncol 1990;1(5):343 / 50. [30] Liberati A, Himel HN, Chalmers TC. A quality assessment of randomized control trials of primary treatment of breast cancer. J Clin Oncol 1986;4(6):942 /51. [31] Glasziou P, Irwig L. The Quality and interpretation of mammographic screening trials for women ages 40 /49. J Natl Cancer Inst Monogr 1997(22):73 /7. [32] Thiesse P, Ollivier L, Stefano-Louineau D, Negrier S, Savary J, Pignard K, et al. Response rate accuracy in oncology trials: reasons for interobserver variability. Groupe Francais DImmunotherapie of the Federation Nationale Des Centres De Lutte Contre Le Cancer. J Clin Oncol 1997;15(12):3507 /14. [33] Holmes EC. General principles of surgery quality control. Chest 1994;106(Suppl 6):334S /6S. [34] Birk D, Bassi C, Beger HG. Need for a standard report and future directions in pancreatic resections for cancer. Dig Surg 1999;16(4):276 /80. [35] Douglass HO, Jr.. Adjuvant therapy of gastric cancer: have we made any progress? Ann Oncol 1994;5(Suppl 3):49 /57. [36] Axillary Dissection. The Steering Committee on Clinical Practice Guidelines for the Care and Treatment of Breast Cancer. Canadian Association of Radiation Oncologists. CMAJ 1998;158 (Suppl 3):S22 /S26. [37] Kranenbarg EK, van De Velde CJ. Importance of organizing surgical trials in oncology. Jpn J Clin Oncol 1999;29(4):185 /6. [38] Jacobs JR, Pajak TF, Weymuller E, Sessions D, Schuller DE. Development of surgical quality-control mechanisms in largescale prospective trials: head and neck intergroup report. Head Neck 1991;13(1):28 /32. [39] Bunt AM, Hermans J, van De Velde CJ, Sasako M, Hoefsloot FA, Fleuren G, et al. Lymph node retrieval in a randomized trial on western-type versus Japanese-type surgery in gastric cancer. J Clin Oncol 1996;14(8):2289 /94. [40] Jacobs JR, Pajak TF, Snow JB, Lowry LD, Kramer S. Surgical quality control in head and neck cancer. Study 73-03 of the Radiation Therapy Oncology Group. Arch Otolaryngol Head Neck Surg 1989;115(4):489 /93. [41] Bijker N, Rutgers EJ, Peterse JL, Fentiman IS, Julien JP, Duchateau L, et al. Variations in diagnostic and therapeutic procedures in a multicentre, randomized clinical trial (EORTC 10853) investigating breast-conserving treatment for DCIS. Eur J Surg Oncol 2001;27(2):135 /40. [42] Christiaens MR. Documentation of the surgical procedure: a tool for quality assessment for breast conservative treatment. Anticancer Res 1996;16(6C):3955 /8. [43] Omura GA. Misstaging ovarian cancer. Obstet Gynecol 1986;67(1):150 /1. [44] Holm T, Johansson H, Cedermark B, Ekelund G, Rutqvist LE. Influence of hospital- and surgeon-related factors on outcome after treatment of rectal cancer with or without preoperative radiotherapy. Br J Surg 1997;84(5):657 /63. [45] Lise M, Nitti D, Marchet A, Sahmoud T, Buyse M, Duez N, et al. Final results of a phase III clinical trial of adjuvant chemotherapy with the modified fluorouracil, doxorubicin, and mitomycin regimen in resectable gastric cancer. J Clin Oncol 1995;13(11):2757 /63. [46] Kapiteijn E, Kranenbarg EK, Steup WH, Taat CW, Rutten HJ, Wiggers T, et al. Total mesorectal excision (TME) with or without preoperative radiotherapy in the treatment of primary rectal cancer. Prospective randomised trial with standard operative and histopathological techniques. Dutch ColoRectal Cancer Group. Eur J Surg 1999;165(5):410 /20.

233

[47] Holm T, Rutqvist LE, Johansson H, Cedermark B. Abdominoperineal resection and anterior resection in the treatment of rectal cancer: results in relation to adjuvant preoperative radiotherapy. Br J Surg 1995;82(9):1213 /6. [48] Weeden S, Grimer RJ, Cannon SR, Taminiau AH, Uscinska BM. The effect of local recurrence on survival in resected osteosarcoma. Eur J Cancer 2001;37(1):39 /46. [49] Schraffordt Koops H. Surgical quality control in an International randomized clinical trial. Eur J SurgOncol 1992;18(6):525 /9. [50] Reynolds JV, Mercer P, McDermott EW, Cross S, Stokes M, Murphy D, et al. Audit of complete axillary dissection in early breast cancer. Eur J Cancer 1994;30A(2):148 /9. [51] Bass SS, Cox CE, Reintgen DS. Learning curves and certification for breast cancer lymphatic mapping. Surg Oncol Clin N Am 1999;8(3):497 /509. [52] Krag D, Weaver D, Ashikaga T, Moffat F, Klimberg VS, Shriver C, et al. The sentinel node in breast cancer */a multicenter validation study. N Engl J Med 1998;339(14):941 /6. [53] Summers AN, Rinehart GC, Simpson D, Redlich PN. Acquisition of surgical skills: a randomized trial of didactic, videotape, and computer-based training. Surgery 1999;126(2):330 /6. [54] Kanouse DE, Jacoby I. When does information change practitioners’ behavior? Int J Technol Assess Health Care 1988;4(1):27 /33. [55] Thunnissen FB, Ambergen AW, Koss M, Travis WD, O’Leary TJ, Ellis IO. Mitotic counting in surgical pathology: sampling bias, heterogeneity and statistical uncertainty. Histopathology 2001;39(1):1 /8. [56] Hermanek P, Wittekind C. The pathologist and the residual tumor (R) classification. Pathol Res Pract 1994;190(2):115 /23. [57] Nagtegaal ID, Kranenbarg EK, Hermans J, van De Velde CJ, van Krieken JH. Pathology data in the central databases of multicenter randomized trials need to be based on pathology reports and controlled by trained quality managers. J Clin Oncol 2000;18(8):1771 /9. [58] Fisher ER, Costantino J. Quality assurance of pathology in clinical trials. The National surgical adjuvant breast and bowel project experience. Cancer 1994;74(Suppl 9):2638 /41. [59] Gilchrist KW, Harrington DP, Wolf BC, Neiman RS. Statistical and empirical evaluation of histopathologic reviews for quality assurance in the Eastern Cooperative Oncology Group. Cancer 1988;62(5):861 /8. [60] Wolf BC, Gilchrist KW, Mann RB, Neiman RS. Evaluation of pathology review of malignant lymphomas and Hodgkin’s disease in Cooperative Clinical Trials. The Eastern Cooperative Oncology Group Experience. Cancer 1988;62(7):1301 /5. [61] Cocker J, Fox H, Langley FA. Consistency in the histological diagnosis of epithelial abnormalities of the cervix uteri. J Clin Pathol 1968;21(1):67 /70. [62] Buckley CH, Butler EB, Fox H. Cervical intraepithelial neoplasia. J Clin Pathol 1982;35(1):1 /13. [63] Morson BC. Histopathology reporting in large-bowel cancer. Br Med J (Clin Res Ed) 1981;283(6305):1493 /4. [64] Farmer ER, Gonin R, Hanna MP. Discordance in the histopathologic diagnosis of melanoma and melanocytic nevi between expert pathologists. Hum Pathol 1996;27(6):528 /31. [65] Howat AJ, Beck S, Fox H, Harris SC, Hill AS, Nicholson CM, et al. Can histopathologists reliably diagnose molar pregnancy? J Clin Pathol 1993;46(7):599 /602. [66] Van der Meijden A, Sylvester R, Collette L, Bono A, Ten Kate F. The role and impact of pathology review on stage and grade assessment of stages Ta and T1 bladder tumors: a combined analysis of 5 European Organization for Research and Treatment of Cancer Trials. J Urol 2000;164(5):1533 /7.

234

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

[67] Linder J. Automation of the papanicolaou smear: a technology assessment perspective. Arch Pathol Lab Med 1997;121(3):282 / 6. [68] Patten SF, Jr., Lee JS, Nelson AC. NeoPath, Inc. NeoPath AutoPap 300 Automatic Pap Screener System. Acta Cytol 1996;40(1):45 /52. [69] Horiot JC, Johansson KA, Gonzalez DG, van der Schueren E, van den Bogaert W, Notter G. Quality assurance control in the EORTC cooperative group of radiotherapy. 1. Assessment of radiotherapy staff and equipment. European Organization for Research and Treatment of Cancer. Radiother Oncol 1986;6(4):275 /84. [70] Bernier J, Horiot JC, Bartelink H, Johansson KA, Cionini L, Gonzalez D, et al. Profile of radiotherapy departments contributing to the cooperative group of radiotherapy of the European Organization for Research and Treatment of Cancer. Int J Radiat Oncol Biol Phys 1996;34(4):953 /60. [71] Hunig R, Landmann C, Roth J, Reinstein LE, Glicksman AS. Quality control of radiotherapy in acute lymphocytic leukemia protocol treatment: experience with 610 cases. Eur J Cancer Clin Oncol 1983;19(11):1585 /91. [72] Johansson KA, Horiot JC, Van Dam J, Lepinoy D, Sentenac I, Sernbo G. Quality assurance control in the EORTC cooperative group of radiotherapy. 2. Dosimetric intercomparison. Radiother Oncol 1986;7(3):269 /79. [73] Golden R, Cundiff JH, Grant WH, III, Shalek RJA. Review of the activities of the AAPM radiological physics center in interinstitutional trials involving radiation therapy. Cancer 1972;29(6):1468 /72. [74] International Electrotechnical Commission. Medical electron accelerators in the range 1-50 MeV. Performance tolerance; 1981. Report No.: subcommittee 62C-18. [75] Hansson U, Johansson KA, Horiot JC, Bernier J, Mailed TL. Dosimetry programme for machine output check and clinical application in the EORTC Radiotherapy Group. Radiother Oncol 1993;29(2):85 /90. [76] Wallner PE, Lustig RA, Pajak TF, Robinson G, Davis LW, Perez CA, et al. Impact of initial quality control review on study outcome in lung and head/neck cancer studies */review of the radiation therapy oncology group experience. Int J Radiat Oncol Biol Phys 1989;17(4):893 /900. [77] Hafermann MD, Gibbons RP, Murphy GP. Quality control of radiation therapy in multi-institutional randomized clinical trial for localized prostate cancer. Urology 1988;31(2):119 /24. [78] Martenson JA, Jr., Urias R, Smalley SR, Coia LR, Tepper JE, Rotman M, et al. Radiation therapy quality control in a clinical trial of adjuvant postoperative treatment for rectal cancer. Int J Radiat Oncol Biol Phys 1995;32(1):51 /5. [79] Pajak TF, Laramore GE, Marcial VA, Fazekas JT, Cooper J, Rubin P, et al. Elapsed treatment days */a critical item for radiotherapy quality control review in head and neck trials: RTOG report. Int J Radiat Oncol Biol Phys 1991;20(1):13 /20. [80] Duhmke E, Diehl V, Loeffler M, Mueller RP, Ruehl U, Willich N, et al. Randomized trial with early-stage Hodgkin’s disease testing 30 Gy vs. 40 Gy extended field radiotherapy alone. Int J Radiat Oncol Biol Phys 1996;36(2):305 /10. [81] Martin LA, Krall JM, Curran WJ, Leibel SA, Cox JD. Influence of a sampling review process for radiation oncology quality assurance in cooperative group clinical trials */results of the Radiation Therapy Oncology Group (RTOG) analysis. Radiother Oncol 1995;36(1):9 /14. [82] Deutsch M, Bryant J, Bass G. Radiotherapy review on national surgical adjuvant breast and bowel project (NSABP) phase III breast cancer clinical trials: is there a need for submission of portal/simulation films? Am J Clin Oncol 1999;22(6):606 /8. [83] van Tienhoven G, van Bree NA, Mijnheer BJ, Bartelink H. Quality assurance of the EORTC trial 22881/10882: assessment

[84] [85]

[86]

[87]

[88]

[89]

[90]

[91]

[92]

[93]

[94]

[95]

[96]

[97]

[98]

[99]

of the role of the booster dose in breast conserving therapy: the dummy run. EORTC radiotherapy cooperative group. Radiother Oncol 1991;22(4):290 /8. Valley JF, Mirimanoff RO. Comparison of treatment techniques for lung cancer. Radiother Oncol 1993;28(2):168 /73. Dusserre A, Garavaglia G, Giraud JY, Bolla M. Quality assurance of the EORTC radiotherapy trial 22863 for prostatic cancer: the dummy run. Radiother Oncol 1995;36(3):229 /34. Valley JF, Bernier J, Tercier PA, Fogliata-Cozzi A, Rosset A, Garavaglia G, et al. Quality assurance of the EORTC radiotherapy trial 22931 for head and neck carcinomas: the dummy run. Radiother Oncol 1998;47(1):37 /44. Leunens G, Menten J, Weltens C, Verstraete J, van der Schueren E. Quality assessment of medical decision making in radiation oncology: variability in target volume delineation for brain tumours. Radiother Oncol 1993;29(2):169 /75. Ketting CH, Austin-Seymour M, Kalet I, Unger J, Hummel S, Jacky J. Consistency of three-dimensional planning target volumes across physicians and institutions. Int J Radiat Oncol Biol Phys 1997;37(2):445 /53. Worsnop BR. Phantom thermoluminescent dosimeter comparison for a cooperative radiotherapy trial. Radiology 1968;91(3):545 /53. Johansson KA, Horiot JC, van der Schueren E. Quality assurance control in the EORTC cooperative group of radiotherapy. 3. Intercomparison in an anatomical phantom. Radiother Oncol 1987;9(4):289 /98. Paliwal BR, Ritter MA, McNutt TR, Mackie TR, Thomadsen BR, Purdy JA, et al. Solid water Pelvic and Prostate Phantom for Imaging, Volume Rendering, Treatment Planning, and Dosimetry for an RTOG Multi-Institutional, 3-D Dose Escalation Study. Radiation Therapy Oncology Group. Int J Radiat Oncol Biol Phys 1998;42(1):205 /11. Bentzen SM, Bernier J, Davis JB, Horiot JC, Garavaglia G, Chavaudra J, et al. Clinical impact of dosimetry quality assurance programmes assessed by radiobiological modelling of data from the thermoluminescent dosimetry study of the European Organization for Research and Treatment of Cancer. Eur J Cancer 2000;36(5):615 /20. Vantongelen K, Steward W, Blackledge G, Verweij J, Van Oosterom A. EORTC joint ventures in quality control: treatment-related variables and data acquisition in chemotherapy trials. Eur J Cancer 1991;27(2):201 /7. Steward WP, Vantongelen K, Verweij J, Thomas D, van Oosterom AT. Chemotherapy administration and data collection in an EORTC collaborative group */can we trust the results? Eur J Cancer 1993;29A(7):943 /7. Favalli G, Vermorken JB, Vantongelen K, Renard J, van Oosterom AT, Pecorelli S. Quality control in multicentric clinical trials. An experience of the EORTC gynecological cancer cooperative group. Eur J Cancer 2000;36(9):1125 /33. Weiss RB, Vogelzang NJ, Peterson BA, Panasci LC, Carpenter JT, Gavigan M, et al. Successful system of scientific data audits for clinical trials. A report from the cancer and leukemia group B. JAMA 1993;270(4):459 /64. Bonadonna G, Valagussa P, Moliterni A, Zambetti M, Brambilla C. Adjuvant cyclophosphamide, methotrexate, and fluorouracil in node-positive breast cancer: the results of 20 years of follow-up. N Engl J Med 1995;332(14):901 /6. Benedetti-Panici P, Greggi S, Colombo A, Amoroso M, Smaniotto D, Giannarelli D, et al. Neoadjuvant chemotherapy and radical surgery versus exclusive radiotherapy in locally advanced squamous cell cervical cancer: results from the Italian Multicenter Randomized Study. J Clin Oncol 2002;20(1):179 / 88. Verweij J, Nielsen OS, Therasse P, van Oosterom AT. The use of a systemic therapy checklist improves the quality of data

P.B. Ottevanger et al. / Critical Reviews in Oncology/Hematology 47 (2003) 213 /235

[100]

[101]

[102]

[103]

[104]

[105]

[106] [107]

[108]

[109]

[110] [111]

[112]

[113]

acquisition and recording in multicentre trials. A study of the EORTC soft tissue and bone sarcoma group. Eur J Cancer 2002;33(7):1045 /9. Knatterud GL, Rockhold FW, George SL, Barton FB, Davis CE, Fairweather WR, et al. Guidelines for quality assurance in multicenter trials: a position paper. Control Clin Trials 1998;19(5):477 /93. Begg CB, Carbone PP, Elson PJ, Zelen M. Participation of community hospitals in clinical trials: analysis of five years of experience in the eastern cooperative oncology group. N Engl J Med 1982;306(18):1076 /80. Vantongelen K, Rotmensz N, van der Schueren E. Quality control of validity of data collected in clinical trials. EORTC study group on data management (SGDM). Eur J Cancer Clin Oncol 1989;25(8):1241 /7. Hosking JD, Newhouse MM, Bagniewska A, Hawkins BS. Data collection and transcription. Control Clin Trials 1995;16(1):66S / 103S. Hansen PS, Andersen E, Andersen KW, Mouridsen HT. Quality control of end results in a Danish adjuvant breast cancer multicenter study. Acta Oncol 1997;36(7):711 /4. Gibson D, Harvey AJ, Everett V, Parmar MK. Is double data entry necessary? The CHART trials. CHART steering committee. Continuous, hyperfractionated, accelerated radiotherapy. Control Clin Trials 1994;15(6):482 /8. Day BS, Fayers BS, Harvey BS. Double data entry: what value, what price? Control Clin Trials 1998;19(1):15 /24. Kleinman K. Adaptive double data entry: a probabilistic tool for choosing which forms to reenter. Control Clin Trials 2001;22(1):2 /12. Schaake-Koning C, Kirkpatrick A, Kroger R, van Zandwijk N, Bartelink H. The need for immediate monitoring of treatment parameters and uniform assessment of patient data in clinical trials. A quality control study of the EORTC radiotherapy and lung cancer cooperative groups. Eur J Cancer 1991;27(5):615 /9. Christian MC, McCabe MS, Korn EL, Abrams JS, Kaplan RS, Friedman MA. The National Cancer Institute Audit of the National Surgical Adjuvant Breast and Bowel Project Protocol B-06. George SLA. Survey of monitoring practices in cancer clinical trials. Stat Med 1993;12(5 /6):435 /50. Smith MA, Ungerleider RS, Korn EL, Rubinstein L, Simon R. Role of independent data-monitoring committees in randomized clinical trials sponsored by the National Cancer Institute. J Clin Oncol 1997;15(7):2736 /43. Califf RM, Karnash SL, Woodlief LH. Developing systems for cost-effective auditing of clinical trials. Control Clin Trials 1997;18(6):651 /60. Meunier F, van Oosterom AT. 40 Years of the EORTC: the evolution towards a unique network to develop new standards of

235

cancer care. European Organisation for Research and Treatment of Cancer. Eur J Cancer 2002;38(Suppl 4):S3 /S13. [114] Lardot C, Steward W, Van Glabbeke M, Armand JP. Scientific review of EORTC trials: the functioning of the new treatment committee and protocol review committee. Eur J Cancer 2002;38(Suppl 4):S24 /30.

Biographies P.B. Ottevanger is a medical oncologist, working as a senior staff member in the department of medical Oncology, University Medical Centre Nijmegen. She is also affiliated to the Centre for Research in Quality of Care of the Univerity Medical Centre Nijmegen. Recent publications on quality of care: Quality of adjuvant chemotherapy in primary breast cancer in a non-trial setting, a Comprehensive Cancer Centre study. P.B. Ottevanger, C.A. Verhagen, L.V. Beex. Eur J Cancer 99; 35(3): 386/91. Effects of quality of treatment on prognosis in primary breast cancer patients treated in daily practice. P.B. Ottevanger, P.H.M. De Mulder, R.P.T.M. Grol, H. Van Lier, L.V.A.M. Beex, The breast cancer study group in the Comprehensive Cancer Centre East of the Netherlands, in press. Quality of adjuvant chemotherapy in primary breast cancer, a Comprehensive Cancer Center study. P.B. Ottevanger, L.V.A.M. Beex, P.G.M. Peer, A.v.d. Linden, L.J. Schouten and the breast cancer study group of the Comprehensive Cancer Center East Netherlandes. Eur J Cancer 1994;30A(Suppl 2): 68. (Rhone Poulenc Rorer Oncology Award). Quality of chemotherapy for malignant epithelial ovarian carcinoma in a non trial setting. P. Ottevanger, L. Beex, R. Grol, P. De Mulder, Abstract Eur J Cancer 99: ECCO 99, Vienna. Eur J Cancer 35(Suppl 4) S238-9. Reassessment of adherence to a guideline for primary breast cancer, a Comprehensive Cancer Centre study. P. Ottevanger, P. De Mulder L. Beex, A. Ruhl, R. Grol, Abstract Eur J Cancer 2001, 37(Suppl 6) S150.