Importance of sample-related effects for commutability testing according to the EP14 protocol

Importance of sample-related effects for commutability testing according to the EP14 protocol

Clinica Chimica Acta 411 (2010) 1378–1379 Contents lists available at ScienceDirect Clinica Chimica Acta j o u r n a l h o m e p a g e : w w w. e l ...

205KB Sizes 0 Downloads 5 Views

Clinica Chimica Acta 411 (2010) 1378–1379

Contents lists available at ScienceDirect

Clinica Chimica Acta j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / c l i n c h i m

Letter to the Editor Importance of sample-related effects for commutability testing according to the EP14 protocol

Dear Editor, The EP14 guideline from the Clinical and Laboratory Standards Institute recommends, for commutability testing of materials with 2 methods, measurement of the materials in parallel with at least 20 native samples, in triplicate, preferably within 1 run [1]. The parameters influencing the statistical power in the detection of matrix effects have been described before [2]. We investigated our hypothesis that replication may not lead to the desired reduction in analytical variation, because dominance of sample-related effects. We did this by performing a commutability study for serum total calcium. We measured 30 serum samples and 3 proficiency testing (PT) materials with the Modular P o-cresolphthalein complexone assay (Roche Diagnostics GmbH) and indirect potentiometry on the DXC880I analyzer (Beckman Coulter) (for details on the samples, see also Data Supplement to this article). The samples were randomly measured 5 times within-run. The most representative singlicate and replicate (n = 2, 3 and 4) results were selected on the basis of the SD of the differences (for a description of the selection procedure, see Data Supplement). The within-run CV (CVWR) was calculated by pooling the SDs of the 5 replicates for the 30 samples. The relationship between the results (Roche = x; Beckman Coulter = y) was determined by Deming regression analysis and the residuals were calculated as deviation of the measured y-value from that predicted by regression. We use the terms ‘observed’ SD of the regression residuals (SDRR) and ‘expected’ SDRR to refer to the spread of the residuals and those calculated from the within-run SD (SDWR) as √[SD2WR,x +SD2WR,y], respectively. We applied the EP 14 statistics for testing the commutability, however, presented the results in a difference plot, instead of a scatter plot [1]. The mean calcium concentration in the samples was 2.22 mmol/L. Both assays had a CVWR of 0.54%. This gives for a singlicate an expected SDRR of 0.017 mmol/L (=√2 × 0.012 mmol/L). For replicates it is reduced by 1/√n, which comes down to 0.0076 mmol/L for n = 5. For the observed SDRR the reduction goes from 0.0191 mmol/L (n = 1) to 0.0137 (n = 5) mmol/L, only. Note that from n ≥ 3 on, the observed SDRR did not significantly decrease anymore. This corresponds to the presence of sample-related effects in the order of 0.011 mmol/L (∼65% of the imprecision component); for example, the expected SDRR (including sample-related effects) for 5 replicates would be √[0.00762 + 0.0112] = 0.0134 mmol/L (observed: 0.0137 mmol/L). On the other hand, 2 by 2 averaging of the results for samples with consecutive concentrations reduces the observed SDRR considerably for n ≥ 3 replicates. It becomes nearly the same as the expected SDRR from replication, e.g., for n = 3, the expected and observed SDRR become 0.0098 and 0.0095 mmol/L, respectively, for n = 5, both become 0.0076 mmol/L. The influence of averaging of the results for consecutive samples on the prediction intervals calculated according to the EP14 protocol is also documented in the graphical presentation of the data 0009-8981/$ – see front matter © 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.cca.2010.04.024

from our commutability study (Fig. 1A and B). The outer (short dashed line) and inner (long dashed line) prediction intervals are constructed from the replicates of the not averaged 30 samples and of the samples averaged 2 by 2, respectively. Averaging (remaining sample size = 15) results in a reduced extent of the inner prediction interval, however, increasing the number of replicates from 3 to 5 does not significantly reduce the prediction intervals (compare A with B). Note that further averaging per 3 samples increases the prediction intervals again due to the low sample size (data not shown). Overall, the graph shows that the low and high PT samples are not commutable when tested with the Roche and Beckman Coulter assay. The data confirm our hypothesis, i.e. that the beneficial effect of replicate measurements in commutability testing according to the EP14

Fig. 1. Difference plot with EP14 prediction intervals for 3 replicates (A) and 5 replicates (B). The crosses represent the differences of results for samples with consecutive concentrations averaged 2 by 2 (remaining sample size= 15). The outer (short dashed line) and inner (long dashed line) prediction intervals are constructed from the replicates of the not averaged 30 samples and of the samples averaged 2 by 2, respectively. The closed triangles indicate the differences of the 3 materials investigated for commutability.

Letter to the Editor

protocol is limited because sample-related effects become dominant. This effect may particularly be pronounced with highly precise assays, such as those used here for measurement of serum calcium. So far, our conclusions are concordant with those by Long [2]. His study of the parameters influencing the power of detection of matrix effects showed indeed that the natural variation of samples (=sample-related effects) may be more influential than other sources of error (the number of samples and the analytical variation of the test). However, whereas he discussed the beneficial effect on the power of increasing the number of samples and replicates, he does not propose a solution for the uncontrollable source of error caused by the patient samples. In this regard our study goes beyond the observation of the dominance of sample-related effects, in that we propose a solution, consisting of pooling of data from several samples. However, knowing that a sample size ≥20 is desirable for the calculation of the regression prediction intervals, we realize that the drawback of our solution is that it requires a sample size of at least 40 to get the same reliability of the prediction intervals as described in the EP14 document. Likewise, pooling of data per 3 samples would require 60 samples. Since this increases the measurement burden, we tried an alternative consisting of physical pooling of several samples. However, upon pooling of 5 samples, we experienced that this practice risks to introduce matrix effects. Some pools became turbid and sometimes we were unable to recover the concentration predicted from the samples constituting the pool (data not shown). Last but not least, we want to draw the attention on the fact that beyond the assessment of commutability, sample-related-effects are important in all experimental protocols that intend to reduce analytical variability by replication of measurements, for example, when assessing the comparability of results between different analyzers within one health care system [3]. In such situations, it is generally recommended to investigate the advantages of pooling data from several individual samples versus the increase of replicates on single

1379

samples. Also, sample-related effects should be considered in total error and uncertainty calculations. Supplementary materials related to this article can be found online at doi:10.1016/j.cca.2010.04.024. Acknowledgments The authors are indebted to V. Stove, Department of Clinical Chemistry, Ghent University Hospital (Belgium) and J. Van Nueten and R. Peeters, Analis (Namur, Belgium) for performing the measurements with the Roche and Beckman Coulter assay, respectively. References [1] Clinical and Laboratory Standards Institute (CLSI). Evaluation of matrix effects. Approved guideline EP14-A2. Wayne, Pennsylvania: CLSI1-56238-561-5; 2005. [2] Long T. Statistical power in the detection of matrix effects. Arch Pathol Lab Med 1993;117:387–92. [3] Clinical and Laboratory Standards Institute (CLSI). Verification of comparability of patient results within one health care system. Approved guideline C54-A. Wayne, Pennsylvania: CLSI1-56238-671-9; 2008.

Dietmar Stöckl Hedwig C.M. Stepman Sofie K. Van Houcke Linda M. Thienpont⁎ Laboratory for Analytical Chemistry, Faculty of Pharmaceutical Sciences, Ghent University, Harelbekestraat 72, Ghent, Belgium ⁎Corresponding author. Tel.: + 32 9 264 81 29; fax: + 32 9 264 81 98. E-mail address: [email protected] (L.M. Thienpont). 26 March 2010