Measurement 44 (2011) 1461–1467
Contents lists available at ScienceDirect
Measurement journal homepage: www.elsevier.com/locate/measurement
Analysis of interlaboratory comparisons affected by correlations of the reference standards and drift of the travelling standards Mercede Bergoglio ⇑, Andrea Malengo, Domenico Mari Istituto Nazionale di Ricerca Metrologica – INRIM, Strada delle Cacce 91, Torino, Italy
a r t i c l e
i n f o
Article history: Received 19 November 2010 Received in revised form 5 April 2011 Accepted 24 May 2011 Available online 7 June 2011 Keywords: Comparison analysis Drift evaluation Pressure Reference value
a b s t r a c t The paper describes a method for analysing interlaboratory comparison results among accredited laboratories. Interlaboratory comparisons are used in order to achieve or to maintain the accreditation following the ISO/IEC 17025 standard. The used approach permits an evaluation of the reference value and of the degree of equivalence in the cases in which the laboratories participating are mutually dependent and the travelling standards are not stable during the comparison. This approach, as an example, was applied to analyse the SIT (Servizio di Taratura in Italia) interlaboratory comparisons carried out for the pressure quantity in the range from 10 MPa to 100 MPa (gage mode and liquid medium) and in the range from 0.15 MPa to 7 MPa (gage mode and gas medium). Ó 2011 Elsevier Ltd. All rights reserved.
1. Introduction In the framework of the CIPM MRA (Mutual Recognition Arrangement of the International Committee for Weights and Measures) the key comparisons and the supplementary comparisons play a fundamental role to establish the basis for the international mutual recognition of the service provided by the National Measurements Institutes (NMIs) [1]. Interlaboratory comparisons are also often used in the framework of the ILAC, EA MRA (Arrangement of the International Laboratory Accreditation Cooperation), which are performed by the accredited laboratories in order to achieve or to maintain the accreditation following the ISO/IEC 17025 standard. The major output of a measurement comparison is the reference value, which is determined by a statistical analysis of the measurement values and the associated uncertainties stated by the participating laboratories. On the basis of this result, for each laboratory, the degree of equivalence, which provides the level of agreement with the reference value, is determined.
⇑ Corresponding author. Tel.: +39 0113919920; fax: +39 0113919926. E-mail address:
[email protected] (M. Bergoglio). 0263-2241/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.measurement.2011.05.012
In recent years, efforts have been made to develop statistical procedures to evaluate objectively the reference value, which is expected to be a ‘‘good’’ value for the artefact being compared. As a general rule, a procedure have to be developed in accordance with the law of propagation of uncertainties, applying a suitable estimator to the model describing the reference value of the comparison. The guide [2] of the BIPM Director’s Advisory Group on Uncertainties is a general approach suitable for the evaluation of the reference value of the key comparisons. It includes two procedures, which can be used when the travelling standard is stable during the comparison and all of the laboratories’ measurements are mutually independent. The first procedure (A) can be applied when each laboratory assigns a Gaussian probability distribution function (PDF) to the measurand. It uses the weighted mean as estimator, and the consistency of the model is provided by the chi-square test. The second procedure (B) is instead used when not all the laboratories’ PDFs associated to the measurand are gaussian, or the consistency test of the first procedure fails and the discrepant measurements cannot be corrected. In this case a more robust estimator as the median can be chosen, and the algorithm to be used is based on Monte Carlo simulations.
1462
M. Bergoglio et al. / Measurement 44 (2011) 1461–1467
The recommendations reported in this guide are of course not mandatory, but provide a guidance to follow and to apply in most of the comparisons, if appropriate modifications are made. The conditions that determine a limitation with respect to the application of these procedures are due to the stability of the travelling standards and the mutual independence of the measurements. In fact, during a comparison all laboratories should measure the identical standards, but often the travelling standards are not perfectly stable so that it is necessary to take into account the instabilities arising from the ageing of the standard and from the circulation. Concerning the interdependence of the measurements of the participating laboratories, this condition is true only when the reference value is determined taking into account the laboratories’ results that maintain an independent reference standard, as in the key comparison. On the other hand, in the case of the interlaboratory comparison, performed by the accredited laboratories, usually there are correlations between the measurements of the laboratories. These considerations prompt us to make the appropriate modifications to the both procedures A and B contained in [2]. This approach, as an example, was applied to analyse the SIT (Servizio di Taratura in Italia) interlaboratory comparisons carried out for the pressure quantity in the range from 10 MPa to 100 MPa (gage mode and liquid medium) and in the range from 0.15 MPa to 7 MPa (gage mode and gas medium).
the end mL1,n and at intervals of time during the comparison), j loops (where j = n 1) are formed, which will be delimited from the measurements of laboratory L1 (i.e. mL1,j is the beginning value of the jth loop and mL1,j+1 is the end value). L1 can be From these measurements the mean value m evaluated
2. Evaluation of the drift of the travelling standard
With the assumption that jmL1;j mL1;jþ1 j is the maximum possible value due to the drift in the loop jth, and the travelling standard is judged instable if jmL1;j mL1; in j þ 1j P 2uðmL1;j mL1;jþ1 Þ, a contribute driftj due to the drift is taken into account for the measurements of the laboratories belonging to the loop jth. in Estimating the value of driftj to be zero, and neglecting the terms of uncertainty of repeatability and reproducibility of the pilot laboratory, the condition of stability of the travelling standard is fulfilled if
In interlaboratory comparisons, one or more travelling standards are measured by all the participating laboratories using its own reference standard. Since these measurements cannot be taken at the same place and at the same time, the travelling standard is circulated among all the participating laboratories, therefore the value of the travelling standard received by each laboratory can be influenced by the instability that can arise from the transport and from the time. The presence of drift in the travelling standard, directly influences the quality of the interlaboratory comparison, therefore this variability of the value should be carefully evaluated. In order to monitor the stability of the travelling standard, the pilot laboratory performed repeated measurements over the duration of the comparison. From these measurements a correction to compensate the drift can be evaluated. This correction and its standard uncertainty, which depend on both the stability of the travelling standard and the long term stability of the measurements, will be included in the model describing the measurement process of the interlaboratory comparison. If it is assumed that to monitor the stability of the travelling standard the pilot laboratory denoted as L1 (which will be the linking laboratory) has performed n measurements mL1,k (k = 1, . . . , n) (i.e. at the beginning mL1,1, at
L1 ¼ m
1X mL1;k n k
ð1Þ
A first evaluation of the stability of travelling standard is based on the visual inspection of the consistency between L1 . the measurements mL1,k and the mean value m Sometimes the data can show a linear drift, such cases have been considered by different models [3–5]. However in the most of the cases the travelling standard shows non-linear drifts, which are probably caused by mechanical changes arising from the circulation. In these cases, an evaluation of the uncertainty due to the instability could be determined from the standard deviation of the measurements of the linking laboratory, which would be considered in the budget uncertainty of each participating laboratory. Nevertheless, this approach could mask real interlaboratory difference within of loops with no drift, or attribute unreal interlaboratory difference, that instead would be caused by instability of the standard. In order to better evaluate the drift of the transfer standard, it is possible evaluate: in
– the drift in each loop driftj .
m m L1;j in L1;jþ1 u drift j ¼ 2
ð2Þ
L1 . – the drift driftj of each loop j with respect to m This value is estimated by the difference between the L1;j and the mean value m L1 mean value of the loop m
L1;j m L1 driftj ¼ m
ð3Þ
where
L1;j ¼ m
mL1;j þ mL1;jþ1 2
ð4Þ
L1 Þ; uðdrift j Þ and uðm L1;j Þ, are evaluThe uncertainties uðm ated from the uncertainties uðmL1;k Þ and taking into account the estimated covariance u(mL1,r, mL1,s) (where r, s = 1, . . . , n, r – s) associated to mL1,r and mL1,s. Since the measurements mL1,k are traceable to the same primary standard of the laboratory L1, they are not independent, but there is a common component due to the
M. Bergoglio et al. / Measurement 44 (2011) 1461–1467
uncertainty uL1(std) of the primary standard. As a result the estimated covariance is given by uðmL1;r ; mL1;s Þ ¼ u2L1 ðstdÞ. So that, the application of the law of propagation of uncertainty [6] gives
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u2 ðmL1;j Þ u2 ðmL1;jþ1 Þ 1 1 2 L1;j Þ ¼ u ðstdÞ uðm þ þ2 4 4 2 2 L1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u2 ðmL1;j Þ þ u2 ðmL1;jþ1 Þ u2L1 ðstdÞ ¼ þ 4 2 ð5Þ rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1X 2 11 2 u ðstdÞ u ðmL1;k Þ þ nðn 1Þ k n2 n n L1 rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1X 2 n1 2 uL1 ðstdÞ u ðmL1;k Þ þ ¼ k n2 n
L1 Þ ¼ uðm
ð6Þ
and
4. Evaluation of the correlations In order to evaluate the reference value of the comparison, besides to consider the laboratory results and the corrections due the drift of the travelling standard with their associated uncertainty, it needs the evaluation of the correlations between the laboratories measurements as well. The correlations between the measurements of the participating laboratory are due to the common components of uncertainty arising from systematic effects, which come from these considerations: – The correlation among the laboratories, depend on the common reference standard to which they are traceable. In general a laboratory will be independent with respect to the others laboratories only when it has an own reference standard or it is traceable to a laboratory not participating at the comparison.
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 n2 1X n2 2 2 u ðstdÞ uðdrift j Þ ¼ ðu2 ðmL1;j Þ þ u2 ðmL1;jþ1 ÞÞ þ 2 k–j u ðmL1;k Þ 2n n 2n L1 k–jþ1 Since, as will be shown in the following chapter, the analysis of the comparison is performed by linking to the mean L1 the measurements of the others laboratories, and value m the purpose of a comparison is to estimate the degree of equivalence, which involve difference between measurements, which are corrected for the drift, the consistency of these results do not need to be analysed.
3. The model Assuming that the participating laboratories Li are f (included the linking laboratory), which are divided in loops as described before, the laboratory belonging to the loop j assigns a value mLi,j to the travelling standard. Taking into account of the drift of the travelling standard as evaluated by the laboratory L1, the measurement of each participating laboratory can be linked to the mean L1 , as a result the equation describing the measurevalue m ment mLi of each laboratory is given by in
mLi ¼ mLi;j þ drift j drift j ;
i ¼ 2; . . . ; f
ð8Þ
whereas for the linking laboratory L1 the claimed value for L1 . the comparison will be mL1 ¼ m Since the corrections due the drift of the travelling standard can be assumed mutually independent each other, and with respect to the measurements mLi,j, the uncertainty of mLi is calculated from the following equation:
uðmLi Þ ¼
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi
i ¼ 2; . . . ; f
in
u2 ðmLi;j Þ þ u2 drift j
1463
ð7Þ
– The measurements mLi,j belonging at the same loop j, have the same correction driftj with the same uncertainty u(driftj). As a result, the covariance terms u(mLr, mLs) between measurements of two different laboratories r and s are evaluated according to these different situations: (a) The laboratories are in the same loop j and they are traceable to the same reference standard having uncertainty u(std)
uðmLr ; mLs Þ ¼ u2 ðdrift j Þ þ u2 ðstdÞ (b) The laboratories are in the same loop j and they are independent
uðmLr ; mLs Þ ¼ u2 ðdrift j Þ (c) The laboratories are in different loops and they are traceable to the same reference standard having uncertainty u(std)
uðmLr ; mLs Þ ¼ u2 ðstdÞ (d) The laboratories are in different loops and they are independent
uðmLr ; mLs Þ ¼ 0 The covariance terms can be rearranged in the covariance matrix w, which is given by
3 uðmL1 ; mL1 Þ uðmL1 ; mLf Þ 7 6 .. .. .. 7 w¼6 . . . 5 4 uðmLf ; mL1 Þ uðmLf ; mLf Þ 2
þ u2 ðdrift j Þ ð9Þ
where u2(mLi,j) is the variance of the measurement as reported from the laboratory Li, whereas for the linking L1 Þ. laboratory is given by uðmL1 Þ ¼ uðm
ð10Þ
where the terms u(mLr, mLs) with r = s on the main diagonal are the variance terms, i.e. u(mLr, mLr)=u2(mLr).
1464
M. Bergoglio et al. / Measurement 44 (2011) 1461–1467
5. Evaluation of the reference value As previously indicated, the reference value of the comparison was calculated following the two procedures A and B proposed in [2]. These procedures have been modified to allow their application even when there are correlations between the participating laboratories and the model of the measurement (Eq. (8)) takes into account the drift of the travelling standard. In the procedure A, which is the preferred procedure, the reference value is calculated as weighted mean, then the chi-square test is performed to check the consistency of the data. When the test does not fail, this method provides a reference value with the smallest uncertainty, which is statistically supported. When the chi-square test fails, because of discrepant measurements, the procedure B is adopted. This procedure is based on Monte Carlo method, in which the distribution of the reference value is determined by a propagation of the simulated PDFs representing the results from each participating laboratory. This approach is based on a robust estimator, as the median, which should be less sensitive to outliers than the weighted mean, and it does not require the exclusion of data. The modification of these procedures are shown in the following. 5.1. Evaluation of the reference value as weighted mean (procedure A modified) The weighted mean value mw is determined by using the general Gauss–Markov theorem, in matrix notation [8]
mw ¼ ðC T w1 CÞ1 C T w1 M
ð11Þ
and the associated uncertainty is given by
uðmw Þ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðC T w1 CÞ1
ð12Þ T
where M ¼ ðmL1 ; mL2 ; . . . ; mLf Þ is the column vector of the measurements, C T ¼ ð1; 1; . . . ; 1Þf 1 is the design matrix (with the symbol T denoting the transpose matrix). The consistency of the results obtained is checked by the chi-square test
v2obs ¼ ðM mw ÞT w1 ðM mw Þ
ð13Þ
with degrees of freedom m = f 1. check is considered as failing if The consistency Pr v2 ðmÞ > v2obs < 0:05 . If the chi-square test does not fail the value mw is accepted as the reference value mref of the comparison. 5.2. Evaluation of the reference value with the median (procedure B modified) As mentioned, for evaluating the reference value, the Monte Carlo method is used as described in [2]. The Monte Carlo method is a numerical approach based on the use of random numbers and the statistical simulations. In general, if a mathematical model Y = f (Xi, . . . , Xf) expresses the dependence between the input quantities Xi and the output quantity Y, the Monte Carlo method allows to propagate PDFs of the input quantities Xi, through
the model, providing the PDF for the output quantity Y. The appropriate PDFs (they can be rectangular, Gaussian, multivariate Gaussian, t-Student, etc.) are assigned on the basis of the information regarding those quantities [7]. The algorithm consists of sampling at random, from the T PDFs of the quantities Xi, Q vectors xr ¼ xr1 ; . . . ; xrf (r = 1, .. , Q), where Q is the number of Monte Carlo trials. In order to sample from these PDFs, appropriate algorithms are used, which are available in literature and implemented in some software packages. Then for each vector xr the correspondent yr = f (xr) is evaluated, and these Q values yr form the vector y = (y1, . . . , yQ). This vector describes the PDF of Y, from which the characteristic parameters of the distribution, as the mean value and the standard deviation can be estimated. As in [2] we use the median estimator
r Y r ¼ med ¼ median xr1 ; . . . ; xrf
ð14Þ
but unlike that procedure, the correlations are taken into account. For this purpose the input quantities Xi are estimated by the measurements mLi performed by the laboratories, and they are considered to have a joint multivariate Gaussian PDF with covariance matrix w (Eq. (10)). The procedure [7] consists to form Q vectors zr = (z1, . . . , zf)T sampled from a standard Gaussian distribution N(0, 1), then the sampling is done from a multivariate Gaussian PDF N(M, w), where M = (mL1, . . . , mLf)T are the measurements and w is the covariance matrix. So that xr = M + RTzr, where R is the upper triangular matrix given by the Cholesky decomposition (i.e. w = RTR). These Q values medr form a vector MED = (med1, . . . , medQ) which describes the PDF of the median value. The mean of the values in MED is taken as the reference value of the comparison mref and the standard deviation of the values in MED is the associated standard uncertainty u(mref). Since typically, the median gives rise to an asymmetric distribution the shortest coverage interval at the 95% level of confidence for mref is used [7]. 6. Degrees of equivalence The value of the degree of equivalence [1] of each participant with respect to the reference value, is determined from the difference DmLi = mLi mref and its associated uncertainty is given by
uðDmLi Þ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u2 ðmLi Þ u2 ðmref Þ
ð15Þ
when the reference value is evaluated as weighted mean. For calculating the associated uncertainty, when the reference value is defined through the median, a vector r is formed for each laboratory Li
r ¼ ðrow Li of P i;MC Þ MED
ð16Þ
where P i;MC ¼ ðx1 ; . . . ; xQ Þ is a f Q matrix. This vector r describes the PDF of DmLi, from which the standard deviation is taken as the standard uncertainty
1465
M. Bergoglio et al. / Measurement 44 (2011) 1461–1467
their reference standard pref,Li and the corresponding reading of the gage pg,Li appropriately corrected as regards zero reading. In both comparisons most laboratories were traceable to the INRIM, some were traceable to other participant laboratories which were traceable again to the INRIM and some laboratories were traceable to other NMIs. Both comparisons were divided in five loops so that INRIM performed six measurements on the travelling standard.
associated with DmLi and the shortest 95% coverage interval can be constructed. In order to evaluate the degrees of equivalence between two laboratories denoted by Ld and Lg, the difference DmLd,Lg, is calculated
DmLd;Lg ¼ mLd mLg Denoting by u(mLd, mLg) the covariance between mLd and mLg, (Eq. (10)) the uncertainty of the difference DmLd,Lg is then given by the following equation:
uðDmLd;Lg Þ ¼
qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u2 ðmLd Þ þ u2 ðmLg Þ 2uðmLd ; mLg Þ
ð17Þ
7.2. Results In the following, as an example, the results obtained in two pressure points 10 MPa and 20 MPa (increasing pressure) in liquid medium are considered. In the Table 1 the INRIM mean values are shown, for the points at 10 MPa and 20 MPa respectively (Eq. (1)). On the basis of these results and taking into account the uncertainty of the INRIM reference standard (uL1(std) = 0.11 kPa at 10 MPa and uL1(std) = 0.21 kPa at 20 MPa) the drifts denoted as driftj in and driftj with the associated uncertainties were evaluated. These values, which are presented in Table 1, were used to correct the results of the participating laboratories (Tables 2 and 3). The most consistent drift has been detected in the third loop, nevertheless also the drifts noted in the first two loops at 10 MPa are not negligible. In Table 4, as an example, the covariance matrix obtained for the pressure point 10 MPa is shown. As noted, most of the participating laboratories (16) are traceable to INRIM (no. 1), and only three laboratories (nos. 6, 11 and 20) are traceable to other NMIs (PTB and EIM). Since laboratories no. 11 and no. 20 are traceable to PTB are themselves correlated, whereas, in this contest, the laboratory no. 6, which is traceable to EIM, is independent. Further correlations between laboratories depend on the fact that four laboratories are traceable to SIT laboratories (which are traceable again to the INRIM) participating in the comparison. Laboratories nos. 8, 9 and 10 are trace-
7. Pressure comparison example The method here presented was applied to analyse two SIT interlaboratory comparisons in the field of pressure metrology in the range from 10 MPa to 100 MPa, gage mode and liquid medium and from 0.15 MPa to 7 MPa gage mode and gas medium. 7.1. General remarks on the comparison Both comparisons were piloted by INRIM (the lonely NMI participating to the comparison), nineteen laboratories participated in liquid medium and 16 laboratories in gas medium (INRIM included). The transfer standards were precision digital quartz transducers having 100 MPa fs and 10 MPa fs. The comparison in liquid mode lasted 20 months, while the comparison in gas medium was concluded over a period of 1 year. The transfer standards stability, evaluated by using Eq. (2), were similar for both the gages, at low pressure about one in 104 and few parts in 106 was detected at full scale (100 MPa and 7 MPa respectively). Following the protocol three complete calibration cycles were performed. At each pressure value the participating laboratories had to provide: the pressure measured by
Table 1 INRIM mean values and drift for each loop. 10 MPa
Loop Loop Loop Loop Loop
1 2 3 4 5
20 MPa
L1 =ðkPaÞ m 10001.36
L1 Þ=ðkPaÞ uðm 0.18
drift/(kPa)
u(drift)/(kPa)
1.90 1.69 0.23 1.71 1.67
0.20 0.22 0.21 0.16 0.16
L1 =ðkPaÞ m 20001.69
L1 Þ=ðkPaÞ uðm 0.25
u(driftin)/(kPa)
drift/(kPa)
u(drift)/(kPa)
u(driftin)/(kPa)
0.36 0.58 1.34 0.13 0.17
1.73 1.74 0.15 1.78 1.58
0.18 0.22 0.21 0.15 0.16
0.12 0.11 1.78 0.15 0.05
Table 2 Calibration results of the participating laboratories at 10 MPa, DmLi,j = mLi,j 10,000. Loop
1
Lab
2
3
4
5
6
2 7
8
3 9
10
11
12
4 13
14
15
5 16
17
18
19
20
DmLi,j/kPa u(mLi,j)/kPa
0.22 1.6
0.29 1.2
1.5 1.1
1.2 2.9
0.13 5.0
3.4 5.0
1.9 10
1.8 2.9
0.33 2.1
2.6 1.2
2.5 3.4
1.6 1.0
0.08 0.69
1.15 0.57
1.26 0.61
0.61 0.58
1.0 0.50
3.6 0.63
3.4 0.52
1466
M. Bergoglio et al. / Measurement 44 (2011) 1461–1467
Table 3 Calibration results of the participating laboratories at 20 MPa, DmLi.j = mLi.j 20,000. Loop
1
Lab
2
3
4
5
2 6
7
3 8
9
10
11
12
4 13
14
15
5 16
17
18
19
20
DmLi.j/kPa u(mLi.j)/kPa
0.3 3.2
0.5 2.3
1.6 1.4
1.9 3.3
0 10
5 10
2 20
2.6 5.3
0.0 4.3
3.3 1.7
4.5 3.4
2.0 2.0
0.67 0.81
2.44 0.84
2.13 0.91
1.2 1.0
1.58 0.67
4.2 1.1
4.46 0.79
Table 4 Covariance matrix for the pressure point at 10 MPa, in kPa to be multiplied by 108. The grey areas show the groups of the laboratories in the same loop.
Table 5 Reference value evaluated with the procedure A modified. Pressure 10 MPa
Estimator: weighted mean mw/kPa
Pressure 20 MPa
Pr 10,001.10
Estimator: weighted mean mref/kPa
20,001.66
able to laboratory no. 4, and the laboratory no. 7 is traceable to laboratory no. 18. The covariance terms due to the travelling standard drift are negligible if compared to the claimed uncertainties by the participants. Applying the Eq. (11) the weighted means (modified procedure A) and the associated uncertainties were evaluates, in Table 5 the results are shown. The consistence test passed for the point at 20 MPa, but it failed at 10 MPa. For this point, since it was not possible to remove or correct the discrepant measurements, the modified procedure B was applied, the obtained reference value is shown in Table 6. In Tables 7 and 8 the degrees of equivalence with respect to the reference value for the point at 10 MPa and 20 MPa are shown respectively. Since in this comparison the INRIM uncertainty is much lower than the other participating laboratories the estimated reference values are close to the INRIM results, as well as the associated uncertainties.
v2 ð19Þ > v2obs ¼ 62:7 < 0:05 v2 test : fail
u(mw)/kPa Pr v2 ð19Þ > v2obs ¼ 30:03 < 0:05 u(mref)/kPa
0.16 2
v test : pass 0.23
Table 6 Reference value evaluated with the procedure B modified. Pressure 10 MPa
Estimator: median mref/kPa Lower limit
10,000.89 9999.72 (2.4%)
u(mref)/kPa Upper limit
0.56 10001.85 (97.4%)
8. Conclusions A method for analysing interlaboratory comparisons has been presented. The approach permits to evaluate the reference value and the degree of equivalence in the cases in which the laboratories participating are mutually dependent and the travelling standards are not stable during the comparison. The drifts of the travelling standard, their associated uncertainty and the consequent correlations are evaluated by the measurements carried out by a
1467
M. Bergoglio et al. / Measurement 44 (2011) 1461–1467 Table 7 Degree of equivalence at 10 MPa. Lab
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
DmLi/(kPa) U(DmLi)/(kPa)
0.5 1.1
1.2 3.3
0.7 2.6
2.5 2.6
2.2 5.9
0.7 9.9
4 10
3 20
0.7 6.2
1.5 5.0
1.5 3.7
0.0 6.6
1.0 2.1
2.7 1.8
1.4 1.6
1.3 1.6
2.0 1.6
1.5 1.5
1.1 1.6
0.8 1.5
Table 8 Degree of equivalence at 20 MPa. Lab
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
DmLi/(kPa) U(DmLi)/(kPa)
0.02 0.14
0.2 6.3
0.4 4.5
1.6 2.7
2.0 6.6
0.2 20
5.0 20
3.7 40
0.8 11
1.8 9.6
1.5 5.4
1.0 6.8
1.5 4.0
2.8 1.6
0.8 1.7
1.1 1.8
2.1 2.1
1.7 1.3
1.0 2.1
1.2 1.6
linking laboratory and are taken into account in a statistical model. The procedure is based on the method proposed by Cox [2] which was adapted in order to take into account the correlations arising from the model comparison. References [1] Guideline for CIPM Comparison, Appendix F to MRA, March 1999 (http://www.bipm.org). [2] M.G. Cox, The evaluation of key comparison data, Metrologia 39 (2002) 589–595. [3] N.F. Zhang, H.K. Liu, N. Sedransk, W.E. Strawderman, Statistical analysis of key comparisons with linear trends, Metrologia 41 (2004) 231–237.
[4] C. Elster, W. Wöger, M.G. Cox, Analysis of key comparison data: unstable travelling standards, Measurements Techniques 48 (9) (2005) 883–893. [5] A.V. Stepanov, Comparing algorithms for evaluating key comparisons with linear drift in the reference standard, Measurement Techniques 50 (10) (2007) 1019–1027. [6] BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP, OIML, Evaluation of Measurement Data – Guide to Expression of Uncertainty in Measurement, JCGM 100, 2008. [7] BIPM, IEC, IFCC, ILAC, ISO, IUPAC, IUPAP, OIML, Evaluation of Measurement Data – Supplement 1 to the ‘‘Guide to Expression of Uncertainty in Measurement’’ – Propagation of Distributions using a Monte Carlo Method, JCGM 101, 2008. [8] J. Beck, K. Arnold, Parameter Estimation in Engineering and Science, John Wiley and Sons, 1977.