Polymer Testing 7 (1987) 239-267
Precision in Polymer Testing: An Important World-Wide Issue A. G. Veith Uniroyal-Goodrich Tire Co., Research and Development Center, Brecksville, Ohio 44141, USA (Received 14 February 1987; accepted 11 March 1987)
SUMMARY There is no question about the importance of testing in the modern technologically oriented world. We strongly rely on it. However, the quality of testing, especially on a laboratory-to-laboratory basis, is often marginal or poor. This poor test quality, or lack of agreement in test results, often upsets the commercial, scientific and regulatory efforts of society. Test method precision is a 'figure of merit' for various tests and the issue of obtaining valid precision parameters and clearly expressing the precision results is currently being addressed by standardization organizations such as A S T M and IS0. This paper gives an overview of the action on precision assessment currently being conducted in A S T M Committee D l l and IS0/TC-45, both of these being devoted to Rubber and Rubber Product testing. Definitions for precision, i.e. repeatability and reproducibility, are given in addition to details on the organization of interlaboratory programs needed to acquire the basic precision data. A precision example using the Mooney viscosity test is presented as well as a discussion of the uses of precision parameters and the problems currently being encountered in this work.
1 INTRODUCTION As society comes to rely m o r e and m o r e on a complex technological base, the role of testing b e c o m e s ever m o r e important. The results of testing are an essential part of the communication b e t w e e n various 239 Polymer Testing 0142-9418/87/$03-50 {~) Elsevier Applied Science Publishers Ltd, England, 1987. Printed in Northern Ireland
240
A. G. Veith
levels of the commercial, technological and scientific community-scientists, engineers, industrial buyers and sellers, regulators and the regulated members of society. In the USA it is estimated by the National Bureau of Standards that testing and measuring of all kinds represents 6% of the gross national product. 1 In 1977, the United States Federal Government alone spent 690 million dollars on the collection of data, 43% of this being generated by or for the environmental agencies. 2 Technical decisions are based on test results; the quality of the test results directly influences the quality of the decisions in the conduct of scientific and commercial transactions as well as governmental regulation activities. If laboratories or testing stations cannot achieve a certain specified level of test result agreement, all of the above transactions seriously suffer. Very frequently the quality of data cannot be determined by simple review. Specific interlaboratory test programs together with an in-depth analysis of the test system to discover potential sources of discrepancy are required to determine quality. As international trade and commerce increase, those standardization organizations that develop test method and specification standards have made policy decisions that all test method standards shall have, as part of the standard, a section on the typical precision that can be expected to be obtained by the user of the standard. The American Society for Testing and Materials, ASTM, made such a decision in 1976. The International Standards Organization, ISO, based in Geneva, Switzerland, has taken an active role in test method precision and approximately 30 Technical Committees (TC) of ISO have also made policy decisions to include typical precision results in (test method) standards. Test method precision is obtained by interlaboratory test programs. These are organized in a definite and structured manner and procedures for data analysis and test method expression have been developed. Some degree of interlaboratory testing has been conducted since laboratories and technical trade first came into widespread use. The first comprehensive interlaboratory tests were part of the action to introduce standardization into technology. In 1884 the American Society of Mechanical Engineers began conducting interlab testing and the results were extremely disheartening--a mass of poor and conflicting data. In this time period the American Society of Civil Engineers and the American Institute of Mining Engineers were undertaking similar actions on interlaboratory testing and the development of industry-wide standards and test methods? In Europe at about the same time, similar activity was taking place in Germany. A voluntary standardization organization was created as the result of an interna-
Preckion in polymer testing
241
tional conference held in Munich in 1884, devoted mainly to mechanical engineering and its allied fields. 3 This review will concentrate on current activity in the oldest segment of the polymer industry, namely rubber manufacturing, and the testing that is an integral part of this industry. The pace of activity in precision assessment has increased over the past 15 years as the realization of its importance became apparent. The review will deal mainly with work done in North America in ASTM Committee D l l (Rubber), but recent work in ISO Technical Committee 45--Rubber, Rubber Products, TC45, will also be reviewed. International standard harmonization has been achieved in these two rubber testing committees; the terminology, calculation algorithms and mode of precision expression are the same for both. The guidelines and procedures are in two newly developed standards; ASTM D4483 'Standard Practice for Rubber--Determining Precision for Test Method Standards' and ISO-Technical Report 9272 'Rubber and Rubber Products--Determination of Precision for Test Method Standards' (see Refs 4 and 5).
2 T H E M E A S U R E M E N T PROCESS Prior to discussing details or case histories of precision assessment, a brief general background on the measurement process is needed. First, a measurement system must exist. This comprises: (1) a sample selection procedure; (2) a controlled operation in using the apparatus, reagents, etc., which includes instrument calibration and maintenance as well as operator training; (3) established procedures for custody of samples and recording and reporting of test results; and (4) of great importance, the control of laboratory bias, which is explained below. The two measurement concepts of accuracy and precision are often used interchangeably--they should not be. Accuracy describes how well measured values agree with a reference or 'true' value: high accuracy implies good (close) agreement. Precision describes how well measured values agree with each other: high precision implies good (close) agreement. The difference between the measured mean value and the reference value is the bias. A large bias implies an inaccurate measurement process, which may or may not be precise. As emphasized below, a large systematic between-lab bias, with its value unique to each lab, is the root cause of poor interlab precision. Eliminating these large biases is a difficult job. When all test criteria conditions are achieved, repeated measurements of any property in a given laboratory, or in a number of
242
A. G. Veith
laboratories, will not yield identical results. A single measurement y may be partitioned into two major components y=m+ d
(1)
where m represents the 'true' or reference mean value of the property in question and d is a disturbance or error, which may be positive or negative and have sub-components. It may be defined as follows
d=Bo+B~+e
(2)
where Bo is a positive or negative random component of variation existing between laboratories, Bs is a systematic or bias 'betweenlaboratory' component, the value of which is fixed for any given laboratory, and e is a random 'within-laboratory' component or error associated with every measurement. Of the combination Bo and Bs, it is the B~ term, the fixed deviation for any given laboratory, that is the major source of disagreement among laboratories.
3 PRECISION ASSESSMENT Historically the first ASTM Technical Committees that undertook precision testing were those dealing with the more classical analytical test methods (percentage of chlorine, heat loss, ash, etc.). For such tests, precision testing is in principle fairly straightforward. Samples are drawn from a large specially prepared homogeneous lot of material and sent to all participating laboratories. At each participating laboratory, test samples are drawn from the distributed sample material and replicate (repeated) tests are nominally conducted in a side-by-side manner, with two, three, or another specified number of replicates. These replicates can often be run within an hour or at most 2-3 h. The estimates of within-laboratory precision for this type of testing are therefore expressed in terms of a short time period, 1-2 h. There are counterparts of this type of 'classical analytical' test method approach within ASTM D l l and ISO/TC45. However, there are also many 'non-analytical-type' test methods that do not fall under this fairly simple test protocol. An example is ASTM D865, 'Rubber--Deterioration by Heating in Air (Test Tube Enclosure)', an aging-test method. Results from this test frequently are of the 'percentage retained' variety, i.e. percentage retained tensile strength, elongation or modulus. To make these measurements the following operations are required.
Precision in polymer testing
(i) (ii)
(iii)
(iv) (v)
243
Compounding, mixing and sheet curing are required to produce compounds for test. Original stress-strain property tests are conducted on the test pieces, nominally dumb-bell specimens cut from sheets for the various compounds. Additional samples of these compounds in dumb-bell specimen form are subjected to accelerated aging for specified timetemperature periods. 'Aged' stress-strain property tests are conducted on these specimens after the aging. The 'original' versus 'aged' property response for the various compounds are nominally expressed as 'percentage property retained'.
For such a test what is to be replicated to assess within- and/or between-laboratory precision? Original stress-strain property tests? Aged property tests? The aging itself? All of these must be replicated-the entire test procedure must be replicated. Obviously this cannot be done in the classical 'side-by-side, within one hour' replication procedure. Thus a Standard Practice document for precision testing must give guidance for testing with this more complex type of test. One of the major problems in 'precision and interlaboratory' testing is that of effective communication--how to use special words, terms, concepts (a nomenclature system) in a way that all can understand. This issue is addressed in a comprehensive way in D4483 and in ISO/TR9272. An additional feature of both of these is the ability to handle the wide range of complex test method standards or, to express this in modern statistical terminology, to handle a broad range of 'systems of causes' which cause or produce the variation in test results that hamper and interfere with technical decisions, variations measured and assessed by precision parameters. As previously discussed, complex standards must be addressed as well as the simpler 'analytical-type' test method standards. This is accommodated in D4483 and TR9272 by defining two types of precision, called Type 1 and Type 2.
Type 1. A precision that focuses on the test machine or apparatus and the test operations that start with a source of nominally identical, prepared test specimens or portions of material. It applies strictly to the operation of the test machine. In many tests, especially simple ones, it is the precision that is most appropriate. Type 2. A precision which includes all steps in a complex required sequence of operations, that leads to a final or terminal measurement of the test property. In addition to the D865 example cited above, testing
244
A. G. Veith
tO determine if a certain material (carbon black or synthetic rubber) meets 'in rubber' specification limits is also a more comprehensive and complex test. Mixing, processing, curing and test specimen preparation steps must be completed within both the producers' and the consumers' laboratories. With the new emphasis on SPC/SQC, this type of precision takes on increased importance. Precision, whether within- or between-laboratory (two aspects of the 'system of causes'), is basically expressed by a standard deviation. It has been customary over the past two decades to calculate, from interlaboratory (precision) data, a special parameter for within-laboratory variation and a special parameter for among- or between-laboratory variation. The within-laboratory variation is called a 'repeatability' and the between-laboratory variation is called a 'reproducibility'. These have been assigned special symbols agreed upon in both ASTM D l l and TC45. Repeatability, r, is given by r = 2"83Sr
(3)
R = 2"83SR
(4)
Reproducibility, R is given by
where Sr and SR are respectively the within- and between-laboratory standard deviations for 'test results' for any given test method. A brief background on the statistical concepts of interlaboratory testing and the derivation of these formulas is given in Appendix B. Before defining r and R in words, a test result must be defined. A test result is the reported value for any given test. It is frequently the average or median of a number of determinations--separate measurements of the property to be measured. The repeatability, r, is an interval, established by an interlaboratory program, within which duplicate test results (in the same laboratory) will fall (i.e. their difference will be less than the interval r) in 95 out of 100 cases, a 95% confidence interval, for a correct and normal operation of the test procedure. The reproducibility, R, is an interval, established by an interlaboratory program, within which duplicate test results obtained in two laboratories will fall, again in 95 out of 100 cases, in a correct and normal operation of the test procedure. A comprehensive understanding of the concepts of precision requires some knowledge of how precision data are obtained and of the analysis procedures for the data so obtained. For those less familiar with the procedures for obtaining and analyzing interlaboratory data, two appendices provide this background information. Appendix A gives
Precision in polymer testing
245
some brief details on how interlaboratory programs are conducted and Appendix B, as already mentioned, gives the rationale for the analysis of the interlaboratory data.
3.1 An example--Mooney viscosity A very frequently used test in the rubber industry is the measurement of rubber or c o m p o u n d viscosity by means of the Mooney viscometer. This test m e t h o d is described in A S T M D1646 and an analogous test in ISO289. The following example describes an A S T M D l l interlaboratory test program (ITP) conducted in North America. The interlaboratory effort consisted of a Type 1 precision program with seven materials (rubbers) tested by way of two replicate (single) test measurements, each conducted on a separate day one week apart. For this test a single m e a s u r e m e n t of viscosity is defined as a test result. A total of 11 laboratories participated. The materials are described in some detail in Table 1. Table 2 lists, in Mooney torque units, the duplicate test results in each cell. The material averages are given at the bottom of each column. Table 3 lists the cell averages. There are some suspect values that disagree rather sharply with the material average. Examining such data in tabular form is extremely difficult. However, the results can be put into a more readily comprehensible form by way TABLE 1 Materials and Test Conditions Used in Mooney Viscosity ITP
Material no. 1 2 3 4" 5 6
7
Material description SBR 1500 SBR 1712 EPDM IIR (NBS-SRM 388j)" Compounded blend 1500/1505 SBR black masterbatch (1712, 65 N339, 50 H A oil) NR
Test Temperature (° C )
Other details
100 100 125 100 100
ML(1 ML(1 ML(1 ML(1 ML(1
+ + + + +
4) 4) 4) 8) 4)
l(lO
ML(1 + 4)
100
ML(1 + 4)
a This IIR (butyl rubber) is a Standard Reference Material No. 388 Lot j, as furnished by the National Bureau of Standards. Measurements are made on unmassed samples.
A. G. Veith
246
TABLE 2 The ITP Test Resultsa--Mooney Test Level (Material)
Laboratory 1 2 3 4 5 6 7 8 9 10 11
Average
1 46.0, 47.0 46.8, 50.4 46.9, 46.9 47.0, 46.0 45.6, 46.5 48.5, 47.0 46.2, 46.3 48.2, 48.9 46.0, 46.4 42.0, 42.5 46.0, 45.4 46.48
2
3
4
5
6
7
51.0,51.0 49.2, 50.0 48.8 49.9 51.0 51.0 50.4 49.9 51.0 49.5 50.3 50-1 52.4 52-3 50.8 50.8 51.0 51-0 48.1 48-3 50.36
68-0, 67.0 68.4, 69.6 68.1, 67.8 66.0, 66.0 65.1,65.8 67.0, 66.0 68.0, 68.5 69.0, 70.0 69.0, 69-7 70.0, 71.0 70.0, 66.7 68.03
69.0,69.0 68.3,68.3 70-0,70.3 68.0,68.5 68-1,68.6 68.0,68.0 68-5,68.5 69.5,69-0 69.5,69.4 69.0,68.5 69.0,68.6 68.80
68.0,69.0 68.9,69.6 69.0,69.1 70.0,70.0 68.3,67.5 68.5,67.0 68.7,68.1 69-2,70.2 68-9,69-3 71.0,70-5 68.3,67.0 68-91
76.0,76-0 75.8,75.2 72.3,74.2 69.0,70.0 72.6,73.6 79-0,75.5 76-0,77.1 80-4,82.3 71-8,72.4 76-0,76.0 63-6,61.6 73.93
99.0,101.0 98.0 100.0 100.0 99.5 97.5 98.0 98.7 99.6 98.0 95.0 100.2 100.4 99.0 99.1 98.9 99.4 104-0 103.0 93.0 91-2 98.72
a Tabulated values are in (Mooney) torque units.
of special graphs. In these, each daily measured Mooney viscosity, i.e. day 1-day 2, is plotted versus the laboratory number, 1-11. The plots for all seven rubbers are given in Figs 1 and 2. For SBR 1500 (no. 1) all laboratories obtained fairly good replicate (day 1-day 2) test results except laboratory 2. Laboratory 10, while getting two repeatable test results, is quite low (on average) compared with the other ten labs. This type of plot permits quick visual appraisal both of the repeatability and of the reproducibility, the lab-to-lab agreement. TABLE 3 Cell Averages a for Mooney Viscosity I T P Level (material)
Laboratory
1
2
3
4
5
6
7
Average
1 2 3 4 5 6 7 8 9 10 11
46.5 48-6 46-9 46-5 46-05 47.75 46-25 48.55 46-2 42.25 45.7 46.48
51.0 49.6 49-35 51.0 50.15 50.25 50.2 52.35 50-8 51.0 48.2 50.36
67.5 69.0 67.95 66.0 65.45 66.5 68.25 69.5 69.35 70.5 68.35 68.03
69.0 68.3 70.15 68.25 68.35 68.0 68.5 69.25 69-45 68-75 68.8 68.8
68.5 69.25 69-05 70.0 67.9 67.75 68.4 69-7 69.1 70-75 67.65 68.91
76.0 75-5 73.25 69.5 73.1 77-25 76.55 81.35 72.1 76.0 62.6 73-93
100.0 99.0 99.75 97.5 99.15 96.5 100.3 99.05 99.15 103.5 92.1 98.73
68.36 68.46 68.06 66.96 67.16 68.71 68-35 69.96 68.02 68.96 64-77 67-88
Average
a Tabulated values are in (Mooney) torque units.
Precision in polymer testing
247
(#1)SBR 1500 DAY2
52
m'/oAY I
~48 I
44
~J
40
I
I
I
I
I
2
3
4
I
[
I
[
I
[
I
5
6
7
8
9
I0
II
QUTL;{Bic~llA~e,a~el
LAB NUMBER
(#2) SBR 1712 (OIL EXTENDEDRUBBER)
I
|
+
I
4e
44
Z [ '
xI
_J
I
I
I
= |
I
2
'
3
I
4
I
5
[
6
[
7
I
8
I
9
I
I0
I
II
LAB NUMBER
(#3) EPDM
7~ .i-
, i
I
I
I
I
2
3
4
I I
I
I
I
I
I
I
5
6
7
8
9
I0
II
LAB NUMBER
(#4)IIR(588J) 72
|
-I-
.~ s8
64
'. I I
t ; . . l i ~ t I
2
I
5
I
4
I
5
I
6
I
7
I
8
I
9
I
I0
I
II
LAB NUMBER
F i g . 1.
Materials
1 t o 4, d a y 1 - d a y
2 viscosity versus laboratory
number.
248
A . G . Veith
(..#.5) BLEND SBR 1500/1505 7~
if,
.+
~6E
64
'
!
=I:
I
I
I
L
L
I
I
I
I
I
2
3
4
5 6 7 8 LAB NUMBER
I
I
9
I0
II
84 (#6)SBR-BMB
,,
I:l: I
+ _1 ~E
72
,
=
1
! 68
64
60
I
I
I
2
I
I
3
I
4
I
I
5 6 7 LAB NUMBER
I
I
8
I
I
9
I0
I
II
(.#7)NR IOE
I
I0~
==
+
~.
Fig. 2.
l
'
.
=
I
I
I
I
8
9
I0
9e
I
I
3
4
I
I
5 6 7 LAB NUMBER
I I
II
Materials 5 to 7, day 1-day 2 viscosity versus laboratory number
Precision in polymer testing
249
For SBR 1712 (no. 2), the within-lab agreement is in general good with laboratory 11 slightly low; there is good overall agreement, however. For EPDM (no. 3) there is good repeatability for all except laboratory 11. Overall or general reproducibility, however, is not really good. For IIR (butyl rubber; no. 4) we see excellent agreement for both repeatability and reproducibility with only laboratory 3 slightly high. This particular rubber is NBS Standard Reference Material (SRM) 388j. It is frequently used on a world-wide industry basis to confirm that mechanical calibration operations on a Mooney viscometer have been properly conducted. If the NBS-established Mooney viscosity is obtained, the viscometer is declared to be in calibration. For the SBR blend (no. 5) relatively good agreement is found for repeatability and reproducibility. The plot for SBR-BMB (black masterbatch; no. 6) shows that reproducibility is extremely poor. One of the problems that immediately comes up when viewing such data is how to identify maverick, or out-lier, test results. D4483 addresses this issue with two statistical tests, Cochran's test for outlier cell variances, and Dixon's test for outlier cell averages. A Dixon's analysis (see Refs 4 and 5 for details) shows the laboratory 10 result for SBR 1500 (no. 1) to be an out-lier value. For NR (no. 7) there is fairly large variation both within and between laboratories. Laboratory 11 is statistically significant in a Dixon's test, as a low side out-lier. Although laboratory 10 is high, the deviation of the laboratory 10 average for the overall laboratory 1-9 average is not sufficiently large to declare it an out-lier at the 95% confidence level in a Dixon's analysis. For SBR-BMB (no. 6) laboratory 11 is quite low; however, its low value must be contrasted with the general level of variation (lab-to-lab) for laboratories 1-10. This lab-to-lab variation is large and no. 11 is not statistically significant (95%) in a Dixon's test, when contrasted with the general high variance among all laboratories. The extreme variability of this SBR-BMB rubber is due not to lack of homogeneity but to certain milling steps in each laboratory that are necessary to prepare a test specimen. Although test method D1646 gives details on how this is to be done, the instructions are not sufficiently detailed and an investigation is currently being conducted in ASTM Committee D l l to resolve this issue. In Table 4 the precision parameters are shown. The materials are listed in order of their average or mean level. Although SBR-BMB is
250
A . G. Veith
TABLE 4 ASTM Test Method D 1646 Type 1--Precision a Mooney Viscosity Within laboratories c
Between laboratories ~
Material
Mean Level
Sr
r
(r )
SR
R
(R )
(SBR 1500) (SBR 1712) EPDM IIR (388j) Blend 1500/1505 SBR-BMB NR
46-5 b 50.4 68.0 68.8 68-9 73.9 98.7
0.563 0.449 0.580 0.240 0-597 1.115 1.036
1.592 1-272 1-641 0.673 1.691 3.155 2-932
3.42 2.52 2.41 0.98 2-45 4.27 2.97
1.130 1-129 1-618 0-653 1.074 4.930 1.938
3.198 3-194 4.579 1.848 3.040 13.94 5-490
6.78 6.34 6.73 2.69 4.41 18.86 5.52
Pooled or average values
67"9
0"716
2"025
2"98
1"323a
3"750 a
5"61d
a This is short-term precision (days) with p = 11, q = 7, n = 2. b Mooney torque units. c Symbols are defined as follows: Sr=Within-laboratory standard deviation; r = Repeatability (in measurement units); (r) = Repeatability (%); SR=Betweenlaboratory standard deviation; R =Reproducibility (in measurement units); ( R ) = Reproducibility (%). a Values excluding material no. 6 (SBR-BMB).
l O=r o=R
~
/
/
(Q87)
00/ O0
o
Jr
Jy:" f
0
Fig. 3.
I
20
t
I
I
40 60 80 MOONEY VISCOSITY
I
I00
Repeatability, r, and reproducibility, R, versus mean Mooney viscosity (IIR is excluded).
Precision in polymer testing
251
8 7 {R)
~6
~5 •O
S4 'E ~3
O
o
(r)
~OmD
6'
2 I
0
Fig. 4.
I 20
I I l 40 60 80 I00 M00NEY VISCOSITY
I
120
Relative repeatability, (r), and reproducibility, (R), versus mean Mooney viscosity (IIR is excluded).
shown in the table it is not used to form average or pooled values across all levels because of the wide lab-to-lab variation, and it is also excluded on the basis of the previous remarks. The 'within' and 'between' standard deviations (St, SR) are shown for each level. From these we calculate the repeatability, r, and the reproducibility, R. Using the mean value, relative r and R are calculated, i.e. (r) and (R), which express repeatability and reproducibility on a percentage basis. Figure 3 shows that there is reasonably good linear correlation between both r and R versus the mean Mooney viscosity (MV). Such a linear dependence of r or R on MV, with essentially zero intercepts, indicates that r and R expressed on a relative or percentage basis should be essentially independent of Mooney viscosity. The data of Table 4 plotted as shown here support this conclusion. The zero-slope broken lines in Fig. 4 are located at the average or pooled values for (r) and (R). Figure 4 shows the essential independence of (r) and (R) (for all the clear non-pigmented rubbers except IRR) of Mooney viscosity as expected on the basis of the relationships demonstrated in Fig. 3.
4 USING PRECISION RESULTS There are two main uses of precision results. The normal or most readily recognized use of precision addresses the comparison of two test results, to determine if the two results are acceptable. The definition of
252
A. G. Veith
'acceptable' is somewhat vague and depends upon circumstances and the user of the data. The implication is that if two test results are within the limits of either the repeatability (within-laboratory comparison) or reproducibility (between-laboratory comparison) then they are suitable for averaging. This emphasis on acceptability comes from the 'analytical bench chemist' mind-set--running side-by-side tests--and it has as its origin the comparison of two determinations, i.e. two individual measurement values that are to be averaged to get a test result. While the precision of determinations is important in testing it is the precision of test results, with day-to-day variation, that is most important. Both ASTM D4483 and ISO TR9272 recommend standardized wording to express precision--based on test results, not determinations. Examples of this for the Mooney viscosity precision are given below using the relative precision (the pooled values) which include the highly precise testing for IIR but exclude the (very-poorprecision) BMB.
Repeatability. The repeatability (r) of Mooney viscosity measurement has been established as 2.98%. Two test results that differ by more than 2.98% (expressed as a percentage of their average viscosity) must be considered suspect, that is, as having arisen from different sample populations. Such a decision dictates that appropriate action be taken.
Reproducibility. The reproducibility of Mooney viscosity measurement has been established as 5.74%. Two test results performed in separate laboratories that differ by more than 5.74% (expressed as a percentage of their average value) must be considered as suspect, that is, as having come from different sample populations. Such a decision dictates that appropriate action be taken. These two precision statements strictly apply to clear, non-pigmented rubbers. The 'analytical bench chemist' comparison of two values, either determinations or test results, can be more readily visualized as applying to repeatability, i.e. within a laboratory, than to two values obtained in two different laboratories. In the second use of precision, perhaps a more important one, there is no implication that the two values compared are within repeatability or reproducibility limits (and thus are to be averaged). In this case the two values, most often test results, are compared using the repeatability or reproducibility 'yardstick' again, but the expectation is that they are not really from the same population, i.e. they are not from identical material tests. In this application perhaps there were intentional changes or modifications in the two tested materials or samples and the
Precision in polymer testing
253
aim is to demonstrate whether statistically significant changes have been made. Specification testing also falls in this second use-category for precision. Does material A meet its specification limits? If the difference between a test result and the specification limit is of the appropriate sign (i.e. positive or negative depending on the limit being a 'max.' or a 'rain.') and if the numerical difference is less than the typical or appropriate repeatability or reproducibility (depending upon the testing circumstances) the material may be declared to be 'in spec.'. The testing community has yet to realize the full importance of this aspect of the use of precision results. This situation should change, however, as experience is gained with the use of precision results. There is a great temptation in the conduct of ITPs to select only a few materials (levels) to keep testing costs down. An extreme in this attitude is to select only o n e material and assume that the results expressed as relative (r) and (R) from it can be applied to all materials in an across-the-board fashion. Consider the situation if this had been done in the Mooney ITP, the most widely available rubber (IIR butyl rubber) had been selected and precision values for r, R, (r) and (R) had been obtained with it. Let us apply the precision values obtained for IIR to the testing of a BMB rubber similar to no. 6 of the ITP (Table 5). The repeatability of the BMB rubber is roughly four times higher than the 'established values'; the reproducibility is roughly seven times higher. Using the 'established' very good precision to make technical decisions about repeatability and reproducibility agreements for BMB rubbers (without the knowledge that BMB testing is so highly variable) would lead to totally erroneous conclusions. Another inclination in precision testing, especially when several materials are tested and when there are two or more measured test result parameters, is to average or pool the results of all materials and give in the test method precision clause or section, only the average values for r, R, (r) and (R). Table 4 shows the danger in this also. While it is true that using the pooled (r) or (R) for the clear non-pigmented rubbers (except IIR) is permissible, using the pooled (r) TABLE 5 Precision Values for IIR and BMB Rubber
Material Established values (i.e. IIR) BMB rubber
r
(r)
R
(R)
0.67 3.16
0.98 4.27
1.85 13.9
2.69 18.9
254
A. G. Veith
a n d (R) w h e n a p p l i e d to B M B r u b b e r s is e r r o n e o u s . T h e B M B - t o p o o l e d (r) a n d (R) ratios are respectively 1.4 a n d 3.4. Such discrepancies w o u l d also l e a d to incorrect technical decisions.
5 TYPE
1 VERSUS
TYPE 2 PRECISION
A T y p e 1 precision (r or R ) d o e s n o t involve the c o m p o u n d i n g , mixing a n d p r o c e s s i n g of ( s t a n d a r d ) m a t e r i a l s in e a c h participating l a b o r a t o r y . Such an o p e r a t i o n w o u l d c o n t r i b u t e an a d d i t i o n a l v a r i a n c e or variability c o m p o n e n t to r a n d R. A T y p e 2 precision d o e s involve such o p e r a t i o n s TABLE 6
Comparison of Type 1 and Type 2 Precision: a ASTM D2084--ODC Standard For Repeatability, r
Parameter b
ML(Nm) MHF(Nm) ts(min) tc(50)(min) tc(90)(min) Average
Range of values
Type 1
r Type 2
0.71-0.97 2.84-3.89 2-3-5.3 3.0-3.8 6.4-14.0
0.0447 0.0512 0.710 0.424 0-566
0.0608 0.0991 0-71 0.509 0.538
Ratio of r values, Type 2 Type 1
1.36 1.94 1.00 1-20 0-95 1.29
For Reproducibility, R
Parameter b
ML(Nm) MHF(Nm) ts(min) tc(50)(min) tc(90)(min) Average
Range of values
Type 1
R Type 2
0-76-0.98 3-2-4" 8 3.9-5.1 6.8-8.8 14-1-15.9
0-742 0"563 0.783 1.05 1.79
0.217 0.742 1.10 2.60 4.28
Ratio of R values, Type 2 Type 1
0-29 1"32 1.41 2.47 2.39 1.58
Type 1 precision: 11 laboratories, four materials, two days. Type 2 precision: 12 laboratories, two materials, two days. (For both programs, 1 test measurement = 1 test result.) b ML = minimum torque (prior to start of cure). MH~ = maximum torque at full cure. t~ = scorch time (1 dN m of torque increase), tc(50), to(90) = time to 50% and 90% of full cure respectively. See Appendix C for explanation of D2084 test.
a
Precision in polymer testing
255
in each participating laboratory and therefore the variation associated with this is included in the (measured) r or R. A Type 2 precision applies to many specifications and/or acceptance tests for such raw materials as rubbers, reinforcing fillers (carbon black) and other compounding materials used in a variety of rubber compounds when tests require compounded properties for evaluation. Precision testing conducted in the ASTM a number of years ago on Test Method D2084 provides some information on this issue. This test involves the measurement of cure rate parameters with the oscillating disk curemeter. Appendix C gives a few words of explanation and a typical cure curve with a description of measured test parameters. Table 6 gives Type 1 and Type 2 values for r and R for minimum and maximum torque, scorch and 90% cure time. With the exception of one anomalous value (R value for ML Type 2), the ratio of a Type 2 to a Type 1 precision (either r or R) is greater than one, as expected. The average Type 2/Type 1 ratios (all five parameters) give 1.29 for repeatability and 1.58 for reproducibility. Thus the compoundingmixing operations contribute a greater variance component to betweenlab precision (an almost 60%-greater R) than to within-lab comparisons. This clearly illustrates the need to give careful thought to the types of precision to be estimated for any test method standard. Although a Type 2 precision interlab program is more involved and requires more attention to details, it provides the only type of precision that can be used in many interlaboratory comparisons of test data or test results, especially for specification testing.
6 INTERNATIONAL INTERLABORATORY TESTING: ISO/TC-45 A working Group (WG15) in TC45 was organized in 1981 to address the issue of test method precision. The first international Interlaboratory Test Program within WG15 and TC45 conducted along the guidelines of ISO TR9272 was a program for the oscillating disk curemeter. This test is described in ISO Standard 3417 which is essentially identical to ASTM D2084 discussed in Section 5. The test program for curemeter testing was organized in late 1984 by the USA delegation. A total of 50 laboratories located in Argentina, Australia, Austria, Canada, France, West Germany, India, Indonesia, Italy, Malaysia, Mexico, The Netherlands, Singapore, Spain, Sri Lanka, Sweden, the UK, the USA and the USSR participated. Four
A. G. Veith
256
TABLE 7
Summary of ISO/TC45 International Interlaboratory Test for Type 1 Precision of Oscillating Disk Curemeter Test (ISO3417)
At 150 °C
Test parameter Minimum torque (MH)(Nm) Maximum torque (MHF)(Nm) Scorch time (min) 50% Cure time (rain) 90% Cure time (min)
At 160 °C
Mean value
(r)
(R)
(R)/(r) ratio
Mean value
(r)
(R)
(R)/(r) ratio
0-92
12.7
50.4
4.0
0.92
13.3
81.4
6"1
3"23 7.2
5"83 23"9 12-8 42.1
4"1 3.3
3-11 4.32
5-51 12.1
24"4 40.8
4"4 3.4
7.83
7.84
19.6
2.5
8.10
22-5
2.8
13.3
8.45
22-7
2.7
23.8
9.11
24-6
2-7
13.5
compounds with a range of curing characteristics were mixed in one laboratory, sealed in metal foil packets (against the effects of moisture) and sent to all laboratories in the program. Tests were conducted in a three-week period in early 1985 at 150 and 160 °C with test machines that conform to ISO3417 (and D2084 also). Table 7 gives the summarized results of the testing and analysis. The relative values, (r) and (R), are given, which are equivalent to a coefficient of variation on a percentage basis. The results show that the most variable test parameters are the minimum torque and the scorch 50
>_ 40Z IJJ
~ 30////
ffff
.~ 213._1 W n,.-
L~
0
I
1.2
1.4
Fig. 5.
i
i
1.6 1.8 2.0 MINIMUM TORQUE (Nm) at 160°C
Histogram of minimum torque.
i
2.2
Precision in polymer testing
257
50
)_
40-
Z W O
w30n~ [a_
\\\\\ \\\\\ \\\\\
W >
\ \ \\
L/////
._1 W
/ / / / /
o~ io-
L/Ill/ v / I l l i / / i / ill.-..
///// O"
I.O
i
1.2
i
I
1.2
i
I
i
1.3 1.4 1.5 1.6 1,7 SCORCH TIME (min) at 160°C
Fig. 6.
1.8
1.9
Histogram of scorch time
time, for which (r) is approximately 12-13%; (R) is 40% for scorch and 81% for minimum torque. Both of these occur in the early part of the cure period or cycle. For maximum torque and the 50 and 90% cure times, (r) is in the range 5-9% and (R) is in the range 20-25%. The ratios of (R)/(r) give an indication of the relative decrease in precision for between-laboratory testing versus within-laboratory testing at the two test temperatures. With the exception of minimum torque, the ratios are roughly independent of temperature, 160 vs 150 °C. These ratios also clearly show the 'torque' measurements to be more variable between laboratories compared with the 'time' measurements. Figures 5, 6, 7 and 8 are histograms for one compound (SBR1502, 45 50
)(..) Z I.iJ "--1 O
"'
riLL
40-
30-
ILl >
L/ / . - ' / /
20.J w
0
i
3.4
3.6
3.8
Fig. 7.
4.0 4.2 4.4 4.6 4.8 MAXIMUM TORQUE (Nm) cIt [60°C
5.0
Histogram of maximum torque.
5.2
A. G. Veith
258 40
>..
30(3 bJ n.. b.
ta 20-
..J W (1:
~,, 100
I
5
I
i
i
I 6
I
f
i
i
I
i
i
i
i
I
I
i
7 8 9 0 0 CURE TIME (min) at 1600C
!
i
I
i
i
9
Fig. 8. Histogramof 90% cure time. HAF black, TMTD cure) at 160 °C. They show how the results of the 50 laboratories are distributed for minimum torque, scorch time, maximum torque and 90% cure time. For the two 'torque' measurements the distribution is essentially normal. For the two 'time' measurements the histograms give an indication of low-side skewness for scorch time and high-side skewness for the 90% cure time. The large number of participating laboratories permits the presentation of such results (histograms). With the more normal-size ITP (8-10 laboratories) histograms are not possible due to the limited data base. The results illustrated in the four histograms do not include out-liers or 'maverick' data values. These had been excluded by a Dixon's analysis prior to constructing the histograms. The histograms therefore show the expected or typical distribution of results on a world-wide basis for this test in its current state of development or technological sophistication. International (ISO) testing faces the same general issues as domestic (North American) testing: the importance of Type 1 versus Type 2 testing, the urgent need for a common agreed-upon terminology that is in harmony or agreement with USA (ASTM) domestic terminology. All of these have been addressed within ISO/TC45(WG15). 7 IMPROVING BETWEEN-LABORATORY TEST AGREEMENT What can be done to improve the between-laboratory agreement of test results? The realization that laboratory agreement is often poor,
Precision in polymer testing
259
especially in certain vital societal areas as public health and medical testing and in environmental areas, has given birth to the concept of laboratory accreditation. The idea and concept is good but the act of putting in place the necessary organization is difficult. The current situation may be described as explained by Hess. 6 'Scores of accreditation schemes are currently functioning. But with two exceptions, all are narrowly focused; and each is completely independent of the rest.' Accreditation is usually granted to a laboratory following an inspection of its facilities and after successful proficiency testing of standard samples or materials, furnished by the accreditation agency. However, the accreditation problem can only be solved if important and interested parties agree to act decisively and support a meaningful national accreditation system within any country. International accreditation is even farther away. Another approach to a reduction in between-laboratory variation is two-fold. Firstly, a comprehensive in-depth investigation of test methods (as described in standards) is needed to determine their weak points and their lack of clarity in the description of test protocols. A special type of statistically oriented testing called 'ruggedness testing' can help substantially in this work. However, after a method is carefully investigated, often the only way in which better agreement can be obtained among a number of laboratories is an actual 'on-site' inspection of test equipment. Is the equipment in calibration? Are all components of a more complex test system also calibrated and in a good state of maintenance? Such 'on-site' inspections are possible in a large corporation where intracompany laboratories at different locations can be subjected to inspection by company personnel. Obviously, intercompany laboratories do not fall under this category and other 'on-site corrective action' schemes must be developed if better industry and world-wide agreement is to be achieved at some future date. Another approach to reducing the systematic bias component between laboratories is the use of standard materials or Industry Reference Materials (IRM). Comprehensive and elaborate testing establishes 'assigned' or reference test value results for these IRMs. If two or more IRMs are available for any test method, each individual laboratory can establish a correlation or correction curve for a particular test. This curve specifies the correction to nominally measured test values and in effect allows for all laboratories to report values that should closely agree. The remaining interlaboratory component of variation is the random between-lab component previously discussed. This cannot be totally eliminated in normal testing.
260
A. G. Veith
APPENDIX A - - O R G A N I Z I N G AN I N T E R L A B O R A T O R Y PRECISION P R O G R A M
Task group A task group of qualified people should be organized to conduct the program: a chairman, a statistical expert, and members wellexperienced with the standard in question. The panel chairman should ensure that all instructions of the program are clearly communicated to all laboratories in the program. A supervisor in each laboratory should be chosen.
Type of precision The following initial decisions are required. (a) (b)
The type of precision to be obtained (Type 1 or Type 2). The time period of the repeatability and reproducibility estimate; short (minutes, hours, or days) or long (weeks or months). Define the time period.
Laboratories and materials The number of laboratories should be determined. The number of materials, each comprising a different level of the measured property, should be selected. At least 10 participating laboratories are recommended. Practical considerations often require that fewer than 10 laboratories participate. However, an interlaboratory study that involves fewer than six participating laboratories may not lead to reliable estimates of the reproducibility of the test method. The number and type of materials to be included will depend on the range of the property and how precision varies over that range, the different types of materials to which the test method is applied, the difficulty (expense) in performing the tests, and the commercial or legal need for obtaining a reliable estimate of precision. An interlaboratory study should include at least three materials; for development of broadly applicable precision statements, five or more materials should be included. The term 'materials' is used in a broad generic sense, to mean raw or natural substances, manufactured products, etc. For each level of material, an adequate quantity (sample) of items, specimens or homogeneous material should be available for subdivision and distribution by random allocation to the participating
Precision in polymer testing
261
laboratories. At each level, p separate closed containers (the number of laboratories) should be used where there is any danger of the material deteriorating upon exposure to air or humidity. In the case of unstable materials, special instructions on storage and treatment should be prescribed.
Organization of the tests The interlaboratory test plan is as shown in Fig. 9 in Appendix B, a table that indicates the laboratories, materials, and replicates. With q levels and n replicates, each participating laboratory among the total of p has to carry out qn tests. The number of replicates, n must be specified. Each replicate may be one test result or one determination according to the requirements of the test method standard. Normally, n is 2. A larger number may be specified if necessary.
Instructions to operators The operators should receive no instructions other than those contained in the standard test method; these should suffice. Prior to testing, the operators should be asked to comment on the standard, and state whether the instructions contained in it are sufficiently clear.
APPENDIX B--REPEATABILITY, REPRODUCIBILITY PRECISION DEVELOPMENT The analysis of interlaboratory testing data is sometimes approached in the conventional two-way analysis of variance (ANOVA) manner. The sources of variation in the classical sense are: laboratories, materials, lab × material interaction and replication within laboratories. Mandel 7 has shown that this conventional two-way analysis does not yield a true estimate of the two main variance or standard deviation estimates within and between laboratories. The conventional two-way A N O V A is invalid because true precision frequently varies from level to level (or average to average) among the (M) materials in the test program. The conventional analysis provides average precision estimates (over all these M levels) and this can be misleading. Therefore, an analysis method is needed that avoids this problem. The correct analysis approach for interlaboratory precision is a
262
A . G . Veith
Original Test Resultsa Level
1
2
J
q
YOl Y ijk
P a The following notation is used: (a) Laboratories, there are p as a total L i ( i = l , 2. . . . . p) (b) Materials or levels, there are q as a total Mj (j = 1, 2. . . . . q) (c) Replicates, there are n as a total in each cell or LiMj combination. There is normally an equal number of n values (usually 2) in each cell. (d) Yijkis a single test result value. Example--Cell (i, j) contains nij results Yijk (k = 1, 2 . . . . . n~j). Fig. 9.
Two-way ITP data table layout (ASTM D4483, Ref. 4).
'level-by-level' analysis. A ' b e t w e e n - w i t h i n ' analysis is p e r f o r m e d for each level or material. This is equivalent to a conventional 'one-way' classification A N O V A . Thus any m e a s u r e d values (Yik)j is given by the model ~ik)j
~--
rnj + (L,)j + (eik)k
(B1)
w h e r e subscript j indicates level j, Yik is the k t h replicate value obtained in laboratory i, Li is the 'effect' or bias of laboratory i (this t e r m is equivalent to the c o m b i n e d Bo + Bs of eqn (2) discussed earlier) and eik is the within-lab deviation associated with the k t h replicate in laboratory i. O n a practical basis the following d e v e l o p m e n t gives the rationale for analysis of an I n t e r l a b o r a t o r y Test P r o g r a m (ITP) for precision. The d e v e l o p m e n t is based on a two-way (matrix) table that represents any ITP. T h e table in Fig. 9 consists of 'cells', nominally with two replicates or test results per cell. This m i n i m u m n u m b e r of replicates per cell (two) gives a 1-DF (degrees of f r e e d o m ) estimate of within-cell standard deviation•
Precision in polymer testing
263
Underlying assumptions Within-laboratory variability The cell standard deviation is a measure of the within-laboratory variability of each individual laboratory. All laboratories are assumed to have essentially the same degree of variability for any material when following the specified repeatability conditions. Therefore, the laboratory cell variances can be pooled by averaging the squares of the cell standard deviations. The square root of this average or pooled within-laboratory variance is the repeatability standard deviation Sr for any material or level. Between-laboratory variability: variability o f laboratory means The test results obtained on a particular material at any particular laboratory are considered to be part of a population having a normal distribution with a standard deviation equal to the repeatability standard deviation and a mean (or bias), which is different for each laboratory. The laboratory means are also assumed to vary according to a normal distribution, whose grand mean is estimated by the average of all ITP test results for a given material, and whose standard deviation is designated by SL. For the ITP calculations, SL is estimated from the standard deviation of the cell averages, S~, and the repeatability standard deviation, St, according to: S 2 = SZ~- S~/n
(B2)
Here, S 2 is the pooled variance for the cell averages for material or level M, and n is the number of test results per cell. The term S} is the observed variance of cell averages for material M; S2/n is subtracted from the S~ because the latter contains a 1/nth part of S 2. In the unlikely event that S 2 is calculated at less than zero, SL is set equal to zero. Sampling variations may cause S 2 to be calculated at less than zero. Reproducibility conditions and SR The variance among individual test results obtained in different laboratories is the sum of the within-laboratory variance and the between-laboratory variance of the laboratory means. Thus, the reproducibility variance is given by eqn (B3).
S2 = S2 + S2
(B3)
Substituting eqn (B2) into eqn (B3) produces S~ = S 2 + S 2 - S2/n
(B4)
264
A.G.
Veith
Simplifying and taking the square root gives
SR = ~/S~ + S2(n - 1)/n
(B5)
When SR is calculated to be less than St, SL is estimated as less than zero and SR is set equal to Sr.
Repeatability conditions and Sr The value of Sr depends on the repeatability conditions under which the test results are obtained. Changes in the repeatability conditions which results in larger values of Sr will decrease the value of SL, as calculated from eqn (B2). Since S 2 is the sum of S 2 and S 2 a change in Sr arising from changes in repeatability conditions has little effect on the value of
sR. A very simple example as given in Table 8 will indicate how this development is applied to test data. Since repeatability and reproducibility are differences between (mean) values they are calculated on the basis of a t-distribution. Thus in general terms 151 - $21 = t(S~x)~/1/kl + 1/k2) (B6) where
$1 = mean of test (result) 1 $2 = mean of test (result) 2 S~x) = pooled standard deviation of 'individual values' of the population under consideration ('system of causes') k~ = number of values (replicates) used to calculate $1 k2 = number of values (replicates) used to calculate $2 TABLE 8 Simple Level-by-Level Analysis (Level 1)
Laboratory
x-data
£
Standard deviation S
1 2 3 4
41,42 42,44 43,42 46,48
41.5 43-0 42.5 47-0
0.707 1-414 0-707 1.414
=
43.5
Sre p = S r = ~ = 1-118; SR = V ~ L + S~ - S~/2; SR = ~ + 1"250 -- 1"250/2 = 1.744. Level 1 = M a t e r i a l 1.
S~p =
Variance S2
0-500 2-00 0.500 2.00 1.250
Precision in polymer testing If kl = kz,
265
t(S(x)V~)
[X 1 - - X21 =
= t(
(x)lV
=
(x)lV
lx/2)
(B7)
Since (B8)
with £ = mean of k (x values), then IX, - x2l = t ( S ( £ ) V 2 )
(B9)
Since r and R are equivalent to critical differences, then for repeatability r r = IX2 - -
X2] =
tWr2 ( S r )
r = 2.0 × 1.414& r = 2.828& ~ 2.83Sr
(810) (Bll) (B12)
And for reproducibility R R = I£,-x2l
=
tV~SR
R = 2"83SR
(B13) (B14)
For a calculation of r, £1 and £z represent within-laboratory test results and Sr is the pooled standard deviation of these test results. For a similar calculation of R, £1 and X 2 represent between-laboratory test results and SR is the pooled standard deviation for those results. The nominal t-value of 2.0 for a 95% confidence level is assumed to be a good representative value when the error estimate DF is 25-30, as is the case for most ITPs.
A P P E N D I X C - - T H E OSCILLATING DISK C U R E M E T E R TEST The test is best described by the Summary (ASTM D2084). Summary o f test 'In this test method, a specimen of compounded rubber is contained in a sealed test cavity under a positive pressure and maintained at an elevated temperature. A biconical disk, embedded in the test specimen, is oscillated about the shaft axis through a small arc. This action exerts a shear strain on the test specimen and the force at maximum amplitude (torque) required to oscillate the disk is proportional to the stiffness (shear modulus) of the rubber. This torque is recorded autographically as a function of time.
266
A . G . Veith
60
60
60
dN m 40
4C
40
M
or
Lb.-in.
MHF
ts2
20
t12
MHR.
2C
I0
2'0
Time - Min.
3'0
20
C ,o
2'0 Time- Min.
3'0
0
I0
20
Time -
30
Idin.
Fig. 10. Cure curves. Left, cure to equilibrium torque; center, cure to a maximum torque with reversion; right, cure to no equilibrium in maximum torque (ASTM D2084). The stiffness of the rubber specimen increases during vulcanization. A test is completed when the recorded torque either rises to an equilibrium or m a x i m u m value, or when a predetermined time has elapsed. The time required to obtain a curve is a function of the test temperature and the characteristics of the rubber c o m p o u n d (see Fig. 10). The following measurements may be taken from the curve of torque versus time. (a) (b) (c)
M i n i m u m torque, ML, a measure of the stiffness and viscosity of the unvulcanized c o m p o u n d . Time to incipient cure (scorch time), ts2, a measure o f processing safety. Time to a percentage of full cure, tc(X), an inverse measure of cure rate, based on time to develop some percentage, x, of the highest torque.
The most commonly used cure time is for 90% cure, to(90), the time n e e d e d for the torque to increase to 90% of its final value. (d)
Maximum, plateau, or highest torque, MHF , MHR o r MH, a measure of shear modulus or stiffness of the fully cured vulcanizate.'
Significance and use This test m e t h o d may be used to determine the rate of cure and some properties of the ultimate vulcanizate. It is also useful for comparing and evaluating various raw materials used in vulcanization for specifications acceptance testing. The m e t h o d may be employed in developmental work to evaluate new rubbers.'
Precision in polymer testing
267
REFERENCES 1. 2. 3. 4.
Hunter, J. S. (Nov. 1980). Science, 210(21), 869. Miller, S. S. (1978). Env. Sci. Technol., 12, 18. Cropper, W. V. (April 1986). A S T M Standardization News, 48. ASTM D4483 (1985). Standard Practice for Rubber--Determining Precision for Test Method Standards. Obtainable from: ASTM, 1916 Race St, Philadelphia, PA 19103, USA. 5. ISO/TR9272 (1986). Rubber and Rubber Products--Determination of Precision for Test Method Standards. Obtainable from: International Standards Organization, Central Secretariat, Case Postale 56, CH-1211 Geneva 20, Switzerland. 6. Hess, E. H. (Jan. 1986). A S T M Standardization News. 7. Mandel, J. (Mar. 1977). A S T M Standardization News, 17.