8
PLAN OF THE FIRST PART What is new in this book ? Message to the hurried reader hapter 1 * Basic definitions and notations ,hapter 2 * Summary of the logical structure of this set of theories
W H A T
I S
N E W
I N
T H I S
B O O K
?
UNIFICATION OF THE FRENCH AND ENGLISH DEFINITIONS AND NOTATIONS Since the publication of our 1982 text, Working Group No 9 of the IS0 TC-183 committee (Sampling of copper, lead and zinc ores and concentrates) convened several times under the chairmanship of Dr R J HOLMES (CSIRO, Clayton, VIC. Australia) and has studied various proposals made by J-M PUJADE-RENAUD leader of the French Delegation, specifically * u n i f i c a t i o n o f the E n g l i s h and F r e n c h d e f i n i t i o n s and n o t a t i ons . * u s e o f the r e s u l t s o f o u r s a m p l i n g t h e o r y and e s t i m a t i o n o f the s a m p l i n g v a r i a n c e by v a r i o g r a p h i c a n a l y s i s (to be defined l a t e r ) . It seems that, for the first time ever, a standards committee is ready to take into account the existence of a sampling theory. Though the French proposals are not accepted yet international standardization is a slow process - the situation is not so hopeless as it seemed to be a few years ago. We decided to follow J-M Pujade-Renaud's clever suggestion to unify the French and English definitions and notations. These will be presented in chapter 1 and integrated in our future developments. We would like to seize this opportunity to acknowledge our friend PUJADE-RENAUD's tenacity and to wish Dr R J HOLMES and his Committee good luck. If they succeed, they are likely to make history.
9
CHARACTERIZATION OF THE VARIOUS FORMS OF HETEROGENEITY In our previous publications, h e t e r o g e n e i t y w a s regarded o n l y a s a sampling e r r o r generator. Its properties were presented as part of the sampling theory. In this book the second part (chapters 3 to 5) analyses the concept of heterogeneity and presents the various forms it can take without any r e f e r e n c e t o sampling. This amounts to acknowledging the obvious. Heterogen e i t y does e x i s t i n i t s e l f and has consequences in domains other than sampling (e.g. process control and monitoring). It is therefore worth discussing separately from sampling. In the following parts (chapters 6 to 231, sampling i s t r e a t e d a s a s e l e c t i o n process applied t o a m a t e r i a l . The h e t e r o g e n e i t y o f t h e material t o be sampled m a y be characterized * either a s a s c a l a r (a variance), when the lot to be sampled can be regarded as a population o f u n i t s whose order i s irrelevant, * or by a f u n c t i o n (a variogram), when the lot to be sampled can be regarded a s a time s e r i e s o f u n i t s whose order i s r e l e v a n t . GENERALIZATION OF THE HETEROGENEITY CARRIED BY A UNIT In 1979 we introduced the concept of h e t e r o g e n e i t y carried by t h e u n i t s which make up a population. In this book, as well as in the 1988 French book, we generalize this concept t o t h e u n i t s making u p a time s e r i e s , which results in a considerable practical simplification. In our former approach, the variance of the selection error associated with the mathematical model of sampling from a moving stream involved no less than 15 parameters, five for each of the three d e s c r i b e r s (describing parameters) of unit Urn, i.e. Mm Am am
Mass of solids in Urn, Mass of critical component A in Urn, Critical content of Urn defined as the ratio
am
E
Arn/Mm.
We had never been satisfied by this complicated demonstration The reasoning presented here seems much more satisfactory. It is based on describing each unit Urn, whether it belongs to a population or to a time series, by a s i n g l e d e s c r i b e r h m , defined as " t h e h e t e r o g e n e i t y carried by Urn". It describes a l l properties of Urn which are relevant to its sampling. We no longer use the variograms of M a , Aa and am but s i m p l y t h e variogram o f h m . This s i n g l e d e s c r i b e r of Urn is defined (section 4.2.1.) as hm
am Mm ___-___ _-_ aL
aL
Mm *
with
Average critical content of lot L (weighted mean of the am) Mass of lot L, NU Number of units in L, Mm* Mass of the average unit Um* with Mm* = M L / N u . aL ML
10
When breaking up the total error associated with the onedimensional mathematical model, formerly called CE and now called t h e i n t e g r a t i o n error I E , the q u a l i t y f l u c t u a t i o n s (content am) and the q u a n t i t y f l u c t u a t i o n s (mass M a ) are no longer separated but are incorporated into the definition of h m . In our 1979/82 books, we broke up C E into a sum of two terms CE
=
QE
+
WE
with
Q u a l i t y f l u c t u a t i o n e r r o r , estimated by introducing the hypothesis : Mm = M L / N u = c o n s t a n t , irrespective of the subscript m, Q u a n t i t y f l u c t u a t i o n or w e i g h t i n g error defined as the difference between CE and QE.
QE WE
In a second step we broke up QE into a sum of three terms. QE 3 QEi + QEz + QE3 with QEt QEz QE3
S h o r t - r a n g e q u a l i t y f l u c t u a t i o n error, Long-range (non-periodic) q u a l i t y f l u c t u a t i o n e r r o r , ( L o n g - r a n g e ) p e r i o d i c q u a l i t y f l u c t u a t i o n error,
These corresponded to the three components of the deviation (am - aL )
(am - aL I
am1 + amz + am3
In the same way, the study of all examples at our disposal showed that it was always possible to break up h m into a sum of three terms hm ha1 + h r z + h m 3 with h m1 hmz h03
S h o r t - r a n g e he terogenei t y f l u c t u a t i o n s , L o n g - r a n g e (non-periodic) h e t e r o g e n e i t y f l u c t u a t i o n s , ( L o n g - r a n g e ) p e r i o d i c heterogeneity f l u c t u a t i o n s .
I n t h e n e w a p p r o a c h p r e s e n t e d here, we directly write IE
IEt IEz IE3
h
IEi
+ I E z + IE3
with
Short-range heterogeneity f l u c t u a t i o n error, L o n g - r a n g e (non-periodic) h e t e r o g e n e i t y f l u c t u a t i o n e r r o r , ( L o n g - r a n g e ) p e r i o d i c h e t e r o g e n e i t y f l u c t u a t i o n error. These are generated by the corresponding components of h m .
The study of hundreds of experimental variograms and of a number of simulations showed that a s a very g e n e r a l r u l e , the parameter h m and its three components inherit the variographic properties of the content am and its three components, the fluctuations of Mm being practically irrelevant. This observation confirms the minor role of the weighting error WE already underlined in our previous texts. This point is developed in the fifth part (chapters 13 to 16).
11
EMPHASIS PUT ON SAMPLING CORRECTNESS 0
The sampling of a certain lot L is said to be c o r r e c t when * it gives all units belonging to the lot a uniform p r o b a b i l i t y o f being s e l e c t e d and , * the integrity of increments and sample is respected.
...
The sampling of L is said to be i n c o r r e c t when at least one of these conditions is not fulfilled. The emphasis on correctness is not new. It was developed in all our books since 1975. But the achievement o f sampling correctness is of paramount importance and, despite our efforts, i t i s too o f t e n overlooked. For these reasons, we decided to increase our emphasis on this very important point by deriving the conditions of sampling correctness in the fourth part (chapters 9 to 121, b e f o r e the development of the mathematical models in the fifth and sixth parts (chapters 13 to 21). Furthermore, t h e r e s u l t s of t h e s e models are devoid o f c o r r e c t n e s s are not any v a l i d i t y i f t h e c o n d i t i o n s o f adequately f u l f i l 1ed. We have extended the concept of sampling correctness to include maintenance of the sample integrity. The preparation errors PE that were treated separately in our former publications are also treated in the fourth part of this book. STRUCTURE AND CIRCUMSTANCES
*
CORRECTNESS AND ACCURACY
For the first time, we distinguish the properties relevant to t h e s t r u c t u r e of the sampling process from those relevant to t h e circumstances, e.g. the properties of the material. This requires a few definitions and explanations. A given property or a given statement is said to be
* S t r u c t u r a l . When it follows logically from conditions which we are i n a p o s i t i o n t o control and that, when fulfilled, remain time-stable. These conditions do not depend upon the properties of the material to be sampled. * Circumstantial. When, on the contrary, it depends upon the circumstances, i.e. on conditions which we are n o t i n a p o s i t i o n t o control such as the properties of the material to be sampled. These definitions may seem abstract. To illustrate their use, we shall apply them to the concepts of c o r r e c t n e s s (section 1.7.1.) and accuracy (section 1.7.2.). Sampling correctness is, for a given maximum particle size, a s t r u c t u r a l property of the sampling equipment. It does not depend upon the circumstances. We will show that sampling c o r r e c t n e s s always leads to accuracy, irrespective of the circumstances. When accuracy r e s u l t s from sampling c o r r e c t n e s s , b u t o n l y t h e n , i s accuracy a s t r u c t u r a l p r o p e r t y .
12
But the situation is not symmetrical and the reverse is not true. A deviation from correctness does not always lead to bias. An incorrect sampling system may very well deliver positively biased samples today, negatively biased samples tomorrow and practically unbiased samples, i.e. accurate samples, some time next week. When sampling is incorrect, accuracy is circumstantial. When accuracy is observed with an incorrect sampling system, it cannot be relied upon because there is no reason for it to be time-stable. Anticipating future conclusions, structural correctness is a necessary and sufficient condition of structural, reliable accuracy. On the other hand accuracy is a necessary and sufficient condition of commercial equity (with a few exceptions that are irrelevant in this context). We must therefore conclude that Structural correctness of the sampling and auxiliary equipment, which alone implies structural accuracy, is a necessary and sufficient condition of commercial equity and more generally of sampling reliability. This undisputable conclusion is of great importance in commercial sampling. It shows that the bias tests proposed by certain standards have essentially no value according to simple logic. One has absolutely no right to generalize the conclusions from a circumstantial observation of accuracy. This point will be developed in chapter 30. Our main purpose, in this section. is to illustrate new and subtle definitions. POINT-BY-POINT COMPUTATION OF AUXILIARY FUNCTIONS, ERRORGENERATING FUNCTIONS AND SAMPLING VARIANCES
To estimate the sampling variance, we have found it convenient to introduce what w e call the auxiliary and errorgenerating functions derived from the variogram of hr and linking the sampling variance to the latter. So far, geostatisticians (see for instance David, 1977/88) as well as ourselves (1979/82), have, more or less successfully, fitted the points of sample variograms to a predetermined model which, in chronostatistics, was the parabolic model (with the linear and flat models as particular cases). From this model we calculated the first and second integrals (called the auxiliary functions of the variogram) as well as combinations of these, which we now call the error-generating functions, to which the sampling variances are proportional. These functions are defined in chapter 5. In the current text, we present variogram modelling but only to emphasize the interest of a new method which we recommend. This is point-by-point computation of the auxiliary and error-generating functions. As far as we can judge, this is an original contribution to the practical use of variograms for calculating sampling error variances.
13
Fitting variogram models has repeatedly proved hazardous, especially with variograms containing one or s e v e r a l p e r i o d i c shape, period and c o m p o n e n t s with non-strictly uniform amplitude, but also with non-periodic variograms containing a n i m p o r t a n t r e s i d u a l component. The error then committed in the estimation of the sampling variancee is related to the inadequacy of the model and can be so important as to deprive the estimation of any practical value. NOTE FOR THE ATTENTION OF GEOSTATISTICIANS. A1 t h o u g h p e r i o d i c v a r i o g r a m s a r e p r a c t i c a l l y unknown i n g e o s t a t i s t i c s , i m p o r t a n t r e s i d u a l c o m p o n e n t s a r e very o f t e n o b s e r v e d i n g e o v a r i o g r a m s and t h i s new m e t h o d s h o u l d i n t e r e s t g e o s t a t i s t i c i a n s a s well as c h r o n o s t a t i s t i c i a n s and u s e r s of t h i s s a m p l i n g theory. I n g e o s t a t i s t i c s , i t c o u l d be g e n e r a l i z e d t o f u n c t i o n s other t h a n the v a r i o g r a m .
The p o i n t - b y - p o i n t c o m p u t a t i o n of a u x i l i a r y , e r r o r - g e n e r a t i n g f u n c t i o n s and s a m p l i n g v a r i a n c e s is much more satisfactory t h a n f i t t i n g m o d e l s t o v a r i o g r a m s . It suppresses the hazards of variogram modelling and is easy to computerize. The corresponding technique (section 5 . 8 ) is based on an estimate and two hypotheses,
* t h e e s t i m a t e i s the v a r i o g r a m i n t e r c e p t v ( O ) , which i s a l s o r e q u i r e d w i t h v a r i o g r a m m o d e l l i n g t e c h n i q u e s , irrespective of the model, * the h y p o t h e s e s are (1) the unknown variogram contains all experimental information, (2) the variogram is linear between consecutive points W e c a n n o t i m a g i n e s i m p 1 e r h y p o t h e s e s nor h y p o t h e s e s which a g r e e more c l o s e l y w i t h the e x p e r i m e n t a l r e s u l t s .
ESTIMATION OF
A
MASS OR A VOLUME
BY
PROPORTIONAL SAMPLING
As far as we can judge, this is also an original contribution. It should improve this branch of metrology at industrial or pilot scale and provides a practically foolproof checking system. Weighing can be a very precise measurement but only when it is carried out by b a t c h w e i g h i n g . Volume measurement is seldom precise, even with batches. C o n t i n u o u s m e a s u r e m e n t of masses or volumes at industrial or pilot scale is definitely not as accurate and reproducible as it should be and certainly not as claimed by manufacturers. This statement specifically covers all types of belt-scales, volumeters, electromagnetic flowmeters, gamma-densimeters and the like, whose accuracy, even when they are frequently calibrated, is highly questionable. The reader should know that sampling offers an accurate and cheap solution.
14
The principle of proportional sampling was presented in a journal paper (1981) that may easily have escaped attention. Since then we have successfully implemented and tested this technique on various occasions. In a series of tests designed to assess the reliability of proportional sampling, several defects of conventional weighing or volume-measuring systems (strain-gauges, water-meter), which were supposed to provide accurate results, were actually detected thanks to the accuracy and reproducibility of proportional sampling. Proportional sampling merely consists in what we would call a hyper-correct sampling system (correct plus safety factors). It provides a highly representative sample which can be used for quality estimation (all kinds of analyses and assays) and whose quantitative attributes (mass and/or volume) are proportional to those of the lot submitted to sampling, with a sampling ratio which is accurately known and time-stable. By dividing the sample mass (which can be easily and accurately measured on conventional, mechanical scales) by the sampling ratio (which can be easily computed and remains timestable) we obtain an unbiased and reproducible estimate of the lot mass or volume. This original and interesting technique is presented in chapter 29. THEORY OF ONE-DIMENSIONAL HOMOGENIZING OR “BED-BLENDINGf‘ Bed-blending is the industrial name €or what theory regards as one-dimensional homogenizing. It has a huge industrial potential which, as far as we know, only the cement industry has fully acknowledged to-date. We are convinced it has a big future tied to the development of process control and monitoring. We developed this theory on the request of a French manufacturer of bed-blending equipment who entertained serious (and healthy) doubts as to the validity of the formulas in use to compute the performances of bed-blending facilities. This theory, which is directly derived from sampling theory, has been presented only in two magazine articles (1981 and 1982). Its development, and the industrial experiment carried out to check its validity, is the subject of chapter 35.
The reader is perfectly entitled to be in a hurry. He can also choose to ignore our work. Most people do. But if he is willing to spend a few hours, perhaps a few minutes only, to get acquainted with the theories of heterogeneity, sampling and homogenizing, we would suggest the reading of the following chapters which present these theories and their results in a condensed way. This would be much better than total ignorance of the subject.
15
Chapter 2 reduces these theories to their logical backbone. Everyone should read it if only to discover that there are numerous forms of heterogeneity; to understand how each form of heterogeneity generates a specific component of sampling error: or to learn what can be expected from a bedblending system. C h a p t e r 24 summarizes the properties of all components of the total sampling error. The reader should refer to this chapter to learn how some of these components can be completely cancelled at no significant extra cost, how others can be controlled in practice and how sampling accuracy, reproducibility, representativeness and commercial equity can be achieved by the use of correct sampling devices. C h a p t e r 25 gives an overview of practical sampling problems and explains that a certain number of these are unsolvable, at least for an acceptable cost. For those sampling problems which happen to be solvable, it outlines the possible solutions.