Journal
of Archaeological
Science 1982,9,281-298
Sampling Seeds Marijke van der Veenaand Nick FielleP Problems of sampling carbonized plant material are discussed. Firstly, the problem of actually selecting a sample in the laboratory is considered, and some experiments which investigate various procedures are described. Secondly, the statistical aspects of determining optimal sample sizes are considered. Formulae are given for calculating optimal sample sizes and confidence intervals. Upper bounds, which are independent of the total population size, are provided for the sample size required to achieve any desired accuracy. RANDOM SAMPLING, SEEDS, CARBONIZED PLANT REMAINS, SAMPLE SIZE DETERMINATION, SAMPLE SELECTION, CONFIDENCE INTERVALS, PALAEOBOTANY, ARCHAEOBOTANY.
Keywords:
Introduction Although archaeologists have realized that their conclusions about past human life were based upon small and sometimes inadequate samples, it was not until the late 1960s that
sampling became an issue of major concern. It has become increasingly necessary to work out a detailed and adequate sampling strategy, since the total survey of a region, the total excavation of a site and the total analysis of an assemblage have become virtually impossible (Mueller, 1975). Palaeobotanists have not, until now, taken much part in the general discussion on sampling strategies. Until recently, palaeobotany was mainly concerned with qualitative statements about the kinds of plants that were exploited and with their phylogenetic history. Slowly, however, the emphasis is changing towards a more economic approach, in which the need for quantitative evidence is felt more strongly. A major development in the history of palaeobotany was certainly the improvement of the retrieval methods of plant remains from archaeological sites. Various “seedmachines” have been developed and although they differ in cost, efficiency and processing times (Cherry, 1978; Keeley, 1978) they all have in common an enormous increase in the amount of soil being processed for plant remains, and consequently an enormous increase in the amount of material to be analysed. This is already beginning to cause problems in laboratories, as it becomes impossible, both in terms of time and money, to analyse all the samples. “Department of Prehistoryand Archaeology,University of Sheffield,Sheffield, SlO2TN, England. bDepartment of Probability and Statistics,University of Sheffield,Sheffield, S3 7RH, England. 287 03054!03/82/020287+ 12 $03.00/O
@ 1982AcademicPressInc. (London) Limited
288
M.
VAN
DER
VEEN
AND
N. FIELLER
It is extremely important that in the future a research design for analysing plant remains be developed and incorporated in the excavation strategy, so as to prevent laboratories from being flooded by unnecessary samples. On the other hand, excavations are unrepeatable, so the quantity of samples taken on site should not be reduced to an absurd minimum. Consequently, it is worthwhile trying to develop a method of speeding up the laboratory analysis. The recent literature on sampling in archaeology is mainly concerned with the sampling of a region or a site (Redman, 1974; Mueller, 1975; Cherry et al., 1978). Very little work has been done on the problems involved in sampling assemblages. In this article, a fist attempt is made to develop a standard procedure for sampling carbonized plant remains, the objective being to analyse no more material than is necessary to achieve a specified accuracy and to be able to assess whether the material brought back from the site is sufficient to claim reliable results. Two distinct problems are dealt with in this paper. Firstly, there is the problem of actually selecting a sample from a bag of carbonized plant remains and, secondly, there is the problem of determining the size of the subsample needed to ensure acceptable reliability of the results. This latter problem emphasizes the importance of specifying clearly the- objective of the sampling experiment. Sampling Procedures
In the literature, we can distinguish three fundamental procedures for selecting samples of carbonized plant material, namely “grabsampling”, “cumulative sampling” and “random sampling”. In many cases, however, the method used is not stated precisely. For example, Buurman (1979) says that all samples were sorted completely, except for one, of which only a quarter was analysed. Hopf & Schubart (1965) say that they examined 23-O g of a total weight sample of SO-8g. Dennell (1974) states only that “1000 grains from each deposit were examined”. Again, Van Zeist (1968) lists several samples which have only been partly analysed. Often no indication is given of either why a subsample was taken, or, more importantly, how the appropriate size of the subsample was calculated and physically selected from the total material. We feel that these are serious deficiencies. Without such information it is not possible to evaluate the reliability of any results obtained. It may be that the method of sampling used in some studies such as these is what has come to be known as “grabsampling” or “haphazard selection” (Redman, 1974; Mueller, 1975). A subsample taken in this way cannot be regarded as being truly representative of the whole, nor can it be termed a “random sample” in the conventional statistical sense outlined below. Although probably the most frequently used of the three procedures, it is the least satisfactory. There is no standard way of taking the subsample nor any guide for its size, consequently there is no way of knowing how “representative” the sample is of the whole nor how accurately calculations made on the sample apply to the total. The second procedure of sampling carbonized plant material is what we shall term “cumulative sampling”. It was used by Fasham & Monk (1978) and Green (1979) in a slightly different context. The procedure is to divide the bag of material to be examined randomly into subsamples of a conveniently small size and arrange these in a random order. Each subsample is examined in turn and the presence, relative frequencies (and whatever other features are of current interest) of the different species are recorded. As the subsamples accumulate, the running totals or proportions are calculated and plotted on a graph. For the first few subsamples the graph will fluctuate wildly but will eventually “settle down”, so that examination of further subsamples hardly alters the current
SAMPLING
SEEDS
289
estimate of species abundance or relative frequency of any particular species. Typically, this “levelling off” of the graph will occur well before all the small subsamples have been examined. At this stage sorting can stop and the remaining subsamples can be left unsorted. The attraction of this method is that it has a built-in device for determining the size of the sample required. The disadvantages are that, firstly, it leaves open the question of the size of the subsamples. If they are too small then the organizational problems become severe; if they are too large then more material is sorted than is necessary (the last few samples sorted in this procedure are in effect wasted since they only provide the information that the graph has stabilized). Secondly, it is not at all clear to what extent the results on sample size are applicable to other samples (Cherry, 1978). That is, a different total sample size may be needed on later occasions, even when sampling similar material. Thirdly, although the method automatically provides a sample of adequate size to ensure reasonable accuracy of estimation, it is often preferable to have forewarning of the likely size of the sample, particularly if this is unacceptably large. It is better to modify overambitious requirements of accuracy than to expend fruitless effort in sorting large numbers of subsamples. The third procedure is probabilistic or random sampling. A simple “random sample” is one selected so that all potential samples of that size have an equal chance of being selected. Samples chosen randomly should avoid any bias introduced by conscious or subconscious human choice. Further, the accuracy of estimates based on random samples can be quantitatively assessed by calculation of the standard errors of the estimates, and hence of confidence intervals for the features of interest. This facility is also available when using the cumulative sampling procedure, but the distinction is that it is possible to calculate in advance how large a sample is required to achieve a given accuracy. Formulae for calculating confidence intervals and optimal sample sizes are given below. It should be emphasized here that accuracy can only be assessed probabilistically. It is only possible to make statements of accuracy such as “this cotidence interval is ninetyjive per cent certain to contain the true value of the relative frequency of this species.” It is never possible to claim from a random sample that a population parameter is definitely within a fixed interval. By the same token, it is only possible to calculate in advance the size of sample which will give a reasonable chance of attaining the required accuracy. Of course, it is always possible that a sample, however “randomly chosen” it was, will not be representative of the whole population. The essential objective of statistical sampling methods is to minimize such a risk. Experiments Purpose
Since we want to apply random sampling procedures to bags of carbonized plant material, it is necessary to give some thought as to what techniques can be used to obtain samples which may be considered to be “random samples” in the statistical sense outlined above. We compared three candidate sampling methods to see whether any one of them was noticeably deficient. In particular, did any one of them produce more unrepresentative samples than could reasonably be expected by pure chance, and further, did any of the methods produce samples which followed the general characteristics of “random samples”? Methods
The three methods we compared were the “spoon”, “riffle-box” and “grid” methods. These three methods are certainly not the only ones that might be devised for laboratory
290
M.
VAN
DER
VEEN
AND
N. FIELLER
work, but they illustrate well the three principal categories of practical sampling methods; the first is a “rough and ready” method that makes no specific attempt at achieving randomness objectively, the second relies on a mechanical device to select a random sample and the third is more formal and requires the use of random sampling numbers. The spoon method is simply to take a spoonful of carbonized material, after having mixed the material “thoroughly”. The number of spoonfuls taken is determined by the size of the sample required. This method might be expected to be unreliable, since its accuracy would heavily depend on the homogeneity of the contents of the bag. If the bag has been stored for a long time, layering of the seeds will have occurred (small seeds go to the bottom). If not thoroughly stirred, arbitrarily selected spoonfuls might be unrepresentative. Also, if the material is composed of particles and fragments of widely varying sizes, the larger sized particles are more likely to escape from the delving spoon. The second method entails the use of a riffle-box. Riffle-boxes are sample dividers, commonly used by soil scientists to divide a soil sample into two equal and representative portions. The riffle-box consists of two rectangular metal containers. On top of these is a metal lid with slots leading alternately to the two containers. The particular riffle used for this experiment had twelve slots, each of width 13 mm, allowing a maximum particle size of about 10 mm. This size was appropriate to the material we were sorting; a larger size would be needed for material more heterogeneous in size. The material to be sampled is poured through the riffle and divided into two portions. One of these is chosen arbitrarily and riffled again, and so on until the required sample size is obtained. The third method (the “grid-method”) is as follows. The sample is spread over a large sheet of paper on which a grid of squares is drawn. Random numbers are used to select squares at random, and the material on that square is removed and put towards the subsample. Sufficient squares are selected to achieve a subsample of the desired size. For this experiment 7 cm squares were used, but smaller or larger ones could be appropriate with smaller or larger amounts of material. Data
For the experiments we used a bag of carbonized material to which modern seeds had been added, sacrificing possible archaeological realism for the sake of the ability to identify and count the seeds quickly, and hence the ability to perform a large number of separate sampling selections. The differences between modern and carbonized seeds in terms of behaviour in the sampling experiments should be negligible, and so the results obtained should be applicable to completely archaeological material. We used five different sample compositions, each with a variety of numbers of species. Each of the three methods was applied five times on each of the five compositions. In each case the required subsample was taken, then sorted and the numbers of seeds of each species in the subsample counted. The material was then returned to the bag. The five species involved were Triticum aestivum (wheat), Lens culinaris (lentil), Hordeum hexastichum (barley), Vicia sativa (vetch) and Solarium nigrum (black nightshade) (denoted by W, L, B, V and N respectively). Their measurements were roughly 8 x 3,4 x 4, 13 x 4, 5 x 3.5 and l-5 x 1 mm. The number of seeds in the five compositions are given in Table 1. The sampling fractions were chosen to give “good” chances of “acceptable” estimation errors, following the procedures outlined below (p. 131). Results
As our working hypothesis, we assumed that the three methods did indeed produce random samples in the probabilistic sense. That is, that all samples of the same size were
SAMPLING
SEEDS
291
Table 1. Compositions of experiments Composition
(3)
W
L
B
V
N
Total
25 25 200 755 755
;z ;:
0 150 150 150 150
0 0 10 20 20
0 0 0 0 25
100 250 435 1000 1025
75
Sampling fraction 37.5 % 25%
25% 10% 10%
equiprobable. The data were tested to see to what extent they cast doubt on this assumption; did any of the three methods produce samples (either for one particular type of seed or for one particular composition) which, on the basis of the working hypothesis, were unlikely to have occurred ? Did any of the methods produce samples where the observed proportions were improbably greater or smaller than those expected? For each of the seventy-five experiments, the probability, P,, of obtaining that proportion actually observed was calculated. (We use the terminology P, and P,-value in this context, rather than the more conventional P and P-value, to avoid confusion with the observed proportion (p) in the sample). Very small values of P, (near to zero) indicate that the observed proportion is improbably far from that expected (i.e. the actual proportion in the bag), large values indicate that the observed proportion is compatible with that expected, The first stage in the analysis was to compute the P,-values for each species in each experiment. These P,-values were then combined, using “Fisher’s Method” for combining significance levels, to produce overall measures for each of the three sampling methods when applied to each of the species. Fisher’s method of combining P,-values gives weight to the very extreme results, so that the overall measure highlights a sampling method which produces too many very extreme results. That is, the overall measure reflects the number and extent of the very inaccurate results obtained when using the method. The results of these calculations, performed for wheat, lentils and barley, are given in Table 2. The calculations for vetch and nightshade are not given here as they figured in only a minority of the experiments and the small proportions make the approximations involved in Fisher’s method less accurate. The full and detailed results of all the experiments are given in Van der Veen (1980). The spoon-method produced several samples that were more extreme than would usually be expected in such experiments. This was particularly noticeable in the cases of wheat in composition 3 (Pr = O-04), lentils in composition 5 (Pr = 044), and especially barley in composition 3 (Pr = 04002). Measured over all five compositions, the spoonmethod produced the most extreme results, whatever the species, the most extreme being for barley (PI = 0401). The riffle-method consistently produced samples which were entirely compatible with the working hypothesis of randomness. No significant deviation from randomness was detected. The grid-method also produced very few extreme results. In fact, it could almost be said to be suspiciously accurate, since for purely random samples we would expect rather more variation than that observed. Conclusion
The results of these experiments give some indication of the likely reliability of the three methods in practical applications with carbonized material. The samples produced by the spoon-method showed clear deviations from randomness. The number of grossly
292
M.
VAN
DER
VEEN
AND
N. FIELLER
Table 2. OverallP,-valuesof sampling methods
Composition
Method
Wheat
Lentils
Barley
1
Spoon Riffle Grid
0.65 082 0.88
0.65 082 0.88
-
2
Spoon Riffle Grid
0.88 0.29 069
0.61 0.24 0.93
0.27 0.052 0.67
3
Spoon Riffle Grid
0.04* 027 088
0.09 0.09 0.12
0~0002** 0.69 0.70
4
Spoon Riffle Grid
015 0.83 0.32
044 0.87 0.12
0.13 0.68 0.67
5
Spoon Riffle Grid
0.66 0.90 0.74
004’ 0.30 0.49
040 0.86 0.97
overall
Spoon RiWe Grid
0.36 0.82 0.93
014 0.40 0.47
0001’ 054 0.96
* and ** indicate extremeand very extremeresults.
inaccurate estimates of the proportion of seeds in the bag obtained in the 25 samples taken by this method was many more than could be attributed to mere chance. Whether this reflected poor technique in wielding the spoon or whether it shows inherent deficiencies in the method is, of course, open to argument. We feel that it does at least provide reasonable cause for doubt as to the adequacy of the spoon-method, and that less suspect methods are to be preferred. The riffle-method showed the most consistently reliable results throughout the experiment. No grossly inaccurate estimates were obtained using this method in the 25 trials, and the results, judged in total, were quite compatible with the hypothesis of randomness. The method is easy and quick to operate, little more difficult and timeconsuming than the spoon-method. Of course, the cost of even a moderate-sized rifflebox is considerably more than that of a spoon, but it seems a worthwhile investment. The results obtained by the grid-method are a little more difficult to explain. Ostensibly they were even more “accurate” than those obtained by the previous method. However, in a sense, they were “too good to be true”. The very high overall P,-values (0.93 and 0.96 for wheat and barley respectively) indicate that rather fewer moderately extreme results occurred than would be expected by chance. We are not quite sure how to explain the surprisingly high accuracy of the grid-method. It is, however, probably related to the fact that people will, subconsciously, always try to spread the material evenly over the paper, resulting in an even distribution of the seeds over the paper (assuming that the content of the bag was thoroughly mixed). This obviously needs to be tested further, and until then it would be unsafe to assume similar accuracy would be obtained in general. It should be noted that this method was considerably more time-
consuming than either of the other two, and so it is not really a serious competitor.
SAMPLING
SEEDS
293
Choice of Sample Size Objective
When a palaeobotanist is presented with many large bags of carbonized material, containing thousands of seeds, the desirability of sampling is readily apparent. However, when only a small amount of material is brought back from the site to the laboratory, it is important to assesswhether sampling is desirable and indeed whether the amount of material to hand in the laboratory (the largest possible “sample”) is sufficient to provide worthwhile information on the composition of the material at the site. In this section we consider the problem of determining in advance the size of the random sample that needs to be taken for a given accuracy, and we discuss how this reflects on the quantity of material to be brought back from the site. The objective is to analyse no more material than necessary. Method
We argued above for the use of random sampling methods rather than any other, and present below various formulae for calculating the required random sample sizes and confidence intervals under differing circumstances. These and their derivations are available in many statistical texts (e.g. Bamett, 1974; Cochran, 1963), but they are collated here for easy reference. The required sample size (n) depends upon four variables. The hrst is the number of seeds (N) in the “target population”. In some cases, the target population (about which inferences are to be made) will be of finite size, for example the seeds in a storage jar. In such cases the formulae (l), (2), and (3) below are appropriate. However, typically the target population will be the totality of seeds in a specific context or deposit, and thus N will be very large, and in terms of the practical use of the formulae, effectively infmite. In these more common cases the alternative formulae (la), (2a), and (3a) are appropriate. The other three variables involved in the equations are the proportion in which the particular species occurs (P), the accuracy in absolute terms that is required (d), and the chance of achieving that accuracy, denoted by (l-u). Both d and (1-a) are at the choice of the experimenter. Specification of the desired accuracy and how “reasonably sure” one should be of achieving this accuracy naturally depends upon the importance of the analysis. Commonly specified chances of obtaining the desired accuracy are 90, 95 and 98 %, the higher values being appropriate when it is particularly important to achieve the required accuracy. Of course specifying this chance as only 90 % does not mean that nine out of ten estimates will be within the desired limits and the tenth will be totally inaccurate, but rather that nine out of ten on average are guaranteed to be within the limits and the tenth may or may not be. If on the tenth occasion the estimate is not quite to the required accuracy, it will usually not be too far away. With regard to the desired accuracy, we feel that estimating a percentage content to within 5 % (in absolute terms) should be adequate in most common applications, and to within 2 oA only for the more exacting experiments. These are the values we have chosen to tabulate (Table 4); equivalent results for other values can readily be obtained from the formulae. Effective use of these formulae requires use of reasonably good estimates of the total number of seeds in the target population (N) and the actual proportion (P) of the particular species in this population. This is a manifestation of the so-called “sampling paradox”. Usually, however, the experimenter has some experience of similar or comparable material and so knows at least the order of magnitude of the number and proportion involved. Even if this is not available, then useful guides can still be given to the sample
294
M.
VAN
DER
VEEN
AND
N. FIELLER
size, which are guaranteed to provide at least the required accuracy of estimation, but which may possibly be rather larger than necessary and hence wasteful of resources. If the total number of seeds (N) is known to be large (say a few thousand or more) then the dependence on the actual number is minimal and the “large sample” versions of the formulae (i.e. la, 2a, and 3a) may be used. The first two of these provide upper bounds on the sample size required to achieve a desired level of accuracy. The third provides a conservative confidence interval for the proportion of the species in the population. We turn now to the problem of providing an estimate of the true proportion of the species in the material. If no relevant experience is available there are two possible courses of action. The fist is to proceed as if the proportion were 50 % and use formulae (2) or (2a); the sample size so calculated being necessarily larger than that for the true proportion (see e.g. Co&an, 1963). The second is to take a pilot sample of size determined by formulae (1) or (la), making the best available guess at the true proportion and “topping up” if the calculated confidence interval for the true proportion is wider than stipulated. In some cases the context of the sample can function as a guide for the preliminary estimation, see for example Dennell (1974, 1976), though it must be recognized that the situation is not always as clear cut as he describes. In many cases a sample is not directly derived from one specific event, but rather from an on-going attritional process, resulting in a greatly mixed and highly variable composition. Table 4 gives some examples of the required sample sizes for a variety of values of the total number of seeds, the true proportion, the desired accuracy and the chance of meeting that accuracy. The table illustrates that the largest sample size is required when the true proportion is near to SO%, and that as the total number of seeds in the target population increases, the actual number of seeds that needs examination to achieve a given degree of accuracy tends to a finite limit. This is an important feature to which we shall return later. It should be pointed out that the formulae given below apply to the problem of estimating the relative abundance of a single species in the material. Typically however, the experimenter is interested in the proportions of several species simultaneously. To guarantee that the abundance of each of several species is estimated to a desired accuracy would require an unduly large sample and would result in a high degree of “overkill”; most proportions would be estimated unecessarily precisely. A compromise is therefore to determine the sample size for that species whose relative abundance is thought to be closest to 50x, this being the proportion requiring the largest sample size for a given accuracy. Formulae (2) and (2a) below are appropriate here. Most of the other species will be measured to within the desired accuracy, but the experimenter will have to accept that there is a substantial chance that one or two species may not be. The formulae given below provide the actual number of seeds that should be examined. However, all physical procedures of sampling carbonized material produce samples of known size by weight or volume rather than by number, and so a simple calculation is required to convert the calculated number into a proportion by weight or volume. Formulae and numerical examples
In this section we provide formulae for calculating the number of seeds that need to be examined to achieve a specified degree of accuracy under various circumstances and also formulae for providing a confidence interval for the true proportion of the species in the target population. These formulae rely on Normal approximations and should be adequate for values of P of 10% or more. The notation used in the formulae is as follows :-
SAMPLING
the the P, the the P, d, the l-a, the Z,, the are %
N,
295
SEEDS
required number of seeds in the subsample, total number of seeds in the target population, proportion of the particular species in the target population, observed proportion in the subsample, required accuracy or tolerance, chance of obtaining that required accuracy, two-sided a percentage point of the normal distribution (some values of Z, given in Table 3). Table 3. Values of Z a
l-a
90%
Z,
1645
95% 1.960
The required sample size, n, is given approximately
98% 2.326
99% 2.576
99.8 % 3.090
99.9 % 3.291
by
This formula should be used when the total number, N, of seeds in the target population is known to be of fairly moderate size (e.g. in a storage jar) and when a reasonable estimate of its value can be given. When N is very large, the limiting value of n is given by n = P(1 -P) (Z./d)2.
(14
This formula should be used when the number of seeds in the target population is known to be very large. In particular, it is appropriate when making inferences about the composition of material in a specific context or deposit and not just about the material in the bags brought back to the laboratory. When P = 50x, formula (1) becomes
’ = {1+4(N--)
(d,ZcJ2) .
(2)
This formula provides an upper bound on the sample size and is of use when the true proportion P is unknown. Its use is appropriate when the total number of seeds N is known to be of moderate size. The “large sample” version of this formula is n = (Z,/242.
CW
The formulae above provide guidance for the size of sample that needs to be taken in order to achieve specified levels of accuracy. When the sample has been taken, and the proportion, p, of seeds of the particular species has been observed it is clearly important to assessthe reliability of p as an estimate of P, the true proportion in the target population, by calculating a confidence interval for P. When it is required only to estimate the proportion in a population of moderate size (e.g. a storage jar or perhaps just the material returned to the laboratory), a (l-u) confidence interval for P is given by pfZ
m J
(1 --W)P(~
(n- 1)
-P).
*
M.
296
VAN
DER
VEEN
AND
N. FIELLER
Here n denotes the size of the sample actually taken. In the more usual case of estimating the proportion of seeds of a species in a specific context (and not just in the material in the laboratory) the appropriate formula is the “large sample” version p&Z
PO* =J (n-l)
To illustrate formulae (l), (la), (2) and (2a), Table 4 below gives the values of n under various specified levels of accuracy and for various values of P, the true proportion in the population. Table 4. Optimal sample sizes n for values of N, d, P and a I-a (%) 5
95
50 20 10
;i 59
80 72 59
218 166 109
44 33 22
278 198 122
28 20 12
384 246 138
2
95
50 20 10
97 94 90
97 94 90
415 378 318
7: 64
707 607 465
71 61 47
2401 1537 864
5
98
50 20 10
85 78 67
;i 67
261 205 141
52 41 28
352 258 164
35 26 16
541 346 195
2
98
50 20 10
98 96 93
98 96 93
436 407 355
87 81 71
772 684 550
77 69 55
3381 2164 1217
Discussion
It is clear that the required sample size depends upon four variables: the total number of seeds in the target population, the proportion in which the particular species occurs, the accuracy or tolerance that is required and the chance of obtaining that accuracy. These last two are at the choice of the experimenter and the appropriate choice depends entirely on the background to the analysis. The fact that there is as yet no general agreement on the degree of accuracy required when sampling carbonized plant remains is a reflection of the rarity of application of statistical sampling methods in this field. We suggest that a sampling scheme that is designed to provide a 95 or 98 % chance of estimating a percentage content of a species to within 5 % (in absolute terms) should be adequate in the majority of common applications. Exceptionally, the required accuracy could be increased to, say, 2%, but increasing it much further would require prohibitively large samples. The formulae and examples presented above illustrate just how critically the required sample size depends upon the size, N, of the target population. In particular, the required sample size varies greatly with N when N is fairly small (a few hundred or less) but gradually tends to an upper limit as N increases to larger values (a few thousand or more). Before beginning a sampling experiment, therefore, it is of crucial importance to clarify what is regarded as the target population. In particular, it is unlikely that we want to know only about the material returned to the laboratory. In most cases the ultimate target population is the totality of seeds that were once present on the site. However,
SAMPLING
SEEDS
297
samples of carbonized plant remains cannot generally be considered as random samples of this total population (Renfrew, 1973; Dennell, 1974, 1976) and so are not necessarily directly representative of it. It is only by sampling many different contexts that it is possible to obtain an accurate reflection of its composition. Thus, at the sampling stage in the laboratory, it is reasonable to regard as the target population the total number of seeds in a specific context or deposit. In most cases this number will be very large, typically several thousand. This means that the upper bounds provided by formulae (la) and (2a), and illustrated in the final column of Table 4 headed “co”, are appropriate in such situations. These values of n give the maximum number of seeds that have to be examined to achieve a given degree of accuracy. We suggest that these are also the minimum numbers of seeds that should be brought back from the site to the laboratory. There are occasions when a deposit or context contains only a small number of seeds, for example a storage jar or posthole. If the number is very small there is of course no need to resort to random sampling, but in the intermediate cases of a few hundred seeds, say, formulae (1) and (2) are appropriate. However, even in such cases one could argue that these contexts should be seen as samples from a larger context, for example the total number of storage jars or the total surface refuse present when the posthole was filled in, in which case N would be considered as being very large. It is an attractive idea to use a standard number of seeds for all analyses, as is the common practice in pollen studies. It is clear that such a standard number could be provided, but only when there is general agreement on the acceptable level of accuracy of estimation required. Such a number would necessarily be conservative, that is it would have to accommodate the most extreme case of an effectively infinite value of N and a true proportion of 50x, so that formula (2a) is appropriate. Much of our discussion of the physical selection of random samples applies specifically to the laboratory stage of the analysis. The inferences made will have wider relevance to the entire context only if the material brought back to the laboratory is “representative” of the whole. The problem of selecting representative or even truely random samples of material from a deposit in the field is outside the scope of the present discussion. However, it should be emphasized that the quantity of material returned should at least be sufficient to provide the laboratory with samples of adequate size. Our discussion here has been orientated towards the problems involved in analysing plant remains and specifically carbonized seeds. However, the statistical theory and numerical results apply to any sampling problem. The only distinction that might arise is that in some cases it is not feasible to bring back large quantities of items from the site. In such cases, careful thought must be given to the objective of the analysis. With very small samples it may not be possible to achieve an acceptable degree of accuracy when estimating the proportion of items of a particular type in the population. However, sampling from large quantities of, say, sherds or flint artifacts is essentially the same as sampling seeds, and the formulae and tables given above apply equally well in these cases. Acknowledgement
We are grateful to a referee for helpful comments on an earlier draft. References
Barnett, V. (1974). Efemenfs of SumpZing Theory. Glasgow: English Universities Press. Buurman, J. (1979). Cereals in circles-crop processing activities in Bronze Age Bovenkarspel. Archaeo-Physika
8, 21-37.
Cherry, J. F., Gamble, C. & Shennan, S. (Eds) (1978). Sampling in Contemporary Archaeology.
British
Archaeological
Reports, 50.
British
298
M.
VAN
DER
VEEN AND N. FIELLER
Cherry, J. F. (1978). Questions of efficiency and integration in assemblage sampling. In (J. F. Cherry, C. Gamble & S. Shennan, Eds) Sampling in Contemporary British Archaeology. British Archaeological Reports 50, pp. 293-320. Co&ran, W. G. (1963). Sampling Techniques. (2nd Edn). New York: Wiley. Dennell, R. W. (1974). Botanical evidence for prehistoric crop processing activities. Journal of Archaeological Science 1, 275-284. Dennell, R. W. (1976). The economic importance of plant resources represented on archaeological sites. Journal of Archaeological Science 3, 229-247. Fasham, P. J. & Monk, M. A. (1978). Sampling for plant remains from Iron Age pits: some results and implications. In (J. F. Cherry, C. Gamble & S. Shennan, Eds) Sampling in Contemporary British Archaeology. British Archaeological Reports 50, pp. 363-371. Green, F. J. (1979). Collection and interpretation of botanical information from Medieval urban excavations in Southern England. Archaeo-Physika 8, 39-55. Hopf, M. & Schubart, H. (1965). Getreidefunde aus der Coveta de 1’Or (Prov. Alicante). Madrider
Mitteilungen
6, 2&38.
Keeley, H. C. M. (1978). The cost-effectiveness of certain methods of recovering macroscopic organic remains from archaeological deposits. Journal of Archaeological Science 5, 179-183. Mueller, J. W. (Ed.) (1975). Sampling in Archaeology. Tucson: University of Arizona Press. Redman, C. L. (1974). Archaeological Sampling Strategies. Addison- Wesley Module in Anthropology
55.
Renfrew, J. M. (1973). Palaeoethnobotany. London: Methuen. Veen, M. van der (1980). Sampling seeds. M.A. thesis, Department of Prehistory and Archaeology, University of Sheffield. Zeist, W. van (1968). Prehistoric and early historic food plants in the Netherlands. Palaeohistoria 14, 41-173.