ELSEVIER
REVIEW OF PALAEOBOTANY AND PALYNOLOGY Review of Palaeobotany and Palynology 96 (1997) 12 I - 144
Multivariate, autocorrelation and spectral analyses of a pollen profile from Scotland and evidence for periodicity M.A. Oliver ~, R. Webster b, K.J. Edwards ~, G. Whittington d a Department ofSoilScience, The University of Reading, Whiteknights, Reading, RG6 6DW,, UK b Rothamsted Experimental Station, Harpenden, Herts AL5 2JQ, UK ° Department of Archaeology and Prehistory, The University of Sheffield, Sheffield, SI 4ET, UK d School of Geography and Geology, The University of St Andrews, St Andrews, Fife K Y16 9ST, UK Received 19 December 1995; accepted 4 June 1996
Abstract
A vertical core through a Holocene peat in Scotland has been divided optimally into segments by multivariate and geostatistical analyses of the pollen of twenty important interrelated taxa. Pollen counts were converted to principal components (from the correlation matrix), and they were strongly correlated with depth. Their variograms were used to explore the stratigraphical structure in the core. The overall form of the variograms was monotonically increasing, but with a wave of length approximately 32 cm superimposed on it. Power spectra of the principal components, computed after filtering to remove long-range trend, had strong spectral peaks, with frequencies corresponding to the 32 cm found in the variograms which suggest that the pollen assemblage changes cyclically. The data were transformed to canonical variates, and the core was segmented using these variates by (a) minimizing the within-segment variation on average (global optimization), and (b) by maximizing the Mahalanobis distances between adjacent segments (local optimization). The results of the two methods were similar. When matched to the data they showed that distinct types of pollen assemblage, from woodland and more open grassland, alternate in a quasi-cyclic way. When converted to time the period was approximately 800 years.
Keywords: climate change; multivariate statistics; pollen; Scotland; spectral analysis; variogram
1. Introduction
The pollen counts in any one sedimentary sequence are almost always interrelated, both in terms of the correlations between taxa and their spatial or stratigraphical positions. These relations can vary through the sequence and tend to be obscured by the large number of taxa and the erratic behaviour of the counts. Standard multivariate methods have been used widely to identify meaningful pattern in the vertical distributions (e.g. 0034-6667/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved PII 0034-6667(96)00041-3
Prentice, 1980; Gordon, 1982; Birks and Gordon, 1985). These include numerical classification, relational classification, ordination by principal components and correspondence analysis, and segmentation analyses. The purpose has been twofold: to reveal multivariate structures in the data as a means of inferring underlying relations between species, and to subdivide profiles into zones with different assemblages of pollen that are meaningful stratigraphically and ecologically. The good sense of the latter hinges on there being positions in the
122
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
core where change in at least some species is greater than elsewhere; in other words, that there are fairly distinct spatially coherent classes. Walker and Wilson (1978) suggest another approach for splitting sequences, taking into account variations of within and between individual pollen curves. Lotter et al. (1992) and Birks and Line (1994) describe an optimal form of this approach to partitioning using data that have been transformed to principal components. Pollen sequences may be divided into zones by one or other of two basic techniques: either by contiguity constrained multivariate classification or by searching for boundaries (see Birks and Gordon, 1985). The first focuses on the classes themselves represented in the zones and tries to create zones that are homogeneous. It depends on the existence of spatial dependence or autocorrelation in the data. The second looks for change and divides a sequence where change is greatest, i.e. at the boundaries. Nevertheless, their aims are similar, i.e. to detect significant changes in pollen content. Both approaches rely on a vertical ordering of the pollen counts with spatial correlation between sampling points in addition to the correlation between species. They assume implicitly that spatial correlation exists; their success depends on that assumption's being correct. Yet the correlation has generally not been taken into account explicitly, and so the best use is not being made of the available information. Any segmentation of a pollen profile in which the data are spatially related should embody that correlation explicitly. This should lead to a more rational choice of method and of the parameters of the computing algorithms, and better understanding of the variation present. Our aim, therefore, has been to incorporate spatial correlation formally into the analysis. The techniques to achieve this and the theory on which they are based are not novel. They have been developed largely for processing signals (e.g. Jenkins and Watts, 1968), but they have also been used in exploration geology (Journel and Huijbregts, 1978), petroleum engineering (Merriam, 1964), sedimentology (Schwarzacher, 1975), and soil science (Webster, 1973, 1978; Oliver and Webster, 1987). All start with the covariance function or its comple-
ment, the variogram. The variogram may be used as it stands, or a mathematical model may be fitted to it to describe the spatial structure in the data and to infer that of the underlying process. It has proved valuable for providing insight into the variation. The autocorrelation function can be transformed to its spectrum in the frequency domain for detecting cyclical behaviour. Both the spatial model and the spectrum may be interpreted and used as guides to further analysis. Periodic variation has been detected in other stratigraphical studies. Aaby (1976) examined the composition and degree of humification of Danish raised bogs and found what seemed to be systematic changes in measured components approximately every 260 years, with a period of twice that in some instances. Only climatic changes seem likely to have influenced the vegetation of the bogs on such time scales. Barber et al. (1994), in their study of the macrofossils in an ombrotrophic mire in northern England, found that the assemblages corresponded more or less synchronously to changes in climate. The autocorrelation function was sinusoidal with a period of around 800 years. They considered that this matched fairly well with Aaby's 520-year cycle for peat humification (reported as 512 calendar years in Barber et al., 1994). Green (1981) detected periodicity between charcoal (indicating fire) and the reappearance of pollen taxa (signifying vegetation influenced by fire) using time-series analysis on Canadian data. Wijmstra et al. (1984) used this approach to describe periodicity in the percentage curves of pollen taxa from a Dutch site, and they related the density of pollen to sunspot cycles of different lengths. We have used some of these techniques to investigate pollen from a core through peat at a site in Fife, Scotland. Whittington et al. (1991) had already analysed the pollen data from the site by conventional palynological methods, but the complexity of the original data with the many taxa made it difficult to discern the overall patterns intuitively. To improve on this we computed the spatial autocorrelation, estimated its parameters and then incorporated them explicitly into the analysis to optimize the segmentation of the core by minimizing the variance within segments.
klLA. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
2. The data
123
The corrected dates form the basis of temporal interpretation.
The pollen profile studied by Whittington et al. (1991) was from the deepest part of a peat-filled kettle hole at Pickletillem near the coast of Fife in eastern Scotland (Fig. 1). The site (National Grid Reference NO 453235) is designated as a Site of Special Scientific Interest. The Holocene section of the core, that is from 58 cm to 670 cm below the unconsolidated surface deposits, was sampled at 4 cm intervals, and the pollen counts from them were used as data. The full set of data comprises i01 taxa and 154 sampling levels. A surface date of AD 1986, a date of AD 1786 for the historical rise of Pinus pollen (reflecting modern estate planting) and eight radiocarbon dates formed the basis of core chronology. Radiocarbon dates were corrected to calendar years by dendrochronology using the computer program CALm 3.0.3 (Stuiver and Reimer, 1993). Intermediate dates were estimated to the nearest ten years by linear interpolation (Fig. 2). For comparison the dates were interpolated using splines (see Odgaard, 1994), but there was little difference in the results from those of the linear interpolation.
3. Principal component analysis (PCA) Detecting a coherent pattern in the complete set of data was difficult: there were too many variables (taxa). Some kind of reduction was needed, and as a first step at reducing the dimensionality we computed principal components from the correlation matrix, treating the percentages of the counts of the taxa as variates. Working from the correlations effectively standardized the variates to unit variance, and so that they carried equal weight; this seemed reasonable in view of the large differences between the percentages of the total pollen accounted for by the individual taxa. The scatter diagram of the scores of PC 1 against PC 2 showed their relation to be linear, and no further transformation was needed. Results of analysing unstandardized data were dominated by those taxa with the largest counts and were unhelpful. The counts of poorly represented taxa generally contribute little to the interpretation of pollen
-
/
_-
z~
35
"
--<2
^
'
"
~u
': c--;-: ~ - ' -
Pickletillem Inn
-
-
-
it
!
"\
-oe
Study ~
~
. . . . .
<
_ _-_-.~" - o-~D i/";" 7" i"~ - - ~
Main Postglacial Shoreline
~
,,
~ao-
'\
,
I// "~/,~
"~
',, f
t/
"
I
C o n t o u r s at 5 m i n t e r v a l s
0
Fig. 1. Map showinglocation of pollen profile at Pickletillem,Fife, Scotland.
1000 m ]
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
124 700 -
600 -
500-
400-
£ 3
300-
200-
100-
0 0
I 2000
I 4000
I 6000
1 8000
I 100(30
I 12000
Age / years BP
Fig. 2. Time-depth graph for the Holocene section of the Pickletillem profile based on the dendrochronological correction program of Stuiver and Reimer (1993).
assemblages (Gordon, 1982), and a component analysis can often be more meaningful without the taxa with consistently small counts. Gordon (1982) recommends a minimum of 2% or 5% pollen at any level in a profile. Following these thresholds we reduced the number of taxa from the full 101 to 20 and 14 variates, respectively.
Table 1 The percentage variance accounted for by the first five eigenvalues of the correlation matrices for different numbers of taxa
Number of taxa
Percentage of variance 1
II
III
IV
V
11.7 31.0 34.6
7.1 19.8 20.4
6.1 12.3 10.2
4.9 6.7 8.2
4.4 5.7 6.6
3.1. Results
101 (total) 20 (2% in any one level) 14 (5% in any one level)
Table 1 lists the latent roots of the correlation matrix as percentages of the total variance accounted for by the leading five principal components for the three analyses. With the total 101 taxa the first five principal components accounted for 34% of the variance, whereas with 20 taxa (2% threshold) they accounted for 75.5% of the variance, and with 14 taxa (5% threshold) it was 80%. The 5% threshold seemed to confer little advantage against the possible loss of information by exclud-
ing 6 taxa. Table 2 lists the 20 taxa retained, with nomenclature following the principles described in Bennett et al. (1994) and the catalogue of Bennett (1994). Fig. 3 shows the correlations between the original variates and unrotated components 1 and 2 plotted within circles of unit radius. The taxa appear to fall into distinct groups, though with a gradient between the deciduous trees on the left
M.A. Ofiver et aL / Review of Palaeobotany and Palynology 96 (1997) 121-144
125
Table 2 Correlations between the original taxa and the first five principal components Taxa
Unrotated principal components
1
Betula Pinus sylvestris Ulmus Quercus Alnus glutinosa Corylus avellana-type Salix Juniperus communis Calluna vulgaris Ericales undiff.
2
- 0.2221 0.6706 - 0.4786 - 0.2275 - 0.0101 - 0.6727 - 0.1216 0.0810 0.5843 0.7847 0.9089 0.6993 0.8214 0.8186
Poaceae undiff.
Cyperaceae Cerealia-type undiff. Brassicaceae
Plantago lanceolata Rumex acetosa
- 0.8679 -0.3051 0.0966 0.6169 0.5849 0.0177 - 0.8374 - 0.7554 0.1014 0.0099 -0.2035 0.1282 0.0295 0.0546 0.3225 -0.1951 0.3969 0,2975 - 0.7503 0,2822
0.7886 0.7832 0,1906 0,1441 0.1548 0.1101
Ranunculaceae undiff.
Ranunculus acris-type Filipendula Crithmum
3
4
0.1532 -0.2426 - 0.4043 0.1418 0.3913 - 0.4427 0.2577 0.4043 -0.2218 -0.2976 0.0340 0.1333 -0.2542 - 0.2763 0.159l -0.0682 0.7099 0.6362 0,4208 0,4553
0.0592 -0.0687 - 0.3081 - 0.5394 - 0.5299 0.4864 0.0201 - 0.2417 0.0931 -0.0207 -0.1098 0.1463 -0.0037 0.0670 0.1025 -0.1752 0.3325 0.3331 -- 0+1858 0.1308
The figures in bold represent the taxa that correlate strongly with a principal component.
AI
Alnus glutinosa
Be
Betu/a
Br
Brassicacea
Cal Calluna vulgaris Cer C e r e a l i a - type undiff. Co Corylus a vellana - type Cr ~,=l:RaD a Cri
Co,
* UI
:l: Cyp e,_, $ * Br
%. P / Rx * ~
Crithmum maritimum
Cyp Cyperaceae
\ \ t
p? ~oa/
Er
Erica/es undiff,
Fi
Fi/ipendula
Ju
Juniperus communis
Pi
Pinus sylvestris
PI
P/antago /anceolata
Poa Poaceae undiff,
Qu
Quercus
Ran Ranunculaceae undiff Ran a R a n u n c u l u s a c r i s - type Rx
Rumex acetosa
Sx
Salix
UI
Ulmus
Fig. 3. Correlations between the original pollen taxa and the first two unrotated principal components.
5 0.0245 0.4947 0.3270 0.0808 - 0.1872 0.0431 - 0.0043 - 0.0832 -0.6415 0.0656 0.0220 0.0628 0.1366 - 0.2912 -0.0287 0.3169 0.1200 0,2644 - 0,1737 0,0710
126
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121 144
and the herbaceous taxa to the right along axis 1. The cluster at the bottom of the circle (Betula, Salix, Juniperus communis, Filipendula) typifies the early Holocene pollen assemblages. The largest correlations (in absolute value) are printed in bold (Table 2). It is clear that herbaceous taxa have the largest eigenvector loadings on component 1, and that woodland types load most heavily on component 2. The first principal axis reflects a contrast between periods of woodland dominance (especially Corylus, Ulmus, Quercus and Alnus) and those where herbs were a distinctive feature of open and cleared landscapes (e.g. Poaceae, Cyperaceae, Cerealia and Brassicaceae). The second axis contrasts early Holocene taxa (e.g. Betula, Salix and Filipendula) with those of later periods (e.g. Quercus, Alnus and Plantago lanceolata). These two components explain about 31% and 19.8%, respectively, of the variation. It is more difficult to interpret components 3, 4 and 5. Ranunculaceae undiff, and Ranunculus acris-type load most heavily on component 3, but the associated species that load fairly heavily comprise
Ulmus, Corylus avellana-type, Juniperus communis, Filipendula and Crithmum maritimum. Component 4 contrasts Alnus and Quercus with Corylus, and component 5 Pinus sylvestris with Calluna vulgaris. Rotation by Varimax hardly changed the configuration in five dimensions; the correlations and the interpretation remain the same. The component scores were plotted against depth in the profile in Fig. 4. The graphs show that there is both long-range variation and shortrange fluctuation in all of the principal components. Component 1 also mirrors the total herbaceous pollen of which Poaceae is the main taxon, i.e. it mirrors the major stratigraphical pattern of the early to mid to late Holocene. The graph of the scores of component 1 (Fig. 4a) matches Poaceae undiff, in the pollen diagram (Fig. 5) and shows that this taxon strongly controls the variation in the first component. The large scores at the bottom of the profile show the dominance of herbs in the early Holocene, whereas the large scores at the top arise from land cleared of woodland. The graph for the scores of component 2 (Fig. 4b), when inverted, has a similar distribution to that of Betula pollen in Fig. 5, and to a lesser extent to
that of Salix. The small scores lower in the profile show an inverse relation to the large counts for Betula, the fairly large counts of Salix, and the presence of Juniperus communis and Filipendula. Component 3 is more difficult to relate to the pollen diagram. Spikes on the graph between 100 cm and 200 cm (Fig. 4c) correspond somewhat to Ranunculaceae undiff, and Ranunculus acristype, a small peak at 300 cm relates to Alnus, and the peak at 600cm to Corylus avellana-type (Fig. 5). Component 4 scores peak between 200 and 300 cm (Fig. 4d) and seem to relate to peaks in Quercus and Alnus glutinosa. The overall form of this trace corresponds somewhat to the vertical distribution of Corylus avellana-type (Fig. 5). The long-range pattern in this graph is distinctly wavy. Comparing the graph of component 5 scores against depth (Fig. 4e) with Fig. 5 suggests that there is a strong inverse relation with Pinus and Calluna, especially at the top of the profile. The peak at about 300 cm relates to Alnus. The graphs of the component scores reflect strongly the pattern of certain species, suggesting, together with the results of the correlation analysis, that these axes can be interpreted biologically.
4. One-dimensional canonical variate analysis Another way of reducing the dimensionality of the data is to take canonical variates. Canonical variate analysis preserves the discrimination between classes, or between the segments of a column in this instance, which might be lost by taking only the leading principal components. The first few canonical roots maximize the ratio of between-group to within-group variance. This is the information required for identifying the most likely boundary positions. In general canonical variate analysis depends on there being an existing classification of the data. However, Hawkins and Merriam (1974) showed that for a regular onedimensional series, where fairly distinct homogeneous classes are separated from one another by more abrupt changes, i.e. boundaries, it is possible to transform the original data to canonical variates without having to group the sampling points
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
o)
127
b)
I0, o80.6" 0+-
02"
-O2"
-04-
-06"
-08-
-'0"
---,
f
,
i
,
i
r
,
,
,
,
,
,
d) OBOe0~02-
-(?2-
-O.4 -
06-08-
-~0
I
I
I
,
x
,
I
,
i
~
e)
,
,
,
r
,
20O
25,O
ZOO
3.5O
40O
,
4~
,
~
i
55C
',
60¢
Depth / cm
I0-
06-
0.4"
02-
0.0"
0204"
06 °
OBI0
go go 2~ 2;0 ~
~o ~o ,;o ~o ~
Depth
~o ,;o
/ cm
Fig. 4. Principal component scores plotted against sampling position for components 1-5 (a e, respectively).
i
6tic
128
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
1 1 2 2 3 3 4 4 5 5 6 6 7~
Fig. 5. The original pollen diagram zonation from Pickletillem (Whittington et al., 1991) showing the 20 taxa used in the present study, zones (solid lines), and subzones (dashed lines). The taxa are expressed as percentages of total land pollen.
beforehand (the theory of this is given in the Appendix). We computed the canonical variates from the pollen data of the 20 taxa at the 154 sampling depths. Table 3 lists the eigenvalues of the relevant matrix, TS -1 (see Appendix for its definition), and shows that a very large proportion of the variance, 40.5%, is accounted for by the first canonical variate (CV 1), and a further 21.7% by CV 2. If there are more than two groups present in the data then the canonical axes cannot be interpreted clearly because of the way that the multivariate space is transformed. To help interpret them we computed the correlations between the original variates and the first five canonical variates as for the principal components. They are also listed in Table 3, with the largest correlations in bold. The first canonical variate contrasts early Holocene assemblages (Betula, Salix, Juniperus, Rumex and Poaceae) with later Holocene ones (Alnus, Quercus and Plantago lanceolata). The second canonical axis contrasts early and mid Holocene assemblages (especially Corylus) with late Holocene ones (Poaceae, Rumex). The canonical variate scores were plotted against depth as for the principal components (Fig. 6). The steps in the plot of CV 1 suggest that one might identify boundaries between segments from the
graph of this variate alone (Fig. 6a). The graph of CV 2 (Fig. 6b) has large slopes at both ends of the core, with an oscillation about a weak trend in the middle. The scores for CV 3 (Fig. 6c) are locally erratic, but the long-range variation appears to be sinuous. The graph for CV 4 (Fig. 6d) is similar to that for CV 3 in that the long-range variation is wavy, whilst CV 5 (Fig. 6e) shows marked changes near to the top of the profile. The graphs of the first three principal components (Fig. 4a-c) and those of the first three CVs (Fig. 6a-c) show little relation, except that they are to some extent reversed. The graphs of scores of PC 4 (Fig. 4d) and CV 4 (Fig. 6d) are fairly similar to one another, however, while those of PC 5 (Fig. 4e) and CV 5 (Fig. 6e) mirror one another.
5. Autocorrelation and spectral analysis Although canonical variates can be derived as above, the procedure is automatic and it provides little insight into the stratigraphical pattern. Good use of the canonical variates or the leading principal components for identifying boundaries needs a preliminary spatial analysis. We did this using the variogram and the equivalent covariance function with depth, and then transformed the covariances
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121 144
129
Table 3 Percentage variance accounted for by the first five canonical variates, and the correlations between them and the original taxa Canonical variates
Percentage variance
1 40.47
Taxa
Correlations between CVs and original variates 1 2 3
Betula Pinus sylvestris Ulmus Quercus Alnus glutinosa Corylus avellana-type Salix Juniperus communis Calluna v,ulgaris Ericales undiff. Poaceaea undiff. Cyperacea Cerealia-type undiff. Brassicaceae
Plantago lanceolata Rumex acetosa Ranunculaceae undiff.
Ranunculus acris-type Filipendula Crithmum
- 0.8231 -0.2996 0.1352 0.7973 0.7400 -0.1750 - 0.7406 -0.5741 0.1116 -0.0023 -0.1784 0.1007 0.0002 0.0314 0.2771 -0.2064 0.3137 0.1763 0.5642 0.2194
2 21.70
3 11.61
- 0.0657 0.6506 - 0.4210 - 0.0312 0.1781 -0.8698 0.0938 0.3397 0.3032 0.6265 0.8745 0.6017 0.6633 0.5720 0.6436 0.7888 0.1955 0.1403 0.3811 0.1078
0.4309 -0.4097 0.1175 0.3430 0.3580 -0.3075 0.4599 0.5288 -0.3132 -0.5103 -0.2360 -0.3069 -0.5665 - 0.4913 -0.3742 -0.3000 0.1089 0.0933 0.5007 0.0962
4 7.34
5 5.34
4
5
- 0.2115 -0.3100 - 0.8168 - 0.1725 0.1175 0.0893 0.0294 0.1703 0.1671 -0.0935 0.093l 0.0817 -0.1363 - 0.0167 0.1783 -0.1014 0.3408 0.1938 0.2000 0.2240
0.1507 -0.2038 - 0.0336 - 0.1859 0.0054 -0.2194 - 0.0215 -0.0568 0.7964 0.1740 0.1632 0.0294 0.1213 0.4654 0.1812 -0.1579 -0.1008 -0.137l 0.0515 -0.0655
The figures in bold are for taxa that correlate strongly with a canonical variate.
to power spectra. Provided the variable, say Z, is second-order stationary, i.e. stationary in the mean and in the variance, the covariance function and variogram are equivalent: 7(h) = C(O) - C(h) = C(0){ 1 - p(h)}
( 1)
where ?,(h) is the variogram function, C(h) is the covariance function, p(h) is the correlogram. The experminental variogram is readily computed from the standard formula: 1
- -
n-h
~(h)= 2(n-h) j--~l
{z(j)-z(j+h)} 2
(2)
where ~(h) is the experimental semivariance, and z ( j ) and z ( j + h ) are the observed values at j and j + h , respectively, separated by h, which is an
integer here. Provided there are sufficient data-Webster and Oliver (1992) and earlier time-series analysts (Jenkins and Watts, 1968) recommend at
least 100 sampling points--~(h) estimates ?(h) reasonably precisely. By incrementing h an ordered set of semivariances is obtained. This set is the experimental variogram, which can be displayed as a graph. These discrete values represent a continuous function, which can be envisaged by fitting a mathematical model to them. The experimental variograms were computed by Eq. (2) from the scores of the leading principal components and canonical variates. Their graphs are shown in Figs. 7 and 8. The variograms of the first three principal components do not have the simple bounded form expected for transition phenomena: on the contrary, they appear to increase indefinitely. The experimental values also fluctuate to varying degrees about these increasing curves in an apparently cyclic fashion. A model that describes reasonably well the variograms of components 1, 2 and 3 is a sine wave
130
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
o)
b)
~)
d)
10-
5"
~
0-
5
~
'd==
i
I
I
I
'
i
1
I
-J
J
"I 20
T-
i
i
r
J
J
i
J
~
i
i
~)
Depth / crn
20-
~5
Ioo
rso
200
750
50o
350
Depth /
400
45c
50o
55o
6oo
650
cm
Fig. 6. Canonical variate scores plotted against sampling position for CVs 1-5 (a-e, respectively),
I
l
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 ( l 997) 121-144
superimposed on a power function of the kind
131
nugget variance is:
y(h) = mh ~, 0 < e < 2. The full model, including the
7(h) = Co + a sin( 2nh/2)
nugget variance, is:
+ b cos(2rch/2) + c{ 1 - exp ( - h/r)}
"y(h)= co + a sin(2rch/2)
for 0 < h
+ b cos (2zch/2) + mh ~ ?(0)=0
for 0 < h 7(0)=0
(4)
where r is the distance parameter of the exponential model and c is its a priori variance. From this an acceptable working range of a = 3 r is generally derived. The variograms of PCs 4 and 5 have ranges of 126 cm and 21.6 cm, respectively. The variograms of the canonical variates (Fig. 8) provide additional insight into the structure of the variation down the profile. That of CV 1 is clearly unbounded and concave upwards, which indicates strong trend in the data. The power function model was out of bounds because the degree of concavity exceeded the limit for an authorized variogram function. Canonical variate 4 was fitted best in the least squares sense by a power function with a sine wave superimposed. The variograms of CV 2 and CV 3 were fitted best by an exponential model on which a sine curve is superimposed. The periodicity expressed by these variograms supports the interpretation of those for the principal components. The wavelength of CV 3 is 34.4 cm, which is very similar to the average of that of four of the principal
(3)
where the parameters a and b determine the amplitude and phase of the sine curve, 2 is its wavelength, and Co is the nugget variance, i.e. the intercept on the ordinate. The last embraces spatial variation unresolved by the sampling and any errors in separating the pollen and counting it. The models were fitted by weighted least-squares approximation to each experimental variogram in turn using the program M L P (Ross, 1987). Fig. 7a-c shows the resulting functions by solid lines, and Table 4 lists their coefficients. The lack of bounds on the variograms of the leading components is almost certainly the result of the trends, which are evident in Fig. 4. The variograms of components 4 and 5 are bounded, but there is still evidence of periodic fluctuation. A model that describes these variograms reasonably well is a sine wave superimposed on an exponential function. The full model with
Table 4 Models and coefficients fitted to variograms of component scores a Component
l 2 3
4 5
Parameter Variance
Nugget
Gradient
Exponent
Sine
Cosine
Wavelength
0.040518 0.025838 0.016114
0.00000 0.00000 0.00303
0.00255 0.00749 0.00045
0.45079 0.23236 0.72914
0.000696 0.000056 -0.000446
0.000304 0.000186 -0.000501
33.70 28.95 31.01
Variance
Nugget
Sill
Distance
Sine
Cosine
Wavelength
0.008704 0.007969
0.00060 0.00000
0.00890 0.00580
10.47 1.82
0.0001853 -0.000409
0.000080 0.000231
54.91 33.41
The function fitted to components 1, 2, 3, and 5 is a sine curve superimposed on a power function of the form ~'(h)= rnh~ where 0 < c~< 2 and rn is the gradient or linear parameter. The coefficients headed 'sine' and 'cosine' together determine the amplitude and phase of the curve. C o m p o n e n t 4 has fitted to it a sine curve superimposed on an exponential, ? , ( h ) = c { 1 - e x p ( - h / a ) } , where c is the sill of the exponential term and a is the distance parameter.
132
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
o)
b)
c)
~)
2C3
0025-
0C2-
c ~ oo15-
00~-
0005 -
O0
0 O3"
0025
002,
._~
O.015'
~
001-
0005-
O0
i
1
i O i
m
T
e)
Depth
/ crn
0025-
0.02 -
.~_ 0015-
15
>
001-
• - -
semivoriances model
O005"
O0
,
,
,
,
,
,
,
20
40
60
80
100
120
140
Depth
/ cm
Fig. 7, Variograms of principal components 1-5 (a-e, respectively); the symbols represent the experimental semivariances and the solid lines the fitted models.
M.A. Oliver et aL / Review of Palaeobotany and Palynology 96 (1997) 121-144
o)
133
b)
25m o 20-
o o o
o
o
,m ,j** -
-
r
i
f
|
c)
d)
x J-
,,u-
b
>
20
40
e)
(KI
80
Depth
100
126
140
IO3
/ cm
c
._o 1',
•
scmivorionce5
model
u
i
20
40
i
60
)
80 100 Depth / cm
I
)
r
120
140
160
Fig. 8. Variograms of canonical variates 1-5 (a-e, respectively): the symbols represent the experimental semivariances and the solid lines the fitted models.
134
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
component variograms. The average wavelength for the other three canonical variates is 64.6 cm, which is about twice. The average range of spatial dependence of the bounded functions fitted to CV 2 (91.7 cm), CV 3 (109.9 cm) and CV 5 (25.5 cm) is 75.7 cm. This last is close to the average range of 74 cm for the variograms of PCs 4 and 5. 5.1. Power spectra
The variograms of the first three principal components were dominated by monotonic increases in variance with increasing h, whereas those of components 4 and 5 were bounded. Superimposed on these increases is an apparent periodic fluctuation but of small amplitude. All variograms appeared to fluctuate periodically. To determine whether the periodicity was real rather than an artifact of the analysis the variation was examined in the frequency domain by computing the power spectra. The formulae for this are given in the Appendix. We used windows of widths 80, 60 and 40 cm, and Fig. 9 shows the results from all three for each of the principal components. All the spectra have most of their power at the smallest frequency: in other words, they are dominated by the long-range non-periodic fluctuation. The results for the 80 cm window revealed the most detail. There are discernible peaks at 0.135 cycles (equivalent to a wavelength of 30 cm) for PC 1, a weak one at 0.125 cycles (32cm) for PC2, peaks at 0.365 cycles (11 cm) for PC3 and PC4, and at 0.335 cycles (12 cm) for PC5 (Table 5). 5.2. Filtering
To separate the cyclic short-range fluctuation from the long-range variation we filtered the series with a simple moving average of length 9 sampling intervals. The results for the principal components are shown in Fig. 10, in which the solid lines are the moving averages and the dotted lines are the residuals plotted against sampling position. Fig. 11 shows the spectra of these residuals, and Table 5 gives their dimensions. The spectrum of PC 1 has a single prominent peak at about 0.14 cycles, equivalent to a wavelength of about 29 cm
Table 5 Spectra for the first five unfiltered and filtered principal components Component Unfiltered Cycle
Filtered
Wavelength Cycle Wavelength (cm)
(cm)
1 2 3
0.137 0.075 0.090
30 53 44
4
0.365
11
5
0.170 0.335
24 12
0.140 0.900 0.110 0.185 0.360 0.150 0.365 0.115 0.155
29 44 36 21 I1 27 11 35 26
in the lag domain (Fig. lla). Component 2 (Fig. llb) has its largest spectral peak at 0.09 cycles, equivalent to a wavelength of 44 cm, and a smaller peak at 0.16 cycles (26 cm). Component 3 (Fig. llc) has three peaks: at 0.11 cycles (36 cm), at 0.185 cycles (21 cm), and at 0.36 cycles (11 cm). The spectrum of component 4 (Fig. lld) has two peaks, one at 0.15 cycles (27 cm) and the other at 0.365 cycles (11 cm). Component 5 (Fig. lle) has two spectral peaks at 0.115 cycles (35cm) and 0.155 cycles (26 cm). The average wavelength suggested by taking the largest peak for each component is 32.4 cm which corresponds well with the average wavelength of the variograms for PCs 1, 2, and 3 of 31.7 cm. The results of the spectral and variogram analyses are consistent. The pollen assemblage in the profile evidently fluctuates periodically with a wavelength of about 32 cm. The graphs of the moving averages (Fig. 10) are for the most part approximately horizontal. However, there are several sections of most of them that are relatively steep. These represent change, and their positions accord from one component to another in seven places down the profile, showing that a significant part of the pollen assemblage in the core changes in unison. Major changes appear to occur at sampling positions 76cm, 124cm, 160cm, 288 cm, 360cm, 476cm and 640 cm. The moving averages of the canonical variates (not shown) also suggest that there are distinct changes in the profile that are worth
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
exploring further. Again, the sharpest changes accord closely from one canonical variate to another, and significant changes appear to occur at 76 cm, 124 cm, 160 cm, 280 cm, 360 cm, 476 cm, 528 cm and 640 cm down the profile. The graph of the first unfiltered canonical variate (Fig. 6a) also suggests change at positions 76 cm, 124 cm, 152 cm, 174 cm, 232 cm, 280 cm, 332 cm, 360 cm, 476 cm, 528 cm and 640 cm (Table 6).
6. Optimal segmentation The graph of CV 1, Fig. 6a, and those of the moving averages of the principal components and canonical variates suggest that it is worth segmenting the core. With the total number of variates reduced to a manageable few, the core can now be segmentated optimally in the sense of minimizing the variance within the segments. It may be done either by attempting to minimize the variation within the segments on average over the whole profile (global optimization) or by maximizing the ratio of between-segment variation to within-segment variation across each potential boundary (local optimization). We deal with these in turn, describing first the analysis briefly and then the results.
135
optimize division into classes, but does it locally. The method is summarized in the Appendix. In it the multivariate sequence is viewed through a window of some predefined width. The window is split at its mid point, and the effect of creating two multivariate classes in this way is assessed by the ratio of the residual variation within the classes to that between them. Ideally it should contain only one boundary. If the window is so wide that it includes two or more boundaries then the method will perform poorly. If, on the other hand, the window is very narrow then small local fluctuations will cause numerous peaks in the graph of D2, and the important boundaries are likely to be indistinguishable. One outstanding matter remains to apply SMW sensibly, namely choosing the width of the window. Webster computed the autocorrelation for the principal components, found a common approximate range, and was thus able to choose widths of windows to identify the segment boundaries. Intuitively the most suitable width seemed to be half to two thirds of the spatial scale (range), and this proved to be so in practice. We used the average range of spatial dependence of the bounded variograms of CVs 3 and 4, and PCs 4 and 5 of about 70 cm to select a window width of 40cm (10 sections). Window widths of 48cm, 56cm, and 72cm were also used for comparison.
6.1. Maximum level variance (MLV) 6.3. Results Hawkins and Merriam (1973, 1974) solved the problem of global optimization. They transformed the original data to canonical variates (see Appendix) and then computed a segmentation that minimized the within-segment variation. We applied the technique first using the leading five CVs and with subdivisions of the profile from three to sixteen segments. We finally chose to stop the partitioning at twelve boundaries. Attempts to segment more finely added boundaries only at the bottom of the core, and did nothing to aid understanding. 6.2. Split moving window ( S M W ) The second method of segmentation, due to Webster (1973, 1978), applies the same logic to
The significance of the boundaries identified by MLV depends on the amount of variation they explain. They are also listed in Table 6, and those boundaries accounting for more than 3% of the variance are in bold. The boundaries correspond well with those picked out from Fig. 6a. For SMW there were spikes for all four window widths, but those for the 40 cm window were the clearest. This supports the choice of a window width based on a proportion of the average range of spatial dependence. Fig. 12a and b show the graphs of D 2 against depth for the canonical variates and principal components, respectively, for a window of 40 cm. For the canonical variates the most prominent spikes in DE are at 84 cm, 108 cm, 292cm, 364cm, 476cm, 524cm, 568 cm, and
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
136
b)
a)
T
O.3"
4
-~ 025"
I
-~ 0.2"
q
o ~ 0.15.
J
I
011
q
0051
J
00
i
i
i
i
i
,
c)
i
,
|
d)
-•
025-
o=-
00
]
I
I
I
I
I
I
e)
Frequency
O.3-
~ 0.25-~
02.-
~z. 0.15-
.........
0.1-
0000 •
0.OS
i
i
01
0.~5
. ~;~
l 0~,
i
0.3
,
,
,
o..~
0.4
o.¢S
wi ndow6800 1 window window4 0
0-~
Freq~mcy Fig. 9. Spectra of principal components 1-5 (a-e, respectively), for window widths of 80, 60 and 40 cm.
•(~IOA!13~ISOJ ' o - e ) ~-[ SlUouodtuoo led!~u!ad Joj o ~ o a ~ ~u!aoVJ lu!od-zu!u ~ Joj slenpmoa pue pu~J1 oqJ~ "0! "~!~ uJo i
a
...
i
i
i
/ qlda(~ l
i
m
J
1
,
or-
-go -,~o-~,o -go
Islonp!sa~ pupal i
-zo -vo -9o -go - -
uJO / o~;9
oo~
og~ ,
oo<; J
ogt i
(x)t 1
*
oo£ l
o~;z i
i
oo~ l
or,a
Ot
0
ql.dao ~
~
oot
-8"0-
-9"0~,o
~~- 5
~
~
.oo .~o • to
-9o
ot
(P i
,
J
i
.
J
~.~
,
(3
A
ot-
-ga
.q,a
.oo
-go -go o~
(o
(q PPI - I CI (L661) 96 ~(gOlOUKlOdpuv ~fuvloqoaPlvd fo ,l~a!aa~t/ 70 la da~l!lO V Ll~
L~I
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
138
o)
b)
c)
d)
C7-
06>.,
~0
d.-
0~-
O0
O~
: ~ O5
i0.3 -
O2-
01-
O0
,
i
i
oo
OO5
O~
O2
O~S
e)
O25
O3
O35
O4
045
C-~
FrequenCy
-e~
~
o,-
~e3
...
-
- ........
window
80
window
60
window
40
02-
01-
OG O0
"
i
,
i
,
005
01
015
02
~
,
,
1
i
0~5
Q.3
O35
0.4
045
05
Frequency
Fig. 11. Spectra of principal components 1-5 (a e, respectively), for window widths of 80, 60 and 40 cm after filtering out the trend.
M.A. Oliver et al. / Review o f Palaeobotany and Palynology 96 (1997) 121-144
139
Table 6 Results of segmentation analyses using Split Moving Window (SMW) and Maximum Level Variance (MLV) for the first five unfiltered principal components and canonical variates Boundary positions (cm) CV I
80
125 150 175 232 280 330 360
SMW-CVs Mahalanobis D z
84
SMW-PCs Euclidean D 2
80
108 124 176 232 292 328 364 440
475 530
476 524
640
568 648
MLV-CVs (% variance) Mahalanobis D 2 78 (17.2%) 90 (1.2%) 106 (2.7%) 130 (4.6%)
Whittington et al.
78 104
176
170 (0.8%}
288 324 360 444 460 476 528 550 564 640
282 (1.0%)
280
362 (6.9%) 442 (0.9%)
356 452
474 (35.7%) 520 566 (3.6%) 638 (17.6%) 662 (0.9%)
640
The figures in bold signify the most prominent boundaries and where the percentage variance exceeds 3% for MLV.
648 cm. There are six subsidiary spikes, which are listed in Table 6. For the principal components the spikes in Dr~ are generally sharper, except at the ends of the transect. The largest peaks are at 80 cm, 324cm, 360cm, 476cm, 528 cm, 564cm and 640 cm. There is very close correlation between the results for these two analyses given in Table 6. The same number of significant boundaries was identified in each case, i.e. thirteen, and ten of these are the same. The boundaries identified by MLV correspond closely with both sets of results from split moving window. Nine boundaries correspond for the three analyses. An additional boundary coincides with the results from SMW when canonical variates are used. Since the results are so consistent it seems probable that the boundaries identified express significant changes in the pollen assemblages of the series. They also confirm the interpretation of CV 1 (Fig. 6a), and the steps in this graph correspond well with the boundaries identified by the segmentation (Table 6).
The results of these analyses were compared with the original subdivision made of the profile by Whittington et al. (1991). The latter identified seven boundaries (Table 6 and Fig. 5), with three major zones and four subzones. Six of these boundaries correspond fairly closely with the major boundaries identified by SMW using canonical variates (Fig. 13a and Table 6). The optimal segmentation (Fig. 12 and Table 6) subdivided the profile more finely than that attempted by Whittington et al. (Fig. 4). This might be because these methods take better account of the correlation among the variables, making them more sensitive to the more subtle changes or simply because Whittington et al. were 'lumpers' rather than 'splitters'. The boundaries from SMW for the canonical variates are shown on the pollen diagram (Fig. 13). The major boundaries coincide with clear differences in the pollen assemblages. At 648 cm Betula and herbs decrease, and Corylus-type increases; at 524 cm Betula, Poaceae and Cyperaceae increase,
140
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
a)
100 9080-
70~:T 60(/3 C3 50E O
40-5 Ld 30-
20lO-
i loo
1so
2oo
2so
soo
sso
Depth
b)
4o0
/
4so
soo
sso
600
650
45o
soo
s50
6o0
650
cm
300
250
~Cr 200U3 C3 150O C 0 -6
-~ 100-
50-
100
150
200
25O
soo
sso
Depth
400
/
cm
Fig. 12. (a) Squared Euclidean distance from split moving window (SMW) plotted against sampling position for the first five canonical variates. (b) Mahalanobis D2 from SMW plotted against sampling position for the first five principal components.
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
141
/
,~
4*
50 . . . . . . . . . . . . . . .
0"~" o ~
,,~
,,O~
,#
1504
200
!
30Oi ~ ' ~
~o~
)
,
4ooi~
i I i
_ ~L Z
,
I
+-
~I
-i
-
....... ¸
:-i ....
......
+~r-?
500~
~ o ~ ~ 700
2O
~ + 40
6O
20
2(7
20
20
40
60
80
20
40
60
20
80
20
40
60
80 lO0
Fig. 13. Boundaries identified by SMW for the first five canonical variates superimposed on the pollen diagram for the 20 taxa retained from Pickletillem with the vertical axis in centimetres. The solid lines represent major boundaries and the dashed lines minor ones.
¢
/
/
0-, .........
5004 E~ 20004 ~
-
- '~ f
~
3500
,~oool
~oo!
5000~ 5500 ~ 6000-
10000~ I0500J 110004 ~-~
P
! i ~
~
h
%
)
20 40 60 80
~-~'
~d
~o
40
eo Bo ~oo
Fig. 14. Boundaries identified by SMW for the first five canonical variates superimposed on the pollen diagram for Pickletillem with the vertical axis in calibrated years. The solid lines represent major boundaries and the dashed lines minor ones.
and Corylus-type decreases; at 476 cm Quercus increases and Betula decreases; at 364 cm decreases in Pinus and Corylus-type are pressaged, while Alnus begins to increase to its maximum for the profile. At 292 cm Corylus-type and herbaceous taxa increase, and Betula, Ulmus and Quercus
decrease. At 108 cm the deciduous arboreal taxa, Betula, Quercus, Alnus and Corylus-type, begin to oscillate in abundance, with peaks in heath and herbaceous taxa. At 84 cm the deciduous taxa and Calluna decline, and Pinus, Poaceae, Cerealia-type, Cyperaceae and other herbs increase.
142
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
The minor boundaries identifed by SMW also correspond with changes in the pollen assemblages. For example, those at 568 cm and 476 cm would demarcate subzones in a conventional zonation. The increase in Ulmus at 568 cm was omitted in the original scheme on ecological grounds (the rational limit was taken as the boundary at 640 cm). The increase in Quercus at 476 cm probably merited a boundary in the original scheme. The boundaries between 124 cm and 232 cm lie within a zone of recurrent fluctuations in the profile and were not delimited separately in the original zonation. Nevertheless, SMW has clearly identified plausible boundaries (Fig. 13), largely where tree pollen declines and herbaceous pollen increases. The vertical axis of the pollen profile has also been represented in dendrochronologically corrected calendar years (Fig. 14). Table 7 gives the corresponding ages of each boundary and shows the considerable variation in the estimated time intervals between them of between 380 and 1710 years.
Table 7 Boundary positions identifiedby SMW with canonicalvariates, the segment lengths, the correspondinguncalibrated years, the calibrated years BP and the length in time of each section of the profile SMW-CVs Segment Uncalibrated T i m e length years years (cm) BP 84
380
460
1133
1250
1599
1780
3116
3490
4469
5040
5100
5840
5870
6680
6500
7350
7420
8180
7858
8560
8180
9030
8480
9450
9320
10,280
24 108
790
16 124
530
52 176
1710
56 232
1550
60 292
800
36 328
840
36 364
670
8O 444
83O
32 476 524
Both the variograms and the spectra show that some aspects of the pollen assemblage vary periodically down the profile. This was not apparent in the pollen diagram (Fig. 5). The period of about 32 cm matches the spacing between the boundaries which is 31.3 cm on average between the closer boundaries (Table 7), and 62.6cm between the more widely spaced ones. There is the same number of large and small segments (Table 7). The period of 32.4 cm derived from the variograms of the principal components corresponds to an average age of about 360 years for the whole profile, from Fig. 2. In addition, the sections down the core appear to represent a common length of time (Fig. 8) of approximately 800 years. The periodicity is more evident in time than in depth. The graph of time against depth (Fig. 2) is reasonably linear between ca. 4870 and 10,310 calendar years BP, and in the absence of other dating the relation must be assumed linear between 4870 and 160 BP. The interpretation is not straightforward because equal lengths of core do not represent equal lengths
380
48
7. Discussion and conclusions
470
44 568
420
80 648
Time (calibrated years)
830
The figures in bold signifythe most prominent boundaries. of time. Nevertheless, the correspondence is good enough to give two periods: one of 800 years matching that estimated by Barber et al. (1994) in northern England, and the other of 360 years like that found by Wijmstra et al. (1984) If the periodicity in pollen counts of important taxa is accepted then one must ask whether this arises from cyclic sedimentation, from periodicity in the climate, or some other factor, either separately or in combination, that affected the vegetation. The periods seem too long to be explained by the ages of the woodland stands, and a model of competitive interaction might explain better the consistency in the composition of the vegetational communities. Whether the balances between different species were shifted by changes in the
M.A. Oliveret al. / Review of Palaeobotany and Palynology 96 (1997) 121-144 c l i m a t e o r o t h e r a t t r i b u t e s o f t h e e n v i r o n m e n t is u n k n o w n . W h a t e v e r the cause, there seems to be a n u n d e r l y i n g p a t t e r n t h a t is w o r t h i n v e s t i g a t i n g f u r t h e r . It w o u l d b e v a l u a b l e t o b e a b l e t o c o m p a r e these results with similar analyses of pollen data
143
The columns of Y are the canonical variates. They are uncorrelated over the whole core and have variances 21,22,-.,2p. They are ranked in order of their variances. Those with the largest )~ maximize the ratio of the between-segment to within-segment variance and discriminate best between the segments, and these were retained for further analysis.
of comparable or greater resolution to determine w h e t h e r p e r i o d i c i t y is a c o m m o n
feature of such
profiles.
Spectral transformation o f autocorrelation Spectra can be obtained from the spatial covariances or correlations, the complements of the semivariances, Eq.(1). Here the non-ergodic correlations were computed using the formula for a regular one-dimensional series:
Acknowledgements
1
We thank Professor H.J.B. Birks for his comments on an earlier version of our paper.
Appendix One-dimensional canonical variate analysis This form of canonical analysis is based on the notion that a transect can be divided by g - 1 boundaries into g relatively homogeneous segments with a common variance-covariance matrix. The analysis and its underlying rationale are as follows. Let the counts be held in a n x p matrix, say Z, where n is the number of sampling points, here 154 levels in the core, and p is the number of variates, here taxa. These are centred by subtracting the column means to give a matrix, C, with elements cj~=zj~-2k, as for principal component analysis. The Sums of Squares and Products (SSP) matrix, say T, is computed for the whole sequence by: T = ~ c~cj
(5)
where cj is the row vector of counts for the jth section, and T signifies the transpose. An analogous SSP matrix of successive differences between sampling points is also computed by:
n-h
Y,
{z(j)-2~ }{z(j+h}-2, }
(91
(n--h)sls2 j="'~
where z(j) and z(j+h) are the values at positions j and j+h separated by the lag h. The quantities 21 and 22 are the means of Z from positions 1 to n - h and from) + h to n, respectively, and sl and s2 are the corresponding standard deviations. Using these means rather than the overall mean and variance, as is usual in computing the autocorrelation, removes some of the long-range effects. The autocorrelation coefficients thus obtained are then used to compute the spectrum by:
--
/~(f)= 2zr
1
fi(k)w(kIcos (~fk)
(i0)
for frequency, f, in the range 0 to ½cycle. The quantity Lis the maximum lag from which the transform is computed: it is the width of a window in the spatial domain. The shape of the window is determined by the smoothing function, w(k), for which we chose the simple Bartlett window. This is defined in the lag domain by k w ( k ) = l - - for 0>k_>L L =0
fork>L
(ll)
M a x i m u m Level Variance
1 S = 2 3~'=2 (cJ--eJ-1)T(cJ--CJ-I)
(6)
This matrix is assumed to be a satisfactory estimate of a withinsegment SSP matrix. The matrices T and S can be used to obtain the transformation to canonical variates. The eigenvalues, 2,i = 1,2,...,p, of matrix TS- 1 and the associated eigenvectors, a~, are found to satisfy: aT(T--2,S)=0
(7) 1
1
n
n
The vectors are scaled so that -aXiSai= l and -aXiTai=,~i. The centred counts, C, are transformed to the matrix Y by ¥=CA
6(h)
(8)
Hawkins and Merriam (1973, 1974) assumed that a full onedimensional sequence can be divided into g relatively homogeneous segments with a common variance-covariance matrix, W, and multinormal distribution. They further assumed that Ill(n-1)IS is a satisfactory estimate of W. They then aimed to minimize the quadratic form:
k=l j=u k 1+1
where zj is the vector of values in the jth section, and ~, is the vector of means of the kth segment, the ends of which are the sections uk-~ and uk. With many variates this would have been formidable. But by transforming the zj to the canonical variates, Yi, the S in Eq. (12) becomes the identity matrix, l, and Q is
144
M.A. Oliver et al. / Review of Palaeobotany and Palynology 96 (1997) 121-144
computed simply as: Q= ~
~.
(yj-fik)r~v, --ilk)
(13)
k = l j=u~ l + t
Hawkins and Merriam then went on to compute Q for all relevant partitions of the sequence by an ingenious piece of dynamic programming (Hawkins, 1975), and thereby to identify the partition for which Q is least.
Split Moving Window The method of split moving window aims to segment a onedimensional sequence by maximizing the difference between adjacent segments. Webster (1973) originally chose squared Mahalanobis distance as the criterion: D~ = (ft,-fft)rW '(f, -:~)
(14)
where ~,~ and ~'t are the mean vectors of the upper and lower halves, respectively, of the window, and W is the pooled withinhalves variance-covariance matrix. The window is moved down the sequence of pollen data one sampling interval at a time, and D~ is calculated at each position and plotted against it. Peaks indicate where the contrast is greatest and hence the boundaries between segments. A limitation of the technique is that matrix W must not be overdefined; i.e. there must be fewer variates than there are sampling points in the window. So again, if many variates have been recorded then the data must be reduced. This could be done by principal component analysis, but, as above, there is no guarantee that the components would discriminate as well as the original data, as Webster (1978) discovered. The canonical variates retain that power, and so these are used as 'data', but now a squared Euclidean distance is computed from them as criterion: O~ = (L --Y,)T(L --Y,)
(15)
References Aaby, B., 1976, Cyclic climatic variations in climate over the past 5,500 yr reflected in raised bogs. Nature, 263: 281-284. Barber, K.E., Chambers, F.M., Maddy, D., Stoneman, R. and Brew, J.S., 1994. A sensitive high-resolution record of the late Holocene climatic change from a raised bog in northern England. Holocene, 4: 198-205. Bennett, K.D., 1994. Annotated Catalogue of Pollen and Pteridophyte Spore Types of the British Isles. Dept. Plant Sei., Univ. Cambridge, Cambridge. Bennett, K.D., Whittington, G. and Edwards, K.J,, 1994. Recent plant nomenclatural changes and pollen morphology in the British Isles. Quat. Newsl,, 73: 1-6. Birks, H.J.B., and Gordon, A.D., 1985. Numerical Methods in Quaternary Pollen Analysis. Academic, London.
Birks, H.J.B. and Line, J.M., 1994. Sequence splitting of pollen accumulation rates from the Holocene and Devensian lateglacial of Scotland. Diss. Bot., 234: 145-160. Gordon, A.D., 1982, Numerical methods in Quaternary palaeoecology, V. Simultaneous graphical representation of the levels and taxa in a pollen diagram. Rev. Palaeobot. Palynol., 37: 155-183. Green, D.G., 1981. Time series and postglacial forest ecology. Quat. Res., 15:265-277 Hawkins, D.M., 1975. Fortran IV program to segment multivariate sequences of data. Comput. Geosci., 1: 339-351. Hawkins, D.M. and Merriam, D.F., 1973. Optimal zonation of digitized sequential data. Math. Geol., 5: 389-395. Hawkins, D.M. and Merriam, D.F., 1974. Zonation of multivariate sequences of digitized geologic data. Math. Geol., 6: 263-269. Jenkins, G.M. and Watts, D.G., 1968. Spectral Analysis and its Applications. Holden-Day, San Francisco, CA. Journel, A.G. and Huijbregts, C.J., 1978. Mining Geostatistics. Academic, London. Lotter, A.F., Either, U., Birks, H.J.B, and Siegenthaler, U., 1992. Late-glacial oscillations as recorded in lake sediments. J. Quat. Sci., 7: 187-204. Merriam D.F. (Editor), 1964. Symposium on Cyclic Sedimentation. Kans. Geol. Surv. Bull., 169(I/II). Odgaard, B.V., 1994 The Holocene vegetation history of northern West Jutland, Denmark. Opera Bot., 123: 1-171. Oliver, M.A. and Webster, R., 1987. The elucidation of soil pattern in the Wyre Forest of the West Midlands, England, II. Spatial distribution. J. Soil Sci., 38: 293-307. Prentice, I.C., 1980. Multidimensional scaling as a research tool in Quaternary palynology: a review of theory and methods. Rev. Palaeobot. Palynol., 31: 71-104. Ross, G.J.S., 1987. Maximum Likelihood Program. Numerical Algorithms Group, Oxford. Schwarzacher, W., 1975. Sedimentation Models and Quantitative Stratigraphy. Elsevier, Amsterdam. Stuiver, M. and Reimer, P.J., 1993. Extended 14C data base and revised CALIB 3.0 14C age calibration program. Radiocarbon, 35: 215-230. Walker, D. and Wilson, S.R,, 1978. A statistical alternative to the zoning of pollen diagrams. J. Biogeogr., 5: 1-21. Webster, R., 1973. Automatic soil boundary location from transect data. Math. Geol., 5: 27-37. Webster, R., 1978. Optimally partitioning soil transects. J. Soil Sci., 29: 388-402. Webster, R. and 0liver, M.A., 1992. Sample adequately to estimate variograms of soil properties. J. Soil Sci., 43: 177-192. Whittington, G., Edwards, K.J. and Caseldine, C.J., 1991. Lateand post-glacial pollen-analytical and environmental data from a near-coastal site in north-east Fife, Scotland. Rev. Palaeobot. Palynol., 68: 65-85~ Wijmstra, T.J., Hoekstra, S., De Vries, B.J. and Van Der Hammen, T., 1984. A preliminary study of periodicities in percentage curves dated by pollen density. Acta Bot. Need., 33: 547-557.