Journal of Environmental Management (1996) 47, 37–60
Land Classification for Strategic Ecological Survey R. G. H. Bunce∗, C. J. Barr∗, R. T. Clarke†, D. C. Howard∗ and A. M. J. Lane∗ ∗Institute of Terrestrial Ecology, Merlewood Research Station, Grange-over-Sands, Cumbria LA11 6JU, U.K. and †Institute of Terrestrial Ecology, Furzebrook Research Station, Wareham, Dorset BH20 5AS, U.K. Received 31 March 1995; accepted 5 September 1995
Traditionally, ecological survey relies upon the intuitive interpretation of habitat patterns in the field. The statistical stratification developed for regional survey by the Institute of Terrestrial Ecology (ITE) is designed to minimise personal judgement in sample site location. The environmental strata within the classification are recognisable and give confidence to users from a variety of backgrounds. The methodology originated in the 1960s when multivariate statistics were developed and applied to ecological data. Initially, Great Britain (GB) was classified from environmental data recorded from 1200 out of the 240 000 km2 of the National Grid. Logistic discrimination and discriminant functions were used to assign the remaining squares to original classes and to reassign the original squares. The final classes differed slightly from the initial divisions because the allocations use different data sets and different techniques. Field surveys of ecological parameters have been used to provide independent data for testing the classification, to characterise the classes and to provide national estimates of land coverage. The statistical rationale behind the methodology is described and the relevance of the experience gained during the development is discussed in relation to future work. Finally, applications of the classification are described demonstrating its use as a framework in a variety of ecological studies. 1996 Academic Press Limited
Keywords: land classification, stratification, statistical assumptions, field survey.
1. Introduction The traditional way of creating sampling strata in ecology has been to make intuitive divisions of habitat types in the field, and then to describe each type from samples located subjectively. Such approaches, e.g. in phytosociology (Ellenberg, 1978) and soil science (Beckett, 1971), are primarily intended for mapping and include no independent tests to validate the strata. Further, because of their subjective nature, valid statistical estimates of extent cannot be made even though the categories are usually mapped and can be measured; no estimate of variability can be assigned to the measurements. In 37 0301–4797/96/050037+24 $18.00/0
1996 Academic Press Limited
38
Land classification
addition, those areas which cannot easily be assigned to a group may be ignored, especially where they fall between the standard definitions. At the landscape scale, units are usually defined subjectively (e.g. Hills, 1974), or the system of classification is constructed from experience, e.g. the Agricultural Land Classification (MAFF, 1966). Landscape evaluation has presented similar, if not more difficult, problems for quantitative analysis, and there has been much discussion about the validity of attaching aesthetic values (Tandy, 1967). Fully quantitative procedures were developed in the 1970s (Robinson et al., 1976), but these have not been widely used because intuitive methods are much easier to apply. In theory, the environmental classification of landscape types should be separated from the process of aesthetic evaluation (Liddle, 1976) but, in practice, the two approaches often overlap. This paper presents the statistical rationale behind the objective system of classification developed by ITE (Bunce et al., 1975). The system is designed to provide a set of general integrated strata for assessing ecologically related parameters in Great Britain. The historical background to the development is described before considering the statistical features of the system and its potential for producing useful classifications in other countries or regions. 2. Historical background The ITE system had its origin in the expanding use of multivariate analysis for ecological data in the 1960s and 70s. With advances in computing technology and the development of more efficient algorithms, it became possible to analyse progressively larger data sets with relative ease. Such methods were widely accepted in vegetation science (GreigSmith, 1964), but were less readily accepted in other disciplines such as soil science (Beckett, 1971), although there was much discussion of their applicability (e.g. IvimeyCook et al., 1966). In the early 1970s, ITE (Bunce et al., 1975) applied a multivariate classification technique to environmental parameters derived from published maps, rather than to ecological field data. A series of sample units were classified into groups according to a set of environmental parameters that were interrelated in complex ways. The procedure formalised and mimicked objectively the recognition processes previously used to identify landscape types. Extremes of landform were easily identified, but it was more difficult to classify more uniform landscapes which had no single dominant feature. For example, the Cairngorms, as the highest mountain plateau in GB, were recognised as a landscape unit, but it was difficult to classify the southern uplands of Scotland which had no instantly recognisable boundaries. The method developed by ITE divided the whole land surface into non-overlapping spatial sampling units, which were classified into groups according to their measured environmental features. These groups could then be used as strata from which to draw samples of ecological parameters as required. As categories, they formed a suitable basis for research in landscape ecology. The interrelationships between ecological composition, spatial arrangement and man’s influence could be examined across sample areas while expressing them in a national context (Bunce, 1984). The effectiveness of this methodology depends upon the strength of the relationship between the features being surveyed, the basic environmental data, and the method of classification used to form the strata. For a survey of Cumbria (Bunce and Smith, 1978), financial restrictions required the use of cheap and readily available data. Sources included published cartographic data, such as the 1:63 360 (1″ to the mile) Ordnance Survey maps and geological maps at 1:250 000 scale. The first phase recorded as many attributes as possible from the
R. G. H. Bunce et al.
39
maps so as to identify strong elements and to eliminate bias. Using a grid, data were recorded from 11% of the 1 km squares in Cumbria, and multivariate analysis was used to classify the squares into 16 classes. These classes showed well-defined patterns of distribution within the county relating to known geomorphological features, but also produced interesting patterns not readily apparent from direct observation. In the second phase, a random sample of squares from each land class were visited and details of the vegetation recorded as a test of the validity of the land classes. The high correlations between the land classes and their vegetation composition confirmed their suitability for use as strata on which to base other environmental surveys. The exercise was similar to that of Fourt et al. (1971), except that reciprocal averaging (Hill, 1973) was used to analyse data from vegetation samples, and the data were correlated with the comparable analysis of the environmental data. The correlation (R2) between the two first ordination axes was 0·745 (P=<0·001). Reciprocal averaging has now been replaced by DECORANA (Hill, 1979b). The demonstration of such correlations and of the potential for applying the classes to land use management and planning led to a survey of the whole of GB. 3. The ITE land classification of GB The initial classification in 1977 was based on a sample of 1212×1 km squares at the intersections of a 15×15 km grid placed over GB. Four types of environmental parameters were recorded for each 1 km square: (i) (ii) (iii) (iv)
climate; topography; human geography; geology and drift.
A total of 40 environmental variables was recorded, details of which are given by Bunce et al. (in press). Some variables were pre-selected to prevent distortion of the final classification; for example, three representative climate variables were selected from a total of 12 using Principal Component Analysis. In order to use Indicator Species Analysis (ISA), a divisive polythetic classification method (Hill et al., 1975), the data had to be in binary form (i.e. presence/absence); where the data were recorded as continuous variables, they were ranked and split evenly into four “pseudo” attributes, producing 198 binary attributes. Other features, such as geological formations, were also recorded, giving an overall total of 281 attributes. The data were classified using ISA which was originally designed to analyse botanical records sampled in quadrats which contain many zero values; the environmental variables converted to pseudo attributes had a similar form even though many features were continuous. TWINSPAN (Hill, 1979a), which superseded ISA, uses effectively the same ordination and classification algorithm. The classification was stopped after five levels of binary division had produced 32 groups of 1 km squares, termed “land classes”. The 32 land classes were then used as strata to select sites for a field survey of GB. Estimates of the relative size, and hence total area, of each land class in GB were improved by classifying (or keying out) a further 4800 squares. The squares were classified using the key produced by ISA. The 76 “key” indicator attributes were recorded for a regular grid of four squares around the original 1212 squares. The ISA key was also used in studies which required squares to be classified outside the sampling grid, e.g. by O’Connor (1985) for the sites of the Common Bird Census. The Planning Department of the Highland
40
Land classification
Regional Council in Scotland later classified all the squares in its region (over 25 000), and showed the advantages of complete coverage for strategic planning purposes where more exact geographical definition is required, as well as exact figures for the extent of each land class (Bunce et al., 1986). It was neither practical nor desirable to classify all 240 000 squares in GB using the key produced by ISA. It would have taken tens of man-years to record manually the key attributes for every square, and, even then, there were potential errors in using a procedural dichotomous key. It was necessary to use automated data capture methods to reduce the weighting of data by further multivariate techniques. However, the data and techniques prevented the same level of detail being recorded for all variables and the results did not perfectly match the original ISA. The procedure that was eventually adopted is described in section five. The original 32 classes derived from the classification of the 1212 squares were used as strata for the stratified random selection of eight 1 km squares from each class (i.e. 256 in total) for a field survey of vegetation, soils and land cover in 1977/78. A summary of the information collected is given by Bunce and Heal (1984). These data were used to assess the effectiveness of the whole field sampling scheme at a national level and show that, for the main land cover categories in Britain, the figures were comparable to those produced by independent official sources. In 1984, the same 256 squares were resurveyed, but the sample size was increased by a further four 1 km squares from each class to give a total of 384 squares. In the second survey, more detailed information on land cover was collected (Barr et al., 1986). It is also possible to estimate the amount and distribution of features within regions (or sub-populations) of GB using the land class composition. There are two methods of producing estimates of cover for regions from features surveyed at a national scale: either (1) the samples can be filtered and only those occurring in the region used to produce the land class means, or (2) the total sample can be used to estimate the mean for each land class in the region. Method (2) involves the assumption that, for each land class, the true regional mean is the same as the GB mean. The regions are unlikely to be random samples from the whole GB population because they are spatially coherent sub-populations. If the average cover of a particular land feature within a land class is different in the region than over GB as a whole, then the resulting method (2) estimate may be biased. However, a large region may contain the complete population of some land classes as the land classes are also geographically coherent, and the populations of squares in the other land classes may be too small and hence have little influence on the regional estimates. Often any bias in method (2) will be outweighed by the increase in precision of feature estimates based on using all the available surveyed squares, rather than just those (perhaps few) which fall within the current region of interest. Between-region differences in land class means should be assessed to test the validity of the method (2) assumption. Correlation analyses have been used to test the links between the original environmental classification and the various sets of survey data. For example, the mean value of the first reciprocal averaging ordination axis of the first ISA division (“first mean axis score”) in the land classification has been calculated for each land class as a measure of the overall environmental gradient behind the classification. The gradient of land cover was assessed using the principal components from the matrix of mean land cover values of the areas in the sample squares for each of the 32 land classes. Similarly, component values were extracted from the mean proportions of each soil group in each of the 32 classes. The correlation coefficients from the mean axis scores
R. G. H. Bunce et al.
41
from the land classification were 0·82 with the first component of the land use data, and 0·92 with the first component of the soils data (P<0·001, 30 df—in both cases). A further test was provided by Furse et al. (1990), who sampled invertebrate populations from streams using the methodology described in Wright et al. (1989); they found a correlation of 0·95 between the mean first axis scores from DECORANA for the land classes and the equivalent for the freshwater invertebrate species composition. 4. Statistical rationale The primary objective of the stratification was to describe the relationships between different elements of the British landscape, and then to partition the variation within the GB land surface into classes which, although arbitrary, are “relatively” homogeneous. The following assumptions were involved. (i) There is a “natural” structure to the environmental factors which underlie the GB countryside that is reflected in the inter-correlations between the variables. These features can be measured and analysed to formalise the relationships. However, the structure has fuzzy, not crisp, boundaries because of the continuous nature of the variation. (ii) These physical environmental parameters are correlated with the composition of elements of the countryside, such as land use and habitats, directly and indirectly. (iii) An appropriate classification technique applied to measured map parameters can define strata and identify the main environmental trends and structure of the GB environment. (iv) The land features of interest may be predicted with statistical measures of accuracy depending upon their degree of correlation with the strata, and hence the strata may be used to improve estimates of the areal cover of the features. In general, stratification provides a more efficient strategy than simple random sampling (see Cochran, 1977), although Kish (1965) has pointed out that there may be only small or moderate gains in proportional sampling of many attributes in single surveys. A random sample would not allow different types of landscape to be described and would limit the expression of heterogeneity in subsequent models. The efficiency of any stratified sampling scheme is defined as the ratio of the error variances obtained from simple random sampling (SRS) to those resulting from the stratified scheme, where equivalent effort is used. The efficiency of a stratification depends on the classification on which it is based. In the literature, a great deal of effort has been devoted to the comparison of classification methods. Most methodological studies involving classification techniques (e.g. Howard, 1977; Podani, 1989) use standard data sets which have a known structure, so that the efficiency of the classification can be compared with a rather simple version of “the truth”. The value of the classification goes beyond the simple increase in efficiency to include the ability to extend the data and sub-divide the population. No particular classification technique has been proved better than the others in all situations (Everitt, 1980; Kent and Ballard, 1988), and effectiveness depends to a great degree upon the matching of the algorithm to the type of data involved. For the purpose of stratification, the method of classification is arbitrary because, provided it is unrelated, the result cannot be predetermined. In the present case, the experience of previous studies (Bunce et al., 1975; Bunce and Smith, 1978) have shown
42
Land classification
that ISA is appropriate for the type of data involved, probably because there are many zeros and the information is therefore similar to the botanical data for which ISA was originally designed. Any inefficiency in the classification method will be reflected in higher error terms attached to subsequent estimates of the land features. Efficiency should be determined using comparisons with data collected by different methods. In multiple-feature surveys, without detailed prior knowledge of the variability involved, it is best to allocate sampling effort so that strata are sampled in proportion to their size. In the worst case, stratified sampling will only be as efficient as simple random sampling, but will normally improve accuracy significantly (Cochran, 1977). However, if the individual strata are of interest, then it is necessary to have a minimum number of samples in each class to produce acceptable estimates of their features. The optimum strategy for the allocation of samples to classes therefore depends upon the objective of the study, and this should be defined carefully at the outset. A test of the efficiency of the ITE land classification was carried out on the county of Devon by the Aberdeen Centre for Land Use (Brandon et al., 1989). It demonstrated significant increases in efficiency over SRS, and showed that the ITE national classification was more effective than a specific classification for Devon in terms of predicting land cover. “Neyman’s allocation” (Neyman, 1934) states that the optimal stratification scheme should minimise the variance of the population estimation per unit cost (in terms of money and/or time). Performing field survey in different landscapes involves variable costs; for example, upland areas may take more time to reach and walk over, but intricate complex patterns of land features in lowland areas may take longer to map and record. If it is known from previous studies that a particular feature does not occur in a certain stratum, then the best strategy for that feature is not to sample the stratum. However, when recording a variety of land features, especially on the first survey, such information is not available and all classes need to be covered; expert opinion must be treated with caution as alterations may merely reinforce existing prejudice. Altogether, the relative within-stratum variability in each of the parameters and the costs of sampling different strata are likely to lead to differing, and perhaps conflicting, optimum allocations of sampling effort in multi-purpose surveys. However, the initial survey can provide estimates of the variability of features and define the most efficient class sample sizes for different features. Where unusual, but important, environmental types form a small part of the total population, then SRS, and even proportional stratified sampling, are unlikely to provide an adequate sample unless the sample size is increased. If the rare feature is correlated with the parameters used to form the stratification, then the classification will still be more effective than SRS. The original ITE GB field survey selected eight sample squares per land class, regardless of class size, because the primary objective was to characterise the strata in terms of land cover features, not simply to provide estimates of GB populations. The land classification of GB produced spatially cohesive land classes, as shown in Figure 1, which were more likely to produce a dispersed sample than SRS. The dispersion of samples was further assisted by the use of a sampling grid. Moss (1985) identified spatial cohesion as an advantage in both interpretation and presentation, especially where the strata are recognisable geographically. The initial sample-based classification showed a number of geographical outliers which were reduced when every square in GB was classified. Spatial autocorrelation, i.e. the increased probability of neighbouring sites being similar, as discussed by Sæbø (1983), is an important factor associated with geographic cohesion within types.
R. G. H. Bunce et al.
Figure 1. The distribution of 1 km2 in GB into 32 land classes of the ITE Land Classification.
43
44
Land classification
4.1. An alternative method of deriving population estimates from field samples would be to use multiple regression. Having selected and surveyed a sample of 1 km field squares, multiple regression type analyses could be used to relate the amount of each land feature recorded in the field survey to the environmental attributes associated with each square. In this case, classes would not be required because each square would be treated individually. If the environmental attributes were then available for (all) other squares, the regression relationships could be used to predict their land features. However, a separate regression model would be needed for every ecological parameter. The equations would also have to include complex interaction terms, because a type of land cover may only occur in certain joint combinations of the environmental variables. Robertson (1991) has shown such an approach is feasible for an individual game species. Furse et al. (1990) are also investigating this using the basic variables to derive regression equations for freshwater invertebrates. The advantages of economy in computing made the development of a complete classification achievable. An advantage of the land classification is that the land classes may be recognised and used with little or no statistical training, whereas statistical knowledge is required for regression relationships using such complex data. In addition, if a long-term aim is to estimate change in land use, then regression equations would have to be re-calibrated after each field survey and the driving environmental variables which best predict each feature may even have changed. The 32 land classes are, in fact, equivalent to 32 binary (0/1) dummy environmental regression variables in multiple regressions for each separate land feature. The regression coefficient for a land class dummy variate would represent the deviation of the mean amount of a land feature in that class from the population mean. 4.2. There are three main approaches to the classification of multivariate data: agglomerative, divisive and non-hierarchical. The first ITE land classification was produced in 1977 when the commonly used classification techniques were either polythetic agglomerative methods (such as Similarity Analysis and Information Analysis) or polythetic divisive techniques. Howard (1977) identified the problems of applying numerical methods to ecological and environmental data, and concluded that the technique used should be determined by the type of problem. Non-hierarchical iterative relocation techniques are described by Moss (1985). As described above, ISA (now TWINSPAN) was used to produce the ITE land classes, partly because experience had shown it to be a suitable technique, but also because the algorithms for agglomerative methods were not able, at that time, to handle the size of data set involved. With recent developments in classification methods and computer power, it is now possible to apply a range of classification techniques to investigate the multivariate structure of the data (e.g. Podani, 1989). However, these classifications should be tested against independent field data in order to assess their efficiency. Many reviews are available on classification techniques; for example, Dale (1988) describes the wide range of classification methods and emphasises that the choice of method is arbitrary and best gauged by individual performance. Whilst classification procedures other than TWINSPAN may now provide a better stratification, any possible improvements are likely to be small in relation to the variability in the field data.
R. G. H. Bunce et al.
45
4.3. The selection of data used for constructing the classification was limited primarily by availability, but also by the ease and cost of acquisition at a consistent geographical resolution over GB. The classification for Cumbria (Bunce and Smith, 1978) included topography, geological attributes and human artefacts, but not climate. The relationship between the types of data is of major importance in that the analytical techniques identify factors which are highly correlated with each other. The classification depends upon the strength of the inter-correlations between attributes, together with the number of attributes representing each aspect or gradient of the variation. If the attributes measured are mostly climatic, or related to climate, then the classification will be predominantly based on climatic factors. Ideally, each type of data (e.g. geology) would be investigated individually and its effect within the classification determined by exploratory analysis. An alternative strategy is to rely upon the classification technique being sufficiently robust to produce groups of samples that are not overweighted towards individual variables. In the ITE classification, 12 climate variables were originally recorded but were so highly correlated that they dominated the structure of a preliminary classification. Accordingly, Principal Component Analysis was carried out on the climatic data and the three main variables which expressed most of the variability were used in the classification. The stability of the classification depends upon the strength of the principal gradient in the study area and upon the balance between the attributes. In a study of the Culm area of Devon (Dartington, 1986), the analysis was dominated initially by climatic factors which were correlated with altitude. However, excluding the climatic data had little effect on the classification as altitude defined the principal gradient for most climatic features. In the GB classification, altitude and climate variables were numerically dominated by the 120 geological attributes but, because of the high degree of inter-correlation of the former, they played the major part in the formation of the principal gradients, and subsequent divisions, in the classification. There are two further aspects of data balance which are important. (i) Jones et al. (1985) investigated the effect of converting continuous variables into discrete pseudo-attributes, and demonstrated high correlations between the analyses, providing an appropriate analytical technique was used. The high correlation was due not only to the dominance of major gradients, but also to the importance of the qualitative, rather than quantitative, aspects of the data. (ii) Some attributes are not sufficiently common to become class indicators. In the original ITE classification, continuous variables were converted into four pseudoattributes by dividing the full range of their values into four equal sub-ranges. It is possible to use the frequency distribution of the continuous variates to define the splits so as to ensure better representation in each attribute class. Elena-Rossello et al. (1984) investigated various methods for dividing variables, and concluded that minor improvements could be made to the classification (assessed by using an independent variable as a test) by making the separation more even. It may be concluded from the above discussion that manipulative analyses can improve the classification to some degree but that the major structure is unlikely to be affected. Even very simple ordination methods identify the principal gradient successfully (Bunce,
46
Land classification
1965), but the allocation to classes also depends on the positioning of the cutpoint. This is usually at the centre of multivariate space, where samples are densest. Consequently, a small difference in the point of division on the gradient inevitably reallocates many samples. 4.4. The selection of 32 as the final number of land classes was based on two criteria. (i) Size of end groups. The minimum number of 1 km squares within each of the final groups produced by the ISA was important to obtain an adequate estimate of land features for each land class. Using the continuously variable GB environment, the ISA divisions of the 1212 squares into 32 groups produced a range of 18 squares in the smallest group and 78 in the largest. Although much consideration has been given to the stopping rules for further sub-divisions in the classification (e.g. in relation to the variance of different groups), the most widely used procedure is to accept an ad hoc minimum size of group guided, to some extent, by usability. Within ISA, smaller groups generally have a higher information content than larger groups and also tend to be ecologically diverse (Bunce, 1982). It is not therefore advisable for all groups to be virtually the same size because this would over-ride the natural structure of the data. (ii) Interpretability of divisions. Sufficient groups were needed to produce an adequately detailed sub-division of the GB land surface, whilst recognising the limited resources available for field survey of each of the resultant strata. The existing 32 classes provided a convenient number for a variety of purposes. If a smaller number of classes was required, groups could be combined within the hierarchy of the classification (Harvey et al., 1986). Studies of the Scottish Highland Region (Highland Regional Council, 1984) and Devon (Brandon et al., 1989) have investigated the effect of increasing the number of groups by subdividing the ITE land classes in the relevant regions. For Devon, both the extended ITE classification and the standard ITE system performed better than a new classification. Aberdeen Centre for Land Use concluded that extending the ITE system was the optimum strategy, if sufficient field samples were available. The Highland Region study showed that subdivision of the classes was appropriate for planning purposes. 4.5. The optimum size of the sampling units depends upon the objective of the study (GreigSmith, 1964; Lambert, 1972). The choice of scale is therefore a compromise between sampling many small units and sampling fewer larger units. The former has theoretical advantages in that it is self-evident that sampling 40×12 km squares is likely to cover more variation than 2×10 km squares. However, each sample unit costs a certain amount of time to survey, irrespective of its size, because of the time spent travelling to and locating the site. For the same total area sampled, it therefore costs more to sample many small units, although they will probably give more precise estimates. The smaller the sample unit, the greater the problem of precise relocation, and hence the introduction of errors in repeat surveys to estimate land use change. The 1 km square is a convenient compromise sampling unit for many purposes; it integrates with the
R. G. H. Bunce et al.
47
T 1. Comparison of ITE land cover estimates (in km2×103) for 1984 based on a stratified field survey sample using the ITE classification, with the Monitoring Landscape Change (MLC) project (Hunting, 1984) results for equivalent land cover types, area in England and Wales (EW), coefficient of variation (CV) ITE 1984 Land cover Woodland Semi-natural vegetation Farmed land Water/wetland Urban
MLC
EW
CV
EW
CV
10·01 11·92 62·76 2·64 12·59
12·82 14·38 3·79 21·63 14·08
9·10a 13·90b 65·90c 1·10 10·00
6·80 10·20 1·40 16·70 4·90
a
With 1·2% ecological scrub. Reduced by 0·6% for scrub and increased by 5·3% to cover rough grassland and neglected grassland. c Increased by 0·6% for scrub and reduced by 5·3% to cover rough grassland and neglected grassland. b
National Grid and Ordnance Survey maps and is widely used for planning studies at the strategic scale (Brandt et al., 1994). However, for other purposes, larger units may be used; for example, ITE’s Biological Records Centre uses 10 km squares. Sæbø (1983) advocates variable-sized point samples, depending upon the scale of spatial variability. More recently, Elena-Rossello (1989) has developed this concept in relation to the land classification technique in Spain and shown how large uniform areas can be analysed at a coarse resolution, whereas more complex and variable landscapes can be dissected to a smaller scale. Units of 1 km2 can be readily combined to show the land characteristics of larger areas, such as counties or river catchments. By using contouring techniques, the sample of 1 km squares can estimate a response surface and the scale of resolution for interpretation can be varied (e.g. Horrill et al., 1988). 4.6. For many policy purposes it is necessary to know the degree of confidence that can be placed upon estimates derived from samples, and this is usually achieved by calculating error terms. For the ITE Land Classification system the relevant equations for stratified random sampling are given by Cochran (1977). In the Monitoring Landscape Change (MLC) report (Huntings Surveys and Consultants, 1986), error terms are presented as percentage errors or coefficients of variation to reflect the effect of overall coverage. Table 1 shows the ITE figures for England and Wales in comparison with the MLC figures. The latter are derived from a larger sample (707 as opposed to 256), but many of the strata used contained only small numbers of samples. Table 2 estimates the potential reduction in coefficient of variation with increasing sample size and different allocations of samples in land classes. A major problem with land cover data is the high proportion of zero values, which increase relative variability and make population estimates less precise. Zeros can arise where the “a priori” probability of occurrence of the feature in the square is zero or negligible, e.g. coastal features in inland classes. However, many zeros representing the
48
Land classification
T 2. Potential reduction in coefficient of variation (CV) produced by increasing sample size (N) with different stratified sampling strategies. Two land cover types are presented from the 1990 survey. “Observed sample” is weighting using the existing sample, “Proportional” is assuming sample proportional to stratum area and “Optimum” reflects both the stratum and variability of values (a) Wheat
N
Sample allocation
384
508
700
1000
Observed Proportional Optimum
7·34 5·24 4·43
6·38 4·56 3·49
5·44 3·89 3·28
4·55 3·25 2·74
(b) Saltmarsh Sample allocation Observed Proportional Optimum
N 384
508
700
1000
38·60 51·00 11·01
33·96 44·35 9·58
28·59 37·78 8·16
23·92 31·61 6·82
tail of distributions arise because a land feature or use simply does not currently occur in a square, even though it is environmentally suitable; for example, hills suitable for heather moorland may be covered by forest. The more land use and land feature subdivisions that are recorded in a survey, the smaller the probability that any particular land feature will occur in any particular square. If an arable field previously of barley is now sown with wheat, there is a significant change in the area of barley, but no change in the area of “cereal crops”. If a square is, for example, covered in forest, other areal features must be absent. Such data limit the increase in efficiency of stratification for detailed surveys. Although classification procedures can be effective, large reductions in standard error random samples should not always be expected, especially for individual landscape features. Further work, perhaps combining probability of occurrence with quantity, might produce more efficient estimates (Pennington, 1983). However, the current coefficients of variation are acceptable because many other factors in the rural environment, e.g. owner attitude, are either difficult to quantify or are more variable. There are two general strategic approaches to the measurement of change: (i) to measure the same areas on successive occasions, and (ii) to carry out separate random samples and detect change from differences between the population means. In the former procedure, any changes detected are known to have taken place, and the error terms attached to them will relate to the consistency in the size of the change concerned. In the second procedure, the variability with the sample populations on each occasion may be so high that valid change cannot be detected. ITE, in the Countryside Surveys, sampled the same squares in repeated surveys but also increased the sample size on successive occasions. Different estimates for change can be produced from the surveys, depending upon the style of comparison (direct or population). When the population estimates are used (256 squares in 1978 and 384 squares in 1984), the standard error involves correlating between the land cover values in the two surveys. A positive correlation means that squares are more likely to have similar values in the surveys. A
R. G. H. Bunce et al.
49
comparison of revisited squares (sample size 256) will give a more precise estimate of change when the correlation between the values is high (>0·85). In that case, it is best to omit the squares visited only on the second occasion. An example of the different estimates is the change in cover of wheat between 1978 and 1984. The two estimates of change differ depending upon the number of samples used from the 1984 survey. If only the squares visited on both occasions (eight per land class) are used, the estimated change is an increase of 732 000 ha, with a coefficient of variation of 27·5%. However, if all the information from the 1984 survey is used, the increase is estimated to be 886 000 ha, with two estimates for the coefficients of variation. If the samples are handled as two distinct populations, the standard error is 27·8%, compared with 21·8% if the covariance for repeated samples is included. The relationship between the difference in size of the coefficients of variation depends on the size of the change. The 1 km squares which were surveyed in 1978 were revisited in 1984, except for three squares where permission to survey was refused. These data provide the most accurate assessment of change in land use between the dates because the same measurements were made at the same locations in both years. Had the aim been only to minimise the error terms for the estimate at any one time, partial replacement theory suggests that it might have been better only to sample a subset of the same squares each time. Partial replacement is widely used in tree growth surveys (Ware and Cunia, 1962). However, it requires an estimate of the regression relationship between the cover of a land use parameter on the first (x) and second (y) survey dates. The variance of the estimator of change will be smaller when there is limited or no change because the correlation between the two dates is high. To use partial replacement in a stratified scheme, either the relationship must be estimated for each stratum, or a single relationship must be assumed which describes the rate of change in the area of that particular land use in every stratum. It is not therefore advisable to use partial replacement sampling for estimating land use change. The 50% increase in sample size in all strata from eight in 1978 to 12 in 1984 is expected, on average, to yield an 18% reduction in standard error.
5. Classification of all 1 km squares in Great Britain The first requirement for a classification of all squares came during the development of a rural information system for the Planning Department of the Highland Regional Council, who identified a need for complete coverage to increase geographical definition and to provide exact figures for the number of squares in each class. This need was further emphasised when several projects (e.g. Tranter et al., 1988; Cresswell et al., 1990) required the allocation of a large number of new squares to land classes in order to utilise existing data bases outside the range of the original classified squares. If the same methods of data collection were used as for the original classification, it would take over 20 man-years simply to collect the reduced dataset to allow new squares to be keyed out! Modern automated methods of data capture were therefore used to produce slightly different, more easily recorded, attributes which would simulate the original attributes as closely as possible. The objective was to reproduce the original classification as closely as possible, because of the following factors. (a) The original classification had been used in many studies for over 13 years, representing a range of applications; continuity was therefore important. In
50
Land classification
T 3. Cross-classification table showing how the original 1212 squares were re-classified at the 8 group level using the logistic discrimination procedure Final classification Initial classification 1 2 3 4 5 6 7 8
1–4 5–8 9–12 13–16 17–20 21–24 25–28 29–32
(b) (c)
(d)
(e)
1
2
3
4
5
6
7
8
200 33 6 4 0 0 0 0
23 119 0 11 3 0 0 1
7 0 133 14 0 0 9 0
5 9 6 62 4 0 5 3
1 2 3 5 99 5 5 1
0 0 0 0 12 144 5 1
0 0 7 7 10 5 119 5
0 4 0 2 2 2 8 101
addition, the experience gained from the existing classification showed that it expressed the required level of detail and that users had gained confidence in its application. Extensive databases had been built up based on the original classification. The data from the successive field surveys were based on the original land classes and the relative distribution of these sample squares between land classes therefore needed to be retained as far as possible. An independent exercise had shown that an expanded version of the original classification was more efficient than a new, unsupervised classification (Brandon et al., 1989). The initial classification produced the dissection of GB on the basis of manually recorded variables measured in detail in individual squares—a new, independent classification based on the simpler automated data set would be unlikely to make such useful divisions as were achieved in the original classification.
The data set eventually contained information on coastal features, altitude, climate, geology, drift and other features such as villages and roads (Bunce et al., in press). Although it was modelled as closely as possible on the key parameters identified in the original study, it differed in the level of detail for some items. These differences became apparent when a variety of classification techniques were applied to the original series of 1212 squares. The differences in the data, although small, accumulated in effect through the five levels of the classification. A high proportion of squares changed from their class in the initial classification although most moved to “nearby” classes in the classification hierarchy, as shown in Table 3. A number of multivariate approaches were applied to the data set but most did not satisfy the above criteria. Various discrimination techniques were tried using the 1212 squares as the training data set but either there was a low correspondence with the original or, in most cases, there were serious misclassifications. The most effective method of allocating new squares to the existing classification was found to be a combination of two techniques. First, logistic discrimination (LGD) was used to allocate each square to one of eight aggregations of land classes at the third level of the initial hierarchy, namely, in terms of the 32 final groups, groups 1–4, 5–8, up to group 29–32. This procedure allocated 81% of the 1212 original squares to the correct group. Then 4-group linear discriminant functions (LDF), derived separately
R. G. H. Bunce et al.
51
for each of these eight groups, were used to further allocate each square to one of the 32 groups. Though this combined procedure only allocated 62% of the original 1212 squares to the correct land class, most misclassifications were to “nearby”, and hence relatively similar, classes in the hierarchy of the ISA classification. The rationale for using LGD is that the multivariate normality and equal withinclass variability assumptions of ordinary linear discriminant functions are far from satisfied for the ITE land classification attributes. In particular, many of the attributes measured are all or mostly zero values in many of the land classes, with few very variable non-zero values, leading to highly skewed statistical distributions. An attribute’s values are much less variable (namely zero variance) in land classes where it does not occur than in those where it is most widespread. If the standard deviation within groups tends to increase with the group mean, then transformation to logarithms can be used to equalise the within-group variability. Such transformations will also help reduce the skewness in the distributions, preventing occasional very high values from overinfluencing the estimate of the average within-group variability. In trials with LDF discriminant analysis, log transformations did give some minor improvements in predictive ability but they did not cope well with the problem of zeros. The ITE attributes included the presence/absence of six coastal features and whether or not the land was on an island or the mainland. The underlying geology of each 1 km square was also classified into one of 11 non-ordered categories. Such binary variables severely violate all the LDF assumptions. Logistic discrimination is best described by the case of two groups. In multiple linear regression, if the dependent Y variate is binary or binomial, then it is better, and now general practice, to transform Y to its Logit z, where: Logit(Y)=z=Log(Y/(1−Y)) and fit the generalised linear model: Logit(Y)=z=b0+b1Xp+. . . bpXp by maximum likelihood, assuming a binomial distribution for Y. This is called logistic regression. It is generally superior to ordinary least squares regression on binary Y when the X variables are binary, categorical and/or quantitative but non-normal, or a mixture of types. The Logit scale more closely resembles the scale on which the effect of variates is more likely to be linear and independent of other variables. The important point is that the X variables are not assumed to be normally distributed. Two-group LGD can be analysed by logistic regression techniques (treating Y=1 for cases in group 1, Y=0 for group 2). Given a new observation X=X1 . . . Xp, the model implies that, if Pk=Prob (observation X from group k) Log(P1/P2)=Log(Y/(1−Y))=z=b0+b1Xp+. . . bpXp and hence P1=exp(z)/(1+exp(z)); P2=1/(1+exp(z)) The general consensus from two-group comparisons and simulations is that logistic
52
Land classification
discrimination is preferable to LDF for non-normal variates and unequal variability (Krzanowski, 1988). The above arguments have been extended to G>2 groups. For group k, k=1, . . . , G−1, we have: Log(Pk/PG)=zk=b0k+b1kX1+. . . bpkXp and hence Pk=exp(zk)/PG where G−1
PG=1/(1+ R exp(zk)) k=1
Because the P(G−1) parameters (bik) must be estimated iteratively, because there is no standard program available for the two-group LGD, it has not often been used. However, the CATMOD procedure in the SAS statistical package can be used to estimate the parameters (bik), although it is computer intensive, from which the Pk can be estimated and each square classified. At the eight super-group level (1–4, 5–8, . . . , 29–32), LGD classified 81% of the original 1212 squares correctly, compared to 78% using LDFs with all the variates and added transformations of variables. The second step in the classification allocates each square in a super-group to one of the four classes involved (e.g. super group 5–8 in class 5, 6, 7 or 8). At this step, it was decided to revert to the usual LDF approach to discrimination, because, with fewer observations per group yet just as many variables, the algorithm for logistic discrimination had difficulties converging in its estimation of the many non-linear parameters. In addition, the most severe dichotomies and LDF assumption violations in the data were removed by the first-step classification into the eight super-classes. The differences between the four land classes in each super-class were much less distinct. LDF analysis was used iteratively to select only those variables which contributed significantly to the four-group discriminations. All squares in GB, including the 1212 squares used to form the original classification, were then classified by this two-stage procedure to give a complete classification of the whole population. Some of the 1212 squares changed classes, but usually to an environmentally similar “nearby” class, and it was judged that all squares should be classified by exactly the same procedure. Several tests were carried out to compare this revised classification with the original classes. There was a reduction in geographical range of most classes over the original. This intensification of the spatial integrity of the land classes was reflected in lower within-class standard deviations for location variables, such as distance to south and west coast. For the field survey’s data, the revised classification changed the stratified sampling scheme from eight squares for every land class to a more nearly proportional sampling scheme with more sample squares in the larger land classes. In most cases, this led to lower error terms for GB population estimates of land features and land use. Though re-stratification of sample survey data is best done when the initial sampling is proportional to stratum sizes, and hence overall spread likely to approximate to random, it is reasonable in our case to re-classify 1 km squares because the population
R. G. H. Bunce et al.
53
Land class
2 3 4 9 11 12 14 25 26 1 5 6 7 8 10 13 15 16 27 17 18 19 20 28 31 21 22 23 24 29 30 32
Division 2
a
Division 3
a
Division 4
b
c
b
Arable
d
e
c
f
g
h
i
d
j
e
Pastural
k
l
f
m n
g h
Marginal upland
o
p
i
g
j
Upland
Figure 2. Aggregations of land class at three levels determined by the ecological characteristics of the classes.
is a continuum of environmental variation and the land classes are not distinct but overlap environmentally (as described in Jongman et al., 1987). 6. Aggregation of classes Whilst the 32 classes have proved the most useful level for field survey at the national scale, groups of classes are often required for interpretive purposes, e.g. for policy issues (Barr et al., 1993). Initially, the possibility of linking classes by the underlying environmental data was considered, but this procedure did not reflect the association with land cover adequately. Whilst a variety of different groupings may be required for contrasting objectives, each of them would require different combinations. An overall aggregation has been made as shown in Figure 2. The division into four groups was made on dominant land cover and the links between them according to experience, although the overall ranking was determined by the first axis of Principal Component Analyses of the land cover data recorded in 1978, as described in Section 2 above. These aggregations reflect the ecological characteristics of the classes and the most widely used relationships between the classes. There is no reason, however, why they could not be recombined for other objectives. For an overall picture of the principal patterns of variation within the British environment, it is useful to present a summary of the separation into the four major groups, as shown in Figure 3 and whose characteristics are summarised in Tables 4–7. The names given to the four landscape types are a necessary simplification and do not reflect the full variation that occurs in the aggregated land classes. Thus, the arable landscape type is composed of land classes dominated by arable land, but does not contain all the arable land in GB. Further, the same aggregated class does contain some pastural land and other land cover types which are not arable. However, results from CS1990 by landscape type provide a convenient summary of information for “agroecological zones” within the country. The arable landscapes (34% of GB) are concentrated in East Anglia and the east
54
Land classification
(a)
(b)
(c)
(d)
Figure 3. Spatial distribution of land classes from different landscape types as examples. The landscape types represent aggregations of land classes and show (a) arable, (b) pastural, (c) marginal upland and (d) upland regions.
R. G. H. Bunce et al.
55
T 4. Relative distribution of mapped elements amongst the four landscape types (% of mapped element in each landscape type) Mapped element Water—sea and tidal Water—inland Woodland Built up—towns Built up—villages Motorways A-roads B-roads Minor roads Canals Railways Rivers Open countryside
Arable
Pastural
Marginal
Upland
9 18 29 51 48 40 44 45 45 43 41 27 34
34 12 18 46 40 57 41 38 41 53 49 29 29
16 14 26 3 8 3 9 11 11 3 6 18 16
41 56 28 + 5 + 6 6 3 + 4 26 21
T 5. The average and maximum altitude (m) for the different landscape types. The figures are based on the mean altitude per 1 km square drawn from a 100-point matrix based over each square Landscape type
Mean altitude (m)
Maximum altitude (m)
Arable Pastural Marginal upland Upland
76 87 244 313
280 340 985 1225
T 6. Climate in the landscape types, describing the average hours of bright sunshine per day in July, the mean minimum temperature (in °C) in January and the average number of days with snow falling in each year (source: 1941–70 Air Ministry data) Landscape type
Sun (hrs)
Temp (°C)
Snow (days)
Arable Pastural Marginal upland Upland
5·8 5·7 4·9 4·4
0·7 1·4 0·9 0·3
26·0 22·1 36·9 48·0
Midlands but also in the central valley and eastern lowlands of Scotland. They are present but less widespread in north-eastern England, the Midlands and south-east Scotland. They are at low altitude, having low winter temperatures, high sunshine hours and below-average snow lie. The geology is dominated by calcareous rocks, clays and other sedimentary types. Characteristic map features include built-up areas and main roads. The land is dominated by cereals and other arable crops, as well as intensively managed grassland.
56
Land classification
T 7. Geological characteristics of the four landscape types (% of the squares in each landscape type in which each rock type is dominant) Rock formation Quaternary, Tertiary and Cretaceous clays Oolihc and friable limestones Mesozoic mudstones and Lias Jurassic clay Cretaceous clay Devonian sandstones Chalk Massive limestones Carboniferous and non calcareous shales, grits and sandstones Basic and intermediate igneous rock and basic metamorphic Acid igneous and metamorphic rock Silurian and Ordovician Metamorphic slates and phyllite Metamorphic limestones Cambrian grits and sandstones
Arable
Pastural
Marginal upland
Upland
8 9 18 4 13 7 23 4
2 3 28 2 6 13 3 5
0 3 2 0 0 14 + 8
0 + 1 0 0 4 0 4
5
19
14
4
2 4 3 + + +
2 3 11 + + 2
8 21 26 1 + 2
9 60 8 4 1 5
The pastural landscapes (20% of GB) are distributed widely in south-west England, west Wales, the west Midlands and north-west England, and also in north-east England and scattered through the lowlands of Scotland and coastal areas throughout GB. They are at low altitude, having moderate winter temperatures, high sunshine hours and little snow lie. The geology is variable but is dominated by sedimentary and metamorphic rocks. All map features occur widely in this type, but especially coastal features, builtup areas and main roads. The land cover is dominated by managed grasslands. The marginal upland landscapes (16% of GB) occur on the fringes of the uplands in all areas of north and west Britain, especially in Wales, at medium altitude, having low winter temperatures, medium sunshine hours and average snow lie. The geology is dominated by metamorphic rocks, with some igneous rocks present. Characteristic map features include minor roads and woodlands. The land cover is diverse with mixtures of low-intensity agriculture, forestry, semi-natural vegetation and limited areas of crops. The upland landscapes are mainly in Scotland and northern England, at high altitude, having very low winter temperatures, low sunshine hours and above-average snow lie. The geology is dominated by igneous and metamorphic rocks. Characteristic map features are inland water, woodland and open countryside, with few buildings or roads. The land is usually unsuitable for mechanised farming and is mostly used for sheep farming, with semi-natural vegetation as its main land cover. Working descriptions of the 32 individual classes can be obtained from the senior author. 7. Discussion One of the central objectives of the classification procedure is to enable representative samples to be drawn from a defined population. Many environmental studies require,
R. G. H. Bunce et al.
57
and would greatly benefit from, such a procedure but it is still surprising how rarely this is carried out. With the limited resources available for ecological work, it becomes even more important to ensure that samples are derived objectively. The argument here is that if few case studies are possible, the risk cannot be taken of an objective sample being aberrant. However, an appropriate framework can be developed to eliminate sites which deviate from a series of defined characteristics. Even a single site can be selected as being at the centre of a defined scatter of points either along an axis or in multi-dimensional space. The number of samples taken can be determined either statistically from an initial sample or, more usually, according to the resources available. Provided a minimum number of samples is taken, increasing the sample size affects mean values for the majority of variables to a progressively smaller degree, although it reduces the error term. Whilst sampling at low intensities has been shown to be adequate for estimating national figures for woodlands and crops (Bunce et al., 1984), it may be inadequate for providing accurate definitions of spatial distribution patterns. This problem was overcome, as described above, by using relatively simple data for the complete classification. However, the degree to which the data for the classification can be simplified does depend upon the objectives of the given study. It is therefore necessary, if time permits, to build in tests within the programme to ensure that the data base is sufficiently flexible to screen alternative classifications. Modern methods of data capture through Geographical Information Systems have greatly enhanced the feasibility of such procedures. It is, however, necessary to examine through interpolation the balance of such data sets as there can be a bias towards variables that are easily recorded and yet merely reflect the same basic information, e.g. temperature measures in climate. People who are familiar with multivariate procedures are used to constructing working classifications for individual studies. In many branches of environmental science, however, e.g. geology and agriculture, immutable classifications are constructed which stand for long periods of time. Whilst the objectives of the two extremes are different, there is little doubt that increased flexibility in accepting short-term classifications would be useful. Most of the present paper concerns strategic survey, but site selection often involves intensive studies of ecosystems for which process models will subsequently be developed. Sites for such studies are usually selected in isolation using criteria of convenience only, but there are major advantages in incorporating them as representatives of known strata so that they can be related to the entire population. In particular, individual parameters identified as critical within a system can then be examined at a broad scale, as their relationships are understood. Bunce (1986, 1988) described such a procedure for examining the nutritional consequences of forestry and pollution studies, respectively. Another type of modelling constructs the relationship between the classes as a basis either for predicting a combination of features not specified in the initial classification, e.g. timber production, or for indicating outcomes of different scenarios, e.g. economic changes in the Common Agricultural Policy. These various types of approach involve techniques such as linear programming, Markov models, or simply individual expertise as described by Bunce et al. (1984). The potential for extending such approaches using the classification is demonstrated by Harvey et al. (1986), and their value in formalising relationships could be greatly increased in order to present the consequences of different policies. The structure of the classification enables the co-ordination of many data sets that previously, of necessity, stood alone. As planning departments need access to such
58
Land classification
information on the rural environment in order to determine appropriate policies for the countryside, the transfer of such data has become a priority. The development of computer-held information systems on the one hand and decision support systems on the other can utilise such a structure. Work is currently in progress at ITE to develop such a module for GB using Microsoft Windows, with potential for other regions. Finally, the recent development of landscape ecology has demonstrated the increasing realisation that, with the modern pressures on the countryside, it is essential to take an integrated view of the environment. The system described in the present paper provides an objective framework for such integration, incorporating not only quantitative environmental data, but also the possibility of formalising more intuitive aspects of the landscape, such as its visual appeal.
References Barr, C. J., Benefield, C. B., Bunce, R. G. H., Ridsdale, H. and Whittaker, M. (1986). Landscape changes in Britain. Institute of Terrestrial Ecology. Barr, C. J., Bunce, R. G. H., Clarke, R. T., Fuller, R. M., Furse, M. T., Gillespie, M. K., Groom, G. B., Hallam, C. J., Homing, M., Howard, D. C. and Ness, M. J. (1993). Countryside Survey 1990: main report. Countryside 1990, vol. 2. London: Department of the Environment. Beckett, P. H. T. (1971). The cost-effectiveness of soil survey. Outlook on Agriculture 6, 191–198. Brandon, O., Voyle, A., Dias, W., Bisset, T., Short, C., Bunce, R. G. H., Barr, C. J., Howard, D. C., Jones, M., Evans, S. and Buckland, S. (1989). Environmental issues and agricultural land use options. Department of the Environment, Ministry of Agriculture, Fisheries and Food, Countryside Commission and Nature Conservancy Council, Aberdeen Centre for Land Use. Brandt, J. and Agger, P. (eds) (1984). Methodology in landscape ecological research and planning. Proceedings of 1st International Seminar of the International Association of Landscape Ecology, Vol. 1–5, Roskilde University. Bunce, R. G. H. (1965). The ecology of rock ledge vegetation in Snowdonia. PhD thesis. Bunce, R. G. H. (1982). A field key for classifying British woodland vegetation. Part I. Cambridge: Institute of Terrestrial Ecology. Bunce, R. G. H. (1984). The use of simple data in the production of strategic sampling systems. In 1st International Seminar of the International Association of Landscape Ecology. Methodology in landscape ecological research and planning Conference 4, Roskilde University Centre. Bunce, R. G. H. (1986). The Cumbrian environment—an overview. In Pollution in Cumbria (P. Ineson, ed.), pp. 38–41. Abbots Ripton: Institute of Terrestrial Ecology. Bunce, R. G. H. (1988). The application of the site classification to the nutritional consequences of forestry in Britain. In Predicting consequences of intensive forest harvesting on long-term productivity by site classification (T. H. Williams and C. A. Gresham, eds), pp. 93–102 (IEA/BE project A3, report no. 6). Georgetown: Baruch Forest Science Institute of Clemson University. Bunce, R. G. H., Barr, C. J., Clarke, T. R., Howard, D. C. and Lane, A. M. J. (in press). ITE Merlewood Land Classification of Great Britain. Journal of Biogeography. Bunce, R. G. H., Claridge, C. J., Barr, C. J. and Baldwin, M. B. (1986). An ecological classification of land—its application to planning in the Highland Region, Scotland. In Land and its uses—actual and potential: an environmental appraisal (F. T. Last, M. C. B. Holtz and B. G. Bell, eds), pp. 407–426. London: Plenum. Bunce, R. G. H. and Heal, O. W. (1984). Landscape evaluation and the impact of changing land-use on the rural environment: the problem and an approach. In Planning and ecology (R. D. Roberts and T. M. Roberts, eds), pp. 164–188. London: Chapman and Hall. Bunce, R. G. H., Morrell, S. K. and Stel, H. E. (1975). The application of multivariate analysis to regional survey. Journal of Environmental Management 3, 151–165. Bunce, R. G. H. and Smith, R. S. (1978). An ecological survey of Cumbria. Cumbria County Council & Lake District Special Planning Board. Cochran, W. G. (1977). Sampling techniques, 3rd edition. New York: Wiley. Cresswell, P., Harris, S., Bunce, R. G. H. and Jeffries, D. J. (1990). The badger (Meles meles) in Britain: present status and future population changes. In Mammals past, present and future (S. Harris, ed.), London: Linnaean Society. Dale, M. B. (1988). Knowing when to stop: cluster concept–concept cluster. Coenoses 3, 11–32. Dartington Institute (1986). The potential for forestry on the Culm measures farms of south-west England. Dartington: Dartington Institute. Elena-Rossello, E., Bunce, R. G. H. and Barr, C. J. (1984). A study of the effects of the changes in data
R. G. H. Bunce et al.
59
structure on a preliminary land classification of the Iberian peninsula. Merlewood Research and Development Paper 98, Institute of Terrestrial Ecology. Elena-Rossello, R. (1989). Biogeoclimatic land classification and mapping in Spain. In GIS in ecology. Merlewood research and development paper 114, Institute of Terrestrial Ecology. Ellenberg, H. (1978). Vegetation Mittleuropas mit den Alpen in Okologischer Sicht. Stuttgart: Ulmer. Everitt, B. S. (1980). Cluster Analysis. London: Heinemann. Fourt, D. F., Donald, D. G. M., Jeffers, J. N. R. and Binns, B. O. (1971). Corsican Pine (Pinus nigra var. maritima (Ait) Melville) in Southern Britain. A study of growth and site factors. Forestry 44, 189–207. Furse, M. T., Wright, J. F., Gunn, R. J. M., Clarke, R. T., Johnson, H. A., Blackburn, J. H., Armitage, P. D. and Moss, D. (1990). The implication of land use change on aquatic communities—an exploratory study. Report to the Department of the Environment. Wareham: Institute of Freshwater Ecology. Greig-Smith, P. (1964). Quantitative plant ecology. London: Butterworth. Harvey, D. R., Barr, C. J., Bell, M., Bunce, R. G. H., Edwards, D., Errington, A. J., Jollans, J. L., McLintock, J. H., Thompson, A. M. M. and Tranter, R. B. (1986). Countryside implications for England and Wales of possible changes in the Common Agricultural Policy. Reading: Centre for Agricultural Strategy. Highland Regional Council (1984). HR/ITE land classification system. Planning department information paper 5. Inverness: Highland Regional Council. Hill, M. O. (1973). Reciprocal averaging: an eigenvector method of ordination. Journal of Ecology 61, 237–249. Hill, M. O. (1979a). TWINSPAN—A FORTRAN program for arranging multivariate data in an ordered twoway table by classification of the individuals and attributes. Ithaca, New York: Cornell University. Hill, M. O. (1979b). DECORANA—A FORTRAN program for detrended correspondence analysis and reciprocal averaging. Ithaca, New York: Cornell University. Hill, M. O., Bunce, R. G. H. and Shaw, M. W. (1975). Indicator species analysis, a divisive polythetic method of classification, and its application to a survey of native pinewoods in Scotland. Journal of Ecology 63, 597–613. Hills, G. A. (1974). A philosophical approach to landscape planning. Landscape Planning 1, 339–371. Horrill, A. D., Lowe, V. P. and Howson, G. H. (1988). Chernobyl fallout in Great Britain. Report to the Department of the Environment DoE/RW/88.101. Institute of Terrestrial Ecology. Howard, P. J. A. (1977). Numerical classification and cluster analysis in ecology: a review. Merlewood Research and Development Paper 77. Institute of Terrestrial Ecology Hunting Technical Surveys. Howard, P. J. A. (1986). Monitoring landscape change. Report to the Department of the Environment. Borehamwood: Hunting Technical Surveys. Huntings Surveys and Consultants Ltd. (1986). Monitoring landscape change. Borehamwood: Huntings. Ivimey-Cook, R. R. and Proctor, M. C. F. (1966). Association analysis and phytosociology. Journal of Ecology 54, 179–192. Jones, H. E. and Bunce, R. G. H. (1985). A preliminary classification of the climate of Europe from temperature and precipitation records. Journal of Environmental Management 20, 17–29. Jongman, R. G. H., ter Braak, C. J. F. and van Tongeren, O. F. R. (1987). Data analysis in community and landscape ecology. Wageningen: Pudoc. Kent, M. and Ballard, J. (1988). Trends and problems in the application of classification and ordination methods in plant ecology. Vegetatio 78, 109–124. Kish, L. (1965). Survey sampling. New York: Wiley. Krzanowski, W. J. (1988). Principles of multivariate analysis: a user’s perspective. Oxford: Oxford University Press. Lambert, J. M. (1972). Theoretical models for large scale vegetation survey. In Mathematical models in ecology (J. N. R. Jeffers, ed.), pp. 78–109. Oxford: Blackwell Scientific. Liddle, M. J. (1976). An approach to objective collection and analysis of data for comparison of landscape character. Regional Studies 10, 173–181. MAFF (1966). Agricultural land classification. Agricultural Land Service Technical Report 11. London: Ministry of Agriculture, Fisheries and Food. Moss, D. (1985). An initial classification of 10-km squares in Great Britain from a land characteristic data bank. Applied Geography 5, 131–150. Neyman, J. (1934). On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97, 558–606. O’Connor, R. J. (1985). Long term monitoring of British bird populations. Ornis Fenn. 62, 73–79. Pennington, M. (1983). Efficient estimates of abundance for fish and plankton survey. Biometrics 39, 281–286. Podani, J. (1989). Comparison of ordinations and classifications of vegetation data. Vegetatio 83, 111–128. Robertson, P. A. (1991). Estimating the nesting success and productivity of British pheasants, Phasianus colchicus from nest record schemes. Bird Study (England) 38, 73–79. Robinson, G., Laurie, I. E., Wager, J. F. and Traill, A. L. (1976). Landscape evaluation. Report to Countryside Commission for England and Wales Manchester Centre for Urban and Regional Research, University of Manchester. Sæbø, H. V. (1983). Land use and environmental statistics obtained by point sampling. Bulletin of the International Statistical Institute 50, 1317–1341.
60
Land classification
Tandy, C. (1967). The isovist method of landscape survey. In Methods of landscape analysis. Landscape Research Group. Tranter, R. B., Jones, P. J. and Miller, F. A. (1988). Some socio-economic characteristics of farmers and their farm businesses in GB by ecological zones as categorized by the ITE Land Classification System. Reading: Centre for Agricultural Strategy. Ware, K. D. and Cunia, T. (1962). Continuous forest inventory with partial replacement of samples. Forest Science Monographs. Wright, J. F., Armitage, P. D., Furse, M. T. and Moss, D. (1989). Prediction of invertebrate communities using stream measurements. Regulation of Rivers: Research and Management M.G.M.T. 4, 147–155.