Agriculture, Ecosystems and Environment 104 (2004) 535–544
Soil quality monitoring in New Zealand: development of an interpretative framework Linda Lilburne a,∗ , Graham Sparling b , Louis Schipper b a
Landcare Research, P.O. Box 69, Lincoln 8152, Canterbury, New Zealand b Landcare Research, Hamilton, New Zealand
Received 8 April 2003; received in revised form 30 December 2003; accepted 6 January 2004
Abstract Schemes to monitor soil quality must be associated with quality criteria that allow for an objective assessment of what the measured values signify in relation to soil quality. The derivation of an interpretative framework for a broad-scale New Zealand monitoring scheme is described. The basis for the framework is a set of interpreted response curves that were developed by soil quality experts in a workshop process. The curves combine both production and environmental goals, and are specific to particular combinations of land use and soil type. Appropriate target ranges for each soil indicator are derived from these curves. Techniques for aggregating and presenting the results of a comparison of sampled data with the target ranges are discussed. There are advantages in assessing quality by both individual soil property, and by grouping properties into one or more indices. This interpretative framework (target ranges and aggregating techniques) can be applied to the assessment of a single sample, or to summarize the overall quality in a region or country. We anticipate that the target ranges will become better defined as more data become available; at present they are better suited for assessing soil quality at a broad regional scale than for specific on-farm assessment. © 2004 Elsevier B.V. All rights reserved. Keywords: Soil quality indicators; Target ranges; Environmental reporting; Data aggregation; New Zealand
1. Introduction A scheme to monitor soil quality operated in New Zealand from 1995 to 2001. This scheme is described in the accompanying paper (Sparling et al., 2004) and is referred to here as the “500 Soils Project”. The scheme collected data on a range of soil properties from over 500 sites throughout New Zealand for environmental reporting at local and national scales. Assessment of the soil quality status of these samples requires an interpretative framework. In particular, a ∗ Corresponding author. Tel.: +64-3-325-6700x3828; fax: +64-3-325-2418. E-mail address:
[email protected] (L. Lilburne).
target range (the range of values considered to be acceptable for productivity or environmental goals) need to be defined for each soil quality indicator (Arshad and Martin, 2002; Lilburne et al., 2002). Defining justifiable target values has been one of the most contentious areas of soil quality assessment, particularly in the absence of predictable production responses or defined ecological consequences (Sojka and Upchurch, 1999; Sparling et al., 2003). Nonetheless, there seems little point in proposing any soil property as a soil quality measure if we cannot provide for its interpretation. Pierce and Larson (1993) defined soil quality succinctly as “fitness for use”. This is a useful definition where the overall soil quality objective is to ensure
0167-8809/$ – see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.agee.2004.01.020
536
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
that soil characteristics are well matched to a particular use. Target ranges that support fitness for use need to be sufficiently flexible to cope with natural differences between soils and different land uses, and should be able to meet environmental as well as production goals. In some cases production and environmental goals may be in conflict. Target ranges can be defined from experimental data, statistical metrics, and simulation modeling (Sparling et al., 2003): 1. Experimental data from agronomic studies on a range of soils and crops have defined the soil nutrient concentrations required to attain target yields and nutrient content (e.g. Cornforth and Sinclair, 1984; Clarke et al., 1986; Roberts and Morton, 1999). Crop response to soil fertility and acidity levels has also been well studied. Considerably less experimental information is available on the response to soil physical condition and organic matter, and even fewer data have been collected on environmental responses to soil quality. 2. The statistical approach is where a distribution metric (e.g. lower quartile) of a database of each indicator is selected to represent a minimum acceptable limit or a level of concern (Halvorson et al., 1996). The disadvantages of this approach are: (1) the selected target value is based on an arbitrary metric parameter and not on a relationship of the indicator to actual production or environmental impact, (2) the significance of the target value is dependent on the representativeness of the soils in the database (and their history of use), (3) that sufficient data are only available for a few indicators, and (4) it assumes the samples are evenly distributed across the range of quality. If the quartile limits are used to set upper and lower ranges, then half the samples will fall outside the target range, irrespective of whether they meet a quality target set by other criteria. 3. A modeling approach can be used to identify desirable and achievable target values, e.g. for soil organic matter content (Sparling et al., 2003). Using a process model, lower limits were defined by the lowest level that still allowed recovery, within 25 years, to a target value of 80% of the equilibrium carbon content under long-term pasture. While this is a promising approach, suitable simulation models are not available for most of our indicators.
A fourth approach (used in this research) is based on expert knowledge to define target ranges. This allows information from the other three approaches (if available) to be synthesized with personal experience, anecdotal evidence, and best guesses based on an understanding of soil processes and relationships. Multiple indicators are often used to provide a more complete measure of soil quality that includes chemical, physical and biological condition. However, multiple indicators are difficult to manage and interpret. Andrews et al. (2002) suggest that a small number of carefully chosen indicators used in a simple index can provide adequate information for quality assessments. When target ranges are defined, soil quality can be further summarized regionally and nationally through “reporting-by-exception”. In this approach, the number of instances that indicators do not fall within a specified acceptable target range is reported as a single number or proportion. The report can also highlight all samples with at least one unsatisfactory indicator (or aggregated indicator), and provide summary data of the total number of acceptable measurements by region, land use, or soil type, as desired. The approach is simplistic, in that it provides a single yes/no answer as to soil quality, whereas the reality is a continuum. However, sooner or later information has to be transformed from a continuum to a binary classification in order that a decision on acceptability or otherwise can be made (Halvorson et al., 1996). This paper reports on the development of an interpretative framework for assessing soil quality in New Zealand. It describes the derivation of provisional target ranges, and discusses methods for grouping, aggregating, and presenting indicator data for both a single sample and a regional or national monitoring scheme.
2. Methods 2.1. Expert workshop 1 Two “expert workshops” were held to establish target values for an agreed set of soil properties. The approach used in the first workshop of 24 New Zealand soil scientists followed the general methodology of Smith (1990), although we convened the group at an intensive 2-day workshop rather than using
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
enhance water and air quality, and support human health and habitation.”
Environmental criteria
Soil quality rating (%)
100
Response curves were constructed using two sets of criteria to assess the soil quality status. One set of curves was based on production considerations, the second set on environmental considerations:
75 50
Hypothetical response curve
25 0
0
0
0
0
537
0
0
Soil property (insert units) Fig. 1. Graph template given to each workshop expert for them to insert the units and draw a response line relating a soil indicator to soil quality (e.g. the hypothetical response curve).
anonymous postal questionnaires. This meant there was group visual contact and interaction throughout. Gustafson et al. (1973) note that this interaction can increase overconfidence in the output. A neutral facilitator was used to maintain positive group interaction. The procedure was that once the necessary definitions, soil properties, and categories of soil and land use had been agreed among the group, each individual drafted response curves relating soil quality status to soil property value for each soil and land-use category (Fig. 1). The individual scientists were encouraged to drawn non-linear curves as, in common with Andrews et al. (2002), we believed that a non-linear scoring method would be more representative of system function than a simple linear function based on the range of observed values (e.g. Liebig et al., 2001). Curves from each individual were then overlaid with those of the other scientists, discussed, and modified if individuals agreed. After the workshop, each curve was digitized, amalgamated with the rest of the group, and means and errors calculated (Sparling, Tarbotton, Landcare Research, unpublished report LC9900/118, 2000). The participating scientists were asked to complete an anonymous questionnaire at the end of the workshop to get their ranking of the usefulness of the exercise and their confidence in the outputs. The workshop definition of soil quality was that used by the Soil Science Society of America (1995): “The capacity of a specific soil to function, within natural or managed ecosystem boundaries, to sustain plant and animal productivity, maintain or
Production criteria were agricultural productivity (plant dry matter, milk solids, logs for export), maximum economic yield, sustainable production, farm profitability, impact on the rural economy. These were generally considered within a short-term time frame (<5 years). Environmental criteria (including off-site impacts) were risks to air quality (including carbon sequestration), risk to water quality (surface and ground), loss of habitat, amenity, access, loss of diversity of indigenous species, invasions by weeds and pests, contaminant accumulation. These were generally considered over a longer time frame (25 years). No specific values for productivity, profitability, or other criteria were specified; panel members were allowed to define their own values. 2.2. Expert workshop 2 In the second workshop, a six-member subgroup of the original panel reviewed the conclusions of the first workshop to resolve anomalies. These mainly consisted of editing the response curves to remove extreme outlier points, smoothing the amalgamated response curves, and aggregating soil and land-use categories where the curves were similar. Four soil quality categories were defined: significant (adverse) impact; potential impact (and therefore of concern); within the target range; and above-target range. The workshop then focussed on defining boundary points or thresholds along the response curves for each soil quality category. This process helped standardize the interpretations between indicators and across the two criteria. Indicator specific terminology to describe the soil quality rankings was defined. In addition, an upper and lower limit was defined for a target range. The acceptable range in a soil property was defined as being between the threshold of significant impact and the above-target range value.
538
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
Several sources of information were used to define the category thresholds. For soil fertility properties the yield response curves were used, as these were reasonably well defined (e.g. Cornforth and Sinclair, 1984; During, 1984; Clarke et al., 1986; Roberts and Morton, 1999). Thresholds for organic resources (total C and N, mineralizable N) were obtained from interquartile ranges of long-term pasture sites, grouped by soil order, using data from the New Zealand’s National Soils Database (NSD) and the 500 Soils Project (Sparling et al., 2003). Long-term pastures represent an “optimum” target range for organic matter content, the total C content of New Zealand pasture topsoils being similar to those of long-term indigenous sites (Sparling and Schipper, 2002). Soil bulk density thresholds were defined from quartile values from the NSD and 500 Soils Project, and macroporosity targets from published information on effects of soil compaction on pasture production (Drewry et al., 1999, 2000; Drewry and Paton, 2000; Singleton et al., 2000). Little published information was available for environmental criteria for most indicators in the data set, so thresholds were set according to the expert opinion of the panel. For soil fertility criteria this generally involved assuming a negative environmental impact once the plateau phase on the yield response curve had been exceeded.
2.3.2. Combining and grouping several indicators A principal components analysis (PCA) was used to reduce the set of soil properties by identifying which soil properties best explained variability in the data set, and whether they could be grouped (Schipper and Sparling, 2000; Sparling and Schipper, 2002; Sparling et al., 2004). 2.3.3. Reporting results for large data sets: reporting-by-exception The overall proportion P of measured values that fell outside the acceptable target ranges was calculated using the relationship: e P = × 100 (1) n where e is the number of exceptions, and n is the total number of all of the measured values. Another overall measure of soil quality status N is the number of sites with at least one indicator that falls outside the target range: c N= × 100 (2) ns where c is the count of sites with at least one of the seven indicator values outside the target range, and ns is the number of sampled sites.
2.3. Aggregating techniques 2.3.1. Combining production and environmental targets The production and environmental response curves for each combination of indicator, soil type, and land use were merged into a single response curve. Where the production and environmental response curves showed different trajectories we took the more limiting (conservative) of the two responses. This means
Production
that a “more is better” production curve combined with a “less is better” environmental curve resulted in a combined “optimal” curve (Fig. 2).
Environmental
3. Results and discussion 3.1. Workshop process The group consensus process worked well with the group of soil scientists at the first workshop, and a
Combined
Fig. 2. Production (left graph) and environmental (middle graph) curves were combined using a minimum function. This can result in an optimal curve (right graph).
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
large degree of consensus was reached within a comparatively short time. The process and approach appear suitable where broad, generalized soil monitoring criteria are being sought. The scientists only assigned a modest level of reliability to the technical information. Part of the concerns arise from the fact that there were comparatively few people with the broad knowledge required to construct response curves for a wide range of soil properties on the various land uses and soil orders. These concerns could be met by having a larger group with a wider skills base. However, the present group felt that precision could be improved by rerunning the exercise with the same group of scientists, provided they were given adequate time to consult data and refine their estimates. There was also a marked learning curve as individuals became familiar with the process and the levels of precision required. Feedback after the workshop indicated that, in general, the participants found it to be very constructive. However, the scientists were concerned how the information would be used once it was released to a wider audience and whether it would be interpreted correctly. There is no means to control how individuals will use public information once it has been released into a wider domain, but our preference is to make the information widely available for comment and feedback. Consequently, our target values and ranges should be regarded as provisional and are likely to be revised as further information becomes available. 3.2. Response curves In the first workshop response curves were drawn for 13 soil properties: topsoil depth (A horizon), rooting depth, pH, total C, anaerobically mineralizable N, C balance, N balance, C/N ratio, Olsen P, bulk density, earthworm numbers, macroporosity, and aggregate stability by wet sieving. Drawing separate curves for production and environmental criteria allowed scientists to focus on each in turn, avoiding the pitfalls of trying to balance potentially conflicting criteria (Hess et al., 2000). Curves were specific for each soil property, land use, and soil order (Hewitt, 1998), although in most cases the group agreed that similar soil orders could be grouped. For several indicators, all of the soils were grouped together. Broad land-use classes of pasture, cropping, horticulture, and forestry (exotic and
539
indigenous) were adopted. We initially had more land-use classes, but during preliminary discussion the group agreed the two proposed classes of intensive and extensive pasture could be combined. This was because, once defined, the target criteria for the two classes were the same: attributes that make for a high quality pasture under intensive use were also applicable to extensive pastures. A single cropping and horticultural class was used. This was a great oversimplification, but in preliminary discussions it became clear that to accurately classify all the diverse horticultural and cropping land uses would need an impractically large number of classes. The reverse strategy was adopted with the aggregation of cropping and horticulture classes but with the recognition that target ranges would be very broad and generalized. The general pattern of the scientists’ response curves showed that the group was reasonably confident about what comprised an acceptable range for soil properties for production criteria. On the other hand, there was much less confidence about the shape of the response curves outside the “acceptable” range. The group’s curves were also more varied for environmental criteria than for production criteria. The response curves for Olsen P on pasture on recent soils illustrate these tendencies (Fig. 3.). There is clearly a need to define more accurately the shape of the response curves for environmental criteria, and to focus on the “risky” rather than the “safe” parts of both the production and environmental response curves. Output from the second workshop consisted of a set of 48 response curves, each with interpreted categories. Production and environmental curves were then combined. Fig. 4 shows the interpreted combined response curve for the Olsen P indicator on all soils under pasture land use. Interpreted response curves were not developed for unlikely or uninterpretable combinations, e.g. total carbon on a peat soil. Each interpreted response curve can also be represented as a one-dimensional bar graph (Fig. 5a). 3.3. Aggregation of indicators and sites A principal components analysis using an Oblimin rotation was applied to a 12-property data set to identify the key properties that best explained the variation in soil quality indicator values between sites (Systat, 1996; Sparling and Schipper, 2002). Seven properties
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
Soil Quality for production
Olsen P for Pasture Recent soils 100 80
60
40
20 0 0
25 50 75 100 125 150 175 200 Olsen P (µg/g)
Olsen P for Pasture Recent soils Soil Quality for Environment
540
100 80
60 40 20 0
0
25
50
75 100 125 150 175 200 Olsen P (µg/g)
Fig. 3. Production and environmental response curves for Olsen P for pasture on recent soils from the first workshop. The participants’ curves have been summarized into a mean curve in the middle (filled symbol), and the curves either side (open symbol) represent one standard deviation.
explained 88% of the variability in the current data set of soil quality indicators from the 511 sites (Table 1). The PCA results also showed that seven properties could be logically grouped into four components of soil quality. Factor 1 was mainly related to organic matter components and is called “organic resources” (total C, total N and mineralizable N), Factor 2 was physical condition (bulk density and macroporosity), Factor 3 was fertility (Olsen P), and Factor 4 was acidity (pH). Bulk density also contributed to Factor 1 as it
was (inversely) correlated to the volumetric measures of total C, total N and mineralizable N. Aggregation of indicators is often achieved through a weighted mean of normalized indicators (Smith, 1990). PCA scores have been used to derive weighting factors (Andrews et al., 2002); however, it is doubtful if these statistical weightings have an ecological relevance. Further, in our case these weights were almost equal so we opted for a simple equal-weighted mean for each factor or soil quality component.
Fig. 4. Combined production and environmental response curve for Olsen P on all soil orders under pasture. The curve has been interpreted into significant impact, potential impact, within-target range, and above-target categories.
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
541
Table 1 Rotated pattern loadings after principal component analysis (PCA) of seven key properties Soil property
Factors 1
2
3
4
Total C Total N Mineralizable N PH Olsen P Bulk density Macroporosity
0.94 0.71 0.01 0.1 −0.57 −0.23
−0.09 0.05 0.17 −0.08 0.06 0.65 −0.87
0.13 0.13 −0.21 0.03 0.94 0.17 −0.07
−0.11 0.04 0.12 1.00 0.06 0.13 0.01
Variance explained (%)
37
20
15
16
0.89a
The higher the weighting of a soil property the more influence it has on the relevant PCA factor. a Weightings in bold type are significant at P < 0.05, n > 500. Fig. 5. Example bar graphs for mineralizable nitrogen soil indicator (a), and the organic resources soil quality factor (b). The small square represents the quality assessment of a soil sample site based on its measured data.
Indicators must be normalized (transformed to a common index) before they can be averaged. This was achieved by linearly mapping all significant-impact segments in the interpreted response curves to the range 0–0.33, potential-impact segments to 0.33–0.66, within-target range segments to 0.66–1.0, and above-target range to 1.0. Aggregating these normalized values according to the PCA groupings allows the results to be presented as one bar graph for each of grouping of organic resources, physical condition, fertility, and acidity (Fig. 5b). A disadvantage of aggregating data by calculating averages is that high scoring properties can “hide” or mask any low scoring properties with which they are combined. This may be appropriate if poor quality in one soil indicator can be compensated for by good quality in another soil indicator. But in the case where all the indicators are limiting (i.e. production or the environment will be compromised with poor quality in any one indicator), a minimum function may be more appropriate, in which the aggregated value is set to the minimum of the normalized component indicators (Smith, 1990). This approach is more conservative in that it decreases the chance of a soil being classed as satisfactory despite one or more indicators being outside the target ranges.
3.4. Interpretative framework The interpreted response curves and aggregating techniques can be used to assess the soil quality of a single sample or a whole sampling program: 3.4.1. Single-sample assessment An interactive Web-based interpretative tool Soil Indicators (Sindi) was developed to provide an interpretative graphical and textual display of soil quality characteristics in response to a user entering soil sample measurements (Lilburne et al., 2000). The URL of Sindi is http://sindi.landcareresearch.co.nz. Statistical analyses, the combined response curves and PCA results, as well as alternative methods for combining indicators, were all incorporated into Sindi. Two interpretative options were developed: a comparison against the soil quality response curves, and comparison against statistical means and ranges from the NSD and 500 Soils Project data sets. In the statistical comparison, sample values can be compared with the mean and quartile values from the NSD and displayed graphically by plotting the sample value over a box and whisker plot of the mean, interquartiles and range. There are only sufficient data in the NSD to provide this comparison for the pastural land use, and for five soil properties (total C, total N, pH, bulk density, macroporosity). The method of display shows whether the sample falls outside the quartile range. No interpretation of the box plot is
542
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
provided, but any sample that falls beyond the quartiles is probably worth investigating further using the expert interpretation option. In the expert interpretation, each interpreted response curve is represented as a bar graph, and a soil sample value plotted on top to give a rapid indication of how the sample relates to the recommended target ranges. As there are seven key indicators, seven bar graphs are needed. These can also be viewed in aggregated form as four bar graphs for each of the four PCA categories of organic resources, acidity, fertility, and soil physical condition. The user has the choice of viewing either the averaged or the minimum indicator values. 3.4.2. Regional assessment: reporting-by-exception During the course of the sampling program, it became clear that within the data set some soils and land uses would be under or over-represented relative to their actual areal extent (Sparling et al., 2004). This was a consequence of some regions not participating in the scheme, and of other regions intentionally targeting soil and land use combinations of greater concern. For example, cropping soils are often of poorer quality and therefore of interest, but the area of cropped land in a region may be quite small. To counter this bias, counts should then be weighted by the land use area to give a more accurate picture (Pwgtd ) of overall soil quality: eL Pwgted = wL × 100 (3) nL L
where eL is the number of exceptions under land use L, nL the total number of measured values in land use L, and wL = area under land use L/total area under all land uses. An implicit assumption of this formula is that the sampled sites are representative of each land use. This reporting-by-exception approach has been applied to derive a national-scale soil quality rating for New Zealand (Sparling and Schipper, 2004). 3.5. Scale It is important to realize that the interpretative framework has an appropriate scale of application. It has a temporal scale in that short-term or even
medium-term degradation is considered acceptable providing that it is reversible within a given time span, which might vary from a few years (annual crops) to decades (tree harvest cycle). Different time frames can result in different target ranges, i.e., choice of temporal scale is important. In addition to temporal scale, spatial and thematic scales are also important in monitoring soil quality and setting target ranges (Nortcliff, 2002). Targets for national application should be derived from information or knowledge of the entire country. Target ranges designed for regional application may, depending on environmental and management variability, be appreciably narrower than national targets. Thematic scale refers to the level of stratification or classification of each target range, i.e., the soil and land use categories applicable to each target range. This will be dictated by spatial scale and by the level of current scientific knowledge. Very broad categories will usually be required in the case where the scale of application is national or there is limited knowledge. Our framework should only be applied in the context of a national monitoring scheme in which degradation is acceptable providing it can be reversed in the medium-term. 3.6. Future directions We suggest that a consistent set of soil quality targets be developed that are nationally recognized and used for interpretation (Nortcliff, 2002). Our recommendations are provisional because of the uncertainties in the curves, and we anticipate the interpretation and targets will be incrementally changed as better data become available. This would be most readily achieved through a centralized database of soil quality data and easy access via the World Wide Web (Sparling et al., 2004). Technically, it is possible to develop a Web-accessible spatial database of soil quality samples, allowing those of interest to be selected and summarized using the by-exception approach, or for individual samples to link to the Sindi assessment. As response curves are updated, or more samples collected, quality assessments could then be easily reassessed. At present this seems unlikely to happen in New Zealand because of limited funding, and the fragmentary way in which environmental data are collected on a regional basis (Lilburne et al., 2002).
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
4. Conclusions An interpretative framework has been developed that allows the soil quality of an individual soil sample to be assessed, and the regional or national quality status to be summarized. The basis for the interpretation is a set of soil-and-land-use-specific response curves that combine production with environmental criteria. These curves are the result of two workshops that brought together most of the recognized experts in New Zealand. The curves are specifically designed for use at the national scale and are therefore appropriate for application at this scale only. While there is some uncertainty in the curves, they provide a starting point, and can be revised or refined as new data and knowledge become available. Aggregating techniques are used to make the assessment information more manageable and readily understood. Aggregation of the data is necessary for ease of communication and for regional and national reporting. An interpretative framework such as the one developed here is an essential part of any monitoring project.
Acknowledgements We acknowledge the financial support of the Ministry for the Environment for the two workshops, and the contributions made by the participating soil scientists. Further funding for this research was provided by the New Zealand Foundation for Research, Science and Technology. Trevor Webb is thanked for his comments on this paper. References Andrews, S.S., Karlen, D.L., Mitchell, J.P., 2002. A comparison of soil quality indexing methods for vegetable production systems in Northern California. Agric. Ecosyst. Environ. 90, 25–45. Arshad, M.A., Martin, S., 2002. Identifying critical limits for soil quality indicators in agro-ecosystems. Agric. Ecosyst. Environ. 88, 153–160. Clarke, C.J., Smith, G.S., Prasad, M., Cornforth, I.S., 1986. Fertiliser Recommendations for Horticultural Crops. Ministry of Agriculture and Fisheries, Wellington, New Zealand, 70 pp. Cornforth, I.S., Sinclair, A.G., 1984. Fertilizer and Lime Recommendations for Pastures and Crops in New Zealand. Ministry of Agriculture and Fisheries, Wellington, New Zealand, 76 pp.
543
Drewry, J.J., Paton, R.J., 2000. Effects of cattle treading and natural amelioration on soil physical properties and pasture under dairy farming in Southland, New Zealand. N. Z. J. Agric. Res. 43, 377–386. Drewry, J.J., Lowe, J.A.H., Paton, R.J., 1999. Effect of sheep stocking intensity on soil physical properties and dry matter production on a Pallic Soil in Southland. N. Z. J. Agric. Res. 42, 493–499. Drewry, J.J., Littlejohn, R.P., Paton, R.J., 2000. A survey of soil physical properties on sheep and dairy farms in southern New Zealand. N. Z. J. Agric. Res. 43, 251–258. During, C., 1984. Fertilisers and Soils in New Zealand Farming. Gov. Printer, Wellington, New Zealand, 356 pp. Gustafson, D.H., Shukla, R.K., Delbecq, A., Walster, G.W., 1973. A comparative study of differences in subjective likelihood estimates made by individuals, interacting groups, Delphi groups and nominal groups. Organis. Behav. Human Perform. 9, 200–291. Halvorson, J.J., Smith, J.L., Papendick, R.I., 1996. Integration of multiple soil parameters to evaluate soil quality: a field example. Biol. Fertil. Soils 21, 207–214. Hess, G.R., Campbell, C.L., Fiscus, D.A., Hellkamp, A.S., McQuaid, B.F., Munster, M.J., Peck, S.L., Shafer, S.R., 2000. A conceptual model and indicators for assessing the ecological condition of agricultural lands. J. Environ. Qual. 29, 728– 737. Hewitt, A.E., 1998. New Zealand Soil Classification. Landcare Research Sci. Ser. 1. Manaaki Whenua Press, Lincoln, Canterbury, New Zealand, 133 pp. Liebig, M.A., Varvel, G., Doran, J., 2001. A simple performancebased index for assessing multiple agroecosystem functions. Agron. J. 93, 313–318. Lilburne, L.R., Sparling, G.P., Schipper, L.A., Hewitt, A.E., Gibson, R., 2000. Soil quality indicators on the world wide web. In: Denzer, R., Swayne, D.A., Purvis, M., Schimak, G. (Eds.), Environmental Software Systems: Environmental Information and Decision Support. In: Proceedings of the 3rd International Symposium on Environmental Software Systems (ISESS). Kluwer Academic Publishers, Boston, pp. 131– 141. Lilburne, L.R., Hewitt, A.E., Sparling, G.P., Selvarajah, N., 2002. Soil quality in New Zealand: policy and the science response. J. Environ. Qual. 31, 1768–1773. Nortcliff, S., 2002. Standardisation of soil quality attributes. Agric. Ecosyst. Environ. 88, 161–168. Pierce, F.J., Larson, W.E., 1993. Developing criteria to evaluate sustainable land management. In: Kimble, J.M. (Ed.), Proceedings of the 8th International Soil Management Workshop: Utilization of Soil Survey Information for Sustainable Land Use. USDA-SCS, National Soil Survey, Lincoln, NE, pp. 7–14. Roberts, A.H.C., Morton, J.D. (Eds.), 1999. Fertiliser Use on New Zealand Dairy Farms. New Zealand Fertiliser Manufacturers Association, Auckland, New Zealand, 37 pp. Schipper, L.A., Sparling, G.P., 2000. Performance of soil condition indicators across taxonomic groups and land uses. Soil Sci. Soc. Am. J. 64, 300–311.
544
L. Lilburne et al. / Agriculture, Ecosystems and Environment 104 (2004) 535–544
Singleton, P.L., Boyes, M., Addison, B., 2000. Effect of treading by dairy cattle on topsoil physical conditions for six contrasting soil types in Waikato and Northland, New Zealand, with implications for monitoring. N. Z. J. Agric. Res. 43, 559–567. Smith, D., 1990. A better water quality indexing system for rivers and streams. Water Res. 24, 1237–1244. Soil Science Society of America, 1995. SSSA Statement on Soil Quality. Agron. News 7. Sojka, R.E., Upchurch, D.R., 1999. Reservations regarding the soil quality concept. Soil Sci. Soc. Am. J. 63, 1039–1054. Sparling, G.P., Schipper, L.A., 2002. Soil quality at a national scale in New Zealand. J. Environ. Qual. 31, 1848–1857.
Sparling, G.P., Schipper, L.A., 2004. Soil quality monitoring in New Zealand: trends and issues arising from a broad-scale survey. Agric. Ecosyst. Environ. 104, 545–552. Sparling, G.P., Parfitt, R.L., Hewitt, A.E., Schipper, L.A., 2003. Three possible approaches to define desirable soil organic matter contents. J. Environ. Qual. 32, 760–766. Sparling, G.P., Schipper, L.A., Bettjeman, W., Hill, R., 2004. Soil quality monitoring in New Zealand: practical lessons from a six-year trial. Agric. Ecosyst. Environ. 104, 523–534. Systat, 1996. Systat 6.0 for Windows. SPSS Inc, Chicago, IL.