Spatial covariation of Azotobacter abundance and soil properties: A case study using the wavelet transform

Spatial covariation of Azotobacter abundance and soil properties: A case study using the wavelet transform

ARTICLE IN PRESS Soil Biology & Biochemistry 39 (2007) 295–310 www.elsevier.com/locate/soilbio Spatial covariation of Azotobacter abundance and soil...

536KB Sizes 1 Downloads 36 Views

ARTICLE IN PRESS

Soil Biology & Biochemistry 39 (2007) 295–310 www.elsevier.com/locate/soilbio

Spatial covariation of Azotobacter abundance and soil properties: A case study using the wavelet transform Robert J Barnesa, Samantha Jayne Baxtera,, R.M. Larkb a

Department of Soil Science, The University of Reading, Whiteknights, P.O. Box 233, Reading, Berkshire RG 6 6DW, UK b Rothamsted Research, Harpenden, Hertfordshire, AL5 2JQ, UK Received 21 February 2006; received in revised form 27 July 2006; accepted 2 August 2006 Available online 11 September 2006

Abstract The soil microflora is very heterogeneous in its spatial distribution. The origins of this heterogeneity and its significance for soil function are not well understood. A problem for understanding spatial variation better is the assumption of statistical stationarity that is made in most of the statistical methods used to assess it. These assumptions are made explicit in geostatistical methods that have been increasingly used by soil biologists in recent years. Geostatistical methods are powerful, particularly for local prediction, but they require the assumption that the variability of a property of interest is spatially uniform, which is not always plausible given what is known about the complexity of the soil microflora and the soil environment. We have used the wavelet transform, a relatively new innovation in mathematical analysis, to investigate the spatial variation of abundance of Azotobacter in the soil of a typical agricultural landscape. The wavelet transform entails no assumptions of stationarity and is well suited to the analysis of variables that show intermittent or transient features at different spatial scales. In this study, we computed cross-variograms of Azotobacter abundance with the pH, water content and loss on ignition of the soil. These revealed scale-dependent covariation in all cases. The wavelet transform also showed that the correlation of Azotobacter abundance with all three soil properties depended on spatial scale, the correlation generally increased with spatial scale and was only significantly different from zero at some scales. However, the wavelet analysis also allowed us to show how the correlation changed across the landscape. For example, at one scale Azotobacter abundance was strongly correlated with pH in part of the transect, and not with soil water content, but this was reversed elsewhere on the transect. The results show how scale-dependent variation of potentially limiting environmental factors can induce a complex spatial pattern of abundance in a soil organism. The geostatistical methods that we used here make assumptions that are not consistent with the spatial changes in the covariation of these properties that our wavelet analysis has shown. This suggests that the wavelet transform is a powerful tool for future investigation of the spatial structure and function of soil biota. r 2006 Elsevier Ltd. All rights reserved. Keywords: Azotobacter; Spatial variability; Geostatistics; Wavelet transforms

1. Introduction 1.1. The scope of this paper Soil biologists have used geostatistical methods to analyse the spatial variation both of biological soil variables and of related physical and chemical soil properties, (Rossi et al., 1995). The primary goal of geostatistical analysis is to predict the magnitude of variables at Corresponding author.

E-mail address: [email protected] (S.J. Baxter). 0038-0717/$ - see front matter r 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.soilbio.2006.08.001

unsampled sites by kriging, but the variograms that we obtain in the analysis show the important spatial scales at which biological processes vary in soil. For this reason soil scientists have often computed variograms in order to gain insight into spatial variability. However, geostatistical analyses require assumptions about the variability, which are particularly implausible for complex biological processes, so we expect that the insight that they give will often be limited. It is likely that kriging predictions are fairly robust to failures of the assumptions; but when they do not hold, the variograms of soil properties will be of doubtful value for the interpretation of spatial variation.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

296

The wavelet transform is more appropriate than geostatistics to elucidate complex spatial variation and covariation of soil properties. It allows us to relax certain unrealistic assumptions about spatial variation, and we may expect it to give insight into scale-dependent and spatially complex variation, as it has in the study of physical and chemical soil properties (Lark and Webster, 1999, 2001; Redding et al., 2004; Si and Zeleke, 2005). In this paper, we consider the limitations of geostatistical analysis and discuss the wavelet transform in outline. We then use both methods to analyse data on the spatial covariation of abundance of Azotobacter with some physical and chemical soil properties. 1.2. Geostatistical analysis All soil properties vary over different spatial scales, and biological variables are no exception. Soil biologists have used geostatistics to analyse this spatial variation (e.g. Rossi et al., 1995). In geostatistics we assume that an observation of a soil property, z, at a location, x, can be treated as a realization of a random function Z(x). We must assume that this function has two properties. First, its increments over any interval in space are zero on average E½ZðxÞ  Zðx þ hÞ ¼ 0;

for any x; h,

(1)

where h is an interval in space, the lag, and E [ ] denotes the statistical expectation of the term in the brackets. Second, it is assumed that the square of these increments depends only on the lag, regardless of absolute position, and so we may define a function g(h): 2gðhÞ ¼ E½fZðxÞ  Zðx þ hÞg2 ;

for any x.

(2)

The function g(h) is the variogram. These two equations define the assumption of intrinsic stationarity, and the primary concern of this paper is with the plausibility of this assumption. The variogram can be estimated for different lags from sufficient data (typically 100 or more observations) at locations selected to allow us to make comparisons over different intervals. A lag may be defined on distance alone or on distance and direction. Usually we define a lag-bin, a tolerance interval of distances and bearing and associate pair-comparisons within a bin with the central lag. When this is done the variogram can be estimated as proposed by Matheron (1962) g^ ðhÞ ¼

Nh 1 X fzðxi Þ  zðxi þ hÞg2 , 2N h i¼1

(3)

where Nh pair-comparisons fall in the lag-bin centred on h. Soil biologists have used this expression to estimate the variogram of different biological variables that have been sampled at different scales. For example, Ritz et al. (2004) computed the variograms of microbial biomass, PLFA concentrations, colony-forming units (CFUs) and a community-based physiological profile variable determined on

Biologs microplates. These were for lags from a few tens of centimetres up to about 6 m. If a variable has no spatial dependence (i.e. two close points can be as different as two distant ones) then the variogram is flat (pure nugget in geostatistical terms). This means that our sampling is insufficiently fine to resolve any spatial dependence. The variogram typically rises with increasing lag distance up to some lag, the range, beyond which it is flat. This means that there is spatial dependence between two sites separated by a lag less than the range, beyond this lag our data are mutually independent. The value of the variogram at the range is called the sill variance. The empirical variogram, estimated with Eq. (3), may often appear to have a positive intercept at lag zero (although the variogram is zero at lag zero, by definition). This intercept, the nugget variance, represents variation that is not resolved into spatially dependent components because our sampling scheme is too coarse. The sill and range are often used to characterize spatial variation. The nugget variance, as a proportion of the sill, indicates the importance of very fine-scale processes, but note that its meaning is conditional on the sampling scheme. Ritz et al. (2004) found that their variables exhibited contrasting patterns of variation. The range for biomass carbon was at a lag distance of about 3 m, while the PFLA concentrations had ranges from around 70 cm to over 6 m. They went on to use these variograms to interpolate the variables by kriging at unsampled sites on a grid, and so to map them. Kriging interpolation uses the variogram to determine local estimates that are optimal linear combinations of the data, optimal in the sense that the predictions have the minimum expected squared error. Other workers have computed variograms for biological variables. For example, Stark et al. (2004) estimated the variograms at lags of tens of centimetres for microbial biomass and enzyme activity, and found differing patterns of spatial variation. Nunan et al. (2002, 2003) estimated variograms from data obtained from both bulk soil and thin sections on microbial distribution. They considered lags down to micrometres, and were able to show how the distribution of microbes is spatially structured. Go¨rres et al. (1998) compared variograms of nematode density and carbon mineralization in forest and arable soils at different times of year. The spatial variation in the forest soils showed strong seasonal dependence. Franklin et al. (2001) used geostatistics to study fine-scale variation (lags from 1 to 35 cm) in the abundance and composition of bacteria in salt marsh sediments. Grundmann and Debouzie (2000) computed variograms of abundance of serotypes of ammonia and nitrite-oxidizing bacteria in arable soil for lags between 1 and 20 mm. Certain serotypes showed spatially dependent variation over a few mm. These studies, among others, illustrate how soil biologists have used geostatistics, often not primarily to interpolate and map variables but rather to characterize and interpret spatial variability.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

Most geostatistical analyses in soil biology have been univariate, but we can also analyse the joint spatial variation of two variables in geostatistics. If two properties, u and v, are assumed to be realizations of random functions Zu(x) and Zv(x), respectively, then we require, under the assumption of intrinsic stationarity, that a cross-variogram can be defined as a function of the lag only, gu,v(h), where 2gu;v ðhÞ ¼ E½fZu ðxÞ  Zu ðx þ hÞgfZ v ðxÞ  Z v ðx þ hÞg;

for any x.

(4) Rossi et al. (1995) recommended that soil biologists use cross-variograms to study joint variation, but they have not been widely used. Smith et al. (1994) reported crosscorrelograms (related to the cross-variogram but entailing stricter stationarity assumptions) for lags up to 1.5 m to describe the joint variation of carbon and nitrogen mineralization and microbial biomass with plant density. Ritz et al. (2004) report cross-variograms for biological variables and soil chemical properties, but give little interpretation. Since they did not model the cross-variograms with the corresponding variograms for single properties (auto-variograms), and leave the former in their original units, there is little scope for interpretation of the results that they presented since the cross-variogram alone tells us little about the strength and nothing about the scale-dependence of any correlation between the two variables. Cross-variograms and auto-variograms can reveal scaledependence in the correlation of two variables. By scaledependence we mean that the components of variation of the two variables at one scale are more strongly correlated than the corresponding components at another scale. This may be of considerable biological interest. For example, the rate of some microbially mediated process (such as nitrous oxide emission from the soil) may vary over short distances in response to variations in the soil carbon content, while over longer distances large differences in pH between contrasting parent materials might mask the effect of other variables. Such scale-dependence has been reported in geostatistical analysis of metal concentrations in the soil (Goovaerts and Webster, 1994), but we are not aware of comparable studies of biological variables. It should be noted that scale-dependence cannot be inferred from the cross-variogram alone. The cross-variogram may change with lag when the correlations between the spatial components of the variables are identical at all scales. We present some geostatistical measures of scale-dependent correlation in Section 2. Geostatistics has given some insight into variation of biological properties of the soil. In particular it indicates the important spatial scales, and can provide guidance on sampling. However, there are severe limitations on geostatistical analysis for the elucidation of variation. While much geostatistics in soil biology has focussed on computing and interpreting variograms, it should be remembered that geostatistics is mainly concerned with optimal local prediction. The variogram gives some information on scales, but different spatial processes can

297

have similar variograms, so the variogram is a blunt tool if we want insight into spatial variability. More seriously, it must be remembered that auto- and cross-variograms require that we assume intrinsic stationarity. A failure of the first-order component of this assumption—Eq. (1)— can be dealt with by modelling the trend (Lark et al., in press), but a failure of the assumption in Eq. (2) is more serious. This assumption requires that the expected variance between two points separated by some lag h is the same everywhere. Thus the variance and the autocorrelation are both constant. This assumption will often be implausible. As we traverse a landscape we expect to pass through contrasting microenvironments within which the variability of soil properties will be different. For example, the redox potential of the soil is likely to be fairly uniform within a well-drained cultivated soil, but it may vary markedly within a low-lying, wetter region of a field. Similarly, the variation of soil texture within a patch of glacial drift may be much larger than within soil formed from clay through weathering of the country rock. This heterogeneity of the variation in soil material is likely to propagate to non-uniform variability of biological processes. Indeed, since many biological processes in soil depend on soil physico-chemical properties in complex, non-linear ways, it is likely that their variation will change even more markedly across the landscape. Under such non-uniformity of the variation, Eq. (2) is implausible, and so we must interpret the variogram with caution. Intuitively we can see that the variogram would be an averaging of very different patterns of spatial dependence across a region, and would represent the actual spatial dependence nowhere. In the multivariate case the assumption in Eq. (4) would also fail, but there may be other considerations. While the variance of two variables might change in space, the correlation might also change, and this would also render the intrinsic hypothesis implausible. Again, this might often be an issue in soil biology. For example, nitrous oxide emission from soil might be correlated with organic content in wet soils, when substrate availability is the limiting factor, but not in drier soils. The impact of failures of the assumption of stationarity in the variance on kriging predictions has not been extensively studied. It is likely that kriging estimates are not very sensitive to any but gross failures of the assumptions, since the kriging estimate is always an unbiased estimate of the local mean (Webster and Oliver, 2001). However, the interpretation of spatial (co)variation, as expressed by the variogram or cross-variogram does depend strongly on stationarity assumptions. For example, weak scale-dependent correlation might be inferred between two variables from the auto- and cross-variograms under assumptions of stationarity, but the variograms might conceal a more complex situation, incompatible with assumptions of stationarity, where the variables are strongly correlated over some of the landscape, but are uncorrelated elsewhere.

ARTICLE IN PRESS 298

R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

Wavelet transforms have been used in soil science for spatial analysis where stationarity of (co)variance is implausible (Lark, 2005; Si and Farrell, 2004). In the next section, we present the wavelet transform in outline. 1.3. The wavelet transform We do not give a full account of the wavelet transform here, but we outline the key ideas in this section and describe our specific methods later. A more complete account for soil scientists is given by Lark and Webster (1999). The transform is described for data that can be treated as some function of location on a one-dimensional transect, f(x). Most of the methods can be extended to two or more dimensions (Lark and Webster, 2004). Note that we can conduct a wavelet transform on around 100 data, or more, the minimum sample size for a geostatistical analysis (Webster and Oliver, 1992), but the wavelet methods that we discuss here require that the data are sampled regularly in space, which is not necessary for geostatistics. Wavelets are basis functions; i.e., they are basic mathematical building blocks that can be combined, with appropriate coefficients, to reconstitute any data set of finite variance. The wavelet has a compact support—it takes non-zero values only within a narrow window—so a particular wavelet coefficient only describes local soil variation, within the support of the corresponding wavelet. It is because any particular wavelet coefficient only describes variation within a local window, rather than across the whole data set, that the wavelet transform is suited to describe variation that cannot plausibly be regarded as the outcome of a stationary process. In order to describe a spatial data set it is necessary to translate the wavelet to a series of locations in space so that all locations are described by local coefficients. In addition, the basic wavelet function (mother wavelet) can be dilated in increments, i.e., expanded to take non-zero values over a wider interval, to describe spatial variation at a coarser spatial scale. A complete discrete wavelet transform is obtained by taking a basic wavelet function and dilating it in increments (so as to describe increasingly coarse spatial scales), and by translating each of these dilations in space (so as to describe the variation at different locations). Having described the basic principle, we now give some notation. The discrete wavelet function, cj,n(x), denotes the jth dilation and nth translation of the mother wavelet function c(x). The scale parameter, that controls the dilation, is x02j, where x0 is the basic sampling interval (although for brevity we refer to the scale parameter for the jth dilation as scale 2j in the following discussion). The translations are done in steps of length x02j. Thus, if we have sampled a variable on a transect at intervals of 1 m, then our finest scale is 1  21 ¼ 2 m, and the wavelet is translated in steps of 2 m. At a coarser scale, say j ¼ 4, the scale parameter is 1  24 ¼ 16 m, and the wavelet is translated in steps of 16 m. An analogy has been made between this process and

microscopy. When we examine the data at a fine scale (examining as slide at large magnification) we move in small increments (we scan the slide in small steps). When we examine the data at a coarse scale (examining as slide at a small magnification) we move in large increments (we scan the slide in large steps). The corresponding discrete wavelet coefficient that corresponds to cj,n(x), Dj,n, is obtained as the inner product of the wavelet with the data: Dj;n ¼ hcj;n ðxÞ; f ðxÞi.

(5)

The discrete wavelet coefficients can be used to reconstruct the data by expanding all the wavelets in the basis (i.e. all the dilations and translations cj,n(x) that we have used) with their coefficients Dj,n. If we expand only those wavelets that correspond to a particular scale parameter, 2j, i.e. cj,n(x) for n ¼ 1, 2, y, then we obtain an additive component of the data corresponding to this scale of generalization. This is called a detail component. A set of detail components for scale parameters 21, 22, y, 2m constitutes a multiresolution analysis (MRA) of the data along with the smooth component for scale 2m (i.e. the residual when we subtract the detail components). A MRA allows us to decompose our data into components at different scales of generalization, and so to separate shortrange fluctuations from long-range patterns. This is done without assuming stationarity. Lark and Webster (1999) illustrated MRA for some data on the clay content of the soil across a 3-km transect in central England. The analysis showed the effects of differences between contrasting outcrops at coarser scale, and the variation at this scale showed a marked change about half-way along the transect where there was a marked pattern of Liassic outcrops that alternate between clay-bearing and coarser-textured rock. Components at finer scales showed similar changes in variability, inconsistent with an assumption that they are realizations of a process that follow Eq. (2). The mean square of the wavelet coefficients for some scale parameter 2j constitutes the wavelet variance of the data at this scale (Percival, 1995), and with orthogonal wavelets, the wavelet variance is an additive and scaledependent component of the variance of the data. The square of any wavelet coefficient Dj,n (normalized by the number of coefficients at scale 2j) is therefore an additive component of the wavelet variance at scale 2j and so of the variance of the data as a whole. We may therefore use the wavelet coefficients to analyse the variation of our data by scale (pooling the contributions from all translations at a particular scale, and so assuming stationarity) or by scale and location (which does not require us to assume stationarity). This is discussed in more depth by Lark and Webster (2001) and Lark (2005). Whitcher et al. (2000a) proposed the wavelet covariance as a scale-specific measure of the covariance of two variables. The wavelet covariance is the mean product of the corresponding wavelet coefficients for two variables at some scale. Again, the analysis of products of wavelet

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

coefficients for two variables allows us to analyse the covariance of the variables by scale (assuming stationarity) or by scale and location (not assuming stationarity). We discuss the practicalities of these analyses in Secton 2. It is important to recognize that geostatistical and wavelet analyses are closely related, under the assumption of stationarity, via the spectral density function of the variable (Cressie, 1993). The wavelet variance for a particular scale can be computed directly from the spectral density function if this is known (and the spectral density function may be estimated using wavelets, as described by Percival and Walden, 2000). At the same time, as Cressie (1993) describes, we may compute the spectral density function for a variable if we know its variogram. Under stationarity wavelet analysis and geostatistical analysis must therefore give us comparable results. However, when the assumption of stationarity in the variance is not plausible, wavelet methods have the advantage that we can examine the local components of the wavelet variance at any scale and show precisely how the stationarity in the variance breaks down. As we describe below, this process can be formalized so that a wavelet analysis can provide post hoc evidence against the assumption of stationarity and show that the variograms of the data are of questionable interpretative value. In summary, the widely used geostatistical techniques, while very powerful, entail assumptions about the (co)variability of soil properties that are not always plausible. In particular, when we consider biological soil variables the assumption that they are realizations of underlying spatial processes of uniform variability and correlations is implausible. However, if this assumption cannot be made, then while variograms and related statistics can be computed, they cannot be interpreted with any confidence. Wavelet transforms give information on spatial variation which is consistent with the geostatistical analysis under the assumption of stationarity, but which allows us to relax

299

this assumption and investigate its plausibility. We would hypothesize that the covariation of a biological variable such as the abundance of a particular species, with potentially limiting factors, will not be spatially uniform (and hence not suitable for geostatistical analysis). This hypothesis can be tested with the wavelet methods introduced above. In this paper, we present a case study involving the geostatistical and wavelet analyses of the joint variation of abundance of Azotobacter CFUs and some physicochemical soil properties sampled on a transect. These results allow us to address the previously stated hypothesis. In the following section, we describe the sampling methods and the specific geostatistical and wavelet methods that were used to analyse the data. 2. Methods 2.1. The site The study site was at Sonning Farm, Berkshire; which belongs to The University of Reading (UK National Grid Reference: SU 762 754). The site comprised an alluvial flood plain next to the River Thames and a flood plain gravel terrace about 1 m above the alluvial flat. Kay (1936) identified four soil series on the part of the farm covered by our sampling transect. These are listed with their modern equivalents in Table 1. In summary they are: 1. Rowland series: formed on a bank of sand raised above the surrounding ground, which increases drainage. The soil is a brown acid coarse sand and stony throughout. The parent material is valley gravel with 85% sand in the top soil and a small organic matter content (approximately 2%). 2. Broadmoor series: a wet alluvial soil that floods regularly every winter unless the season is exceptionally dry.

Table 1 Description of the prevailing land-use at the time of sampling, and soil series mapped at points along the transect Location (m)

0–32 34–152 154–178 180–208 210 212 214–266 268–296 298 300–436 438–472 574–516 a

Land use

Rotational grass Rotational grass and clover Winter wheat Spring barley Winter wheat Permanent grass strip Woodland Woodland Stream Permanent pasture Permanent pasture Permanent pasture

Soil series Original identificationa

Modern correlativeb

Rowland series Rowland series Rowland series Rowland series Rowland series Rowland series Rowland series Broadmoor series

Sonning series Sonning series Sonning series Sonning series Sonning series Sonning series Sonning series Broadmoor series

Broadmoor series Broad series Usher series

Broadmoor series Thames series Usher series

The original identification was by Kay (1936). The modern correlatives are the series used in the regional soil survey of Jarvis (1968), and are those that Jarvis (1968) deemed equivalent to the series of Kay (1936). These particular series remain unchanged in the current soil classification of England and Wales (Clayden and Hollis, 1984). b

ARTICLE IN PRESS 300

R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

The water table is close to the surface and the organic matter of the soil is large, nearly 25% in the top soil. The soil is a peaty silty clay, alkaline in reaction and contains numerous calcareous shells. Mottling is abundant in the top layers. 3. Broad series: floods for short periods in very wet winters. The soil is an organic loamy clay. The organic content in the top soil is less than in the Broadmoor series (15%) and the soil is less alkaline than the Broadmoor series. 4. Usher series: is a loamy soil, the driest and most freely drained of the alluvial soils. 2.2. Sampling procedures The soil was sampled at 256 sites at 2-m intervals on a linear transect. The transect was disrupted by a stream near its centre where two samples could not be taken. The total length of the transect was therefore 516 m. The transect spanned three distinct land uses: an experimental arable area, woodland and pasture. Table 1 describes the land use across the transect in more detail with the corresponding soil series. The first 153 m of the transect were on rotational grass, drilled 11 weeks prior to sampling (to 94 m) and 1-year old elsewhere. The ground vegetation of the woodland was predominantly Urtica dioica L. and Rubus fruticosus L. agg. Most of the pasture area was grazed by cattle, but along the final 78 m (on the Broad and the Usher series) the grass was cut for silage. The soil samples were collected on 13–14 June 2004. At each site a soil sample of approximately 1 kg was taken to a depth of 10 cm. The soil was then placed into a sterile plastic sample bag, which was then sealed and labelled. The samples were transported with a minimum of delay to a cold room, and were kept at 4 1C until they were analysed. 2.3. Laboratory analysis Each sample of soil was thoroughly mixed in the bag with a sterile spatula. A 10-g subsample of the material was then weighed out, and a suspension was made by dispersing it into 90 ml of sterile de-ionized water to give a 10-fold dilution. After shaking, 0.1 ml of the suspension was pipetted aseptically onto a pre-labelled sterile Petri dish containing 15–20 ml of Azotobacter medium (Deutsche Sammlung von Mikroorganismen und Zellkulturen GmbH (DSMZ)). The sample was then spread evenly across the petri dish using a sterile glass spreader. This was repeated on another petri dish with a second subsample to give two replicates for each soil sample. The dishes were incubated at 26 1C in the dark. After 72 h they were examined and the number of CFUs of Azotobacter spp. was recorded. Sixty dishes were chosen at random and a Gram stain method was used to confirm that the colonies were Azotobacter spp. The number of colonies on the two replicate petri dishes for each sample site was then standardized to units of colonies per gram of field-moist soil. This average value

over both replicates was used in all further analyses. In the rest of the paper, we call these values ‘‘CFUs’’ for brevity. Soil water content (g water 100 g1 dry soil) was determined by the weight loss of 10-g samples after drying in an oven at 105 1C for 24 h. The soil organic matter content (g 100 g1 dry soil) of this same material was then determined by loss on ignition in a furnace at 500 1C (Rowell, 1994). Soil pH (10 g soil in 25 ml water) was also determined. 2.4. Multivariate geostatistical analysis The cross-variogram, defined in Eq. (4), can be estimated from data on variables u and v by a generalization of Eq. (3) g^ u;v ðhÞ ¼

Nh 1 X fzu ðxi Þ  zu ðxi þ hÞgfzv ðxi Þ  zv ðxi þ hÞg 2N h i¼1

(6) which is equivalent to Eq. (3) for u ¼ v, which estimates the auto-variogram for a variable. The estimator in Eq. (6) was used to estimate auto-variograms for all variables, and the cross-variograms for Azotobacter CFUs with each of the other soil properties. By itself the cross-variogram does not tell us whether the relationship between two variables is scale-dependent, but we can identify scale-dependence from the cross- and autovariograms if we rescale the estimates of the crossvariogram at each lag to take values between 1 and 1, by analogy with the ordinary correlation coefficient. This is the codispersion coefficient (see Goovaerts and Webster, 1994), which is a function of lag: g^ u;v ðhÞ r^ u;v ðhÞ ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi . g^ u ðhÞ^gv ðhÞ

(7)

When the two variables have scale-invariant correlation then the codispersion coefficient does not change with lag, otherwise the codispersion coefficient may increase or decrease with the separation distance. It is also possible to extract information on scale-dependent covariation by modelling the auto- and cross-variograms as continuous functions of the lag in a linear model of coregionalization (LMCR). Under this model, for Nw variables, a matrix of the auto- and cross-variograms is expressed as a function of a common set of variogram models: 3 2 g1;1 ðhÞ g1;2 ðhÞ . . . g1;N w ðhÞ 7 6 6 g2;1 ðhÞ g2;2 ðhÞ . . . g2;N w ðhÞ 7 7 6 7 6 7 6   ...  7 6 GðhÞ  6 7 7 6   ...  7 6 7 6 7 6   ...  5 4 gN w ;1 ðhÞ gN w ;2 ðhÞ . . . gN w ;N w ðhÞ ¼

L X l¼0

gl ðhÞS l ,

ð8Þ

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

where gl(h) is a basic variogram model, with a distinct range of spatial dependence (except for g0(h) which is a nugget term, and Sl is the corresponding variancecovariance matrix of the Nw variables, a coregionalization matrix. We may standardize this matrix to a correlation matrix, so that any entry is in the range [1,1]. For any pair of variables this gives a structural correlation, the correlation between the components of variation associated with the particular spatial scale described by the basic variogram function gl(h). The structural correlation is a useful statistic (e.g. Goovaerts and Webster, 1994), but its validity depends on the appropriateness of the LMCR as a model of the joint variation of the variables. The auto- and cross-variograms that are estimated with Eq. (6) may not always be sufficiently similar in form to be described by a joint set of variograms such that the coregionalization matrices are all positive definite, and if the fit of the model is poor then the structural correlations may be misleading. In this study, codispersion coefficients were computed from the auto- and cross-variogram estimates, but LMCR could not be fitted satisfactorily to the estimates, so structural correlations are not computed. Since these data were collected on a 1-dimensional transect all lags are scalars (distances). 2.5. The adapted maximal overlap discrete wavelet transform (AMODWT) and associated statistics The AMODWT, proposed by Lark and Webster (2001), was used for the analyses reported in this paper. Percival and Guttorp (1994) proposed that wavelet variances are estimated from the maximum overlap wavelet coefficients, rather than from the discrete wavelet coefficients. The wavelet coefficients defined in Eq. (5) for successive translations of a scaled wavelet are equivalent to subsampling every 2jth value from a filtering of the data with the wavelet function. The maximum overlap discrete wavelet transform (MODWT) coefficients are the full set that we would obtain without subsampling. They are not orthogonal to each other, but they do retain more information from the data than do the DWT coefficients (which are a subset of the MODWT ones). The AMODWT of Lark and Webster (2001) is essentially the same as that of Percival and Guttorp (1994) but uses adapted wavelet functions near the beginning and end of the transect. u We denote the AMODWT coefficient by d j;n for variable j u at location n and for scale parameter 2 . If there are Nj, such coefficients then the AMODWT variance is estimated as j   1 X u 2 dj;n . ¼ j 2 N j n¼1

N

s 2j;u

(9)

Percival and Guttorp (1994) and Percival (1995) show how confidence limits for these variances may be estimated. Lark and Webster (2001) computed AMODWT covariances for variables u and v, measured at the same

301

locations, as N

j 1 X u v u;v d j;n dj;n . C j ¼ j 2 N j n¼1

(10)

A wavelet correlation may be computed as r u;v j ¼

C u;v j . suj svj

(11)

Confidence limits for this estimate may be computed using Fisher’s z transform:     1 h r u;v r u;v ¼ tanh . (12) j j The transformed estimate of the correlation is approximately normal with a sample variance of 1/(n3) where the correlation is derived from n independent observations. Lark and Webster (2001) described a Monte Carlo method to compute an effective value of n given that the wavelet coefficients are non-orthogonal so this assumption of independence does not hold. The wavelet correlation is a useful measure of scaledependent correlation between variables. It corresponds to the structural correlation described in the previous section, but without requiring that an LMCR is fitted to the data. However, since it is obtained by summing functions of the wavelet coefficients over all locations its interpretation entails assumptions of stationarity (i.e. that the correlation between the variables at some scale is the same everywhere). We may exploit the localization of the coefficients to look for changes in covariance. This is done using a development of a method of Whitcher et al. (2000b) proposed by Lark and Webster (2001). For a particular scale 2j, with Nj AMODWT coefficients, we compute Sk, a normalized cumulative sum of products of the wavelet coefficients: Pk  u  v n¼1 d j;n d j;n Sk ¼ PN j 1 u v 8 k ¼ 1; 2; . . . ; N j  1. (13) d d n¼1

j;n j;n

Under an assumption of stationarity of the covariance we expect this quantity to increase linearly to 1.0 as k ¼ 1, 2,y,Nj1. If there is a change in the covariance at some location then a plot of Sk will appear piece-wise linear with the break point at the change. We may search for such breaks of slope by finding the value of k that gives rise to the value Bu,v: "(  ) max k u;v u;v  Sk B  max , 1pkpnj  1 nj  1 (  )# max k  1 S u;v . ð14Þ k  nj  1 1pkpnj  1 The position k that gives rise to Bu,v is a candidate change point.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

302

2.5.1. Multiresolution analyses, wavelet correlations and change detection As noted in Section 2.2, the transect was sampled at 2-m intervals but there was a small irregularity in the sampling since the soil could not be sampled at 298 and 300 m. For this analysis we disregarded this gap in the sampling, which could lead to some local inflation of the wavelet variances at the smaller scales. It is unlikely to affect the changedetection procedure described previously because the method generally identifies pronounced differences in variability between segments of a transect rather than local anomalies, unless these are very large. The extremal phase wavelet of Daubechies (1988) with two vanishing moments was used in this work because it has a very compact support, and so it is particularly suited to the investigation of non-stationary variation. It is not very smooth, but this is not usually a concern in the analysis of soil data, which generally show substantial short-range variation. The AMODWT coefficients for the 256 data could be computed for 6 dilations of the wavelet, i.e., for scale parameters 2j  2 m, j ¼ 1, 2, y, 6:—4, 8, y, 128 m. Multiresolution analyses: A MRA was computed for each variable using the shift-averaging procedure of Coifman and Donoho (1995) described by Lark and Webster (1999). To do this a MRA was computed within

a moving window, which is shifted in increments. The resulting components are averaged. The moving window was 256 m long, so the coarsest scale that could be investigated with the adapted wavelets is 64 m (one quarter the length of the window). This procedure is commonly used because it is shift-invariant (unlike a simple MRA based on the DWT). Detail components were obtained for scale parameters 4, 8, 16, y 64 m, and a smooth component for 64 m. Wavelet correlations: Wavelet variances were computed for all the variables at the scales studied with the AMODWT and 95% confidence intervals were obtained for these estimates, following Percival and Guttorp (1994) and Percival (1995). Wavelet correlations between the transformed Azotobacter CFUs and each soil property and their 95% confidence intervals were then computed, with the expressions in Eqs. (9)–(12). Next the procedure described previously in this section was used to identify locations where the covariance of Azotobacter CFUs and each soil property appears to change. This was done for each spatial scale for which AMODWT coefficients were calculated. For any soil property, and a particular scale parameter 2j, Sk was computed initially for k ¼ 1, 2, y , 255 where k is the index of each datum in order as sampled, and a candidate change point was then identified using Eq. (14). The next step was

4 Azotobacter log10 CFUs

Azotobacter CFUs

2000 1500 1000 500 0 0

100

200

300

400

2

0

-2

500

0

100

0

100

200

300

400

500

50 150

Loss on ignition / g 100 g-1 dry soil

Soil water content / g 100 g-1 dry soil

200

100 50 0

30 20 10 0

0

Soil pH

40

100

200

300

400

500

200

300

400

500

Distance from origin of transect (m)

8 7 6 5 4 3 2 1 0 0

100

200

300

400

500

Distance from origin of transect (m)

Fig. 1. Values of soil properties plotted against distance from the origin of the transect. Note that adjacent points are joined by straight lines.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

303

Table 2 Summary statistics of data on soil properties measured along the transect Loss on ignition (g 100 g1 dry soil)

Soil pH

Soil water content (g 100 g1 dry soil)

Azotobacter (CFUs g1 soil)

Azotobacter log10 (CFUs g1 soil)

Mean Median Minium Maximum Variance Skewnews

13.5 10.7 1.7 43.5 141.5 0.77

6.34 6.52 3.59 7.77 0.72 0.96

39.0 20.9 3.8 181.7 1608.5 1.0

166 0 0 2035 135917 3.0

0.10 2.0 2.0 3.31 4.27 0.31

Correlations Loss of ignition pH Water content Azotobacter log Azotobacter

1.0 0.34 0.95 0.64 0.57

1.0 0.46 0.44 0.74

1.0 0.65 0.64

1.0 0.65

1.0

3000

7

(a)

(c)

6

2500

Variance

5

2000

4 1500 3 1000

2

500

1

0

0 0

1.2

60

120

0

180

250

(b)

60

120

180

(d)

Variance

200 0.8

150 100

0.4

50 0

0 0

60

120

180

Lag distance (m)

0

60

120

180

Lag distance (m)

Fig. 2. Auto-variograms of (a) log10 Azotobacter CFUs, (b) soil pH, (c) water content and (d) loss on ignition.

to compute the wavelet correlations between the variables for each of the two segments divided by the change point, and their confidence limits. Following Lark et al. (2004) and Lark (2005) the significance of the difference between the wavelet correlations either side of the change point was tested against critical values computed by a Monte Carlo simulation. To test a change point in a sequence of n data at scale parameter 2j, 5000 realizations were generated of n pairs of values with stationary covariance (the null hypothesis). The candidate change point for scale parameter 2j on each realization was determined as above, and the absolute difference between the wavelet correlations on either side of

the change point was computed. The empirical distribution of these differences was then treated as an approximation to the sampling distribution of the difference in wavelet correlation on either side of a candidate change point under a null hypothesis of stationary variation, conditional on 2j and n. If the correlations on either side of the change point differed significantly (po0.05) then the procedure was repeated for each of the two segments defined by the change point, otherwise the search for change points stopped. Reiteration of this procedure divided the transect into segments within which the wavelet correlation (for a particular scale) may be regarded as uniform.

ARTICLE IN PRESS 304

R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

3. Results

1 0.8

3.2. Geostatistical analyses Fig. 2 shows the autovariograms for the four variables of interest, and Fig. 3 shows the co-dispersion coefficients of the transformed Azotobacter CFUs with the other three variables. The auto-variograms show distinct features. Both water content and loss on ignition show spatial dependence over longer distances than do the other two variables, although all variograms appear to be bounded. All auto-variograms show evidence of some local smooth-

0.6 0.4 0.2 0

(a) -0.2 0

60

120

180

60

120

180

120

180

1

Codispersion

0.8 0.6 0.4 0.2 0

(b) -0.2 0

1 0.8 Codispersion

The properties measured at each position are shown in Fig. 1. The number of Azotobacter CFUs was intermittent. There were no CFUs detected in 133 of the sample sites (52%), and the numbers at sites where the species was present varied widely from a few to 42000 CFUs g1 soil. Azotobacter was rare in the arable area except for two small patches, which corresponded to the largest pH values. Azotobacter CFUs were numerous in the grassland over the Broadmoor series and declined rapidly as this gives way to the Usher soil series. The soil pH was small in the woodland but larger in the arable area (because of liming) and the grassland (because of flooding). The flood water is alkaline because the Thames cuts through the Chalk hills at Goring and deposits calcareous material in the form of shells and small river snails; this will also account for some of the carbon lost on ignition. Water content and loss on ignition showed similar patterns. Values were small in the arable area, increased in the woodland and declined over the Broad and Usher soil series. This reflects the well-drained sandy textured soil in the arable area and the lower lying alluvial soils under woodland and grassland. Table 2 gives the summary statistics and correlations of these properties. The skewness of Azotobacter CFUs was large so these data were transformed to common logarithms for further analysis. Before transformation a constant of 0.01 was added to all the data because some values were zero. This transformation markedly reduced the skew of the data. The transformed data are plotted in Fig. 1. There was a moderate positive correlation between all soil properties and Azotobacter CFUs. Correlations are also shown for log10 Azotobacter CFUs, and it is with these transformed data that our further results are concerned. Transformation of strongly skewed data improves the efficiency of statistical estimation and reduces the leverage of observations in the upper tails of the marginal distributions. The correlations with pH were larger for the transformed Azotobacter CFUs. The positive correlation of CFUs with pH is expected on biological grounds, since bacterial activity is usually inhibited by lower pH values in the soil.

Codispersion

3.1. Basic data, statistics and transformation

0.6 0.4 0.2 0

(c) -0.2 0

60 Lag distance (m)

Fig. 3. Codispersion coefficients for log10 Azotobacter CFUs with (a) soil pH, (b) water content and (c) loss on ignition.

ness in the variation, i.e., a slight upward concave form at short lags. This is most pronounced for water content and for loss on ignition. It was not possible to model the auto- and crossvariograms satisfactorily with a consistent LMCR, so we have not attempted to compute structural correlations. However, the co-dispersion coefficients do show that there is pronounced scale-dependence in the correlation between the soil properties and transformed Azotobacter CFUs. The correlation increases with scale in all cases, indicating that much of the shortest-range variation in Azotobacter CFUs must be due to soil factors other than those measured here, but that a substantial amount of the variation over longer distances might be explained in terms of our selected soil properties. However, this interpretation

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

depends entirely on the assumptions of intrinsic stationarity given in Eqs. (1), (2) and (4). The results of the wavelet analysis will indicate how plausible these assumptions are.

strongest evidence, overall, for a relationship between these variables was seen at the intermediate scales of 16 and 32 m. There was a similar pattern for the wavelet correlations between log10 Azotobacter CFUs and soil water content. The largest correlation was at the third dilation (16 m) and only at 8, 16 and 32 m were the overall correlations significantly different from zero. Fig. 6 shows the MRA of each variable. The smooth component at 64 m is at the bottom of each graph, and arranged above this in order of decreasing values of the scale parameter are the detail components. Summing these components reconstitutes the original data. Note that, for clarity, there is an offset between each component. A bar on each graph indicates the scale in units of the original variable. The MRA shows the intermittency of much of the variation in the transformed Azotobacter CFUs, particularly at the three finest scales. This variation appears to be non-stationary. Note that some, but not all, of the intermittent variation around 300 m on the transect might reflect the gap in the sampling here. The non-stationarity of the variation in soil water content and loss on ignition is marked, with little variation up to about 200 m on the transect, and substantial variation thereafter. In such circumstances a geostatistical analysis of the variation would be unlikely to give much insight. Table 3 shows the results of the tests for changes in wavelet correlation. None of these changes are near 300 m where the sampling was interrupted. At the coarsest scale (128 m) there are no change points, and only pH is significantly correlated with transformed Azotobacter CFUs. The multiresolution analyses show that the smooth

3.3. Wavelet analyses Fig. 4 shows the wavelet variances for each variable plotted against the exponent of the scale parameter (exponent j denotes scale parameter 2j  2 m). Note that for each variable the coarser scale parameters had the largest variance components. There was considerable short-range variation in the transformed Azotobacter CFUs, the variance for the finest scale was larger than that for intermediate scales, and a similar pattern was seen for loss on ignition. Fig. 5 shows the wavelet correlations between the log10 Azotobacter CFUs and each soil variable. For soil pH the overall wavelet correlations were only significantly different from zero at the two coarsest scales (64 and 128 m). These correlations were large (0.72 and 0.87, respectively). It is not unexpected that Azotobacter CFUs were larger in the soils of greater pH. What is clear is that the observed short-range variations in pH do not seem to be important. This could be, in part, because this short-range variation included significant measurement error, but it might also reflect the effect of other limiting factors at these scales. The finest scale correlations between loss on ignition and log10 Azotobacter CFUs were small. At the third and fourth dilation of the wavelet the correlations were larger and significantly different from zero. The largest correlations were at the two coarsest scales, but given the relatively small effective sample size at these scales the correlations were not significantly different from zero. The

Variance

100

10

(a)

10

1

1

0.1

0.1

(c)

0.01 1

0 10000

Variance

305

2

3

5

4

6

0 10000

(b)

1

2

3

2

3

4

5

6

(d)

100

100

1

1 0

1

2

3

4

Scale exponent

5

6

0

1

4

5

6

Scale exponent

Fig. 4. Wavelet variances, with 95% confidence intervals, for (a) log10 Azotobacter CFUs, (b) soil water content, (c) soil pH, (d) loss on ignition. The 95% confidence intervals are shown by horizontal bars.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

306 1.0

Wavelet correlation

(a)

0.5

0.0

-0.5

-1.0 0

Wavelet correlation

1.0

1

2

3

4

5

6

1

2

3

4

5

6

(b)

0.5

0.0

-0.5

correlated across the transect, and here the main feature was the common peak around 175 m. At scale 64 m there were significant correlations between Azotobacter CFUs and soil water content and loss on ignition, but only from around 200 m to the end of the transect. Similarly at scale 32 m in the first 200 m or so of the transect, CFUs were significantly correlated only with soil pH. In summary, at the coarsest resolved scales soil pH was the critical factor that determined Azotobacter abundance across the transect as a whole. At scale 32 m, variations in soil pH were correlated with transformed Azotobacter CFUs only in the first part of the transect where the soil was drier and losses on ignition were smaller. The overall (stationary) wavelet correlations can disguise relationships between variables that are local; for example, between CFUs and soil water. The wavelet correlation at scale 64 m was moderate (0.5). However, Table 3 shows that there was a strong correlation (0.76) from 256 m onward, and a very weak correlation at positions before 256 m on the transect. There were strong correlations between the transformed Azotobacter CFUs and both soil water content and loss on ignition at scales 16, 32 and 64 m for the second half of the transect. At scale 32 m, there was a switch from a strong correlation of transformed Azotobacter CFUs with pH to around 200 m on the transect and similarly a strong correlation with soil water and loss on ignition at points to the end of the transect.

-1.0 0

4. Discussion

1.0

Wavelet correlation

(c) 0.5

0.0

-0.5

-1.0 0

1

2

3

4

5

6

Scale exponent

Fig. 5. Wavelet correlations of Azotobacter CFUs with (a) soil pH, (b) water content and (c) loss on ignition. The 95% confidence intervals are shown by horizontal bars.

components of these variables at the 64 m scale (which include the 128 m detail component that cannot be resolved by the shift-averaging procedure) have common features, i.e., the peaks near positions 60 and 175 m (Fig. 6). This shows that the intermittent appearance of Azotobacter in the first half of the transect was probably due to the presence of small patch of soil with elevated pH. At 64 m, pH and transformed Azotobacter CFUs are uniformly

We have quantified the spatial covariation of abundance of Azotobacter CFUs with some physico-chemical soil properties. We accept that the variables that we have measured have limitations. For example, CFUs are known to represent only a subset of extant bacteria in soil. Similarly, loss on ignition is a measure of organic matter content of the soil, and is less easily interpreted than organic carbon content, and we would prefer a measure of the water content of the soil at a fixed water potential. However, this was designed to be an exploratory study to show the value of wavelet analysis for studying the spatial variation of biological properties of the soil. Wavelet analysis requires substantial data sets, and so we used relatively cheap and rapid analyses to determine these properties at many points. We have analysed the data using standard geostatistical methods (which assume stationarity) which have been used in the past for biological soil variables (Ritz et al., 2004; Rossi et al., 1995). We have also used the wavelet transform to describe the spatial variation. Under the assumption of stationarity in the variance the results of the geostatistical and wavelet analyses are necessarily consistent. However, as we discuss below, the wavelet analyses show that the correlation between the variables changes in space, which is inconsistent with the assumption of stationarity.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

(a) 0

(c) 100

200

300

400

500

0

(b) 0

307

100

200

300

400

500

200

300

400

500

(d) 100

200

300

400

500

Distance from origin of transect (m)

0

100

Distance from origin of transect (m)

Fig. 6. Multiresolution analyses of (a) log10 Azotobacter CFUs, (b) soil water content, (c) soil pH, (d) loss on ignition. The lowest component on each graph is the smooth component at the 64 m scale. Detail components of increasingly fine scale are stacked above this. These are displaced by an arbitrary amount for clarity. The vertical scale bars are (i) 2 units of log10 Azotobacter CFUs, (ii) 50 g water 100 g–1 dry soil, (iii) 1 soil pH unit, (iv) 10 g loss on ignition 100 g–1 dry soil. See text for further details.

The multiresolution analyses (Fig. 6) show how the variation of each property changes between contrasting soil types and land uses. They also indicate that the variation was intermittent at the finer scales, which is inconsistent with the assumption of stationarity. While we may compute experimental variograms of these data (Fig. 2), these combine measurements of the variability at particular lag distances from all parts of the transect, under the assumption of stationarity. The multiresolution analyses cast doubt on the plausibility of this assumption, particularly for soil water content and loss on ignition which vary little before 200 m and substantially thereafter. The experimental variograms do not give a true description anywhere on the transect. We have investigated the spatial covariation of Azotobacter with these soil properties, using geostatistics and wavelet transforms. Codispersion coefficients, a

geostatistical method, were computed as well as wavelet correlations at several spatial scales and locations. These analyses of the transect data lead us to the following key observations. 1. The relationship between Azotobacter CFUs and soil properties depended on scale. The wavelet correlations generally increased with the wavelet scale parameter (no significant correlations occurred at the finest scale). The codispersion coefficients also show that the correlation between Azotobacter CFUs and soil properties increased with scale (Fig. 3). 2. The relationship between Azotobacter CFUs and soil properties might also be local, as is seen from the wavelet correlations with soil water content and loss on ignition at scales up to 64 m. These were significant only in the second half of the transect.

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

308

Table 3 Sections of the transect deemed uniform with respect to the correlation between log-transformed Azotobacter CFUs and soil properties under the changedetection procedure Soil property

Scale exponent

Segment (m)a

Wavelet correlationb

Upper 95% limit

Lower 95% limit

pH

1 2 3 4

All All All 0–224 226–516 All All

0.10 0.09 0.16 0.64 0.20 0.72 0.87

0.24 0.28 0.39 0.87 0.29 0.90 0.98

0.05 0.11 0.09 0.18 0.60 0.34 0.36

All 0–262 246–284 286–516 All 0–238 240–516 0–210 212–516 All

0.03 0.05 0.68 0.29 0.27 0.08 0.73 0.47 0.85 0.55

0.12 0.32 0.99 0.00 0.49 0.46 0.90 0.53 0.97 0.91

0.18 0.23 0.67 0.54 0.03 0.58 0.39 0.92 0.42 0.31

All All 0–254 256–516 0–236 238–516 0–204 206–516 All

0.10 0.24 0.03 0.77 0.12 0.73 0.46 0.76 0.52

0.24 0.41 0.33 0.88 0.43 0.90 0.54 0.94 0.91

0.05 0.04 0.37 0.58 0.60 0.39 0.92 0.21 0.34

5 6 Loss on ignition

1 2

3 4 5 6 Water content

1 2 3 4 5 6

a

‘‘All’’ indicates that no change points were detected. Correlations in bold are significant at 95%.

b

3. The soil pH appears to be the principal factor correlated with Azotobacter CFUs at the coarsest two scales; it is the only soil property that was significantly correlated with CFUs at the coarsest scale. The soil pH explained most of the variation at intermediate scales in the first (arable) half of the transect. Soil pH has often been shown to be a critical environmental factor that influences the occurrence and numbers of Azotobacter (Channal et al., 1989; Paul and Newton, 1961; Kole et al., 1988; Martyniuk and Martyniuk, 2003). 4. Significant local correlations between Azotobacter CFUs and soil water content or loss on ignition did not occur in the arable soils, but only in the pasture and woodland soils where the values of both these variables were larger than in the arable soil. It appears that pH limited Azotobacter CFUs in the arable area. As the pH was not limiting in the second half of the transect, the amount of soil water and the loss on ignition (where the values of both these variables were larger) appeared to determine in part the number of CFUs at intermediate scales. These findings, particularly 2–4, support our original hypothesis that biological variables, such as abundance of particular organisms, and the soil properties that might affect these biological variables, will show complex covariation in space, of a form inconsistent with the

assumptions of intrinsic stationarity that underly geostatistics. This study has highlighted the importance of accounting for spatial scale to identify soil factors that explain variations in biological soil variables. In this case, we have found convincing, and biologically plausible, correlations between broader trends in soil properties and the Azotobacter CFUs. The short-range variation in CFUs, which the wavelet variances show to be significant, is not explained by the soil properties considered here. In part this may be due to sampling effects. Despite the effort to homogenize the soil sample, it is inevitable that there will be differences between the soil extracted for the microbial study and that analysed for physical and chemical properties. Soil properties other than those that we measured might also be important at the finest scales, such as those associated with macro- and micro-structure, and biological factors such as the effects of competitors. The wavelet analysis also shows that correlations are not spatially uniform, but change between different environments (the geostatistical analysis could not show this). This is not surprising biologically, but most methods of spatial analysis (and a fortiori all non-spatial methods) cannot account for such non-stationarity. The spatial variation of soil physical and chemical properties means that the most limiting factors on biological processes are likely to change

ARTICLE IN PRESS R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

in space, and this requires an analysis which can identify local relationships. Wavelets are ideally suited to this purpose. These results show how the wavelet transform can give insight into the spatial covariation of soil properties that geostatistical analysis can not. It is beyond the scope of this paper to consider whether this has implications for geostatistical prediction (kriging). This problem has not been widely addressed. We note that there are some developments of standard geostatistics that attempt to account for non-stationarity in the variance, albeit requiring somewhat arbitrary decisions on how to subdivide or window the data (Haas, 1995; Fuentes, 2001). To conclude, the wavelet transform and associated analyses can be used to study complex and spatially nonstationary relationships between soil variables. Given that the variables that we have measured have limitations we recommend the use of more refined variables to investigate their spatial variation using the wavelet transform. We suggest that such analyses are necessary if our understanding of the nature and origins of biological variability and ecological complexity in soil systems is to advance. Acknowledgements We are grateful to our colleagues, Professor Margaret Oliver, Dr Ben Marchant and Dr Tom Bishop, for helpful comments on this paper. RML’s contribution to this work was funded by the UK Biotechnology and Biological Sciences Research Council (BBSRC) through its core grant to Rothamsted Research. References Channal, H.T., Alagawadi, A.R., Bharamagoudar, T.D., Udupa, S.G., Patil, P.L., Mannikeri, I.M., 1989. Azotobacter population as influenced by soil properties in some soils of North Karnataka. Current Science 58, 70–71. Clayden, B., Hollis, J.M., 1984. Criteria for differentiating soil series. Soil Survey Technical monograph No. 17. Soil Survey of England and Wales, Harpenden. Coifman, R.R., Donoho, D.L., 1995. Translation-invariant denoising. In: Antoniadis, A., Oppenheim, G. (Eds.), Lecture Notes in Statistics, Number 103: Wavelets and Statistics 125–150. Springer, New York. Cressie, N.A.C., 1993. Statistics for Spatial Data. Wiley, New York. Daubechies, I., 1988. Orthonormal bases of compactly supported wavelets. Communications in Pure and Applied Mathematics 41, 909–996. Franklin, R.B., Blum, L.K., McComb, A.C., Mills, A.L., 2001. A geostatistical analysis of small-scale spatial variability in bacterial abundance and community structure in salt marsh creek bank sediments. FEMS Microbiology Ecology 42, 71–80. Fuentes, M., 2001. A high frequency kriging approach for non-stationary environmental processes. Environmetrics 12, 469–483. Go¨rres, J.H., Dichiaro, M.J., Lyons, J.B., Amador, J.A., 1998. Spatial and temporal patterns of soil biological activity in a forest and an old field. Soil Biology & Biochemistry 30, 219–230. Goovaerts, P., Webster, R., 1994. Scale-dependent correlation between topsoil copper and cobalt concentrations in Scotland. European Journal of Soil Science 45, 79–95.

309

Grundmann, G.L., Debouzie, D., 2000. Geostatistical analysis of the  distribution of NH+ 4 and NO2 -oxidizing bacteria and serotypes at the millimeter scale along a soil transect. FEMS Microbiology Ecology 34, 57–62. Haas, T.C., 1995. Local prediction of a spatio-temporal process with an application to wet sulfate deposition. Journal of the American Statistical Association 90, 1189–1199. Jarvis, R.A., 1968. Soils of the Reading district. Memoirs of the Soil Survey of Great Britain: England & Wales 13 [Sheet 268], Soil Survey of England and Wales, Harpenden. Kay, F.F., 1936. A Soil Survey of The University Farm, Sonning, Berkshire. University of Reading, Faculty of Agriculture and Horticulture. Kole, M.M., Page, W.J., Altosaar, I., 1988. Distribution of Azotobacter in Eastern Canadian soils and in association with plant rhizopheres. Canadian Journal of Microbiology 34, 815–817. Lark, R.M., 2005. Analysis of complex soil variation using wavelets. In: Grunwald, S. (Ed.), Environmental Soil-Landscape Modeling—Geographic Information Technologies and Pedometrics. CRC Press, New York pp. 343–371. Lark, R.M., Webster, R., 1999. Analysis and elucidation of soil variation using wavelets. European Journal of Soil Science 50, 185–206. Lark, R.M., Webster, R., 2001. Changes in variance and correlation of soil properties with scale and location: analysis using an adapted maximal overlap discrete wavelet transform. European Journal of Soil Science 52, 547–562. Lark, R.M., Webster, R., 2004. Analysing soil variation in two dimensions with the discrete wavelet transform. European Journal of Soil Science 55, 777–797. Lark, R.M., Cullis, B.R., Welham, S.J., In press. On spatial prediction of soil properties in the presence of a spatial trend: the empirical best linear unbiased predictor (E-BLUP) with REML. European Journal of Soil Science, in press. Lark, R.M., Milne, A.E., Addiscott, T.M., Goulding, K.T., Webster, P., OFlaherty, S.O., 2004. Scale- and location-dependent correlation of nitrous oxide emissions with soil properties: an analysis using wavelets. European Journal of Soil Science 55, 601–610. Martyniuk, S., Martyniuk, M., 2003. Occurrence of Azotobacter spp. in some Polish soils. Polish Journal of Environmental Studies 12, 371–374. Matheron, G., 1962. Traite´ de Ge´ostatistique Applique´, Tome 1. Memoires du Bureau de Recherches Ge´ologiques et Minie´res, Paris. Nunan, N., Wu, K., Young, I.M., Crawford, J.W., Ritz, K., 2002. In situ spatial patterns of soil bacterial populations, mapped at multiple scales, in an arable soil. Microbial Ecology 44, 296–305. Nunan, N., Wu, K., Young, I.M., Crawford, J.W., Ritz, K., 2003. Spatial distribution of bacterial communities and their relationships with the micro-architecture of soil. FEMS Microbiology Ecology 44, 203–215. Paul, E.A., Newton, J.D., 1961. Studies of aerobic non-symbiotic nitrogen-fixing bacteria. Canadian Journal of Microbiology 7, 7–13. Percival, D.P., 1995. On estimation of the wavelet variance. Biometrika 82, 619–631. Percival, D.P., Guttorp, P., 1994. Long-memory process, the Allan variance and wavelets. In: Foufoula-Georgiou, E., Kumar, P. (Eds.), Wavelets in Geophysics. Academic Press, New York pp. 325–344. Percival, D.B., Walden, A.T., 2000. Wavelet Methods for Time Series Analysis. Cambridge University Press, Cambridge. Redding, T.E., Hope, G.D., Schmidt, M.G., Fortin, M.J., 2004. Analytical methods for defining stand-clearcut edge effects demonstrated for N mineralization. Canadian Journal of Forest Research 34, 1018–1024. Ritz, K., McNicol, W., Nunan, N., Grayston, S., Millard, P., Atkinson, D., Gollotte, A., Habeshaw, D., Boag, B., Clegg, C.D., Griffiths, B.S., Wheatley, R.E., Glover, L.A., McCaig, A.E., Prosser, J.I., 2004. Spatial structure in soil chemical and microbiological properties in an upland grassland. FEMS Microbiology Ecology 49, 191–205.

ARTICLE IN PRESS 310

R.J. Barnes et al. / Soil Biology & Biochemistry 39 (2007) 295–310

Rossi, J.P., Lavelle, P., Tondoh, J.E., 1995. Statistical tool for soil biology.10. Geostatistical analysis. European Journal of Soil Biology 31, 173–181. Rowell, D.L., 1994. Soil Science: Methods and Applications. Longman Scientific, Harlow, UK. Si, B.C., Farrell, R.E., 2004. Scale-dependent relationship between wheat yield and topographic indices: a wavelet approach. Soil Science Society of America Journal 68, 577–587. Si, B.C., Zeleke, T.B., 2005. Wavelet coherency analysis to relate saturated hydraulic properties to soil physical properties. Water Resources Research 41, Article No. W11424. Smith, J.L., Halvorson, J.L., Bolton, H., 1994. Spatial relationships of soil microbial biomass and C and N mineralization in a semi-arid shrubsteppe ecosystem. Soil Biology & Biochemistry 26, 1151–1159.

Stark, C.H.E., Condron, L.M., Stewart, A., Di, H.J., O’Callaghan, M., 2004. Small-scale spatial variability of selected soil biological properties. Soil Biology & Biochemistry 36, 601–608. Webster, R., Oliver, M.A., 1992. Sample adequately to estimate variograms of soil properties. Journal of Soil Science 43, 177–192. Webster, R., Oliver, M.A., 2001. Geostatistics for Environmental Scientists. Wiley, Chichester. Whitcher, B.J., Guttorp, P., Percival, D.B., 2000a. Wavelet analysis of covariance with application to atmospheric time series. Journal of Geophysical Research—Atmospheres 105 (D11), 14941–14962. Whitcher, B., Guttorp, P., Percival, D.B., 2000b. Multiscale detection and location of multiple variance changes in the presence of long memory. Journal of Statistical Computation and Simulation 68, 65–88.