ELSEVIER
Journal of Geochemical Exploration 63 (1998) 189–201
Leveling geochemical data between map sheets Bahram Daneshfar a,Ł , Eion Cameron a,b a
Ottawa-Carleton Geoscience Center, Dept. of Earth Sciences, University of Ottawa, Ottawa, Ont. K1N 6N5, Canada b Geografix, 865 Spruce Ridge Road, Carp, Ontario, K0A 1L0, Canada Received 11 December 1997; revised version received 27 April 1998; accepted 4 May 1998
Abstract Geochemical surveys are frequently assembled into larger, regional compilations. In some cases a boundary shift in the values for one or more elements may be observed at the join of adjacent surveys. This indicates that data for the affected elements are not consistent between the surveys. Where the same sampling medium has been used, the shift may be due to different crews=organizations, who varied in their sampling techniques. However, most commonly the shift is due to imperfect calibration of the analytical method used for samples from the different surveys. For example, there may have been a lack of proper analytical standardization between survey programs. To carry out leveling, bands are established on either side of the boundary between two surveys that show a shift. It is desirable that the bands have a close match in terms of geology and physiography. A quantitative method is presented to estimate the optimum width for these bands. Quantiles of the data within each band are calculated. The quantile pairs are plotted in X–Y space and a line fitted to express the relationship between the pairs of quantiles. The equation of this line is used to correct the shift between the two surveys. This method is tested on data for Mo in stream sediments, and pH of stream water, from two National Geochemical Reconnaissance Surveys in British Columbia. 1998 Elsevier Science B.V. All rights reserved. Keywords: geochemical surveys; leveling; molybdenum, pH
1. Introduction In regional compilations of geochemical data, where the surveys have been carried out at different times and=or by different organizations, a boundary shift may be observed at the join between the surveys. Even where the same sampling medium has been used, there may be variations in technique employed by different sampling crews. However, the major source of differences is usually in the analytical methods. The exact same procedures may not have been used during sample preparation and analŁ Corresponding
author. E-mail:
[email protected]
ysis, or the analytical instruments or settings may not have been identical. Even where the analytical methods were the same, there may have been an absence of standardization, or imperfect standardization between laboratories. Procedures to minimize such problems have been discussed by Darnley et al. (1995). In this paper a method is described that adjusts the data between surveys that show a ‘shift’ along their boundary. The method is based on a suggestion for parametric leveling by R.G. Garrett and N. Gustavsson (in Darnley et al., 1995). Bands are selected on either side of the shifted border(s) and quantiles of the data for a given element are calculated for
c 1998 Elsevier Science B.V. All rights reserved. 0375-6742/98/$19.00 PII: S 0 3 7 5 - 6 7 4 2 ( 9 8 ) 0 0 0 1 5 - 6
190
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
each band. A line is then fitted to quantiles pairs, which provides an equation relating the data for the element between surveys. This equation may then be used to adjust or ‘correct’ the data of the two surveys so that the boundary shift is reduced or eliminated. This procedure may then be sequentially applied to all surveys within the compiled area where such problems exist.
2. Methodology Assume that there is a shift for the value of a certain element at the border of two surveys, A and B (Fig. 1). As a first step, data in the bands ‘a’ and ‘b’ of maps A and B are extracted and quantiles of the data for each band are calculated. Calculated quantiles could be 0.05, 0.10, 0.20, 0.30, 0.40, 0.50, 0.60, 0.70, 0.80, 0.90, 0.95. The quantile is the value corresponding to a given fraction of the data and is an alternative expression for percentile when the result is expressed in fractions. For example, the 0.5 quantile is the 50th percentile or the median. Fig. 2 shows alternative scenarios for the relationship of quantiles between bands ‘a’ and ‘b’. In most cases the data will be plotted on logarithmic scales in order to meet homogeneity of variance considerations important in the analysis of trace element data. In Fig. 2a there is a 1 : 1 relationship between quantiles, indicating that the data are fully consistent between the two surveys. If such data extended over 2 or more orders of magnitude and were plotted without logarithmic transformation, the points would fan out at higher concentrations. Fig. 2b indicates
Fig. 1. Adjoining surveys (A and B) show a shift in the value of a given element at their boundary, although the geological units are similar on each side of the boundary. Quantiles of the data from bands ‘a’ and ‘b’ are used to correct for the boundary shift.
that the data from band ‘b’ is consistently high by a constant amount. The distributions shown in Fig. 2c require a multiplier to correct the data from one survey to the other, while Fig. 2d shows a situation where both a multiplier and a constant are required. In situations like (e), leveling is not possible because there is no quantifiable relationship between the quantile pairs. In all three scenarios requiring a correction, a regression equation is calculated between quantiles for bands ‘a’ and ‘b’. In this paper band ‘a’ plotted along the y axis, is taken to be the ‘true’ set of data and band ‘b’ and the remaining area of sheet B are corrected to be consistent with band ‘a’ and sheet A. It is assumed that those features that might affect the geochemical distributions, such as rock type or the physiographic environment, are broadly similar in the bands being compared, for example, that the survey boundary is not a geological boundary. The choice of which sheet or data set to consider ‘true’ is a matter of judgment. In general, the data that have the best analytical standardization and quality control should be selected. If these do not provide a clear criterion, data sets that are consistent over the largest part of the total area should be chosen. Should initial X–Y quantile plots exhibit non-linearity, transformations can be used to improve the linearity. If a logarithmic rule or a Poisson counting process is involved, a logarithmic or square root transformation may be adequate. If a logarithmic transformation was required, then all calculations need to be completed in logarithms, with a final conversion back to the original scale. In calculating a leveling line that best fits the X–Y distribution of quantile pairs, the precision of quantiles as estimators should be considered, since this is a function of the number of samples used to calculate individual quantiles. The central quantile (0.5) is estimated with greatest precision, and the precision then falls off to the outer quantiles (Wilks, 1962). It may thus be prudent to use a weighted regression so the outer quantiles, which have the greater error, have less influence in estimating the regression coefficients. In some cases it may also be desirable to compute confidence intervals around the regression line.
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
191
Fig. 2. Various scenarios that may occur when quantiles are plotted for pairs of bands at the boundary between surveys. (a) No shift between two maps, (b) shift by a positive constant, (c) shift by a multiplier, (d) shift by a constant and multiplier, (e) no quantifiable relation between quantile pairs. (After Darnley et al., 1995).
2.1. Selection of bands One of the most important considerations is that the geology be reasonably similar for bands that are being compared. This affects the selection of the length and width of the bands. Fig. 3 shows four map sheets with dotted lines as their boundaries. Boundaries between numbered geological units are shown as solid lines. In this figure the central map sheet is considered to have a shift for element x relative to the other three sheets. For the bands shown, (a-W, b-W) are the best pair, since the areas of different geological units are approximately equal. In contrast, bands (a-N, b-N) are poor, because the geological units are different on either side of the border. Bands (a-E, b-E) provide an intermediate case in terms of suitability. In the case of the E band pairs, the length of the bands may be decreased (Fig. 3) to assure better representivity. Choice of an appropriate width for the bands is also important. If the width is too small, then there will not be enough samples; if too wide the areas included may be too diverse in geology or physiography.
In the following example for leveling, a quantitative method is tested and applied to estimate a width needed for bands ‘a’ and ‘b’. Then maps with a boundary shift are leveled. 2.2. Data sets The data sets used here are Mo in stream sediments and moss mat samples and the pH of stream water taken from National Geochemical Reconnaissance (NGR) Surveys in British Columbia. The NGR stream sediment and water surveys in British Columbia have been carried out jointly by the Geological Survey of Canada (GSC) and the British Columbia Geological Survey (BCGS). For the first several years the survey program was managed by the GSC, with sample collection, sample preparation and chemical analysis carried out under contract by private companies. Quality control was the responsibility of the manager (Friske and Hornbrook, 1991). From 1987 the manager changed from GSC to BCGS. The initial survey sheets managed by BCGS were given NGR Open File numbers. Latterly, the
192
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
Fig. 3. Considering geological condition in selecting length of the bands ‘a’ and ‘b’. Assuming that the map sheet in the middle has a boundary shift compared to surrounding sheets.
surveys were given the designation BCRGS (British Columbia Regional Geochemical Survey). Table 1 shows the managers and contractors involved in the survey sheets (Fig. 4) discussed in this paper.
For map sheets 865, 867 and all others managed by the Geological Survey of Canada, Mo was determined by reacting a 0.5 g sample with 1.5 ml concentrated HNO3 and allowing it to stand at room
Table 1 Organizational summary for the geochemical map sheets considered in this paper Map sheet
2038 a
Year
Sample collection
Sample preparation
Sample analysis Sediments
Water Barringer Magenta Ltd.
1988
McElhanney Engineering Services Ltd.
Kamloops Research and Assay Ltd.
Chemex Laboratories
BCRGS 34 a
1991
McElhanney Engineering Services Ltd.
Rossbacher Laboratory Ltd.
Barringer Magenta Ltd.
2182 a 2183 a 2184 a
1989
MPH Consulting Ltd.
Rossbacher Laboratory Ltd.
Barringer Magenta Ltd.
865 and 867 b
1981
ROOI Enterprises Ltd.
Kamloops Research Assay and Laboratory Ltd.
Chemex Labs Ltd.
2039 a 2040 a
a Managed b
by the British Columbia Geological Survey. Managed by the Geological Survey of Canada.
Bondar-Clegg and Co. Ltd.
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
193
Fig. 4. Different map sheets of National Geochemical Reconnaissance (NGR) with their open files number.
temperature for overnight (Lynch, 1992). Then, sample test tubes were placed in a hot water bath at room temperature, heated to 90ºC and allowed to digest at 90ºC for 30 minutes. Another 0.5 ml of concentrated HCl was added and the digestion continued at 90ºC for an additional 90 minutes with gentle mixing every 30 minutes. After removing the test tubes from the water bath and cooling to room temperature, 8 ml of 1250 µg=ml Al solution was added, and the sample solution was diluted to 10 ml and mixed well using a vortex mixer. Solids in the test tube were allowed to settle for two hours. Molybdenum was determined by atomic absorption spectrophotometry (AAS) using single element molybdenum hollow cathode lamp and a laminar flow nitrous oxide–acetylene burner. During the analysis a standard solution was analyzed after every tenth sample. Different standard solutions were used throughout the run so that the concentration range of a batch of unknown samples is represented by replicate standard readings. For map sheets 2038, 2039, 2040, 2182, 2183,
2184, and BCRGS 34, Mo was determined in stream sediment samples as follows. A 3 ml aliquot of HNO3 was added to a 0.5 g sample and allowed to sit overnight. Then 1 ml HCl was added and the sample was left at 90ºC in a water bath for 2 hours, then cooled. Aluminum solution were added and the whole left for 2 hours. The resulting solution was aspirated into an atomic absorption spectrophotometer (AAS) using an air–acetylene burner and standard solutions for calibration, with 1 ppm as the detection limit. The quality control procedure for these map sheets were also similar to those indicated above. For all surveys, water samples were collected into 250 ml polyethylene bottles and were sent to the contractor’s laboratory. The measurement of pH was done in the laboratory with a glass–calomel electrode system. For pH measurement an aliquot was poured into a separate container and pH determined using a glass–calomel system, and reported to the nearest 0.1 pH unit. Since the procedure for sampling and pH measurement for all surveys is similar, the major reason
194
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
for the existing shifts for pH may be because of slightly different instrumental calibrations.
3. Two examples of leveling pH values The first example is for shifts in the pH of stream water between the group of NGR sheets 2182, 2183 and 2184, which were sampled and pH measured in the same year (1989) and between the adjacent sheets BC RGS 34 and NGR 867 that were sampled in 1991 and 1981 respectively. Fig. 5a shows that there is a sharp change in pH value at the boundary between sheets 2039 and 34 and between 867 and 2039. To estimate the appropriate width of bands that should be constructed at the boundary between 2039=34 and 867=2039, bands of variable width (exploratory bands) were created within sheets 34 and 2039. Those within sheet 34 are 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 km wide and are shown in Fig. 6. Since these bands are taken from one map sheet, quantiles should plot close to a line where Y D X. Two methods were used to assess whether the quantile plots were close to the line Y D X. In the first, a parameter is defined that can estimate the range of optimal widths based on the differences between the quantile pairs. The parameter D as defined here, can help only as a guide to estimate the range of width for leveling bands. If the distribution of data between each band of the pair is identical, then the quantile pairs will be the same value. The value D is calculated: X ð Ł2 DD wi .qi /e .qi /e0 where wi D assigned weight to ith quantile, (qi /e D ith quantile in band e, (qi /e0 D ih quantile in band e0 . Weights are introduced to favor quantile pairs at or near the median based on the ordinates of the normal distribution (Krumbein and Graybill, 1965). The ordinate of a normal distribution for the 0.5 quantile (median) is 0.399, and the others were scaled relative to 0.399 (Table 2).
Fig. 5. Leveling pH of water samples for 2038, 2039, 2040 and 2182, 2183, 2184 map sheets. (a) Before and (b) after leveling (c) plutonic units in the area.
195
Variations of D with width of exploratory bands within sheets 34 and 867 are shown in Fig. 7. A quadratic polynomial curve is fitted to these D values. The range of widths consistent with low values for D are considered most appropriate as the initial estimate. A study of Fig. 7 shows that 10 to 30 km can be an initial estimate for the width of the leveling bands, because in this range both sheets have low values for D. Among the minimum D values, smaller ones are preferred because as the dimensions of the bands increase, the geology becomes more dissimilar. In the second method, to assess the optimal width more precisely, regression lines were fitted to the quantile plots for each pair of exploratory bands and 95% confidence limits were drawn on either side of the regression line. Using this approach, only exploratory bands of 5, 10 and 15 km width had confidence limits that covered the Y D X line (Fig. 8) and for example the bands of 20, 25 and 50 km width did not meet these criteria. The statistical estimate of quantiles improves as the number of samples used in their calculation increases, which affects the estimate of confidence limits. Note in Fig. 8 that the width of the confidence limits for the 15 km bands is less than either the 5 or 10 km bands. By considering the above discussions, for this map sheet, 15 km bands can be considered as optimal. Larger widths increase the number of samples used to estimate quantiles, but the increased widths introduce differences in geology between the band pairs. Similar analysis within sheet 867 also showed the 15 km width to be optimal. Thus in constructing bands to assess the boundary shifts at 34=2039 and 867=2039, widths of 15 km were used and sheets 34 and 867 were considered to have the ‘true’ values, i.e., they are the ‘a’ bands. These calculations show that the pH of samples within sheet 2039 can be corrected by: pH.34/ D 0:77pH.2039/ C 1:87 pH.867/ D 0:89pH.2039/ C 1:14 These values are similar and suggest that the pH within sheets 34 and 867 are similar, which is also apparent in the lack of shifts in the area occupied by sheets 34, 774 and 867 (Fig. 5). To assess this similarity, confidence limits were plotted for each
196
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
Fig. 6. Exploratory bands, which were put inside map sheet BC RGS34.
Fig. 7. Variations of D with width (km) of exploratory bands inside sheets 867 and BCRGS34.
of these regression lines (Fig. 9). As Fig. 9 shows, each of these lines are located within the confidence limits of the other line and could be considered indistinguishable at the 95% confidence level. Thus
Table 2 Ordinates of the normal distribution and assigned weights to quantities Quantities
Ordinates of the normal distribution
Weight
0.50 0.40 and 0.60 0.30 and 0.70 0.20 and 0.80 0.10 and 0.90 0.05 and 0.95
0.399 0.386 0.348 0.280 0.175 0.103
1 0.97 0.87 0.70 0.44 0.26
a final stage was to combine the bands flanking 34 and 867 with 2039 and estimate the corrections using both data sets simultaneously: pH.34 and 867/ D 0:84pH.2039/ C 1:93 It was also considered that, since sheets 2038, 2039 and 2040 were sampled and measured in the same year, this correction should be applied to all three sheets. Sheets 2182, 2183 and 2184 were sampled and measured in the same year. Bands of 15 km width between sheets 867 and 2184 were used for estimating quantile pairs and produced the equation: pH.867/ D 0:94pH.2184/ C 1:13 This correction was then applied to sheets 2182, 2183 and 2184.
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
197
Fig. 8. Regression lines and their 95% confidence limits for 5, 10, 15 and 50 km exploratory bands inside BCRGS34.
Fig. 9. Similarity of regression models derived for leveling between map sheets 2039=34 and between 2039=867.
When all of the above corrections are made, the map is replotted (Fig. 5b). The boundary shifts that were apparent in Fig. 5a are no longer present. The resulting trends in pH follow the geology of the
region, with the lowest pH values corresponding to the Coast Plutonic Complex and additional smaller areas of felsic rock in the Insular Belt of Vancouver Island (Fig. 5c).
198
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
4. An example of leveling Mo values Fig. 10a shows the same area first discussed for leveling pH in south western British Columbia. Here a shift exists for Mo in stream sediments and moss mat samples between map sheets 2182, 2183, 2184 and all the adjacent sheets. As Table 1 shows, map sheets 2182, 2183 and 2184 were sampled and analyzed in the same year (1989) by the same company
and laboratory, which are both different as compared to adjacent sheets. For leveling Mo values, a log10 transformation was first executed prior to the regression modeling. Set of exploratory bands were selected from inside sheet 2039 of width 5, 10, 15, 20, 25, 30, 35, 40, 45, and 50 km. Fig. 11 shows the regression lines and their confidence limits for the 5, 10, 15 and 50 km exploratory bands. For all exploratory bands, Y D X
Fig. 10. (a) Original values of Mo in map sheets 2182, 2183, and 2184. (b) Leveled map of Mo.
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
199
Fig. 11. Regression lines and their 95% confidence limits for 5, 10, 15 and 50 km exploratory bands inside 2039.
locates inside the confidence limits of regression lines. For the 10 and 30 km bands the estimated quantiles of log10(Mo) were equal. Therefore their regression lines did not have any confidence bounds, and the D value for the 10 km and 30 km bands is 0, which is the minimum possible value. Here, 10 km was judged the best estimate for the width of bands ‘a’ and ‘b’ which would result in a successful leveling. Only the northern borders of 2183 and 2184 were selected for leveling with the southern border of 2039 and 867 using 10 km bands. To check whether there is any significant difference if bands ‘a’ and ‘b’ are selected only between 2183 and 2039, or only between 2184 and 867, the following analysis was undertaken. The regression model appropriate for correcting the Mo of samples within sheet 2183 to sheet 2039 equivalent values was estimated as: log.Mo/.2039/ D 0:62 log.Mo/.2183/
0:15
The correction between sheets 2184 and 867 is: log.Mo/.867/ D 0:57 log.Mo/.2184/
0:08
These values are similar and suggest that the Mo measurement levels within sheets 2039 and 867 are similar, which is also apparent in the lack of shifts in the area occupied by sheets 2038, 2039, 2040 and 867 (Fig. 10). To check the similarity of these two regression models, their confidence limits were plotted (Fig. 12), each regression line is located within the confidence limits of the other. So these lines could be considered indistinguishable at the 95% confidence level. Here a final stage was to combine the bands ‘a’ and ‘b’ flanking 2184=867 and 2183=2039: log.Mo/.2039;867/ D 0:97 log.Mo/.2183;2184/
0:24
It was also considered that, since sheets 2182, 2183 and 2184 were sampled and analyzed in the same year, this correction should be applied to all three sheets. After leveling by this last equation, all of log(Mo) values are then transformed from log scale into the original scale and all the leveled Mo values that were less than 1 were assigned to 1 which is the minimum reported value for Mo.
200
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
Fig. 12. Similarity of regression lines derived for leveling between map sheets 2183=2039 and between 2184=867.
By using these leveled values, and the original values of Mo for the adjacent balanced map sheets, the map of Fig. 10b is created which does not show the boundary shift that were apparent in Fig. 10a. To make this map, the same imaging levels of Fig. 10a, were applied.
5. Conclusion To select the most appropriate length of the bands needed for leveling, first the geological situation must be considered. To estimate the best range of widths for leveling bands, a series of exploratory bands with different widths should be selected. These bands must be inside a balanced area with similar geology to that of the border with a boundary shift. For each pair of these exploratory bands with identical widths, quantiles of the data must be estimated. D is defined here as a parameter with which to check the difference between quantile pairs for exploratory bands. This parameter can be used as an initial estimator, to check in which range the width of the leveling bands should be selected. It must be followed by fitting a regression line to the estimated quantile pairs and plotting confidence limits of this line. The location of the Y D X line relative to these confidence limits can help in the decision as to which width or range of widths is best. The width whose
confidence limits include the Y D X line, which has the narrowest confidence limits and also has a low value for D, would be an appropriate choice for the width of bands ‘a’ and ‘b’. After estimation of a width band, quantiles of the data inside these bands must be estimated. The regression line, which fits these quantile pairs, is the leveling line equation. In calculating this leveling line, it is preferable to assign more weight to the central quantiles close to median. In this way it is possible to level the shifted values to the adjacent sheets and make a balanced map without a boundary shift at the contacts of map sheets.
Acknowledgements We wish to acknowledge Dr. Robert Garrett from Geological Survey of Canada for his general guidelines in the beginning of this study. Also Martin McCurdy and John Lynch from Geological Survey of Canada, Mr. Wayne Jackaman in the British Columbia branch of the Geological Survey and Dr. Ray Lett in the Analytical Sciences Unit of the Geological Survey branch in British Columbia for supplying information about National Geochemical Reconnaissance Surveys and the analytical procedures. We appreciate the comments on the manuscript by Drs. Eric Grunsky and Robert Garrett.
B. Daneshfar, E. Cameron / Journal of Geochemical Exploration 63 (1998) 189–201
References Darnley, A.G., Bjorkund, A., Bolviken, B., Gustavsson, N., Koval, P.V., Plant, J.A., Steenfelt, A., Tauchid, M., Xuejing, Xie, Garrett, R.G., Hall, G.E.M., 1995. A Global Geochemical Database for Environmental and Resource Management, Publication 19. UNESCO, Paris, 122 pp. Friske, P.W.B., Hornbrook, E.H.W., 1991. Canada’s National Geochemical Reconnaissance Programme. Trans. Inst. Min.
201
Metall. Sect. B Appl. Earth Sci. 100, B47–B56. Krumbein, W.C., Graybill, F.G., 1965. An Introduction to Statistical Models in Geology. McGraw-Hill, New York, 475 pp. Lynch, J.J., 1992. Requests for proposals of analysis of water and stream sediment samples. National Geochemical Reconnaissance (NGR) Surveys Geological Survey of Canada and BC Regional Geochemical Survey. Wilks, S.S., 1962. Mathematical Statistics. Wiley, New York, 644 pp.