0743-9547/91 $3.00 + 0.00 Pergamon Press plc
Journal of Southeast Asian Earth Sciences. Vol. 5, Nos l-4, pp. 317-320, 1991
Printed in Great Britain
Data processing of surface chemical exploration YANG WENKUAN P.O. Box 12, Jiangling County, Hubei Province, People’s Republic of China Abstract-This paper deals with the sampling network and data processing of surface chemical exploration for oil and gas. The author proposes a new method of multivariate analysis-the ellipsoid method for data processing. It is suggested that the equilateral triangle network may greatly reduce the exploration cost, and the “three-point method” may reduce the random fluctuation of the observed values.
INTRODUCTION chemical exploration for oil and gas is based on the assumption that small molecules in oil-gas pools may get into the crust surface by diffusion, and hydrocarbons and oil-field water may migrate to the surface by passageways such as faults. This assumption means that the study of these allochthonous matters in groundwater, unconsolidated sediment and rock outcrops, enables us to obtain some information concerning the location of the pools. The main method of chemical exploration is to determine the areas with anomalous index values. From the assumption and method mentioned above, the following conclusions can be drawn directly: THE SURFACE
(1) If the overlying sediments are very dense or have ample thickness, the diffusion resulting from the concentration gradient will be very weak. Under this condition, if there is no passageway near the pool, it will be very difficult to find an obvious abundance-anomalous area on the surface. That is to say, the better the conservation condition of the pool, the more difficult its discovery will be. (2) If the allochthonous matters from the pool cannot be well conserved in the surface, the formation of an anomalous area is also impossible. (3) Abundance contours of surface halo resulting from pure diffusion are generally smoother than the outline of the pool itself. Therefore, any abundance contour taken as the outline of the anomalous area is always different from the true outline of the pool. In fact, most of the anomalies are caused by migration instead of diffusion, so anomaly outlines are mainly controlled by faults instead of pool boundaries. (4) If the mature source-rocks are exposed on the surface owing to tectonic movement, then pseudoanomalies (non-pool-generated anomalies) irrelevant to the pools may be found. (5) The geochemical indices for surface exploration must be able to reflect the distribution of hydrocarbons. The index being polysemous or easy to be affected by surface factors (such as air temperature, rainfall, topography, mineral composition of outcrop, biochemical process, artificial pollution, etc.) cannot provide any reliable information.
(6) There is a risk of omission of pool-generated anomalies if the minimum distance between sampling points (i.e. side-line lengths of lattice of exploration network) is too large. But closely spaced distribution of sampling points will raise the exploration costs. (7) “Anomaly” is relative to “background”. The background value of a geochemical index in an exploration region may be different from another region. Even in a region without any petroleum accumulation, it is inevitable that the observed index values of parts of sampling points are greater than the mean value of the whole region. In this case, some non-pool-generated anomalous regions will be delineated inevitably. On the other hand, if the sampling region is within the region over the undiscovered pool, the delineated anomalous area will be much smaller than the pool. For the reasons given above, it is incorrect to write an “equal” sign between an anomalous area and a pool. But it is also incorrect to deny the practical effect of the surface chemical exploration, which can, after all, provide some information about the pools. Furthermore, if some improvements are made on index selection, network layout, data processing and result explanation, the exploration quality will be improved and the exploration cost per unit area may be reduced. I shall focus my discussion on network layout and data processing below.
LAYOUT OF NETWORK The exploration network of sampling points should be able to control small anomalous regions. If small anomalous regions are under control, then the larger anomalous regions will certainly be under control. By “under control”, I mean that there is at least one sampling point in the range of anomalous region. Both experience and calculation show that most small anomalous regions are roughly circular or elliptical. Needless to say, there is a maximum inscribed circle in each anomalous region (Fig. 1). If this kind of theoretical circle (“special circle”) is under control, then there is at least one sampling point in the anomalous region.
317
YANG WENKUAN
318
Reduction of random juctuation index
Fig. 1. Equilateral triangle network. Spots represent the sampling points, dashed lines show the exploration network, vertical lines indicate the o&gas pool, horizontal lines indicate the special circle. a hexagon whose apices are the seats of central virtual points shows the average controlling area per sampling point.
The traditional networks are square, rectangular, and irregular, but the best one should be an equilateral triangle. Supposing that there are a certain number of poolgenerated anomalous regions in a given exploration region with a total area of S, and that the special circle of the minimum pool-generated anomalous region that we want to control has a radius R and an area A = nR*, if we use an equilateral triangle network for controlling the special circle mentioned above, then the side-line length of lattice is 1.732 R, the area of the lattice is 1.299 RI, the average controlling area per sampling point is 2.598 R*, the number of points is 0.385 S/R2, and the total length of explorer’s route is 0.667 S/R. That is to say, the point number of our method is merely 76.98% of that of a square network, and the average controlling area per point increases by 29.90%. Thus it is possible that the cost decreases by about one-fourth. The side-line length L of an equilateral triangle depends upon the exploration requirement. The relationship between side-line length L and special-circle area A is:
L’= 3Ai7c, A =fnL2.
DATA PROCESSING Distribution qf abundance values When I researched the petroleum generation of the Upper Palaeozoic Group in Hunan Province, I found that abundance values of disseminated organic matters in ancient sedimentary rocks obeyed gamma distribution. Now some evidence shows that the values of most geochemical indices used for surface chemical exploration can also be described by the same distribution function.
qf observed values of an
It is probable that each observed value of an index contains random noise resulting from factors such as instrument quality, capability of operators, meteorological condition and so on. Such random noise may be reduced (or even eliminated) in a way that I call the “three-point method”. As mentioned above, if the radius of the special circle that we want to control is R, the side-line length of triangular lattice should be 1.732 R. At the centre (centre of gravity) of any triangular lattice, we can establish an assumptive sampling point called the “central virtual point” (Fig. 1). If the average of the three values observed at the apices of the triangular lattice is used as the virtual value L’of the central virtual point, the L’value will be more representative than the observed values at apices. This method is better than others such as trend surface analysis. If the exploration region is broad, the number of central virtual points will be nearly twice as many as the number of true sampling points. The virtual points constitute regular hexagons with a side-line length of R and an area of 2.598 R’. Delimitation of single-index anomalous region Anomalies exist objectively, but the threshold of anomalous values (i.e. the upper limit of “normal values”) decides the total area of delimited anomalous regions. A too-high threshold value may lead to some pools being omitted, but if the threshold is too low, some non-pool-generated anomalous regions will be included. As stated above, the values of geochemical index approximately obey gamma distribution. Consequently, the total area of delimited anomalous regions will make up l&16% of the total area of the exploration region when threshold is u, + s,, (mean value plus standard deviation), or 2-5% when threshold is u, + 2s,. (mean value plus double standard deviation). That is to say, when the threshold of anomalous values has been determined, the percentage of anomalous area was also determined. In order to guarantee that all big enough pool-generated anomalous regions are under control, we may use a smaller threshold instead of the bigger one. Moreover, in the process of determining threshold value, it is also necessary to consider geologicalgeochemical conditions such as the shape of possible trap, position of fault, and distribution and maturity of source rock. It must be pointed out that a single-index anomalous region delimited by one index may, in geochemical meaning as well as in position, be different from another anomalous region delimited by another index. Perhaps some anomalies relate to oil-pools, some anomalies relate to gas-pools, and the rest are non-pool-generated. We have to use the effective values of central virtual points for delimiting the single-index anomalous regions.
319
Data processing of surface chemical exploration Delimitation of comprehensive anomaly
The comprehensive anomalies can be delimited with either virtual values of central virtual points or observed values of real sampling points. Here let us deal with the virtual values only. Before delimitation of comprehensive anomalous regions it is necessary to calculate the mean value u, and standard deviation S, for each index, and to regularize the original data (virtual values of central points). A desirable method for the data regularization is to use the difference between mean virtual value u, and minimum virtual value t’,,, to divide the difference between virtual value v and v,,,. and to take the result x = (v -
vmin )/turn
-
vm~n)
as a new virtual value used for the delimitation of anomalous region. After regularization of original data, the mean value x, of x is 1.000, and the standard deviation S, is equal to the standard deviation s,. of v divided by v, - v,,,, namely ,s, = S,./(Pm- V,i”).
When Rk is greater than r, the point k will be anomalous because it is bound to be outside the ellipsoid. Conversely, when Rk is less than r, the point k undoubtedly lies inside the ellipsoid and is not anomalous. When the R, values of all central virtual points have been calculated and the Rk contour map have been drawn out, the comprehensive anomalous regions may immediately be obtained. The total area of anomalous regions depends on the r value. From above-mentioned equation, R, is geometrically the radius vector of point k’ whose coordinates are wlk, wti,. . . , w,k,. . . , w,,,. We can think that the point k’ is just the point k, but the coordinate scale in n-dimensional space has been changed-xjk has transformed into wjk after being divided by xlO. A simpler way for determining anomalous points is to use the value v, + MS, - u,in to divide the difference v - U,in, and directly obtain w,k
=
t”
-
%in
>/hn
+
Ms,.
-
v,,,~~
The coefficient M depends on the importance of the index. The above-mentioned method for delimitation of comprehensive anomalous regions may be called “ellipsoid multivariate analysis” or “ellipsoid method”. When the anomalous regions are delimited, we must make a reasonable explanation for the generation of the delimited anomalous regions using all geological, geochemical and geophysical materials in order to put forward a proposal for further exploration. In the process of explanation, it is necessary to consider the position of anomalous regions in the n-dimensional space. The calculating method is as follows: Assume that there are m central virtual points in a delimited anomalous region, then there are m equations:
If n indices, x,, x2, . . . , x,, . . . , x,, are used for the surface chemical exploration, then the regularized values, xlk, xZk, . , x,~, . . . , x,,~, may be considered as the coordinates of the point k (namely the point No. k) in this n-dimensional space. Thus each central virtual point is at a definite position in the n-dimensional space. Now we must give a threshold value to each index respectively. Because any index may be different from another in geochemical significance, it is reasonable that the threshold values, xlO, xlO, . . . , x,~, . . . , xno, are varied. That is to say, every index may get a weight factor (see below). For example, it is permissive (even necessary) to give a lower threshold x, + 2s, (mean value 2 2 92 R,=w,,+w:,+...+u,, plus double standard deviation) to an important index (e.g. benzene abundance), and to give a higher threshold R; = wy2+ w;* + . . . + iv;, x, + 3s, (mean value plus triple standard deviation) to . . . . . . . . . . . . . . . a relatively unimportant index. Rt, = w:, + w:,,, +. . . + w;,,,. Let us imagine that there is a theoretical “n-dimensional ellipsoid” whose semi-axes are rxIO, yxzO,. . . , YX,,, If the size of the vector sum of R,, R,, . . . , R, is R,, in the n-dimensional space, where r is not less than 1.OO. then the cosine values of the azimuth angles of R, are This ellipsoid may be described by the equation Cl = (w,, + WI2+. . . + w,nJlR, = x W,klRar q = (wz, + w22+ . . . +
It is adoptable that if a central virtual point is outside the ellipsoid, it will be considered as an anomalous point, and the points inside the ellipsoid are not anomalous. Any central virtual point k has a definite R, value calculated by the equation
.
.
C,, =
where M;~is abbreviation of ratio xjklx,, . And 1ix,,, may be considered as weight coefficient of xjk. FtAtS 5lhll
. (W,,
.
. +
.
.
.
Wn2
+
.
.
.
. . . +
w2m>l& .
.
.
= .
w,,)/R,
.
.
x .
W2kl& .
.
.
.
.
= x W,,k/Ro.
Considering c: + c: +. . . + c; = 1 , we have
=w~,+w:,+~~~+w,:,+~~~+w~,,
.
YANG WENKUAN
320
The c values are important for discriminating poolgenerated anomalous regions from non-pool generated anomalous regions, because the cj value reflects the relationship between the anomalous region and the index x,. CONCLUSIONS Application of the equilateral triangle network enables us to cut down the exploration costs. The side-line length of the triangular lattices must be 1.732 times as long as the radius of the inscribed circle (special circle) of the minimum anomalous region that we want to control.
If n geochemical indices are used for surface chemical exploration, then any central virtual point (or real sampling point) will be at a definite position in the n-dimensional space. By the ellipsoid method proposed in this paper, the point positions in this space may be used for discriminating points of comprehensively anomalous regions from others. All virtual values of central virtual points (or observed values obtained from real sampling points) should be regularized before determination of comprehensively anomalous points. Acknowledgement-The author is greatly indebted to Prof. Liu Guangding, the academician of Academia Sinica, who read the manuscript and gave warm encouragement.