Robust statistical methods for impulse noise suppressing of spread Spectrum induced polarization data, with application to a mine site, Gansu Province, China Weiqiang Liu, Rujun Chen, Hongzhu Cai, Weibin Luo PII: DOI: Reference:
S0926-9851(16)30119-7 doi: 10.1016/j.jappgeo.2016.04.020 APPGEO 2985
To appear in:
Journal of Applied Geophysics
Received date: Revised date: Accepted date:
17 April 2015 21 April 2016 28 April 2016
Please cite this article as: Liu, Weiqiang, Chen, Rujun, Cai, Hongzhu, Luo, Weibin, Robust statistical methods for impulse noise suppressing of spread Spectrum induced polarization data, with application to a mine site, Gansu Province, China, Journal of Applied Geophysics (2016), doi: 10.1016/j.jappgeo.2016.04.020
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Robust statistical methods for impulse noise suppressing
US CR IP T
of Spread Spectrum Induced Polarization Data, with application to a mine site, Gansu Province, China Weiqiang Liua
Hongzhu Caic
Weibin Luod
Institute of Geophysical and Geochemical Exploration CAGS, Chinese
MA N
a
Rujun Chen*b
Academy Of Geological Sciences, Langfang, Hebei 065000, PR China b
School of Geoscience and Info—physics, Central South University, Changsha,
University of Utah, Consortium for Electromagnetic Modeling and Inversion
TE
c
D
Hunan Province 410083,PR China.
Gansu Non-ferrous Geological Survey Institute, Lanzhou, Gansu 730000, PR China
AC
d
CE P
(CEMI), Salt Lake City, Utah, USA
ABSTRACT In this paper, we investigated the robust processing of noisy spread spectrum induced polarization (SSIP) data. SSIP is a new frequency domain induced polarization method that transmits pseudo-random m-sequence as source current where m-sequence is a broadband signal. The potential information at multiple frequencies can be obtained through in measurement. Removing the *Corresponding author E-mail addresses:
[email protected] (Chen Ru-jun),
[email protected] (Liu Wei-qiang).
ACCEPTED MANUSCRIPT
noise is a crucial problem for SSIP data processing. Considering that the
US CR IP T
ordinary mean stack and digital filter if not capable of reducing the impulse noise effectively in SSIP data processing, the impact of impulse noise will remain in the complex resistivity spectrum that will affect the interpretation of profile anomalies. We implemented a robust statistical method to SSIP data processing. The robust least-squares regression is used to fit and remove the linear trend
MA N
from the original data before stacking. The robust M estimate is used to stack the data of all periods. The robust smooth filter is used to suppress the residual noise
D
for data after stacking. For robust statistical scheme, the most appropriate
TE
influence function and iterative algorithm are chosen by testing the simulated
CE P
data to suppress the outliers’ influence. We tested the benefits of the robust SSIP data processing using examples of SSIP data recorded in a test site beside a mine
AC
in Gansu Province, China.
Keywords: Spread Spectrum Induced Polarization, Robust statistical methods, Gaussian random noise, Spike impulsive noise, Complex resistivity.
1. INTRODUCTION Induced Polarization (IP) methods performed in time or frequency domain are widely used at various stages of geophysical prospecting (Weller et al, 2010; Ntarlagiannis, 2010; Ryan, 2012). There have been great developments of IP in the aspects of the system of instruments, forward modeling and inversion
ACCEPTED MANUSCRIPT methods (Khesin, 1997; Titov et al., 2002; Loke et al, 2006). Spread spectrum induced polarization (SSIP) is a new frequency-domain induced polarization
US CR IP T
(FDIP) method (Xi et al., 2013; Xi et al., 2014). For SSIP, the transmitter transmits pseudo-random m-sequence waveforms as source current, and the receiver measures the potential difference between two electrodes.
Pseudo-random m-sequence is a kind of binary sequence generated using maximal linear feedback shift registers. This sequence is spectrally flat similar to
MA N
the random signal. Practical application for m-sequence is in digital communication systems that employ direct-sequence spread spectrum and
D
frequency-hopping spread spectrum transmission systems (Golomb, 1994). Its
TE
application in electrical prospecting is emerging recently (Duncan et al., 1980, Ziolkowski et al., 2007). The major reason for transmitting pseudo-random
CE P
m-sequence as source current is the broadband property of this signal. For SSIP, the complex resistivity at multiple frequencies can be obtained simultaneously.
AC
The underground geological structures can be estimated by calculating the IP parameters further (Chen et al., 2010). However, SSIP measurements are contaminated by noise from a number of different sources, including telluric current, industrial current interference, major power lines and so on. When a survey area is located beside a mine, there will be spike impulse interference appearing continuously, the amplitude of which is much higher than the Gaussian random noise and normal SSIP signal. This interference is mainly from the impulsive stray current and high-amplitude sferics pulses (Li et al., 1994). Stray current is caused by the electric power
ACCEPTED MANUSCRIPT equipment. Sferics pulses are produced by lightning strokes propagating in the ionosphere-earth waveguide cavity (Jewell and Ward, 1963). There is a huge
US CR IP T
frequency overlap between SSIP signal and impulsive interference. If the interference is not effectively suppressed, the calculation of complex resistivity will be greatly affected.
Mean stack and digital filter are the common methods to reduce noise in FDIP data processing (Zonge and Wynn, 1975; Xi et al., 2014). Mean stack can
MA N
effectively suppress the Gaussian random noise. If there is a little impulsive noise, this noise can also be reduced by prolonging observation time and
D
increasing stack times. But when the interference is serious and impulsive noise
TE
is present continuously, mean stack is not effective. Digital filter based on the Discrete Fourier Transform (DFT) is widely used in a variety of FDIP
CE P
instruments, which can suppress many kinds of noise by designing high-pass, low-pass, band-pass and notch filter. However there is a huge overlap between
AC
SSIP signal and impulsive interference in the frequency domain, which leads to unsatisfactory result of digital filtering for impulsive noise suppressing. Many other practical methods have been developed for noise reduction in geophysics. These methods include the mean filter (Canales, 1984) and the median filter (Gallagher and Wise, 1981). Although the mean filter can suppress the Gaussian random noise effectively, it cannot suppress the spike impulsive noise. In contrast, the median filter can suppress the spike impulsive noise effectively but it cannot suppress the Gaussian random noise. Additionally, wavelet analysis is also applied in FDIP data processing, which may cause signal distortion when
ACCEPTED MANUSCRIPT impulsive noise is serious. Robust processing of noisy (SSIP) data is investigated in this paper. Robust
US CR IP T
Statistical Methods are a generalized conception (Huber, 1981). This method has been widely used in geophysical signal processing. It is routinely used for estimating high-quality Magnetotelluric (MT) transfer functions (Egbert, et al., 1986; Chave, et al., 1987; Chave, 1989). In addition, the technique is also applied for gravity and magnetic interpretation (Silva and Cutrim, 1989),
MA N
controlled-source electromagnetic (CSEM) processing (Streich et al., 2013), transient electromagnetic (TEM) measurements processing (Buselli et al., 1996),
D
seismic data processing (Sabbione, 2013; Li et al., 2014) and geophysical
TE
monitoring problems (Lyubushin et al., 2002). In contrast, there have been few robust method applications in FDIP data processing. Whereas traditional method
CE P
cannot suppress spike impulsive interference effectively, we implemented robust statistical method to the SSIP data processing to improve the data quality.
AC
Specific applications include the following three aspects: robust least-squares regression is used to remove the linear trend from the original data. Robust M estimate is used to stack the data of all periods. Robust smooth filter is used to suppress noise in data after stacking. We first introduced the basic principle of robust M estimate for one-dimensional data and robust regression for multidimensional data. For robust statistical scheme, the most appropriate influence function and iterative algorithm are chosen by testing simulated data to suppress the outliers’ influence. Then we introduced the basic principle and data processing procedure of SSIP. Finally, we demonstrated the effectiveness of this
ACCEPTED MANUSCRIPT processing approach. We carried SSIP testing for a 2 D survey line beside a mine in Gansu
US CR IP T
Province of northwest of China. There is a disseminated lead and zinc ore fault initially detected underground in the survey area, The SSIP profile was centered on the top of the target using intermediate gradient array protocol with multiple current electrode distance. The transmitter transmits 5-order m-sequence as source current, and the receiver array is based on GPS sync and ZigBee network, 25
MA N
receivers or 100 channels are supported for data acquisition. We applied robust methods to improve the data quality. Gaussian random noise and peak noise is
suppressed effectively. There is a good agreement between the apparent
TE
D
resistivity anomalies and the geological target body with high polarization and low resistivity, which can help to improve the interpretation of previous
CE P
geophysical surveys in the survey area.
AC
2. ROBUST STATISTICAL METHOD There are outliers in original data inevitably due to the observation error and spike interference in the Process to obtain SSIP data. Some ordinary statistical methods are quite sensitive to outliers. A few outliers can change the statistical results significantly. The objective of robust statistical method is to deal with the assumed model using reasonable method and suppress outliers’ influence (Huber, 1981). 2.1 Robust M estimate Huber (1964) proposed Robust M estimate method, which is similar to the
ACCEPTED MANUSCRIPT Maximum Likelihood Estimate. To solve the practical problems using statistical method, firstly, we need to obtain original data
yi through
repeated
US CR IP T
observations. Where yi i , i 1, 2,3, 4.......N , 1 , 2 , 3 ,......, N is error sequence, N is the number of observations. In order to estimate the true value from multiple observations, According to the maximum likelihood criterion (R A Fisher,1950), The true value of can be calculated by minimizing the loss
N
MA N
function:
( i 1
yi
)
is called the Scale
D
where, the true value is also called Location parameter.
(1)
TE
parameter, representing residual distribution range. In order to calculate equation
CE P
(1), we set
( z)
d ( z) dz
(2)
AC
Then a necessary condition for a the minimization is that satisfy N
( i 1
yi
)0
(3)
By solving the equation (2), we can get the estimated value of . ( x) is called as Influence Function, which represents a class of functions that determine the influence of original observations in calculating the estimated value . When loss functions ( z )
z2 , influence function ( z ) z , the solution is 2
ACCEPTED MANUSCRIPT N
yi / N . This is the ordinary least squares estimation, also called mean i 1
US CR IP T
stack. This method is quite sensitive to outliers. The principle of robust M estimation is to make the ( x) be a bounded odd function, namely ( x) ( x) . Additionally, it is close to ( x) x for
x
small and becoming a constant or decreasing to zero for
x
large values.
Thus, the weight of outliers is reduced. There are various types of ( x)
MA N
functions in robust statistics(Holland,1977), such as: ―Andrews‖ (Andrews et al., 1972), ―Beaton‖(Beaton and Turkey, 1974), ―Talwar‖ (Hinich and Talwar, 1975), ―Cauchy‖ (Cauchy or t-likelihood),
―Welsch ‖(Dennis and Welsch, 1976),
TE
D
―Huber ‖(Huber, 1964), ―Logistic‖(logistic), ―Fair‖(Fair, 1974), ―Hampel‖(Andrews et al., 1972). The different loss functions and influence
CE P
functions are given in Table 1 and plotted in Fig 1. The most appropriate influence function and iterative algorithm are chosen by testing the simulated
AC
noise data to suppress the influence of outliers afterward. The key aspect of robust M estimate is to solve the nonlinear equations (3). Huber (1981) presented two kinds of iterative algorithms: Modified residuals and modified weights. (1) Modified residuals Firstly, let the median value and median absolute deviation of original observation sequence be the initial estimates 0 and 0
0 median yi
(4)
ACCEPTED MANUSCRIPT 0 median yi 0
(5)
The iterative formula is defined as:
US CR IP T
y 1 N ( i k ) 0 N 0 k 1 k i 1N y 1 '( i k ) N i 1 0
Where, is the derivative of influence function . When '
N
( '
i 1
MA N
equal to 0, the denominator should be replaced by a constant C ,
(6)
yi k
0
) is
1
(2) Modified weights
D
n will lead to the exact solution of equations (3) in a finite number of steps.
TE
Also use the formula (4) and (5) to define the initial estimates 0 and 0 , the
N
k 1
AC
CE P
iterative formula is defined as:
w i 1 N
k i
yi
(7)
w i 1
k i
with the weighting functions:
( w k i
yi k
0 yi k 0
)
(8)
The iteration limit is the solution of equations (3) Besides the two algorithms above, Huber (1981) also presented the joint M-estimates of location and scale. Englund (1988) also presented recursive M estimators of location and scale for dependent sequences. These algorithms are
ACCEPTED MANUSCRIPT complex and will not be discussed here. 2.2 Robust least squares regression
US CR IP T
Robust least squares regression is a further development of robust M estimates, which can be used for two-dimensional or multidimensional data fitting (Holland,1977;Street,1988). As for regression model: y Xβ ε
ε
are n 1 vector. X is a n p matrix.
MA N
Where y and
(9)
is a p 1 vector.
A robust estimate of β should minimize N
yi xi β
D
(
)
(10)
TE
i 1
CE P
Where, yi is the ith element of y . x i is the ith row vector of X , is a scalar parameter, which decides the range of residual and the weight of data.
AC
is the derivative function of the loss function ,Thus the satisfy N
x ( i 1
ij
yi xi β
)0
j 1, 2,..., p
(11)
Equation (11) is also a nonlinear equation and the iterative method is required. Iteratively reweighted least-squares are the common method (Holland, 1977). The iterative formula is defined as:
βk 1 βk (XT WX)1 XT W(y βk ) Where W is a n n diagonal matrix defined by weight function w( With:
(12)
y Xβ0
),
ACCEPTED MANUSCRIPT w(r )
(r )
(13)
r
US CR IP T
The initial estimation of β 0 and 0 can be obtained by the ordinary least-squares regression and calculating median absolute deviation. Similar to robust M estimate, there are various types of ( x) functions in robust least squares regression. The different loss functions, influence functions and weighting functions are given in Table 1.
MA N
For the robust least squares regression y Xβ ε , when the X is not
a
n p matrix, but a n 1 vector composed of 1, namely [1, 1, 1, 1, …, 1], The
D
calculated result using iterative reweighted least-squares algorithm is the
Name Andrew s
Beaton
The different loss function, influence function and weight function in robust
AC
Table 1
CE P
TE
same as the robust M estimate value of y by modified weights method.
statistics
(r) r 2 A [1 cos( )] A 2 2A B2 r 23 2 [1 [1 ( B ) ] ] 2 B 2
(r)
w(r)
r A sin( ) A 0
r A sin( ) A r 0
2 r 2 r[1- ] B 0
r 2 2 [1- ] B 0
Range
r A r A r B r B
ACCEPTED MANUSCRIPT
Talwar
r 2 / 2 2 T / 2
Cauchy
C2 r ln[1 ( )2 ] 2 C
Logistic
r
r ( )2 W2 [1 e W ] 2 r2 2 2 H r - H 2 r L2 ln[cosh( )] L
r[e
r 2 ) W
]
r 1 C
2
[e
(
r 2 ) W
]
r sign(r)H
1 H r
r Ltanh( ) L r r 1 F
L r tanh( ) r L 1 r 1 F
r asign(r) c r sign(r) a c b 0
1 asign(r) r a c r sign(r) cb r 0
D
TE
r2 2 2 a a2 a r 2 2 r 1 cr a a c b 2 c b 0
CE P
(
2
r H r H
r a a r b b r c r c
AC
Hampel
1
r 1 C
r r F2 [ ln(1 )] F F
Fair
r T
US CR IP T
Huber
r T
1 0
MA N
Welsch
r 0
In table 1, ―A, B, T, C, W, H, L, F, a, b, c‖ are tuning constants. Let the tuning constants be the numerical value that Holland (1977) proposed, the curves of different Influence function ( x) are illustrated in Fig 1.
ACCEPTED MANUSCRIPT
-2 -10
10
2
Cauchy
0
-5 -10
0
-2 -10
10
2
Logistic
2
Huber
0
0
-2 -10
10
0
0
10
0
10
0
10
0
-2 -10
10
2
0
-2 -10
-2 -10
10
0
10
0
-2 -10
D
0
0
2
MA N
Talwar
5
0
Welsch
0
Fair
-2 -10
0
US CR IP T
0
2
Bisquare
2
Andrews
Hampel
2
TE
Fig. 1. Function curve of different influence functions
CE P
2.3 Simulation tests of various iterative algorithm and influence function In order to choose the best iterative algorithm and influence function to
AC
suppress the outliers, the simulated noise is generated by computer, which contains two parts, one representing the Gaussian random noise with a standard deviation 1 and mean value 0, another representing the spike impulsive noise with a standard deviation varying from 0 to 10 and mean value varying from 0 to 5. The spike noise can be treated as outliers of the Gaussian noise. The percentage of outliers ranges from 0 to 35%. The test is carried out for modified residuals and modified weights, respectively. The closer the absolute value of the estimated results i
is to 0,
the better the effect of noise suppression will be. The absolute value of the
ACCEPTED MANUSCRIPT estimated results for different influence functions using the modified residuals are given in Table 2. The absolute value of the estimated results for different
US CR IP T
influence functions using modified weights are given in Table 3. Where, ―mean‖ and ―median‖ represents the absolute value of estimate results through mean stack and median stack respectively.
Percent of outliers
0
5%
10%
15%
20%
25%
30%
35%
D
Influence
MA N
Table 2 Estimate results of different influence functions using modified residuals
TE
function
0.0000
0.9726
2.1108
2.7840
3.5126
5.7169
5.4073
5.0835
Median
0.0922
0.2223
0.2589
0.4510
0.4647
0.4647
0.5482
0.7709
Hampel
CE P
Mean
0.0377
0.0885
0.1088
0.0756
0.1055
0.0375
0.2255
0.3914
0.0416
0.0865
0.0933
0.0497
0.0806
0.0189
0.2076
0.2509
0.0418
0.0866
0.0935
0.0505
0.0804
0.0153
0.2076
0.2557
0.0451
0.0865
0.0573
0.0247
0.0388
0.0015
0.2009
0.1921
Cauchy
0.0412
0.1066
0.1271
0.1556
0.2100
0.1813
0.4009
0.6451
Welsch
0.0415
0.0868
0.0976
0.0647
0.0887
0.0086
0.2164
0.3136
Huber
0.0285
0.1448
0.1998
0.3231
0.4168
0.5456
0.8038
1.0705
Logistic
0.0413
0.1678
0.2224
0.3675
0.4655
0.6069
0.8623
1.1572
Fair
0.0438
0.2018
0.2742
0.5035
0.6259
0.8926
1.1770
1.5407
Andrews Bisquare
AC
Talwar
ACCEPTED MANUSCRIPT
Percent of outliers
0
5%
10%
Mean
0.0000
0.9726
Median
0.0922
Hampel
US CR IP T
Table 3 Estimate results of different influence functions using modified weights
15%
20%
25%
30%
35%
2.1108
2.7840
3.5126
5.7169
5.4073
5.0835
0.2223
0.2589
0.4510
0.4647
0.4647
0.5482
0.7709
0.0377
0.1480
0.1818
0.2597
0.3280
0.3932
0.6629
0.8481
Andrews
0.0416
0.1499
Bisquare
0.0418
0.1499
Talwar
0.0763
0.2713
Cauchy
0.0412
Welsch
Influence
0.2683
0.3550
0.4109
0.6825
0.8558
0.1967
0.2530
0.3545
0.4119
0.6817
0.8338
0.2942
0.1901
0.2535
0.3622
0.5837
0.7952
0.1066
0.1271
0.1556
0.2100
0.1813
0.4009
0.6451
0.0415
0.0868
0.1106
0.0647
0.0887
0.0086
0.2164
0.3136
Huber
0.0285
0.1448
0.1998
0.3231
0.4168
0.5456
0.8038
1.0705
Logistic
TE
0.0413
0.1678
0.2224
0.3675
0.4655
0.6069
0.8623
1.1572
0.0438
0.2018
0.2742
0.5035
0.6259
0.8926
1.1770
1.5407
AC
Fair
D
0.1968
CE P
MA N
function
From Table 2 and Table 3, we can see that: the mean is the worst over all performers. Performance of the mean rapidly worsens with an increasing in the percentage of outliers present. The estimator deviates significantly from 0. For modified residuals, the estimating error of ―Huber‖ ―Logistic‖ ―Fair‖ is greater than the median method and the estimating error of other six kinds of function is less than the median method. The ―Talwar‖ performs best over the whole range of different percentage of spike noise. For modified weights, the estimating error of ―Welsch‖ ―Cauchy‖ is less than the median method and the estimating error of
ACCEPTED MANUSCRIPT other kinds of function is greater than the median method slightly. The ―Welsch‖ perform best over the whole range of different percentage of spike noise. The
US CR IP T
estimator of ―Cauchy‖ ―Welsch‖ ―Huber‖ ―Logistic‖ ―Fair‖ using modified weights are the same as the estimator using modified residuals.
From Table 2 and Table 3, we can draw a conclusion that the outliers’ effect on estimator is decided by the ( x) function form. For ―Hampel‖ ―Andrews‖ ―Bisquare‖ ―Talwar‖, they are decreasing to zero for x large values. For
MA N
―Cauchy‖ ―Welsch‖, they tend to be zero, but not be equal to zero for x large values. For ―Huber‖ ―Logistic‖ ―Fair‖, they are becoming constant for x large
D
values. The degree of suppressing outliers for three kinds of ( x) function is
TE
different, so the estimator of different ( x) function using various iterative
CE P
algorithms is different.
In summary, for robust M estimate, using ―Talwar‖ function and the calculated modified residuals can suppress the outliers best. For robust least
AC
squares regression, using ―Welsch‖ function with iteratively reweighted least-squares is most appropriate.
3. APPLICATION TO A FIELD CASE We applied the robust processing methods to SSIP data recorded in a test site beside a mine in Baiyin city, Gansu Province of northwest of China. The survey area is located in the east of Beiqilian Caledonian fold. The lava in this area is developmental. The deposit is under the control of the regional structures. We carried SSIP testing in a 2D survey line. The design of the survey was based
ACCEPTED MANUSCRIPT on a target concept and refined from the interpretation of previous geophysical surveys (CSAMT, TEM and TDIP/resistivity) organized by Chinese Bureau of
US CR IP T
Geological Survey and regional structural analysis. There is a disseminated lead and zinc ore fault initially detected underground in the survey area, which is an east-west trend. The SSIP survey line is perpendicular to the fault which is roughly oriented in north-south direction, and small number survey stations are set in the south. The SSIP profile (1980 m, 100 potential electrodes with
MA N
non-polarized electrodes in -1000 m to 1000 m with spacing of 20 m) was centered on the top of the target using intermediate gradient array protocol with
D
multiple current electrode distances. Currents were injected from variable current
AC
CE P
TE
electrode distance AB /2 with a range from 1400 m to 2600 m in turn (Fig. 2).
Fig. 2. Map of survey line: Red circle along the line is the location of current
ACCEPTED MANUSCRIPT electrode and max current electrode spacing is 5200 m, Red thick lines in the middle of
US CR IP T
the line is the potential electrode and max potential electrode spacing is 2000 m.
The receiver array is based on GPS sync and ZigBee network, 25 receivers or 100 channels are supported for data acquisition. Intermediate gradient array protocol with multiple current electrode distances is adopted. By using the existing multi-channel transmitter and receiver with a little extra work, we can
MA N
get more apparent resistivity and apparent phase data for 2 D inversion. The transmitter generates 5-order m-sequence as source current with the sampling rate of 64 Hz and 1024 data per period, The value of current is 8 A and the base
D
frequency is 1/16 Hz. Transmitting current (5-order m-sequence) corresponds to
TE
a period (16 s) and part of the amplitude spectrum of this signal at 0-1 Hz is
CE P
shown in Fig. 3. The power of this signal at frequencies 1/16 Hz—1 Hz is more uniformly flat. The complex resistivity with a similar SNR at 16 frequencies can
AC
be calculated simultaneously by transmitting m-sequence current and measuring the potential information.
ACCEPTED MANUSCRIPT
(a) 10
0 -5 -10
0
2
4
6
2000
1000 500
0.1
0.2
0.3
0.4 0.5 0.6 Frequency / Hz
12
0.7
14
0.8
0.9
16
1
(a) Transmitting current ( 5-order m sequence ) corresponds to one period (16
TE
Fig. 3.
0
10
D
0
8 Time / s (b)
MA N
Amplitude
1500
US CR IP T
I/A
5
CE P
s); (b) Part of the amplitude spectrum of transmitting current at 0-1 Hz
3.1 Processing of SSIP data
AC
Firstly, we implemented the robust statistical method to SSIP data obtained in one survey station to demonstrate the detail of this processing method. For this survey station, current electrodes A and B is located in -2200 m and 2200 m, the potential electrode M and N are located in 760 m and 780 m. The distance in the profile is 770 m. For the survey station, we obtained SSIP data of sufficient periods (more than 160 periods) by measuring repeatedly for a long time (more than 40 minutes). Application of robust processing method includes three stages. The robust least squares regression is used to fit and remove the linear trend from the original data before stacking. The robust M estimate is used to stack the
ACCEPTED MANUSCRIPT data of all periods. The robust smooth filter is used to suppress residual noise in data after stacking.
US CR IP T
In SSIP, when the time for acquiring data is too long, the observed potential signal will contain trending item with the time, including direct current component, linear trending or nonlinear trending. The trending is mainly caused by the change of electrode charge, spontaneous potential, zero drift of amplifier, variation of air temperature and other low-frequency interference, which can
MA N
affect the Fourier transform of the potential data. Removing the trending from the original data before stacking time series is an essential task for repeated
D
observation SSIP data. Nonlinear trending only occurs in some special
TE
circumstances. The direct current components and linear trending are common in
value.
CE P
SSIP data, which can be removed by subtracting the mean value and linear fitting
SSIP data in the survey station mentioned before have 160 periods in total
AC
by repeated measuring, with 1024 data per period (16 s). We should stack each data of all the 160 periods. Then, we can get a period of data (1024 data) after 160 stacks. However, before stacking, removing the trending is necessary. We used the robust least-squares regression and the ordinary least squares regression to fit the linear trend respectively. Fig 4 shows one of the 1024 data in different periods from 1 to 160. The data of different periods contains the background Gaussian random noise, the outliers caused by the spike noise and the linear trends caused by other reasons. Fitting result of the ordinary least-squares is affected by the spike noise greatly. In contrast, the fitting result of the robust least
ACCEPTED MANUSCRIPT squares is almost not affected by the outliers.
20
US CR IP T
Original data Odinary least squares Robust least squares
19
17
16
MA N
U ( mV )
18
15
13 0
TE
40
60 80 100 Repeated observations
120
140
160
One of the 1024 data (a period) in repeated observation periods and the linear
CE P
Fig. 4.
20
D
14
trend fitting using different regression methods
AC
After subtracting the fitting value of the linear trend, we stack the data of all periods to suppress the Gaussian noise and spike noise. Fig 5 shows the results of mean stack and robust stack for one of the 1024 data. The estimation of mean stack is affected by the outliers greatly, which deviate significantly from the real value. In contrast, the robust stack is almost not affected by the outliers that caused by spike impulsive noise. So when the interference is serious and impulsive noise is present continuously, mean stack is not effective, robust stack can reduce Gaussian random noise and impulse noise effectively.
ACCEPTED MANUSCRIPT
19 Original data after detrending Mean stack Robust stack
US CR IP T
18
U ( mV )
17
16
15
13
20
40
60 80 100 Repeated observations
120
140
160
D
12 0
MA N
14
CE P
TE
Fig. 5. The Comparision of mean stack and Robust stack for detrend data
For the 1024 data points of a period, we processed each data point by removing the linear trending and stacking data of all the 160 periods. Finally, we
AC
obtained the signal in one period consisting in 16 s of measurements. Fig 6 shows the original potential signal in 160 periods and the result in one period after robust stacking.
ACCEPTED MANUSCRIPT
(a) 40
0 -20 -40
0
2
4
0
2
4
6
40
MA N
0 -20
Fig. 6.
6
8 Time / s
10
12
14
16
10
12
14
16
D
-40
8 Time / s (b)
(a) Potential signal in 160 periods before robust stacking;
TE
U( mV)
20
US CR IP T
U( mV)
20
CE P
(b) The result in one period after robust stacking.
The robust stacking can remove most of the random noise and spike noise.
AC
However, the residual noise may still remain in the SSIP data after stacking when the interference is very serious. We proposed a robust smoothing filter to suppress spike impulsive noise and Gaussian random noise for SSIP signal after stacking. Robust smoothing filter also can be used for spike filtering for the survey line. The essential idea is to replace the observed value in the center of a sliding window by the Robust M estimate value of the data within the window. For an example, the data within a sliding window is [2,-2, 1,-1, 100]. The result of mean filter is 20, which cannot suppress spike impulsive noise. The result of median filter is 1, which cannot suppress random noise. The M estimate is 0,
ACCEPTED MANUSCRIPT which can reduce the spike impulsive noise and random noise meanwhile. The smoothing filter may lead to signal distortion when the sliding window is too
US CR IP T
large. So when the interference is not particularly serious, this processing method is not necessary.
Fig 7 shows the smooth processing result of SSIP data after stacking using the piecewise robust smooth filter. For this survey point, noise in potential
MA N
signal is further suppressed.
(a)
40
0
D
U(mV)
20
0
2
4
6
8 Times /s (b)
10
12
14
16
4
6
8 Times / s
10
12
14
16
CE P
-40
TE
-20
40
0
AC
U(mV)
20
-20 -40
Fig. 7.
0
2
(a) Potential difference curve before Robust Smooth filtering; (a) Potential difference curve after Robust Smooth filtering
3.2 Calculated results of the complex resistivity The formula of complex resistivity in FDIP is:
ACCEPTED MANUSCRIPT ( w) K
U ( w) I ( w)
(14)
US CR IP T
Where, K is the setting coefficient, U (w) is the frequency spectrum of observed potential difference, I ( w) is the frequency spectrum of transmitting current.
There are four steps to processing SSIP data of multiple periods in one survey point using robust methods,
MA N
1) A period data contains 1024 data points. For each sampling of all the periods, using robust least-squares regression to fit and remove the linear trending.
TE
D
2) Using robust M estimate to stack potential difference and current data of all periods and using the robust smooth filter to suppress residual noise.
CE P
3) Getting frequency spectrum of transmitting current and potential difference using Discrete Fourier Transform.
AC
4) Calculating complex resistivity spectrum according to the formula (14). Although the spectrum of the complex resistivity in SSIP is from 0 Hz to 32 Hz, We only take complex resistivity at a few frequencies from 1/16 Hz to 1 Hz (the first 2-17 data) as the results to ensure the SNR. We further calculate the mean of the results at every four frequencies as the complex resistivity at 0.16 Hz, 0.41 Hz, 0.66 Hz and 0.91 Hz. For this survey point, 160 periods of SSIP data have been obtained through repeated observation. In order to study the convergence of calculated results with increasing stack times, we calculate the complex resistivity at four frequencies
ACCEPTED MANUSCRIPT and stack from 10 times to 160 times. Fig 8-Fig 9 show the convergence of calculated results with increasing stack times using different processing methods
(a)
(b)
782 780 778 mean stack robust stack
776 0
50 100 Stack times (c)
mean stack robust stack
780
150
D
800
mean stack robust stack
798
50 100 Stack times
CE P
0
150
Resistivity( *m )
802
796
785
0
50 100 Stack times (d)
150
814
TE
Resistivity( *m )
804
Resistivity( *m )
790
MA N
Resistivity( *m )
784
774
US CR IP T
at the four frequencies, respectively.
812 810 808
mean stack robust stack
806 804
0
50 100 Stack times
150
Fig. 8. Convergence of apparent complex resistivity with increasing stack times at 4
AC
frequencies using mean stack and robust satck: (a) 0.16 Hz; (b) 0.41 Hz; (c) 0.66 Hz; (d) 0.91 Hz.
ACCEPTED MANUSCRIPT (a)
(b) 88
28 26 24 22
0
50 100 Stack times (c)
82
0
50 100 Stack times (d)
150
136 134 132 0
50 100 Stack times
Phase(mrad)
176
mean stack robust stack
mean stack robust stack
174 172 170
MA N
Phase(mrad)
138
130
84
80
150
mean stack robust stack
86
US CR IP T
mean stack robust stack
30
Phase(mrad)
Phase(mrad)
32
150
168 166
0
50 100 Stack times
150
D
Fig. 9. Convergence of phase with increasing stack times at 4 frequencies using mean
TE
stack and robust satck: (a) 0.16 Hz; (b) 0.41 Hz; (c) 0.66 Hz; (d) 0.91 Hz.
CE P
In Fig. 8-Fig. 9, we present the convergence of apparent complex resistivity and phase with increasing stack times using different methods at four frequencies.
AC
For ordinary mean stack, the complex resistivity and phase converge to the correct solution after 160 stacks, because the original data are contaminated by Gaussian random noise and impulse spike noise and mean stack is not effective to remove the spike noise appearing continuously. For robust methods, the calculated result is more accurate and the results have already converged to the correct solution after 180 stacking. The robust processing can suppress the Gaussian random noise and spike noise effectively, which can improve the quality of SSIP data processing and save measuring time in the field.
ACCEPTED MANUSCRIPT
Using the robust method and traditional method to process all the SSIP data
US CR IP T
for different current electrode distance, profiles of apparent complex resistivity and phase at different frequencies can be obtained. In order to highlight the anomalies, we displayed the profiles of apparent complex resistivity and phase from distance -600 m to 600 m for current electrode distance AB/2 from -1600 m
frequency=0.16 Hz frequency=0.41 Hz frequency=0.66 Hz frequency=0.91 Hz
a1
1200 800 400
D TE
800
0
a2
CE P
2000 1600 1200 800 400 0
b1
1200
400
AC
Apparent Resistivity ( Ohm-m )
2000 1600 1200 800 400 0
1600
MA N
to 2600 m in Fig. 10 and Fig. 11.
a3
b2
1000 800 600 400 200 0
b3
500 400 300 200 100
0
1200
a4
600 500 400 300 200 100 0
b4
a5
1000 800 600 400 200 0
b5
800 400 0
1600 1200 800 400 0 -600
-400
-200
0
200
Profile coordinate ( m )
400
600
-600
-400
-200
0
200
Profile coordinate ( m )
400
600
ACCEPTED MANUSCRIPT
Fig. 10. Profiles of apparent complex resistivity at 4 frequencies for different current
US CR IP T
electrode distance. (a1) traditional method when AB/2=1600 m; (a2) traditional method when AB/2=1800 m; (a3) traditional method when AB/2=2000 m; (a4) traditional method when AB/2=2200 m; (a5) traditional method when AB/2=2400 m; (b1) robust method when AB/2=1600 m; (b2) robust method when AB/2=1800 m; (b3) robust
MA N
method when AB/2=2000 m; (b4) robust method when AB/2=2200 m; (b5) robust
AC
CE P
TE
D
method when AB/2=2400 m.
ACCEPTED MANUSCRIPT
frequency=0.16 Hz frequency=0.41 Hz frequency=0.66 Hz frequency=0.91 Hz
400 300
300
200
200
100
100
0
0
a2
400 300
b2
400 300
200
200
100
100
0
MA N
0
a3
2000
b3
1400 1000
1000
600
0
200
-1000
-200
-2000
a4
1000
D
Apparent Phase ( mrad )
b1
US CR IP T
a1
400
TE
600 200
1000 600 200
AC
-200 -600
-400
-200
0
600 200 -200
a5
CE P
-200
b4
1000
b5
1000 600 200 -200
200
Profile coordinate ( m )
400
600
-600
-400
-200
0
200
400
600
Profile coordinate ( m )
Fig. 11. Profiles of apparent phase at 4 frequencies for different current electrode distance. (a1) traditional method when AB/2=1600 m; (a2) traditional method when AB/2=1800 m; (a3) traditional method when AB/2=2000 m; (a4) traditional method when AB/2=2200 m; (a5) traditional method when AB/2=2400 m; (b1) robust method when AB/2=1600 m; (b2) robust method when AB/2=1800 m; (b3) robust method when AB/2=2000 m; (b4) robust method when AB/2=2200 m; (b5) robust method when
ACCEPTED MANUSCRIPT
AB/2=2400 m.
US CR IP T
Fig. 10 shows profiles of apparent complex resistivity at 4 frequencies for different current electrode distance using the traditional method and the robust method respectively. Although the results as seen in the profile processed by the traditional method show obvious similarity with the robust processing data, robust processing can improve the quality of the data and remove the outliers
MA N
caused by environmental interference. The anomalies are more obvious and reasonable. Fig. 11 shows profiles of the apparent phase at 4 frequencies, which
D
shows similar characteristics to the apparent complex resistivity profiles.
TE
However, phase suffered from lighter contamination, compare to complex resistivity.
CE P
In Fig. 10, profiles show a low subsurface apparent resistivity anomaly from a distance at -190 m to -10 m. The anomalies increase with increasing current
AC
electrode distance AB/2. In the meanwhile, in Fig. 11, profiles show a high subsurface apparent phase anomaly from survey station at -190 m to -10 m. The anomalies also increase with increasing current electrode distance AB/2. Both apparent complex resistivity and apparent phase anomalies are related to the low resistance and high polarization Lead-zinc ore bodies. The complex resistivity at four frequencies shows compatible anomalies in profiles. However, the apparent phase at 0.16 Hz when AB/2 =2400 m shows a negative phase anomaly in the profile. This may be a false anomaly caused by other electromagnetic interference, which should be studied further. In SSIP, the complex resistivity
ACCEPTED MANUSCRIPT and phase at multiple frequencies can be calculated simultaneously by transmitting pseudo-random m-sequence as source current. The information is
US CR IP T
more sufficient with the same cost of field work. There is a good agreement between apparent resistivity and the interpretation of previous geophysical surveys.
4. CONCLUSIONS
MA N
Our results demonstrated that SSIP method transmitting pseudo-random m-sequence for current injection and calculating complex resistivity response at four frequencies is practically applicable in exploring for structurally controlled
D
deposits in the survey area. We also show that the measuring time can be reduced
TE
and that the data quality can be generally improved by using the robust
CE P
processing method to suppress the outliers caused by spike noise. Applications of the robust approach described are not limited to SSIP data. Robust regression,
AC
robust stack, and robust smooth filter also can be used for data of multiple periods in other geophysical methods. Robust processing is effective for removing the outliers caused by spike noise, comparing to the traditional method. However, the robust processing is not necessary for SSIP data with high SNR. When SSIP is contaminated mainly by the Gaussian random noise, compared with ordinary mean stack, the robust processing does not change the calculated results significantly. In addition, the pseudo-random m-sequence not only has the broad band characteristic, but also has special autocorrelation and
ACCEPTED MANUSCRIPT cross correlation characteristic. The noise suppressing methods based on
US CR IP T
m-sequence correlation identification will be investigated in the future study.
ACKNOWLEDGMENTS
This research is supported by the Fundamental Research Funds for the Central Universities of Central South University, China. SSIP data measured in
MA N
Gansu Province, China were kindly supplied by Champion Geophysical Technology Ltd. We are grateful to the help of company technician Wu Hong,
D
Yao Hong-chun, Shen Rui-jie, Shi Hong-hua and Qiu Jie-ting in SSIP method
TE
instruction, instrument and data processing. We thank reviewers and editor for
CE P
their reviews and comments.
AC
REFERENCES
Andrews, D. F., & Hampel, F. R. (2015). Robust estimates of location: survey and advances. Princeton University Press. Beaton, A. E., & Tukey, J. W. (1974). The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics, 16 (2), 147-185. Buselli, G., & Cameron, M. (1996). Robust statistical methods for reducing sferics noise contaminating transient electromagnetic measurements. Geophysics, 61 (6), 1633-1646.
ACCEPTED MANUSCRIPT
Canales, L. (1984). Random noise reduction: 54th Annual International
US CR IP T
Meeting, SEG, Expanded Abstracts, 525–527. Chave, A. D., Thomson, D. J., & Ander, M. E. (1987). On the robust estimation of power spectra, coherences, and transfer functions. Journal of Geophysical Research: Solid Earth (1978–2012), 92 (B1), 633-648. Chave, A. D., & Thomson, D. J. (1989). Some comments on
MA N
magnetotelluric response function estimation. Journal of Geophysical Research: Solid Earth (1978–2012), 94 (B10), 14215-14225.
D
Chen, R., Zhangxiang, H., Jieting, Q., Lanfang, H., & Zixing, C. (2010).
TE
Distributed data acquisition unit based on GPS and ZigBee for electromagnetic
CE P
exploration. In Instrumentation and Measurement Technology Conference (I2MTC), 2010 IEEE (981-985). Dennis Jr, J. E., & Welsch, R. E. (1978). Techniques for nonlinear least
AC
squares and robust regression. Communications in Statistics-Simulation and Computation, 7 (4), 345-359. Duncan, P. M., Hwang, A., Edwards, R. N., Bailey, R. C., & Garland, G. D. (1980). The development and applications of a wide band electromagnetic sounding system using a pseudo-noise source. Geophysics, 45 (8), 1276-1296. Egbert, G. D., & Booker, J. R. (1986). Robust estimation of geomagnetic transfer functions. Geophysical Journal International, 87 (1), 173-194. Englund, J. E., Holst, U., & Ruppert, D. (1988). Recursive M-estimators of
ACCEPTED MANUSCRIPT
location and scale for dependent sequences. Scandinavian journal of statistics,
US CR IP T
147-159. Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 309-368. Fair, R. C. (1974). On the robust estimation of econometric models. In
MA N
Annals of Economic and Social Measurement, 3(4), 117-128.
Gallagher Jr, N. C., & Wise, G. L. (1981). A theoretical analysis of the
D
properties of median filters. Acoustics, Speech and Signal Processing, IEEE
TE
Transactions on, 29 (6), 1136-1141.
CE P
Golomb, S. W. (1994). Shift-register sequences and spread-spectrum communications. In Spread Spectrum Techniques and Applications, 1994. IEEE ISSSTA'94., IEEE Third International Symposium on (pp. 14-15).
AC
Huber, P. J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35 (1), 73-101. Huber, P. J. (1981). Robust Statistics. New York: John Wiley and Sons. Hinich, M. J., & Talwar, P. P. (1975). A simple method for robust regression. Journal of the American Statistical Association, 70 (349), 113-119. Holland, P. W., & Welsch, R. E. (1977). Robust regression using iteratively reweighted least-squares. Communications in Statistics-Theory and Methods, 6 (9), 813-827.
ACCEPTED MANUSCRIPT
Jewell, T. R., & Ward, S. H. (1963). The influence of conductivity
US CR IP T
inhomogeneities upon audio-frequency magnetic fields. Geophysics, 28 (2), 201-221.
Joyce, R. A., Glaser, D. R., Werkema, D. D., & Atekwana, E. A. (2012). Spectral induced polarization response to nanoparticles in a saturated sand matrix. Journal of Applied Geophysics, 77, 63-71.
MA N
Khesin, B., Alexeyev, V., & Eppelbaum, L. (1997). Rapid methods for interpretation of induced polarization anomalies. Journal of applied geophysics,
D
37 (2), 117-130.
TE
Lyubushin Jr, A. A. (2002). A robust wavelet-aggregated signal for
CE P
geophysical monitoring problems. IZVESTIIA PHizvestiia physical earth C/C fizika zemli-rossiiskaia academia Nauk, 38 (9), 745-755. Loke, M. H., Chambers, J. E., & Ogilvy, R. D. (2006). Inversion of 2D
AC
spectral induced polarization imaging data. Geophysical Prospecting, 54 (3), 287-301.
Li, X., Chen, W., & Zhou, Y. (2014). A robust method for analyzing the instantaneous attributes of seismic data: The instantaneous frequency estimation based on ensemble empirical mode decomposition. Journal of Applied Geophysics, 111, 102-109. Ntarlagiannis, D., Doherty, R., & Williams, K. H. (2010). Spectral induced polarization signatures of abiotic FeS precipitation. Geophysics, 75 (4), 127-133.
ACCEPTED MANUSCRIPT
Street, J. O., Carroll, R. J., & Ruppert, D. (1988). A note on computing
American Statistician, 42 (2), 152-154.
US CR IP T
robust regression estimates via iteratively reweighted least squares. The
Silva, J. B., & Cutrim, A. O. (1989). A robust maximum likelihood method for gravity and magnetic interpretation. Geoexploration, 26(1), 1-31. Streich, R., Becken, M., & Ritter, O. (2013). Robust processing of noisy
MA N
land-based controlled-source electromagnetic data. Geophysics, 78 (5), E237-E247.
D
Sabbione, J. I., & Velis, D. R. (2013). A robust method for microseismic
TE
event detection based on automatic phase pickers. Journal of Applied Geophysics,
CE P
99, 42-50.
Titov, K., Komarov, V., Tarasov, V., & Levitski, A. (2002). Theoretical and experimental study of time domain-induced polarization in water-saturated sands.
AC
Journal of Applied Geophysics, 50 (4), 417-433. Weller, A., Nordsiek, S., & Debschütz, W. (2010). Estimating permeability of sandstone data by nuclear magnetic resonance and spectral-induced polarization. Geophysics, 75 (6), E215-E226. Wei, Li. Y. Q. L. (1994). Suppression of spike interferencein magnetotelluric survey. Chinese Journal of Geophysics(in Chinese), 37(S2), 493-500. Xi, X., Yang, H., He, L. F., & Chen, R. J. (2013). Chromite mapping using
ACCEPTED MANUSCRIPT
induced polarization method based on spread spectrum technology. In
Problems, 2013, 1-7.
US CR IP T
Symposium on the Application of Geophysics to Engineering and Environmental
Xi, X., Yang, H., Zhao, X., Yao, H., Qiu, J., Shen, R., & Chen, R. (2014). Large scale distributed 2D/3D FDIP system based on zigbee network and GPS. In Symposium on the Application of Geophysics to Engineering and
MA N
Environmental Problems, 2014, 130-139.
Zonge, K. L., & Wynn, J. C. (1975). Recent advances and applications in
D
complex resistivity measurements. Geophysics, 40 (5), 851-864.
TE
Ziolkowski, A., Hobbs, B. A., & Wright, D. (2007). Multitransient
AC
CE P
electromagnetic demonstration survey in France. Geophysics, 72 (4), 197-209.
ACCEPTED MANUSCRIPT
Highlights
For robust statistical methods, most appropriate influence function and
US CR IP T
iterative algorithm are chosen by testing simulated data to suppress outliers’ influence.
We implement robust processing method for observed spread spectrum induced polarization data.
Robust statistical methods can reduce Gaussian random noise and spike
MA N
noise effectively.
Profile of apparent complex resistivity can more objectively reflect the
CE P
TE
D
actual situation of geological target.
AC