Robust statistical methods for impulse noise suppressing of spread spectrum induced polarization data, with application to a mine site, Gansu province, China

Robust statistical methods for impulse noise suppressing of spread spectrum induced polarization data, with application to a mine site, Gansu province, China

    Robust statistical methods for impulse noise suppressing of spread Spectrum induced polarization data, with application to a mine sit...

829KB Sizes 12 Downloads 31 Views

    Robust statistical methods for impulse noise suppressing of spread Spectrum induced polarization data, with application to a mine site, Gansu Province, China Weiqiang Liu, Rujun Chen, Hongzhu Cai, Weibin Luo PII: DOI: Reference:

S0926-9851(16)30119-7 doi: 10.1016/j.jappgeo.2016.04.020 APPGEO 2985

To appear in:

Journal of Applied Geophysics

Received date: Revised date: Accepted date:

17 April 2015 21 April 2016 28 April 2016

Please cite this article as: Liu, Weiqiang, Chen, Rujun, Cai, Hongzhu, Luo, Weibin, Robust statistical methods for impulse noise suppressing of spread Spectrum induced polarization data, with application to a mine site, Gansu Province, China, Journal of Applied Geophysics (2016), doi: 10.1016/j.jappgeo.2016.04.020

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Robust statistical methods for impulse noise suppressing

US CR IP T

of Spread Spectrum Induced Polarization Data, with application to a mine site, Gansu Province, China Weiqiang Liua

Hongzhu Caic

Weibin Luod

Institute of Geophysical and Geochemical Exploration CAGS, Chinese

MA N

a

Rujun Chen*b

Academy Of Geological Sciences, Langfang, Hebei 065000, PR China b

School of Geoscience and Info—physics, Central South University, Changsha,

University of Utah, Consortium for Electromagnetic Modeling and Inversion

TE

c

D

Hunan Province 410083,PR China.

Gansu Non-ferrous Geological Survey Institute, Lanzhou, Gansu 730000, PR China

AC

d

CE P

(CEMI), Salt Lake City, Utah, USA

ABSTRACT In this paper, we investigated the robust processing of noisy spread spectrum induced polarization (SSIP) data. SSIP is a new frequency domain induced polarization method that transmits pseudo-random m-sequence as source current where m-sequence is a broadband signal. The potential information at multiple frequencies can be obtained through in measurement. Removing the *Corresponding author E-mail addresses: [email protected] (Chen Ru-jun), [email protected] (Liu Wei-qiang).

ACCEPTED MANUSCRIPT

noise is a crucial problem for SSIP data processing. Considering that the

US CR IP T

ordinary mean stack and digital filter if not capable of reducing the impulse noise effectively in SSIP data processing, the impact of impulse noise will remain in the complex resistivity spectrum that will affect the interpretation of profile anomalies. We implemented a robust statistical method to SSIP data processing. The robust least-squares regression is used to fit and remove the linear trend

MA N

from the original data before stacking. The robust M estimate is used to stack the data of all periods. The robust smooth filter is used to suppress the residual noise

D

for data after stacking. For robust statistical scheme, the most appropriate

TE

influence function and iterative algorithm are chosen by testing the simulated

CE P

data to suppress the outliers’ influence. We tested the benefits of the robust SSIP data processing using examples of SSIP data recorded in a test site beside a mine

AC

in Gansu Province, China.

Keywords: Spread Spectrum Induced Polarization, Robust statistical methods, Gaussian random noise, Spike impulsive noise, Complex resistivity.

1. INTRODUCTION Induced Polarization (IP) methods performed in time or frequency domain are widely used at various stages of geophysical prospecting (Weller et al, 2010; Ntarlagiannis, 2010; Ryan, 2012). There have been great developments of IP in the aspects of the system of instruments, forward modeling and inversion

ACCEPTED MANUSCRIPT methods (Khesin, 1997; Titov et al., 2002; Loke et al, 2006). Spread spectrum induced polarization (SSIP) is a new frequency-domain induced polarization

US CR IP T

(FDIP) method (Xi et al., 2013; Xi et al., 2014). For SSIP, the transmitter transmits pseudo-random m-sequence waveforms as source current, and the receiver measures the potential difference between two electrodes.

Pseudo-random m-sequence is a kind of binary sequence generated using maximal linear feedback shift registers. This sequence is spectrally flat similar to

MA N

the random signal. Practical application for m-sequence is in digital communication systems that employ direct-sequence spread spectrum and

D

frequency-hopping spread spectrum transmission systems (Golomb, 1994). Its

TE

application in electrical prospecting is emerging recently (Duncan et al., 1980, Ziolkowski et al., 2007). The major reason for transmitting pseudo-random

CE P

m-sequence as source current is the broadband property of this signal. For SSIP, the complex resistivity at multiple frequencies can be obtained simultaneously.

AC

The underground geological structures can be estimated by calculating the IP parameters further (Chen et al., 2010). However, SSIP measurements are contaminated by noise from a number of different sources, including telluric current, industrial current interference, major power lines and so on. When a survey area is located beside a mine, there will be spike impulse interference appearing continuously, the amplitude of which is much higher than the Gaussian random noise and normal SSIP signal. This interference is mainly from the impulsive stray current and high-amplitude sferics pulses (Li et al., 1994). Stray current is caused by the electric power

ACCEPTED MANUSCRIPT equipment. Sferics pulses are produced by lightning strokes propagating in the ionosphere-earth waveguide cavity (Jewell and Ward, 1963). There is a huge

US CR IP T

frequency overlap between SSIP signal and impulsive interference. If the interference is not effectively suppressed, the calculation of complex resistivity will be greatly affected.

Mean stack and digital filter are the common methods to reduce noise in FDIP data processing (Zonge and Wynn, 1975; Xi et al., 2014). Mean stack can

MA N

effectively suppress the Gaussian random noise. If there is a little impulsive noise, this noise can also be reduced by prolonging observation time and

D

increasing stack times. But when the interference is serious and impulsive noise

TE

is present continuously, mean stack is not effective. Digital filter based on the Discrete Fourier Transform (DFT) is widely used in a variety of FDIP

CE P

instruments, which can suppress many kinds of noise by designing high-pass, low-pass, band-pass and notch filter. However there is a huge overlap between

AC

SSIP signal and impulsive interference in the frequency domain, which leads to unsatisfactory result of digital filtering for impulsive noise suppressing. Many other practical methods have been developed for noise reduction in geophysics. These methods include the mean filter (Canales, 1984) and the median filter (Gallagher and Wise, 1981). Although the mean filter can suppress the Gaussian random noise effectively, it cannot suppress the spike impulsive noise. In contrast, the median filter can suppress the spike impulsive noise effectively but it cannot suppress the Gaussian random noise. Additionally, wavelet analysis is also applied in FDIP data processing, which may cause signal distortion when

ACCEPTED MANUSCRIPT impulsive noise is serious. Robust processing of noisy (SSIP) data is investigated in this paper. Robust

US CR IP T

Statistical Methods are a generalized conception (Huber, 1981). This method has been widely used in geophysical signal processing. It is routinely used for estimating high-quality Magnetotelluric (MT) transfer functions (Egbert, et al., 1986; Chave, et al., 1987; Chave, 1989). In addition, the technique is also applied for gravity and magnetic interpretation (Silva and Cutrim, 1989),

MA N

controlled-source electromagnetic (CSEM) processing (Streich et al., 2013), transient electromagnetic (TEM) measurements processing (Buselli et al., 1996),

D

seismic data processing (Sabbione, 2013; Li et al., 2014) and geophysical

TE

monitoring problems (Lyubushin et al., 2002). In contrast, there have been few robust method applications in FDIP data processing. Whereas traditional method

CE P

cannot suppress spike impulsive interference effectively, we implemented robust statistical method to the SSIP data processing to improve the data quality.

AC

Specific applications include the following three aspects: robust least-squares regression is used to remove the linear trend from the original data. Robust M estimate is used to stack the data of all periods. Robust smooth filter is used to suppress noise in data after stacking. We first introduced the basic principle of robust M estimate for one-dimensional data and robust regression for multidimensional data. For robust statistical scheme, the most appropriate influence function and iterative algorithm are chosen by testing simulated data to suppress the outliers’ influence. Then we introduced the basic principle and data processing procedure of SSIP. Finally, we demonstrated the effectiveness of this

ACCEPTED MANUSCRIPT processing approach. We carried SSIP testing for a 2 D survey line beside a mine in Gansu

US CR IP T

Province of northwest of China. There is a disseminated lead and zinc ore fault initially detected underground in the survey area, The SSIP profile was centered on the top of the target using intermediate gradient array protocol with multiple current electrode distance. The transmitter transmits 5-order m-sequence as source current, and the receiver array is based on GPS sync and ZigBee network, 25

MA N

receivers or 100 channels are supported for data acquisition. We applied robust methods to improve the data quality. Gaussian random noise and peak noise is

suppressed effectively. There is a good agreement between the apparent

TE

D

resistivity anomalies and the geological target body with high polarization and low resistivity, which can help to improve the interpretation of previous

CE P

geophysical surveys in the survey area.

AC

2. ROBUST STATISTICAL METHOD There are outliers in original data inevitably due to the observation error and spike interference in the Process to obtain SSIP data. Some ordinary statistical methods are quite sensitive to outliers. A few outliers can change the statistical results significantly. The objective of robust statistical method is to deal with the assumed model using reasonable method and suppress outliers’ influence (Huber, 1981). 2.1 Robust M estimate Huber (1964) proposed Robust M estimate method, which is similar to the

ACCEPTED MANUSCRIPT Maximum Likelihood Estimate. To solve the practical problems using statistical method, firstly, we need to obtain original data

 yi  through

repeated

US CR IP T

observations. Where yi    i , i  1, 2,3, 4.......N ,  1 ,  2 ,  3 ,......,  N is error sequence, N is the number of observations. In order to estimate the true value  from multiple observations, According to the maximum likelihood criterion (R A Fisher,1950), The true value of  can be calculated by minimizing the loss

N

MA N

function:

 ( i 1

yi  



)

is called the Scale

D

where, the true value  is also called Location parameter.

(1)

TE

parameter, representing residual distribution range. In order to calculate equation

CE P

(1), we set

 ( z) 

d  ( z) dz

(2)

AC

Then a necessary condition for a the minimization is that  satisfy N

 ( i 1

yi  



)0

(3)

By solving the equation (2), we can get the estimated value  of  .  ( x) is called as Influence Function, which represents a class of functions that determine the influence of original observations in calculating the estimated value  . When loss functions  ( z ) 

z2 , influence function ( z )  z , the solution is 2

ACCEPTED MANUSCRIPT N

   yi / N . This is the ordinary least squares estimation, also called mean i 1

US CR IP T

stack. This method is quite sensitive to outliers. The principle of robust M estimation is to make the  ( x) be a bounded odd function, namely ( x)   ( x) . Additionally, it is close to  ( x)  x for

x

small and becoming a constant or decreasing to zero for

x

large values.

Thus, the weight of outliers is reduced. There are various types of  ( x)

MA N

functions in robust statistics(Holland,1977), such as: ―Andrews‖ (Andrews et al., 1972), ―Beaton‖(Beaton and Turkey, 1974), ―Talwar‖ (Hinich and Talwar, 1975), ―Cauchy‖ (Cauchy or t-likelihood),

―Welsch ‖(Dennis and Welsch, 1976),

TE

D

―Huber ‖(Huber, 1964), ―Logistic‖(logistic), ―Fair‖(Fair, 1974), ―Hampel‖(Andrews et al., 1972). The different loss functions and influence

CE P

functions are given in Table 1 and plotted in Fig 1. The most appropriate influence function and iterative algorithm are chosen by testing the simulated

AC

noise data to suppress the influence of outliers afterward. The key aspect of robust M estimate is to solve the nonlinear equations (3). Huber (1981) presented two kinds of iterative algorithms: Modified residuals and modified weights. (1) Modified residuals Firstly, let the median value and median absolute deviation of original observation sequence be the initial estimates  0 and  0

0  median  yi 

(4)

ACCEPTED MANUSCRIPT  0  median  yi  0 

(5)

The iterative formula is defined as:

US CR IP T

y  1 N  ( i k ) 0  N 0  k 1   k  i 1N y  1  '( i k )  N i 1 0

Where,  is the derivative of influence function . When '

N

 ( '

i 1

MA N

equal to 0, the denominator should be replaced by a constant C ,

(6)

yi   k

0

) is

1
(2) Modified weights

D

 n will lead to the exact solution of equations (3) in a finite number of steps.

TE

Also use the formula (4) and (5) to define the initial estimates  0 and  0 , the

N

 k 1 

AC

CE P

iterative formula is defined as:

w i 1 N

k i

yi

(7)

w i 1

k i

with the weighting functions:

( w  k i

yi   k

0 yi   k 0

)

(8)

The iteration limit is the solution of equations (3) Besides the two algorithms above, Huber (1981) also presented the joint M-estimates of location and scale. Englund (1988) also presented recursive M estimators of location and scale for dependent sequences. These algorithms are

ACCEPTED MANUSCRIPT complex and will not be discussed here. 2.2 Robust least squares regression

US CR IP T

Robust least squares regression is a further development of robust M estimates, which can be used for two-dimensional or multidimensional data fitting (Holland,1977;Street,1988). As for regression model: y  Xβ  ε

ε

are n 1 vector. X is a n  p matrix.

MA N

Where y and

(9)

is a p 1 vector.

A robust estimate of β should minimize N

yi  xi β



D

 (

)

(10)

TE

i 1

CE P

Where, yi is the ith element of y . x i is the ith row vector of X ,  is a scalar parameter, which decides the range of residual and the weight of data.

AC

 is the derivative function of the loss function  ,Thus the  satisfy N

x ( i 1

ij

yi  xi β



)0

j  1, 2,..., p

(11)

Equation (11) is also a nonlinear equation and the iterative method is required. Iteratively reweighted least-squares are the common method (Holland, 1977). The iterative formula is defined as:

βk 1  βk  (XT WX)1 XT W(y  βk ) Where W is a n  n diagonal matrix defined by weight function w( With:

(12)

y  Xβ0



),

ACCEPTED MANUSCRIPT w(r ) 

 (r )

(13)

r

US CR IP T

The initial estimation of β 0 and  0 can be obtained by the ordinary least-squares regression and calculating median absolute deviation. Similar to robust M estimate, there are various types of  ( x) functions in robust least squares regression. The different loss functions, influence functions and weighting functions are given in Table 1.

MA N

For the robust least squares regression y  Xβ  ε , when the X is not

a

n  p matrix, but a n 1 vector composed of 1, namely [1, 1, 1, 1, …, 1], The

D

calculated result  using iterative reweighted least-squares algorithm is the

Name Andrew s

Beaton

The different loss function, influence function and weight function in robust

AC

Table 1

CE P

TE

same as the robust M estimate value of y by modified weights method.

statistics

 (r) r  2 A [1  cos( )] A  2 2A   B2 r 23  2 [1  [1  ( B ) ] ]  2 B  2

 (r)

w(r)

r  A sin( ) A  0

r A  sin( ) A r  0

2  r 2 r[1-   ]   B 0 

  r 2 2 [1-   ]  B 0 

Range

r A r A r B r B

ACCEPTED MANUSCRIPT

Talwar

r 2 / 2  2 T / 2

Cauchy

C2 r ln[1  ( )2 ] 2 C

Logistic

r

r  ( )2 W2 [1  e W ] 2  r2  2  2 H r - H  2 r L2 ln[cosh( )] L

r[e

r 2 ) W

]

r 1   C

2

[e

(

r 2 ) W

]

r  sign(r)H

1  H r 

r Ltanh( ) L r r 1 F

L r tanh( ) r L 1 r 1 F

r asign(r)    c r sign(r) a  c b  0

1  asign(r)  r   a c  r sign(r)  cb r 0 

D

TE

r2 2  2 a  a2  a r  2  2  r 1 cr a a  c b 2 c b 0 

CE P

(

2

r H r H

r a  a  r  b  b  r  c r c 

AC

Hampel

1

r 1   C

r r F2 [  ln(1  )] F F

Fair

r T

US CR IP T

Huber

r T

1  0

MA N

Welsch

r  0

In table 1, ―A, B, T, C, W, H, L, F, a, b, c‖ are tuning constants. Let the tuning constants be the numerical value that Holland (1977) proposed, the curves of different Influence function ( x) are illustrated in Fig 1.

ACCEPTED MANUSCRIPT

-2 -10

10

2

Cauchy

0

-5 -10

0

-2 -10

10

2

Logistic

2

Huber

0

0

-2 -10

10

0

0

10

0

10

0

10

0

-2 -10

10

2

0

-2 -10

-2 -10

10

0

10

0

-2 -10

D

0

0

2

MA N

Talwar

5

0

Welsch

0

Fair

-2 -10

0

US CR IP T

0

2

Bisquare

2

Andrews

Hampel

2

TE

Fig. 1. Function curve of different influence functions

CE P

2.3 Simulation tests of various iterative algorithm and influence function In order to choose the best iterative algorithm and influence function to

AC

suppress the outliers, the simulated noise is generated by computer, which contains two parts, one representing the Gaussian random noise with a standard deviation 1 and mean value 0, another representing the spike impulsive noise with a standard deviation varying from 0 to 10 and mean value varying from 0 to 5. The spike noise can be treated as outliers of the Gaussian noise. The percentage of outliers ranges from 0 to 35%. The test is carried out for modified residuals and modified weights, respectively. The closer the absolute value of the estimated results  i

is to 0,

the better the effect of noise suppression will be. The absolute value of the

ACCEPTED MANUSCRIPT estimated results for different influence functions using the modified residuals are given in Table 2. The absolute value of the estimated results for different

US CR IP T

influence functions using modified weights are given in Table 3. Where, ―mean‖ and ―median‖ represents the absolute value of estimate results through mean stack and median stack respectively.

Percent of outliers

0

5%

10%

15%

20%

25%

30%

35%

D

Influence

MA N

Table 2 Estimate results of different influence functions using modified residuals

TE

function

0.0000

0.9726

2.1108

2.7840

3.5126

5.7169

5.4073

5.0835

Median

0.0922

0.2223

0.2589

0.4510

0.4647

0.4647

0.5482

0.7709

Hampel

CE P

Mean

0.0377

0.0885

0.1088

0.0756

0.1055

0.0375

0.2255

0.3914

0.0416

0.0865

0.0933

0.0497

0.0806

0.0189

0.2076

0.2509

0.0418

0.0866

0.0935

0.0505

0.0804

0.0153

0.2076

0.2557

0.0451

0.0865

0.0573

0.0247

0.0388

0.0015

0.2009

0.1921

Cauchy

0.0412

0.1066

0.1271

0.1556

0.2100

0.1813

0.4009

0.6451

Welsch

0.0415

0.0868

0.0976

0.0647

0.0887

0.0086

0.2164

0.3136

Huber

0.0285

0.1448

0.1998

0.3231

0.4168

0.5456

0.8038

1.0705

Logistic

0.0413

0.1678

0.2224

0.3675

0.4655

0.6069

0.8623

1.1572

Fair

0.0438

0.2018

0.2742

0.5035

0.6259

0.8926

1.1770

1.5407

Andrews Bisquare

AC

Talwar

ACCEPTED MANUSCRIPT

Percent of outliers

0

5%

10%

Mean

0.0000

0.9726

Median

0.0922

Hampel

US CR IP T

Table 3 Estimate results of different influence functions using modified weights

15%

20%

25%

30%

35%

2.1108

2.7840

3.5126

5.7169

5.4073

5.0835

0.2223

0.2589

0.4510

0.4647

0.4647

0.5482

0.7709

0.0377

0.1480

0.1818

0.2597

0.3280

0.3932

0.6629

0.8481

Andrews

0.0416

0.1499

Bisquare

0.0418

0.1499

Talwar

0.0763

0.2713

Cauchy

0.0412

Welsch

Influence

0.2683

0.3550

0.4109

0.6825

0.8558

0.1967

0.2530

0.3545

0.4119

0.6817

0.8338

0.2942

0.1901

0.2535

0.3622

0.5837

0.7952

0.1066

0.1271

0.1556

0.2100

0.1813

0.4009

0.6451

0.0415

0.0868

0.1106

0.0647

0.0887

0.0086

0.2164

0.3136

Huber

0.0285

0.1448

0.1998

0.3231

0.4168

0.5456

0.8038

1.0705

Logistic

TE

0.0413

0.1678

0.2224

0.3675

0.4655

0.6069

0.8623

1.1572

0.0438

0.2018

0.2742

0.5035

0.6259

0.8926

1.1770

1.5407

AC

Fair

D

0.1968

CE P

MA N

function

From Table 2 and Table 3, we can see that: the mean is the worst over all performers. Performance of the mean rapidly worsens with an increasing in the percentage of outliers present. The estimator deviates significantly from 0. For modified residuals, the estimating error of ―Huber‖ ―Logistic‖ ―Fair‖ is greater than the median method and the estimating error of other six kinds of function is less than the median method. The ―Talwar‖ performs best over the whole range of different percentage of spike noise. For modified weights, the estimating error of ―Welsch‖ ―Cauchy‖ is less than the median method and the estimating error of

ACCEPTED MANUSCRIPT other kinds of function is greater than the median method slightly. The ―Welsch‖ perform best over the whole range of different percentage of spike noise. The

US CR IP T

estimator of ―Cauchy‖ ―Welsch‖ ―Huber‖ ―Logistic‖ ―Fair‖ using modified weights are the same as the estimator using modified residuals.

From Table 2 and Table 3, we can draw a conclusion that the outliers’ effect on estimator is decided by the  ( x) function form. For ―Hampel‖ ―Andrews‖ ―Bisquare‖ ―Talwar‖, they are decreasing to zero for x large values. For

MA N

―Cauchy‖ ―Welsch‖, they tend to be zero, but not be equal to zero for x large values. For ―Huber‖ ―Logistic‖ ―Fair‖, they are becoming constant for x large

D

values. The degree of suppressing outliers for three kinds of  ( x) function is

TE

different, so the estimator of different  ( x) function using various iterative

CE P

algorithms is different.

In summary, for robust M estimate, using ―Talwar‖ function and the calculated modified residuals can suppress the outliers best. For robust least

AC

squares regression, using ―Welsch‖ function with iteratively reweighted least-squares is most appropriate.

3. APPLICATION TO A FIELD CASE We applied the robust processing methods to SSIP data recorded in a test site beside a mine in Baiyin city, Gansu Province of northwest of China. The survey area is located in the east of Beiqilian Caledonian fold. The lava in this area is developmental. The deposit is under the control of the regional structures. We carried SSIP testing in a 2D survey line. The design of the survey was based

ACCEPTED MANUSCRIPT on a target concept and refined from the interpretation of previous geophysical surveys (CSAMT, TEM and TDIP/resistivity) organized by Chinese Bureau of

US CR IP T

Geological Survey and regional structural analysis. There is a disseminated lead and zinc ore fault initially detected underground in the survey area, which is an east-west trend. The SSIP survey line is perpendicular to the fault which is roughly oriented in north-south direction, and small number survey stations are set in the south. The SSIP profile (1980 m, 100 potential electrodes with

MA N

non-polarized electrodes in -1000 m to 1000 m with spacing of 20 m) was centered on the top of the target using intermediate gradient array protocol with

D

multiple current electrode distances. Currents were injected from variable current

AC

CE P

TE

electrode distance AB /2 with a range from 1400 m to 2600 m in turn (Fig. 2).

Fig. 2. Map of survey line: Red circle along the line is the location of current

ACCEPTED MANUSCRIPT electrode and max current electrode spacing is 5200 m, Red thick lines in the middle of

US CR IP T

the line is the potential electrode and max potential electrode spacing is 2000 m.

The receiver array is based on GPS sync and ZigBee network, 25 receivers or 100 channels are supported for data acquisition. Intermediate gradient array protocol with multiple current electrode distances is adopted. By using the existing multi-channel transmitter and receiver with a little extra work, we can

MA N

get more apparent resistivity and apparent phase data for 2 D inversion. The transmitter generates 5-order m-sequence as source current with the sampling rate of 64 Hz and 1024 data per period, The value of current is 8 A and the base

D

frequency is 1/16 Hz. Transmitting current (5-order m-sequence) corresponds to

TE

a period (16 s) and part of the amplitude spectrum of this signal at 0-1 Hz is

CE P

shown in Fig. 3. The power of this signal at frequencies 1/16 Hz—1 Hz is more uniformly flat. The complex resistivity with a similar SNR at 16 frequencies can

AC

be calculated simultaneously by transmitting m-sequence current and measuring the potential information.

ACCEPTED MANUSCRIPT

(a) 10

0 -5 -10

0

2

4

6

2000

1000 500

0.1

0.2

0.3

0.4 0.5 0.6 Frequency / Hz

12

0.7

14

0.8

0.9

16

1

(a) Transmitting current ( 5-order m sequence ) corresponds to one period (16

TE

Fig. 3.

0

10

D

0

8 Time / s (b)

MA N

Amplitude

1500

US CR IP T

I/A

5

CE P

s); (b) Part of the amplitude spectrum of transmitting current at 0-1 Hz

3.1 Processing of SSIP data

AC

Firstly, we implemented the robust statistical method to SSIP data obtained in one survey station to demonstrate the detail of this processing method. For this survey station, current electrodes A and B is located in -2200 m and 2200 m, the potential electrode M and N are located in 760 m and 780 m. The distance in the profile is 770 m. For the survey station, we obtained SSIP data of sufficient periods (more than 160 periods) by measuring repeatedly for a long time (more than 40 minutes). Application of robust processing method includes three stages. The robust least squares regression is used to fit and remove the linear trend from the original data before stacking. The robust M estimate is used to stack the

ACCEPTED MANUSCRIPT data of all periods. The robust smooth filter is used to suppress residual noise in data after stacking.

US CR IP T

In SSIP, when the time for acquiring data is too long, the observed potential signal will contain trending item with the time, including direct current component, linear trending or nonlinear trending. The trending is mainly caused by the change of electrode charge, spontaneous potential, zero drift of amplifier, variation of air temperature and other low-frequency interference, which can

MA N

affect the Fourier transform of the potential data. Removing the trending from the original data before stacking time series is an essential task for repeated

D

observation SSIP data. Nonlinear trending only occurs in some special

TE

circumstances. The direct current components and linear trending are common in

value.

CE P

SSIP data, which can be removed by subtracting the mean value and linear fitting

SSIP data in the survey station mentioned before have 160 periods in total

AC

by repeated measuring, with 1024 data per period (16 s). We should stack each data of all the 160 periods. Then, we can get a period of data (1024 data) after 160 stacks. However, before stacking, removing the trending is necessary. We used the robust least-squares regression and the ordinary least squares regression to fit the linear trend respectively. Fig 4 shows one of the 1024 data in different periods from 1 to 160. The data of different periods contains the background Gaussian random noise, the outliers caused by the spike noise and the linear trends caused by other reasons. Fitting result of the ordinary least-squares is affected by the spike noise greatly. In contrast, the fitting result of the robust least

ACCEPTED MANUSCRIPT squares is almost not affected by the outliers.

20

US CR IP T

Original data Odinary least squares Robust least squares

19

17

16

MA N

U ( mV )

18

15

13 0

TE

40

60 80 100 Repeated observations

120

140

160

One of the 1024 data (a period) in repeated observation periods and the linear

CE P

Fig. 4.

20

D

14

trend fitting using different regression methods

AC

After subtracting the fitting value of the linear trend, we stack the data of all periods to suppress the Gaussian noise and spike noise. Fig 5 shows the results of mean stack and robust stack for one of the 1024 data. The estimation of mean stack is affected by the outliers greatly, which deviate significantly from the real value. In contrast, the robust stack is almost not affected by the outliers that caused by spike impulsive noise. So when the interference is serious and impulsive noise is present continuously, mean stack is not effective, robust stack can reduce Gaussian random noise and impulse noise effectively.

ACCEPTED MANUSCRIPT

19 Original data after detrending Mean stack Robust stack

US CR IP T

18

 U ( mV )

17

16

15

13

20

40

60 80 100 Repeated observations

120

140

160

D

12 0

MA N

14

CE P

TE

Fig. 5. The Comparision of mean stack and Robust stack for detrend data

For the 1024 data points of a period, we processed each data point by removing the linear trending and stacking data of all the 160 periods. Finally, we

AC

obtained the signal in one period consisting in 16 s of measurements. Fig 6 shows the original potential signal in 160 periods and the result in one period after robust stacking.

ACCEPTED MANUSCRIPT

(a) 40

0 -20 -40

0

2

4

0

2

4

6

40

MA N

0 -20

Fig. 6.

6

8 Time / s

10

12

14

16

10

12

14

16

D

-40

8 Time / s (b)

(a) Potential signal in 160 periods before robust stacking;

TE

 U( mV)

20

US CR IP T

 U( mV)

20

CE P

(b) The result in one period after robust stacking.

The robust stacking can remove most of the random noise and spike noise.

AC

However, the residual noise may still remain in the SSIP data after stacking when the interference is very serious. We proposed a robust smoothing filter to suppress spike impulsive noise and Gaussian random noise for SSIP signal after stacking. Robust smoothing filter also can be used for spike filtering for the survey line. The essential idea is to replace the observed value in the center of a sliding window by the Robust M estimate value of the data within the window. For an example, the data within a sliding window is [2,-2, 1,-1, 100]. The result of mean filter is 20, which cannot suppress spike impulsive noise. The result of median filter is 1, which cannot suppress random noise. The M estimate is 0,

ACCEPTED MANUSCRIPT which can reduce the spike impulsive noise and random noise meanwhile. The smoothing filter may lead to signal distortion when the sliding window is too

US CR IP T

large. So when the interference is not particularly serious, this processing method is not necessary.

Fig 7 shows the smooth processing result of SSIP data after stacking using the piecewise robust smooth filter. For this survey point, noise in potential

MA N

signal is further suppressed.

(a)

40

0

D

 U(mV)

20

0

2

4

6

8 Times /s (b)

10

12

14

16

4

6

8 Times / s

10

12

14

16

CE P

-40

TE

-20

40

0

AC

U(mV)

20

-20 -40

Fig. 7.

0

2

(a) Potential difference curve before Robust Smooth filtering; (a) Potential difference curve after Robust Smooth filtering

3.2 Calculated results of the complex resistivity The formula of complex resistivity in FDIP is:

ACCEPTED MANUSCRIPT  ( w)  K

U ( w) I ( w)

(14)

US CR IP T

Where, K is the setting coefficient, U (w) is the frequency spectrum of observed potential difference, I ( w) is the frequency spectrum of transmitting current.

There are four steps to processing SSIP data of multiple periods in one survey point using robust methods,

MA N

1) A period data contains 1024 data points. For each sampling of all the periods, using robust least-squares regression to fit and remove the linear trending.

TE

D

2) Using robust M estimate to stack potential difference and current data of all periods and using the robust smooth filter to suppress residual noise.

CE P

3) Getting frequency spectrum of transmitting current and potential difference using Discrete Fourier Transform.

AC

4) Calculating complex resistivity spectrum according to the formula (14). Although the spectrum of the complex resistivity in SSIP is from 0 Hz to 32 Hz, We only take complex resistivity at a few frequencies from 1/16 Hz to 1 Hz (the first 2-17 data) as the results to ensure the SNR. We further calculate the mean of the results at every four frequencies as the complex resistivity at 0.16 Hz, 0.41 Hz, 0.66 Hz and 0.91 Hz. For this survey point, 160 periods of SSIP data have been obtained through repeated observation. In order to study the convergence of calculated results with increasing stack times, we calculate the complex resistivity at four frequencies

ACCEPTED MANUSCRIPT and stack from 10 times to 160 times. Fig 8-Fig 9 show the convergence of calculated results with increasing stack times using different processing methods

(a)

(b)

782 780 778 mean stack robust stack

776 0

50 100 Stack times (c)

mean stack robust stack

780

150

D

800

mean stack robust stack

798

50 100 Stack times

CE P

0

150

Resistivity(  *m )

802

796

785

0

50 100 Stack times (d)

150

814

TE

Resistivity(  *m )

804

Resistivity(  *m )

790

MA N

Resistivity(  *m )

784

774

US CR IP T

at the four frequencies, respectively.

812 810 808

mean stack robust stack

806 804

0

50 100 Stack times

150

Fig. 8. Convergence of apparent complex resistivity with increasing stack times at 4

AC

frequencies using mean stack and robust satck: (a) 0.16 Hz; (b) 0.41 Hz; (c) 0.66 Hz; (d) 0.91 Hz.

ACCEPTED MANUSCRIPT (a)

(b) 88

28 26 24 22

0

50 100 Stack times (c)

82

0

50 100 Stack times (d)

150

136 134 132 0

50 100 Stack times

Phase(mrad)

176

mean stack robust stack

mean stack robust stack

174 172 170

MA N

Phase(mrad)

138

130

84

80

150

mean stack robust stack

86

US CR IP T

mean stack robust stack

30

Phase(mrad)

Phase(mrad)

32

150

168 166

0

50 100 Stack times

150

D

Fig. 9. Convergence of phase with increasing stack times at 4 frequencies using mean

TE

stack and robust satck: (a) 0.16 Hz; (b) 0.41 Hz; (c) 0.66 Hz; (d) 0.91 Hz.

CE P

In Fig. 8-Fig. 9, we present the convergence of apparent complex resistivity and phase with increasing stack times using different methods at four frequencies.

AC

For ordinary mean stack, the complex resistivity and phase converge to the correct solution after 160 stacks, because the original data are contaminated by Gaussian random noise and impulse spike noise and mean stack is not effective to remove the spike noise appearing continuously. For robust methods, the calculated result is more accurate and the results have already converged to the correct solution after 180 stacking. The robust processing can suppress the Gaussian random noise and spike noise effectively, which can improve the quality of SSIP data processing and save measuring time in the field.

ACCEPTED MANUSCRIPT

Using the robust method and traditional method to process all the SSIP data

US CR IP T

for different current electrode distance, profiles of apparent complex resistivity and phase at different frequencies can be obtained. In order to highlight the anomalies, we displayed the profiles of apparent complex resistivity and phase from distance -600 m to 600 m for current electrode distance AB/2 from -1600 m

frequency=0.16 Hz frequency=0.41 Hz frequency=0.66 Hz frequency=0.91 Hz

a1

1200 800 400

D TE

800

0

a2

CE P

2000 1600 1200 800 400 0

b1

1200

400

AC

Apparent Resistivity ( Ohm-m )

2000 1600 1200 800 400 0

1600

MA N

to 2600 m in Fig. 10 and Fig. 11.

a3

b2

1000 800 600 400 200 0

b3

500 400 300 200 100

0

1200

a4

600 500 400 300 200 100 0

b4

a5

1000 800 600 400 200 0

b5

800 400 0

1600 1200 800 400 0 -600

-400

-200

0

200

Profile coordinate ( m )

400

600

-600

-400

-200

0

200

Profile coordinate ( m )

400

600

ACCEPTED MANUSCRIPT

Fig. 10. Profiles of apparent complex resistivity at 4 frequencies for different current

US CR IP T

electrode distance. (a1) traditional method when AB/2=1600 m; (a2) traditional method when AB/2=1800 m; (a3) traditional method when AB/2=2000 m; (a4) traditional method when AB/2=2200 m; (a5) traditional method when AB/2=2400 m; (b1) robust method when AB/2=1600 m; (b2) robust method when AB/2=1800 m; (b3) robust

MA N

method when AB/2=2000 m; (b4) robust method when AB/2=2200 m; (b5) robust

AC

CE P

TE

D

method when AB/2=2400 m.

ACCEPTED MANUSCRIPT

frequency=0.16 Hz frequency=0.41 Hz frequency=0.66 Hz frequency=0.91 Hz

400 300

300

200

200

100

100

0

0

a2

400 300

b2

400 300

200

200

100

100

0

MA N

0

a3

2000

b3

1400 1000

1000

600

0

200

-1000

-200

-2000

a4

1000

D

Apparent Phase ( mrad )

b1

US CR IP T

a1

400

TE

600 200

1000 600 200

AC

-200 -600

-400

-200

0

600 200 -200

a5

CE P

-200

b4

1000

b5

1000 600 200 -200

200

Profile coordinate ( m )

400

600

-600

-400

-200

0

200

400

600

Profile coordinate ( m )

Fig. 11. Profiles of apparent phase at 4 frequencies for different current electrode distance. (a1) traditional method when AB/2=1600 m; (a2) traditional method when AB/2=1800 m; (a3) traditional method when AB/2=2000 m; (a4) traditional method when AB/2=2200 m; (a5) traditional method when AB/2=2400 m; (b1) robust method when AB/2=1600 m; (b2) robust method when AB/2=1800 m; (b3) robust method when AB/2=2000 m; (b4) robust method when AB/2=2200 m; (b5) robust method when

ACCEPTED MANUSCRIPT

AB/2=2400 m.

US CR IP T

Fig. 10 shows profiles of apparent complex resistivity at 4 frequencies for different current electrode distance using the traditional method and the robust method respectively. Although the results as seen in the profile processed by the traditional method show obvious similarity with the robust processing data, robust processing can improve the quality of the data and remove the outliers

MA N

caused by environmental interference. The anomalies are more obvious and reasonable. Fig. 11 shows profiles of the apparent phase at 4 frequencies, which

D

shows similar characteristics to the apparent complex resistivity profiles.

TE

However, phase suffered from lighter contamination, compare to complex resistivity.

CE P

In Fig. 10, profiles show a low subsurface apparent resistivity anomaly from a distance at -190 m to -10 m. The anomalies increase with increasing current

AC

electrode distance AB/2. In the meanwhile, in Fig. 11, profiles show a high subsurface apparent phase anomaly from survey station at -190 m to -10 m. The anomalies also increase with increasing current electrode distance AB/2. Both apparent complex resistivity and apparent phase anomalies are related to the low resistance and high polarization Lead-zinc ore bodies. The complex resistivity at four frequencies shows compatible anomalies in profiles. However, the apparent phase at 0.16 Hz when AB/2 =2400 m shows a negative phase anomaly in the profile. This may be a false anomaly caused by other electromagnetic interference, which should be studied further. In SSIP, the complex resistivity

ACCEPTED MANUSCRIPT and phase at multiple frequencies can be calculated simultaneously by transmitting pseudo-random m-sequence as source current. The information is

US CR IP T

more sufficient with the same cost of field work. There is a good agreement between apparent resistivity and the interpretation of previous geophysical surveys.

4. CONCLUSIONS

MA N

Our results demonstrated that SSIP method transmitting pseudo-random m-sequence for current injection and calculating complex resistivity response at four frequencies is practically applicable in exploring for structurally controlled

D

deposits in the survey area. We also show that the measuring time can be reduced

TE

and that the data quality can be generally improved by using the robust

CE P

processing method to suppress the outliers caused by spike noise. Applications of the robust approach described are not limited to SSIP data. Robust regression,

AC

robust stack, and robust smooth filter also can be used for data of multiple periods in other geophysical methods. Robust processing is effective for removing the outliers caused by spike noise, comparing to the traditional method. However, the robust processing is not necessary for SSIP data with high SNR. When SSIP is contaminated mainly by the Gaussian random noise, compared with ordinary mean stack, the robust processing does not change the calculated results significantly. In addition, the pseudo-random m-sequence not only has the broad band characteristic, but also has special autocorrelation and

ACCEPTED MANUSCRIPT cross correlation characteristic. The noise suppressing methods based on

US CR IP T

m-sequence correlation identification will be investigated in the future study.

ACKNOWLEDGMENTS

This research is supported by the Fundamental Research Funds for the Central Universities of Central South University, China. SSIP data measured in

MA N

Gansu Province, China were kindly supplied by Champion Geophysical Technology Ltd. We are grateful to the help of company technician Wu Hong,

D

Yao Hong-chun, Shen Rui-jie, Shi Hong-hua and Qiu Jie-ting in SSIP method

TE

instruction, instrument and data processing. We thank reviewers and editor for

CE P

their reviews and comments.

AC

REFERENCES

Andrews, D. F., & Hampel, F. R. (2015). Robust estimates of location: survey and advances. Princeton University Press. Beaton, A. E., & Tukey, J. W. (1974). The fitting of power series, meaning polynomials, illustrated on band-spectroscopic data. Technometrics, 16 (2), 147-185. Buselli, G., & Cameron, M. (1996). Robust statistical methods for reducing sferics noise contaminating transient electromagnetic measurements. Geophysics, 61 (6), 1633-1646.

ACCEPTED MANUSCRIPT

Canales, L. (1984). Random noise reduction: 54th Annual International

US CR IP T

Meeting, SEG, Expanded Abstracts, 525–527. Chave, A. D., Thomson, D. J., & Ander, M. E. (1987). On the robust estimation of power spectra, coherences, and transfer functions. Journal of Geophysical Research: Solid Earth (1978–2012), 92 (B1), 633-648. Chave, A. D., & Thomson, D. J. (1989). Some comments on

MA N

magnetotelluric response function estimation. Journal of Geophysical Research: Solid Earth (1978–2012), 94 (B10), 14215-14225.

D

Chen, R., Zhangxiang, H., Jieting, Q., Lanfang, H., & Zixing, C. (2010).

TE

Distributed data acquisition unit based on GPS and ZigBee for electromagnetic

CE P

exploration. In Instrumentation and Measurement Technology Conference (I2MTC), 2010 IEEE (981-985). Dennis Jr, J. E., & Welsch, R. E. (1978). Techniques for nonlinear least

AC

squares and robust regression. Communications in Statistics-Simulation and Computation, 7 (4), 345-359. Duncan, P. M., Hwang, A., Edwards, R. N., Bailey, R. C., & Garland, G. D. (1980). The development and applications of a wide band electromagnetic sounding system using a pseudo-noise source. Geophysics, 45 (8), 1276-1296. Egbert, G. D., & Booker, J. R. (1986). Robust estimation of geomagnetic transfer functions. Geophysical Journal International, 87 (1), 173-194. Englund, J. E., Holst, U., & Ruppert, D. (1988). Recursive M-estimators of

ACCEPTED MANUSCRIPT

location and scale for dependent sequences. Scandinavian journal of statistics,

US CR IP T

147-159. Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philosophical Transactions of the Royal Society of London. Series A, Containing Papers of a Mathematical or Physical Character, 309-368. Fair, R. C. (1974). On the robust estimation of econometric models. In

MA N

Annals of Economic and Social Measurement, 3(4), 117-128.

Gallagher Jr, N. C., & Wise, G. L. (1981). A theoretical analysis of the

D

properties of median filters. Acoustics, Speech and Signal Processing, IEEE

TE

Transactions on, 29 (6), 1136-1141.

CE P

Golomb, S. W. (1994). Shift-register sequences and spread-spectrum communications. In Spread Spectrum Techniques and Applications, 1994. IEEE ISSSTA'94., IEEE Third International Symposium on (pp. 14-15).

AC

Huber, P. J. (1964). Robust estimation of a location parameter. The Annals of Mathematical Statistics, 35 (1), 73-101. Huber, P. J. (1981). Robust Statistics. New York: John Wiley and Sons. Hinich, M. J., & Talwar, P. P. (1975). A simple method for robust regression. Journal of the American Statistical Association, 70 (349), 113-119. Holland, P. W., & Welsch, R. E. (1977). Robust regression using iteratively reweighted least-squares. Communications in Statistics-Theory and Methods, 6 (9), 813-827.

ACCEPTED MANUSCRIPT

Jewell, T. R., & Ward, S. H. (1963). The influence of conductivity

US CR IP T

inhomogeneities upon audio-frequency magnetic fields. Geophysics, 28 (2), 201-221.

Joyce, R. A., Glaser, D. R., Werkema, D. D., & Atekwana, E. A. (2012). Spectral induced polarization response to nanoparticles in a saturated sand matrix. Journal of Applied Geophysics, 77, 63-71.

MA N

Khesin, B., Alexeyev, V., & Eppelbaum, L. (1997). Rapid methods for interpretation of induced polarization anomalies. Journal of applied geophysics,

D

37 (2), 117-130.

TE

Lyubushin Jr, A. A. (2002). A robust wavelet-aggregated signal for

CE P

geophysical monitoring problems. IZVESTIIA PHizvestiia physical earth C/C fizika zemli-rossiiskaia academia Nauk, 38 (9), 745-755. Loke, M. H., Chambers, J. E., & Ogilvy, R. D. (2006). Inversion of 2D

AC

spectral induced polarization imaging data. Geophysical Prospecting, 54 (3), 287-301.

Li, X., Chen, W., & Zhou, Y. (2014). A robust method for analyzing the instantaneous attributes of seismic data: The instantaneous frequency estimation based on ensemble empirical mode decomposition. Journal of Applied Geophysics, 111, 102-109. Ntarlagiannis, D., Doherty, R., & Williams, K. H. (2010). Spectral induced polarization signatures of abiotic FeS precipitation. Geophysics, 75 (4), 127-133.

ACCEPTED MANUSCRIPT

Street, J. O., Carroll, R. J., & Ruppert, D. (1988). A note on computing

American Statistician, 42 (2), 152-154.

US CR IP T

robust regression estimates via iteratively reweighted least squares. The

Silva, J. B., & Cutrim, A. O. (1989). A robust maximum likelihood method for gravity and magnetic interpretation. Geoexploration, 26(1), 1-31. Streich, R., Becken, M., & Ritter, O. (2013). Robust processing of noisy

MA N

land-based controlled-source electromagnetic data. Geophysics, 78 (5), E237-E247.

D

Sabbione, J. I., & Velis, D. R. (2013). A robust method for microseismic

TE

event detection based on automatic phase pickers. Journal of Applied Geophysics,

CE P

99, 42-50.

Titov, K., Komarov, V., Tarasov, V., & Levitski, A. (2002). Theoretical and experimental study of time domain-induced polarization in water-saturated sands.

AC

Journal of Applied Geophysics, 50 (4), 417-433. Weller, A., Nordsiek, S., & Debschütz, W. (2010). Estimating permeability of sandstone data by nuclear magnetic resonance and spectral-induced polarization. Geophysics, 75 (6), E215-E226. Wei, Li. Y. Q. L. (1994). Suppression of spike interferencein magnetotelluric survey. Chinese Journal of Geophysics(in Chinese), 37(S2), 493-500. Xi, X., Yang, H., He, L. F., & Chen, R. J. (2013). Chromite mapping using

ACCEPTED MANUSCRIPT

induced polarization method based on spread spectrum technology. In

Problems, 2013, 1-7.

US CR IP T

Symposium on the Application of Geophysics to Engineering and Environmental

Xi, X., Yang, H., Zhao, X., Yao, H., Qiu, J., Shen, R., & Chen, R. (2014). Large scale distributed 2D/3D FDIP system based on zigbee network and GPS. In Symposium on the Application of Geophysics to Engineering and

MA N

Environmental Problems, 2014, 130-139.

Zonge, K. L., & Wynn, J. C. (1975). Recent advances and applications in

D

complex resistivity measurements. Geophysics, 40 (5), 851-864.

TE

Ziolkowski, A., Hobbs, B. A., & Wright, D. (2007). Multitransient

AC

CE P

electromagnetic demonstration survey in France. Geophysics, 72 (4), 197-209.

ACCEPTED MANUSCRIPT

Highlights 

For robust statistical methods, most appropriate influence function and

US CR IP T

iterative algorithm are chosen by testing simulated data to suppress outliers’ influence. 

We implement robust processing method for observed spread spectrum induced polarization data.

Robust statistical methods can reduce Gaussian random noise and spike

MA N



noise effectively.

Profile of apparent complex resistivity can more objectively reflect the

CE P

TE

D

actual situation of geological target.

AC