Outlier detection algorithm for satellite gravity gradiometry data using wavelet shrinkage de-noising

Outlier detection algorithm for satellite gravity gradiometry data using wavelet shrinkage de-noising

Geodesy and Geodynamics 2012,3(2) :47-52 http://www. jgg09. com Doi:l0.3724/SP.J.l246.2012.00047 Outlier detection algorithm for satellite gravity ...

4MB Sizes 0 Downloads 73 Views

Geodesy and Geodynamics

2012,3(2) :47-52

http://www. jgg09. com Doi:l0.3724/SP.J.l246.2012.00047

Outlier detection algorithm for satellite gravity gradiometry data using wavelet shrinkage de-noising Wu Yunlong 1 ' 2 , Li Hui 1 ' 2 , Zou Zhengbo 1 ' 2 , Kang Kabman 1 ' 2 and Muhammad Sadiq3 1

Key lAboratory of Earthqw.ke Geodesy, ImtiJiae of Sei.mwlogy, China Earthqw.ke Administration, Wuhon 430071 , Chino

2

Crustal Movement Laboratory, Wuhan 430071, China

3

Deparmumt of Earth &iences, Quaid-i-Azam Umversiiy, Islamabad 45320, Pakistan

Abstract: On the basis of wavelet theory, we propose an outlier-detection algorithm for satellite gravity gradiometry by applying a wavelet-shrinkage-de-noising method to some simulation data with white noise and outliers. The result shows that this novel algorithm has a 97% success rate in outlier identification and that it can be efficiently used for pre-processing real satellite gravity gradiometry data.

Key words: satellite gravity gradiometry ; outlier detection; wavelet shrinkage ; threshold ; Haar wavelet

key step in GOCE data pre-process, and the purpose of

1 Introduction

which is to identify and remove outliers in the GOCE GG data, which may then be used for earth gravity-

The GOCE (Gravity field and steady-state Ocean Cir-

model determination. As is well known, it is impossi-

culation Explorer) satellite has been io orbit for 2 years

ble to ascertain the number and distribution of outliers

since its successful launch on 27 March 2009 by ESA

in the data obtained from real mission environment.

(European Space Agency) . Its mission is to determine

Thus , outlier-detection algorithms should be developed

a unique earth gravity field model and its geoid on a

and vertified by some simulation study , so that certain

global scale with high accuracy and spatial resolu-

novel outlier detection algorithms may be used in a sat-

tion [11 • The payload consists of an electrostatic gradiom-

ellite gravity gradiometry data pre-processing program.

eter and a combined GPS/GLONASS precise positioning

Existing outlier detection algorithms in simulation

system. However, outliers ( grossly inconsistent data

study are derived from statistical methods, including

points) exist in the GOCE gravity graident ( GG) meas-

Thresholding, Mahalanobis distance , Grubbs ' s test,

urements due to inevitable measurement noise , instru-

and Dixon test. The tmditional statistical methods are

ment imperfection, attitude error, misreading and calcu-

fast algorithms and can be easily applied to outlier de-

lation error. A simulation study indicates that even a

small 0. 2% of outliers in GOCE gravity graident ( GG) observations can lead to an adverse effect on gravity-

model deternrination. Therefore , outlier detection is a Received;2012.Q4-10; Aocepted;20!2.()5.()1

tection[2-81. However, statistical methods suffer the

disadvantages of small data set and low success rate in detecting outliers. The performance is especially unsatisfactory, if only a siogle statistical method is applied to gravity gradiometry , which is affected by multiple

Corresponding author: Wu Yunlong, Tel: + 86-13886175033, E-mail:

factors. In view of the characteristics of the gravity-ra-

yunlongwu@ gmail. com

diometry data, such as huge data sets and wide error

This work is supported by the Director Foundation of the Institute of Seis-

mology , China Earthquake Administration ( IS20 1126025 ) ; The Basis Research Foundation of Key laboratory of Geospace Environment & Geod-

esy Ministry of Education, China (10-01-09)

resources, a more novel outlier-detection algorithm

should be developed. The wavelets method has the characterstics of good

48

Vol. 3

Geodesy and Geodynamics

display of time-frequency and multi-resolution analy-

nation. Lastly, a bulk or block outlier, typically

sis. It is widely applied in signal de-noising, f:tlter

caused by instrument malfunction.

process, numerical calculation and data analysis in geodesy and geophysics research[ 9 ]. In this paper, we

When selecting a wavelets analysis for de-noising, there are normally two criteriaes:

present some outlier-detection algorithms in satellite

( 1 ) smoothness: The de-noised signal should keep

gravity gradiometry using wavelets shrinkage de-noi-

the same smoothness as the raw signal under most cir-

sing, based on the wavelets theory. We then generate

cumstances. ( 2 ) similarity : The variance estimate between the

simulation data sets with white noise and outliers , and lastly test the effectiveness and reliability of the devel-

de-noised and the raw signals should be minimum

oped outlier-detection algorithm by simulated computa-

der the worst condition.

tion and analysis.

un-

Commonly used methods for wavelet de-noising are modulus-maximum de-noising, correlation de-noising ,

2

Wavelets method for outlier detec-

translation-invariant-wavelet de-noising, and waveletshrinkage de-noising[ll-l3l. The wavelet shrinkage de-

tion

noising method is based on the principle of minimal vaOutliers in a set of data are points that are grossly inconsistent with the remainder of the data set[

101

riance and it determines the thresholds by unbiased

Their

risk estimation of the coefficients , resulting in de-

values are beyond possible maximum errors of normal

noised signal that can best satisfy the above-mentioned

measurements. There are many reasons for their occur-

criteria. It shows better effect than the other methods,

rence , including instrument malfunction, misreading,

and is thus widely used for research and application.

and miscalculation. Outliers should be detected and re-

Because of its good characteristics in time-frequency

placed by some normal values before data processing. Three types of outliers may occur in a data set[lOI

characterization and its strong adaptivity, it can be used in error analysis for different frequencies. Thus,

(Fig. 1 ) . Firstly, an additive outlier is an apparently

in this study, we apply this method to outlier detection

isolated data spike often superposed on the signal. Sec-

for satellite gravity gradiometry data.



ondly, an innovative outlier often occurs at a place

We use the well- known Haar wavelets for a time se-

where the signal itself has an extreme value and is dif-

ries. The Harr wavelet coefficients are transformed as

ficult to identify; it is caused by measurement contami-

follow: bulk or block:

innovative additive Figure 1

Tlnee types of outliers[IOl

Wu Yunlong,et al. Outlier detection algorithm for satellite gravity gradiometry No.2

49

data using wavelet shrinkage de-noising

the same. The residual data series ri can be computed by the

(1)

reconstructed signals x"' ri

=X; -

x~ , i

=1 , ·· · , n

(3)

where s 1 , 11 are the smoothed wavelet coefficients of the jth level; d1 ,k are the detailed wavelet coefficients.

The position of an outlier can be identified by the re-

When applied to satellite gravity gradiometry data ,

sidual series r;.

the wavelet coefficients of the jth level are transformed ,

By following these steps , outliers in the data appear

using wavelet transformation ; then a threshold is deter-

as spikes in the residual signal, and thus can be easily

mined for the coefficients' soft-threshold procedure. A

found. The detection scheme is shown in figure 2.

typical threshold is selected with respect to the expected signal/ noise level and the minimum magnitude of the outliers[ 8 l. For the simulation computation in this work, the threshold is determined by the characteristics-simulation data. In a real satellite gravity gradiometry data de-noising process in the future, the added outliers can be identified by changing the threshold. Also the initial guess of the threshold , which is the standard deviation of the detailed coefficients , can be used in the outlier detection.

3

Quality assessments

For quality assessment of the performance of the outlier-detection methods in the simulation computation of SGG data, the following two ratios are introduced. ( 1 ) The outlier rate of success ( ORS) , which describes the number of correctly identified outliers ( n' ) with respect to the number of all outliers ( n °) ,

The reconstructed satellite-gravity-gradiometry signals

x~

are transformed by inverse wavelet , using

0RS=n'ln°

(4)

threshold coefficients. ( 2) The outlier rate of failure ( ORF) , which provides information about incorrectly detected outliers

( r/) with respect to all data points ( n) ,

(2)

ORF=rlln

(5)

where d' l,k are detailed coefficients of the jth level after

Note that both ORS and ORF can only be compu-

threshold; the smoothed wavelet coefficients s 1 , 11 remain

ted in the simulation study, but are unknown in a real

Figure 2

Wavelet outlier detecting scheme for gravity gradiometxy data

50

Vol. 3

Geodesy and Geodynamics

GOCE-data set. These ratios are used to evalute the

the computation of the simulated data, and then the

outlier detection methods in the simulation study, and

detected outliers are replaced by interpolation values.

to provide a reference for pre-processing of the real

The results of computation are given in table 3 and fig-

GOCE data.

ure 3 , the latter showing data sets both before and after the outlier detection.

4

Computations and analysis

The result indicates that most outliers can be detected by this method, with a total ORS reaching 94. 7% , or

4.1

161 outliers detected. On the other hand, there are

Data

536 true values misjudged as outliers, and the ORF is

( 1 ) Model gradients

3. 1%. The detection detail shows that most innovative

The Vzz tensor of the SGG measurement is used in the

outliers can be processed succesafully. The undetected

computation. The model gradients are simulated from

outliers are within a bulk of outliers , the reason being

the geopotential model EGM96 complete up to degree

that an individual outlier ( at least four points wide )

and order 300, based on GRS80 reference ellipsoid.

has caused a characteristic pattern after application of

The data set covers 17280 points along the GOCE orbit

the wavelet algorithm. For bulk outliers, the recogni-

of revolution for 1 day, with sampling interval of 5 sec-

tion of characteristic pattern is less sensitive , and the

onds[I•-ISJ. The generated model gradients are taken

success rate is lower than the innovative outliers.

as a clean data set for the following computation. The

In figure 4 , the red curve is the raw signal and the

parameters used for data simulation are listed in table

blue line, the signal after removal of outliers. The

1 , and the statistics of simulated gravity gradiometry

detected outliers become clear by stacking the two

data in table 2.

data sets together. The result indicates that the wavelet

( 2) Gradients with white noise and outliers

shrinkage de-noising method may be applied to GOCE

The standard deviationis a is computed first, and then

data pre-process for outlier detection.

normally distributed white noises are added in the simulation gradients data with 0 as the expected value and 0. 01 a as the standard deviation. Approximately 1% of the 170 data points in the data set were infected by outliers. These outliers are ran-

domly varying absolute values of ( 1 - 5 )

X

Table 1

Related parameters used for data simulation

Reference Semimajor axis ellipsoid ( km)

GRSSO

Cycle Height Eccenln.CI'ty Inclination ( degree) ( second) ( km)

6628

0.001

96.7

5375

250

10 -• s _,,

being composed of the following : (a) 150 outliers are randomly distributed in the data

Statistic of simulated gravity gradiometry data(Unit:lO-'s _,)

Table 2

set as innovative outliers.

(b) One group of outliers are added as block outliers with a length of 20 points. This is to simulate unreliable data caused by unstable satellite environment.

Maximum

Minimum

Average

Staodanl

Vzz 2754.645033 2729.279761 2740.099471 7. 560515653

The simulation data with white noise and outliers are generated as mentioned above for the following compuTable 3

tation.

( 3 ) Real GOCE data

Results of outlier detection in gravity gradiometry data( Unit: %)

The real GOCE data section is also selected for compu-

Outlier rate of suceess

Outlier rate

Innovative outliers

98.3

3.2

Bulk outliers

63

1

Total

97.6

3.1

of failare

tation, in order to check the outlier detection method

for the simulation study.

4. 2

Results and analysis

The wavelet shrinkage de-noising method is applied to

Wu Yunlong,et al. Outlier detection algorithm for satellite gravity gradiometry No.2

....

~ll

J~

51

data using wavelet shrinkage de-noising 2770 2765 2760 2755 2750 2745

~ 2740 ~ 2735

2730 2725

12730 ~

0

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 ~ Paints ( a) Original data with white noise and outliers Figure 3

( b) The data after outlier removal

Original data and the data after outlier removal

2760 1

-

Rawdata After detection

2755

...

~

-=

2750

e:

2745

0

~

0

·~

~

.<:> 0

-;:; 2740

....

:;; "

""

3"

2735

-.;"

"'

2730

2725 0

2000

4000

6000

10000

8000

12000

14000

16000

18000

Paints

Figure 4

5

Effect of outlier detection for real GOCE data using wavelet analysis method

Conclusion and outlook

In this paper, a wavelet-shrinkage-de-noising method

References [ 1]

ports for Mission Selection, the Four Candidate Earth Explorer

for satellite gravity gradiometry ( SGG) measurements has been presented and evaluated. The computation re-

ESA. Gravity field and steady-state ocean circulation mission. Re-

Core Missions, ESA SP-1233(1), 1999. [ 2]

Luo Zhicai, et al. Pre-processing of the GOCE satellite gravity gra-

sult indicates that this method can reach to a high out-

diometry data. Geometries and Information Science of Wuhan Uni-

lier rate of success and a low outlier rate of failure, and

versity , 2009 , 34 ( 10 ) : 1163 - 1167 . ( in Chinese)

is adaptable to various outlier types. It shows that this

[ 3]

method can be a good pre-processing algorithm for the detection of outliers in real SGG data.

[ 4]

Bouman J, Koop R, Tscherning C and Visser P. Calibration of GOCE SGG data using high-low SST, terrestrial gravity data, and

In the future , a more detailed study , including more

global gravity field models. Journal of Geodesy, 2004, 78 : 124 -

suitable wavelet-base selection and wavelet-transform level , will be carried out in order to discover the most

Bouman J. Quick-look outlier detection for GOCE gravity gradients. Newton' s Bulletin 2, 2004 , 78 - 87.

137. [5 ]

Barnett V and Lewis T . Outliers in statistical data ( 3 rd edn) . John Wiley, Chichester, 1994.

effective outlier detection method for SGG data. [ 6]

Xu Tianhe, et al. Outlier snooping based on the test statistic of moving windows and it' s applications in GOCE data preprocessing. Acta Geodaetica et Cartographica Sinica,2009 ,38 ( 5) :391 -

52

Geodesy and Geodynamics Journal of Geodesy, 2005,78,509-519.

396. (in Chinese)

[7]

Koop R, Bouman J, Schrama E and Visser P. Calibration and er-

[ 11 ]

ror assessment of GOCE data. In: Schwarz A. lAG Symp Proc

Heidelberg New York, 2002.

Booman J, Kern M, Koop R, Pail R,

(7) ,56-58. (in Chine•e) [ 12]

J1aawnans

berger T. Comparison of outlier detection algorithms for GOCE

Chineoe) [13]

-88.

[14]

dyoami.,, 2010,30(2) ,71-75. (in Chinese)

and its progress. Geomatics and Information Science of Wuhan U-

[10]

Kern M,Preimesherger T,allesch M, et al. Outlier detection al-

gorithms and their pedOmnmce in GOCE gravity field processing.

Xu Xinyu,et al. Research on analysis and simulation of gravity gradiometty error of GOCE satellite. Journal of Geodesy and Geo-

Ning Jinsheng, et al. Applications of wavelet analysis in geodesy

nivemity. 2004, 29(8) :659-664. (in Chinese)

Qing Qianqing, et al. Applications of wavelet analysis. Xi' an: Xidian University Press, 1994 . (in Chinese)

posia 129 - Gravity, 2005, Geoid and Space Missions Springer ,83

[ 9]

Lin Jianxin, et al. Application of wavelet analysis in seismic data denoising. Progress in Geophysics, 2006,21(2) :541-545. (in

R ond PreDnes-

gravity gradients. In Jekeli C,Bastos Land Fernandes J. lAG Sym-

Li Haidong , et al. Wavelet denoising based on technique of

threshold. Computer Technology and Development, 2009, 19

125, Vistas for geodesy in the new millenium. Springer, Berlin

[ 8]

Vol. 3

[15]

Preimesberger T and Pail R. GOCE quick-look gravity solution: application of the seminalytic approach in the case of data gaps and non-repeat mbi". Stud. Ceoph Geod. ,2003,(47) ,435 - 453.