Geodesy and Geodynamics
2012,3(2) :47-52
http://www. jgg09. com Doi:l0.3724/SP.J.l246.2012.00047
Outlier detection algorithm for satellite gravity gradiometry data using wavelet shrinkage de-noising Wu Yunlong 1 ' 2 , Li Hui 1 ' 2 , Zou Zhengbo 1 ' 2 , Kang Kabman 1 ' 2 and Muhammad Sadiq3 1
Key lAboratory of Earthqw.ke Geodesy, ImtiJiae of Sei.mwlogy, China Earthqw.ke Administration, Wuhon 430071 , Chino
2
Crustal Movement Laboratory, Wuhan 430071, China
3
Deparmumt of Earth &iences, Quaid-i-Azam Umversiiy, Islamabad 45320, Pakistan
Abstract: On the basis of wavelet theory, we propose an outlier-detection algorithm for satellite gravity gradiometry by applying a wavelet-shrinkage-de-noising method to some simulation data with white noise and outliers. The result shows that this novel algorithm has a 97% success rate in outlier identification and that it can be efficiently used for pre-processing real satellite gravity gradiometry data.
Key words: satellite gravity gradiometry ; outlier detection; wavelet shrinkage ; threshold ; Haar wavelet
key step in GOCE data pre-process, and the purpose of
1 Introduction
which is to identify and remove outliers in the GOCE GG data, which may then be used for earth gravity-
The GOCE (Gravity field and steady-state Ocean Cir-
model determination. As is well known, it is impossi-
culation Explorer) satellite has been io orbit for 2 years
ble to ascertain the number and distribution of outliers
since its successful launch on 27 March 2009 by ESA
in the data obtained from real mission environment.
(European Space Agency) . Its mission is to determine
Thus , outlier-detection algorithms should be developed
a unique earth gravity field model and its geoid on a
and vertified by some simulation study , so that certain
global scale with high accuracy and spatial resolu-
novel outlier detection algorithms may be used in a sat-
tion [11 • The payload consists of an electrostatic gradiom-
ellite gravity gradiometry data pre-processing program.
eter and a combined GPS/GLONASS precise positioning
Existing outlier detection algorithms in simulation
system. However, outliers ( grossly inconsistent data
study are derived from statistical methods, including
points) exist in the GOCE gravity graident ( GG) meas-
Thresholding, Mahalanobis distance , Grubbs ' s test,
urements due to inevitable measurement noise , instru-
and Dixon test. The tmditional statistical methods are
ment imperfection, attitude error, misreading and calcu-
fast algorithms and can be easily applied to outlier de-
lation error. A simulation study indicates that even a
small 0. 2% of outliers in GOCE gravity graident ( GG) observations can lead to an adverse effect on gravity-
model deternrination. Therefore , outlier detection is a Received;2012.Q4-10; Aocepted;20!2.()5.()1
tection[2-81. However, statistical methods suffer the
disadvantages of small data set and low success rate in detecting outliers. The performance is especially unsatisfactory, if only a siogle statistical method is applied to gravity gradiometry , which is affected by multiple
Corresponding author: Wu Yunlong, Tel: + 86-13886175033, E-mail:
factors. In view of the characteristics of the gravity-ra-
yunlongwu@ gmail. com
diometry data, such as huge data sets and wide error
This work is supported by the Director Foundation of the Institute of Seis-
mology , China Earthquake Administration ( IS20 1126025 ) ; The Basis Research Foundation of Key laboratory of Geospace Environment & Geod-
esy Ministry of Education, China (10-01-09)
resources, a more novel outlier-detection algorithm
should be developed. The wavelets method has the characterstics of good
48
Vol. 3
Geodesy and Geodynamics
display of time-frequency and multi-resolution analy-
nation. Lastly, a bulk or block outlier, typically
sis. It is widely applied in signal de-noising, f:tlter
caused by instrument malfunction.
process, numerical calculation and data analysis in geodesy and geophysics research[ 9 ]. In this paper, we
When selecting a wavelets analysis for de-noising, there are normally two criteriaes:
present some outlier-detection algorithms in satellite
( 1 ) smoothness: The de-noised signal should keep
gravity gradiometry using wavelets shrinkage de-noi-
the same smoothness as the raw signal under most cir-
sing, based on the wavelets theory. We then generate
cumstances. ( 2 ) similarity : The variance estimate between the
simulation data sets with white noise and outliers , and lastly test the effectiveness and reliability of the devel-
de-noised and the raw signals should be minimum
oped outlier-detection algorithm by simulated computa-
der the worst condition.
tion and analysis.
un-
Commonly used methods for wavelet de-noising are modulus-maximum de-noising, correlation de-noising ,
2
Wavelets method for outlier detec-
translation-invariant-wavelet de-noising, and waveletshrinkage de-noising[ll-l3l. The wavelet shrinkage de-
tion
noising method is based on the principle of minimal vaOutliers in a set of data are points that are grossly inconsistent with the remainder of the data set[
101
riance and it determines the thresholds by unbiased
Their
risk estimation of the coefficients , resulting in de-
values are beyond possible maximum errors of normal
noised signal that can best satisfy the above-mentioned
measurements. There are many reasons for their occur-
criteria. It shows better effect than the other methods,
rence , including instrument malfunction, misreading,
and is thus widely used for research and application.
and miscalculation. Outliers should be detected and re-
Because of its good characteristics in time-frequency
placed by some normal values before data processing. Three types of outliers may occur in a data set[lOI
characterization and its strong adaptivity, it can be used in error analysis for different frequencies. Thus,
(Fig. 1 ) . Firstly, an additive outlier is an apparently
in this study, we apply this method to outlier detection
isolated data spike often superposed on the signal. Sec-
for satellite gravity gradiometry data.
•
ondly, an innovative outlier often occurs at a place
We use the well- known Haar wavelets for a time se-
where the signal itself has an extreme value and is dif-
ries. The Harr wavelet coefficients are transformed as
ficult to identify; it is caused by measurement contami-
follow: bulk or block:
innovative additive Figure 1
Tlnee types of outliers[IOl
Wu Yunlong,et al. Outlier detection algorithm for satellite gravity gradiometry No.2
49
data using wavelet shrinkage de-noising
the same. The residual data series ri can be computed by the
(1)
reconstructed signals x"' ri
=X; -
x~ , i
=1 , ·· · , n
(3)
where s 1 , 11 are the smoothed wavelet coefficients of the jth level; d1 ,k are the detailed wavelet coefficients.
The position of an outlier can be identified by the re-
When applied to satellite gravity gradiometry data ,
sidual series r;.
the wavelet coefficients of the jth level are transformed ,
By following these steps , outliers in the data appear
using wavelet transformation ; then a threshold is deter-
as spikes in the residual signal, and thus can be easily
mined for the coefficients' soft-threshold procedure. A
found. The detection scheme is shown in figure 2.
typical threshold is selected with respect to the expected signal/ noise level and the minimum magnitude of the outliers[ 8 l. For the simulation computation in this work, the threshold is determined by the characteristics-simulation data. In a real satellite gravity gradiometry data de-noising process in the future, the added outliers can be identified by changing the threshold. Also the initial guess of the threshold , which is the standard deviation of the detailed coefficients , can be used in the outlier detection.
3
Quality assessments
For quality assessment of the performance of the outlier-detection methods in the simulation computation of SGG data, the following two ratios are introduced. ( 1 ) The outlier rate of success ( ORS) , which describes the number of correctly identified outliers ( n' ) with respect to the number of all outliers ( n °) ,
The reconstructed satellite-gravity-gradiometry signals
x~
are transformed by inverse wavelet , using
0RS=n'ln°
(4)
threshold coefficients. ( 2) The outlier rate of failure ( ORF) , which provides information about incorrectly detected outliers
( r/) with respect to all data points ( n) ,
(2)
ORF=rlln
(5)
where d' l,k are detailed coefficients of the jth level after
Note that both ORS and ORF can only be compu-
threshold; the smoothed wavelet coefficients s 1 , 11 remain
ted in the simulation study, but are unknown in a real
Figure 2
Wavelet outlier detecting scheme for gravity gradiometxy data
50
Vol. 3
Geodesy and Geodynamics
GOCE-data set. These ratios are used to evalute the
the computation of the simulated data, and then the
outlier detection methods in the simulation study, and
detected outliers are replaced by interpolation values.
to provide a reference for pre-processing of the real
The results of computation are given in table 3 and fig-
GOCE data.
ure 3 , the latter showing data sets both before and after the outlier detection.
4
Computations and analysis
The result indicates that most outliers can be detected by this method, with a total ORS reaching 94. 7% , or
4.1
161 outliers detected. On the other hand, there are
Data
536 true values misjudged as outliers, and the ORF is
( 1 ) Model gradients
3. 1%. The detection detail shows that most innovative
The Vzz tensor of the SGG measurement is used in the
outliers can be processed succesafully. The undetected
computation. The model gradients are simulated from
outliers are within a bulk of outliers , the reason being
the geopotential model EGM96 complete up to degree
that an individual outlier ( at least four points wide )
and order 300, based on GRS80 reference ellipsoid.
has caused a characteristic pattern after application of
The data set covers 17280 points along the GOCE orbit
the wavelet algorithm. For bulk outliers, the recogni-
of revolution for 1 day, with sampling interval of 5 sec-
tion of characteristic pattern is less sensitive , and the
onds[I•-ISJ. The generated model gradients are taken
success rate is lower than the innovative outliers.
as a clean data set for the following computation. The
In figure 4 , the red curve is the raw signal and the
parameters used for data simulation are listed in table
blue line, the signal after removal of outliers. The
1 , and the statistics of simulated gravity gradiometry
detected outliers become clear by stacking the two
data in table 2.
data sets together. The result indicates that the wavelet
( 2) Gradients with white noise and outliers
shrinkage de-noising method may be applied to GOCE
The standard deviationis a is computed first, and then
data pre-process for outlier detection.
normally distributed white noises are added in the simulation gradients data with 0 as the expected value and 0. 01 a as the standard deviation. Approximately 1% of the 170 data points in the data set were infected by outliers. These outliers are ran-
domly varying absolute values of ( 1 - 5 )
X
Table 1
Related parameters used for data simulation
Reference Semimajor axis ellipsoid ( km)
GRSSO
Cycle Height Eccenln.CI'ty Inclination ( degree) ( second) ( km)
6628
0.001
96.7
5375
250
10 -• s _,,
being composed of the following : (a) 150 outliers are randomly distributed in the data
Statistic of simulated gravity gradiometry data(Unit:lO-'s _,)
Table 2
set as innovative outliers.
(b) One group of outliers are added as block outliers with a length of 20 points. This is to simulate unreliable data caused by unstable satellite environment.
Maximum
Minimum
Average
Staodanl
Vzz 2754.645033 2729.279761 2740.099471 7. 560515653
The simulation data with white noise and outliers are generated as mentioned above for the following compuTable 3
tation.
( 3 ) Real GOCE data
Results of outlier detection in gravity gradiometry data( Unit: %)
The real GOCE data section is also selected for compu-
Outlier rate of suceess
Outlier rate
Innovative outliers
98.3
3.2
Bulk outliers
63
1
Total
97.6
3.1
of failare
tation, in order to check the outlier detection method
for the simulation study.
4. 2
Results and analysis
The wavelet shrinkage de-noising method is applied to
Wu Yunlong,et al. Outlier detection algorithm for satellite gravity gradiometry No.2
....
~ll
J~
51
data using wavelet shrinkage de-noising 2770 2765 2760 2755 2750 2745
~ 2740 ~ 2735
2730 2725
12730 ~
0
1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 ~ Paints ( a) Original data with white noise and outliers Figure 3
( b) The data after outlier removal
Original data and the data after outlier removal
2760 1
-
Rawdata After detection
2755
...
~
-=
2750
e:
2745
0
~
0
·~
~
.<:> 0
-;:; 2740
....
:;; "
""
3"
2735
-.;"
"'
2730
2725 0
2000
4000
6000
10000
8000
12000
14000
16000
18000
Paints
Figure 4
5
Effect of outlier detection for real GOCE data using wavelet analysis method
Conclusion and outlook
In this paper, a wavelet-shrinkage-de-noising method
References [ 1]
ports for Mission Selection, the Four Candidate Earth Explorer
for satellite gravity gradiometry ( SGG) measurements has been presented and evaluated. The computation re-
ESA. Gravity field and steady-state ocean circulation mission. Re-
Core Missions, ESA SP-1233(1), 1999. [ 2]
Luo Zhicai, et al. Pre-processing of the GOCE satellite gravity gra-
sult indicates that this method can reach to a high out-
diometry data. Geometries and Information Science of Wuhan Uni-
lier rate of success and a low outlier rate of failure, and
versity , 2009 , 34 ( 10 ) : 1163 - 1167 . ( in Chinese)
is adaptable to various outlier types. It shows that this
[ 3]
method can be a good pre-processing algorithm for the detection of outliers in real SGG data.
[ 4]
Bouman J, Koop R, Tscherning C and Visser P. Calibration of GOCE SGG data using high-low SST, terrestrial gravity data, and
In the future , a more detailed study , including more
global gravity field models. Journal of Geodesy, 2004, 78 : 124 -
suitable wavelet-base selection and wavelet-transform level , will be carried out in order to discover the most
Bouman J. Quick-look outlier detection for GOCE gravity gradients. Newton' s Bulletin 2, 2004 , 78 - 87.
137. [5 ]
Barnett V and Lewis T . Outliers in statistical data ( 3 rd edn) . John Wiley, Chichester, 1994.
effective outlier detection method for SGG data. [ 6]
Xu Tianhe, et al. Outlier snooping based on the test statistic of moving windows and it' s applications in GOCE data preprocessing. Acta Geodaetica et Cartographica Sinica,2009 ,38 ( 5) :391 -
52
Geodesy and Geodynamics Journal of Geodesy, 2005,78,509-519.
396. (in Chinese)
[7]
Koop R, Bouman J, Schrama E and Visser P. Calibration and er-
[ 11 ]
ror assessment of GOCE data. In: Schwarz A. lAG Symp Proc
Heidelberg New York, 2002.
Booman J, Kern M, Koop R, Pail R,
(7) ,56-58. (in Chine•e) [ 12]
J1aawnans
berger T. Comparison of outlier detection algorithms for GOCE
Chineoe) [13]
-88.
[14]
dyoami.,, 2010,30(2) ,71-75. (in Chinese)
and its progress. Geomatics and Information Science of Wuhan U-
[10]
Kern M,Preimesherger T,allesch M, et al. Outlier detection al-
gorithms and their pedOmnmce in GOCE gravity field processing.
Xu Xinyu,et al. Research on analysis and simulation of gravity gradiometty error of GOCE satellite. Journal of Geodesy and Geo-
Ning Jinsheng, et al. Applications of wavelet analysis in geodesy
nivemity. 2004, 29(8) :659-664. (in Chinese)
Qing Qianqing, et al. Applications of wavelet analysis. Xi' an: Xidian University Press, 1994 . (in Chinese)
posia 129 - Gravity, 2005, Geoid and Space Missions Springer ,83
[ 9]
Lin Jianxin, et al. Application of wavelet analysis in seismic data denoising. Progress in Geophysics, 2006,21(2) :541-545. (in
R ond PreDnes-
gravity gradients. In Jekeli C,Bastos Land Fernandes J. lAG Sym-
Li Haidong , et al. Wavelet denoising based on technique of
threshold. Computer Technology and Development, 2009, 19
125, Vistas for geodesy in the new millenium. Springer, Berlin
[ 8]
Vol. 3
[15]
Preimesberger T and Pail R. GOCE quick-look gravity solution: application of the seminalytic approach in the case of data gaps and non-repeat mbi". Stud. Ceoph Geod. ,2003,(47) ,435 - 453.