Available online at www.sciencedirect.com Available online at www.sciencedirect.com
Procedia Engineering
Procedia Engineering 00 (2011) 000–000 Procedia Engineering 15 (2011) 1942 – 1946 www.elsevier.com/locate/procedia
Advanced in Control Engineeringand Information Science
Scattered Noisy Data Fitting Using Bivariate Splines Tianhe Zhou, Zhong Li * a
Department of Mathematical Sciences, Zhejiang Sci-Tech University, Hangzhou, China, 310018
Abstract In this paper, we present a extension of weighted least squares method to fit the Hermite scattered data with noise. This method is different from the method in [1] which can only deal with the Lagrange scattered data. We give some numerical experiments to show the performance of our method. In addition, suppose the number of noisy data is large enough and the noisy term has the uniform distribution on interval [-1, 1], we show that the error bound can get better by average the coefficients of several splines which are constructed by fitting different sets of data.
© 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of [CEIS 2011] Keywords: Bernstein-Bezier representation of spline; Hermite scattered data; Extension of weighted least squares method;
1. Introduction Spline functions are piecewise polynomial functions which have certain smoothness over given polygonal domains. They are very flexible for approximating known or unknown functions or any given data sets. In CAGD(computer-aided geometric design) and Geology the reconstruction or approximation of a curve or surface from a scattered data set is a commonly encountered problem.
* Corresponding author. Tel.: +86-571-86843540; Fax: +86-571-86843224. E-mail address:
[email protected]. This research is supported by the National Natural Science Foundation of China under Grant No. 60903143 and 51075421; the Natural Science Foundation of Zhejiang Province of China under Grant No. Y1090141 and Y1110504; the Qianjiang Talent Project of Zhejiang Province of China under Grant No. QJD0902006; the Science Foundation of Zhejiang Sci-Tech University under Grant No. 0813826-Y.
1877-7058 © 2011 Published by Elsevier Ltd. doi:10.1016/j.proeng.2011.08.362
1943
TianheTianhe Zhou and Zhong Procedia Engineering Engineering 00 15 (2011) (2011) 000–000 1942 – 1946 Zhou ,et Li al//Procedia
2
In this paper, we are going to deal with Hermite scattered data. In addition, we present a way to analyze the probability of the method. In many situations, we not only have Lagrange scattered data, but also have Hermite scattered data (cf. [2]). In meteorology, for example, the wind velocity value is derivative of N
2
wind potential function. Let V = {vi = ( xi , yi )}i =1 be a set of points lying in a domain Ω ⊂ R with polygonal boundary and f i
μ ,ν
where
μ ,ν
, 0 ≤ μ + ν ≤ r , i = 1,L , N be a corresponding set of real numbers
are integer numbers. The data are assumed to be contaminated by noise
fiν , μ = Dνx Dyμ f ( xi , yi ) +
ν ,μ
i
r
where f belongs to the standard Sobolev space W∞ (Ω) and
ν ,μ
i
are
r
noisy terms. We propose to construct a surface s ∈ Sd ( ) which minimizes N
L( s ) := ∑
∑ μ ν
i =1 0≤ + ≤ r
where
ωiμ ,ν (||μ +ν | Dxμ Dνy s (vi ) − f i μ ,ν |) 2
S dr ( ) is a spline space defined in next section and 0 ≤ ωiμ ,ν ≤ 1 are weight terms. We call this
Extension of Weighted Least Squares Method. The same as in [1], when the data are reliable (which means
|
μ ,ν
i
|= 0 or << 1 ), we set ωiμ ,ν = 1 . When some of the data fi μ ,ν are not reliable, we set the
corresponding weight terms
ωiμ ,ν < 1 (e.g. ωiμ ,ν = 0.1, 0.01,…
depending on the size of |
μ ,ν
i
| , i.e.,
we choose the weight terms according to the size of the noise. ) 2. Probability Analysis Given a triangulation and integers 0 ≤ r < d , we write
S dr ( ) := {s ∈ C r (Ω) : s |T ∈ Pd , ∀T ∈} for the usual space of splines of degree d and smoothness r , where Pd is the ⎛ d + 2 ⎞ dimensional space ⎟ ⎜ ⎝ 2 ⎠
of bivariate polynomials of degree d . For each triangle T = 〈 v1 , v2 , v3 〉 in with vertices v1 , v2 , v3 , the corresponding polynomial piece s |T is written in the form
s |T =
∑
i + j + k =d
T T T cijk Bijk , where Bijk are the
Bernstein-Bezier polynomials of degree d associated with T . In particular, if barycentric coordinates of any point u ∈ R 2 relative to the triangle T , then
(λ1 , λ2 , λ3 ) are the
1944
Tianhe Zhouname and Zhong Li / Procedia Engineering (2011) 1942 – 1946 Author / Procedia Engineering 00 (2011)15 000–000
3
d ! λ 1i λ 2 j λ 3k , i + j + k = d i! j!k !
B iTj k ( u ) : =
More details about spline function can be found in [3-4]. Suppose there are sufficient numbers of scattered data with noise. We divide these data into M sets. Let {{ f i
μ ,ν
} j , 0 ≤ μ + ν ≤ 1, i = 1,L , N }Mj=1 be the M sets of the scattered data with noise. Each of
the set can be used to construct the spline. Let Pf use s :=
1 M
M
∑Pf
j
j
be the spline which construct by the j th data set. We
to approximate the original function f . It is easy to see Pf
1
, Pf 2 ,L , Pf M is a
j =1
sequence of independent random variables each having the same distribution with mean μ and variance
σ 2 . Here we suppose the noisy term has the uniform distribution on interval [-1, 1], then it is easy to see that
μ= f
and
σ 2 = 1/ 3 .
According to Central Limit Theorem (cf. [5]), we have the distribution
Pf 1 + Pf 2 + L + Pf M − Mf of σ M P{
tend
to
the
Standard
Normal
as M → ∞ .
That
is
Pf 1 + Pf 2 + L + Pf M − Mf ≤ c} → Φ (c), as M → ∞ , where Φ denote cumulative area σ M
under the Standard Normal Distribution. Then given a constant
P{| s − f |< δ } = P{|
δ > 0 , we have
Pf 1 + Pf 2 + L + Pf M − Mf δ M δ M |≤ } ≅ 2Φ ( ) − 1. σ σ σ M
According to Central Limit Theorem, the number M should be as large as possible. But in fact here, when M > 12 , the distribution tends to the Standard Normal as well. 3. Numerical Experiments 3.1. Example 1. Consider 1500 random points
{( xi , yi , Dxμ Dνy f ( xi , yi ) +
μ ,ν
i
f ( x, y ) is the test function and f1 ( x, y ) = 2 x 4 + 5 y 4
( xi , yi ) 's over [0,1] × [0,1] as shown in Fig. 1. Let
), 0 ≤ μ +ν ≤ 1, i = 1, …,1500} be a scattered data set, where i
are noise. We use the following test functions:
TianheTianhe Zhou and Zhong Procedia Engineering Engineering 00 15 (2011) (2011) 000–000 1942 – 1946 Zhou ,et Li al//Procedia
4
f 2 ( x, y ) = arctan(2 x + y 2 ) 2 2 f3 ( x, y ) = 0.75exp(−0.25(9 x − 2) − 0.25(9 y − 2) )
+0.75exp(−(9 x + 1) 2 / 49 − (9 y + 1) /10) +0.5exp(−0.25(9 x − 7) 2 − 0.25(9 y − 3) 2 ) −0.2 exp(−(9 x − 4) 2 − (9 y − 7) 2 ); to evaluate the scattered data set. We set
μ ,ν
i
= 0 when the corresponding point ( xi , yi ) is marked with
a cross (which means the data is reliable), and set
μ ,ν
i
to be a random number between -1 to 1 when the
corresponding point ( xi , yi ) is marked with a dot (which means the data is not reliable). Next the weights are set to be
ωi = 1 and ωi = 0.01 for the reliable data and non-reliable data, respectively. The
2
spline spaces of S8 ( ) is employed to find the fitting surfaces, where is the triangulation given in Fig. 1. We compared the maximum errors against the exact function between our method and the traditional least squares method in Table 1. The maximum errors are measured using 101× 101 equally-spaced points over [0,1] × [0,1] . From Table 1, we can see that the surface create by the traditional least squares method doesn't perform well on the test function. However, our method produces much better approximate solutions. Fig. 1. The scattered data and the triangulation
1945
1946
Tianhe Zhouname and /Zhong Li / Engineering Procedia Engineering (2011) 1942 – 1946 Author Procedia 00 (2011)15 000–000
5
Table 1. The approximation errors
our method
traditional least squares method
f1(x, y)
0.1071
0.6576
f2(x, y)
0.0185
0.2996
f3(x, y)
0.0161
0.4890
3.2. Example 2. The spline space and test functions are the same as Example 1. Here, suppose there are sufficient numbers of scattered data. Then we divided these data into M sets and use s := approximate the test functions. Different from Example 1, we set all the weight terms
1 M
M
∑Pf
j
to
j =1
ω =1
no matter
when the data is reliable or unreliable. Furthermore, we choose different M to check the dependence of the performance of our method on M. Some numerical results are given in Table 2. Form the Table 2, we can see that the error decrease when the M become bigger. But the error in approximating f3(x, y) using M=500
is
bigger
than
that
using
M=250.
This
is
because
the
formula
P{| s − f |< δ } ≅ 2Φ(δ 3M ) − 1 is about probability. That means it has 2Φ (δ 3M ) − 1 probability to happen. Table 2. The approximation errors with different M
M=20
M=50
M=100
M=250
M=500
f1(x, y)
0.0756
0.0583
0.0301
0.0191
0.0120
f2(x, y)
0.0772
0.0411
0.0322
0.0221
0.0131
f3(x, y)
0.1265
0.0981
0.0976
0.0878
0.0902
References [1] T. Zhou and D. Han, A weighted least squares method for scattered data fitting, J. Comput. Appl. Math. 2008; 217:56-63. [2] T. Zhou, D. Han and M. J. Lai, Energy Minimization Method for Scattered Data Hermite Interpolation, Appl. Numer. Math. 2008; 58:646-659. [3] M. J. Lai and L. L. Schumaker, Spline Functions on Triangulations, United Kingdom at the University Press,Cambridge, 2007. [4] M. von Golitschek, M. J. Lai and L. L. Schumaker, Error bounds for minimal energy bivariate polynomial splines, Numer. Math. 2002; 93: 315-331. [5] K. Subrahmaniam, A Primer in Probability-Second Edition,Revised and Expanded, 1990, Mercel Dekker, INC.