) d x = 0, Jo we get E M I θ[Τ(ξ)] ^ is ξ a sufficient statistic for <Ρ 9 then (*, <9) = φ0(χ) exp [ / ( χ ) β ( Θ ) + (xk9 0 O ) _ δ<9δχ, 2 ) + /·(©)] => =>/(*„ .·.,
+ ••• + ξη)/η] = Θ.
Further, Γ
Η
0 = — Γ 0 ( Χ ) ( - Η © + / ' * ) / ( * , <9)dx = n d0J^ 2
=
2
τ0(χ) ( - η + —9 [{Fx) - ΙηΘΙ'χ + η'Θ ]) f(x, Θ) dx, Jjtn
\
J
Gr
2
thus also (Fξ) is the U B U E of its mean value. A s 2
2
ΕΘ[{ξχ + ... + ξη) ] = ®Θ{ξχ + ... + ξη) + Ε Θ(ξ, + ... + ξη) = where
2
2
it follows that £ 0 [ ( / ' £ ) ] = n&{\ + n). If g{&) = β , 0 e ( O , oo), then the 2 U B U E is τ , ( § ) = ( Γ | ) / [ η ( « + 1)], and 9^(ξ)) = (An + 6)Θ*/[η(η + 1)]. The reader is referred to Example 3.2.1.5 and especially to Example 3.2.1.6 as representing a continuation of Example 3.7. Example 3.8. Let (dPj/dA)(x) = / ( * , Θ) =f0(x) where 0 e 0 , r 0 eO. Let g ( ) and R()
1
64
exp [ ψ τ ) < 2 ( 6 > )
+
R(&)],
be difTerentiable. Then
r 0 ( * ) / 0 ( * ) exp [ < * ) ß(6>) + R(0)] dX{x) = Φ(6>) = 0 =>
=*0 = d
Γ 0( * ) / 0( * ) exp [ φ τ ) β ( θ )
[r(JR) d ö ( 0 ) / d 0 + d Ä ( 0 ) / d 0 ]
+
R(0)]-
dX{X).
= [dR(0)/d&]/ Thus τ(ξ) is the U B U E of the function g(0) = ΕΘ[τ(ξ)] /[dQ(0)/d0], where d ß ( 0 ) / d 0 Φ 0. 2 It can be shown analogously that r ^ ) is the U B U E of the function g,(0) E&[^)1 0 6 0 , etc. See also Theorem 3.1.4 and Remark 3.2.1. The reader is recommended to verify that the foregoing examples (except Example 3.5) are particular cases of Example 3.8.
3.1 Sufficient statistics Definition 3.1.1. The statistic Tis said to be sufficient for the class & (or 1 n (^-measurable function P(A\T(-)): 9t -+ for the set Θ ) if V 6 3 { Γ " ^ }V{5o6f- (^)}V{06 0 } I 1
1
(X)dPi(X)=
XA
F P(A\T(X))
dPà(X). n
This means that the Γ-conditional probability of the event Ae3S can be determined simultaneously for the whole set 0 ; P(A\T()) does not depend on 0 6 0 (see also Section 2.3, from Definition 2.3.2 to Lemma 2.3.8). n
Theorem 3.1.1. Let (AT", be a measurable space, 0t the w-dimensional n S S Euclidean space, SET the Borel σ-algebra of this space, T: (0t, 3ä ) ( ß , ÂS ) k a statistic for the class of probability measures 3^ = {P£ : 0 6 0 c= 0t } which is dominated by a σ-finite measure μ. Then Tis the sufficient statistic for 0 , if and only if, 1. V { 0 6 0 } 3 { g ( - , 0 ) : ^"-measurable and non-negative}; n 2. 3 {/*(·): 0t St\ ^"-measurable and non-negative}; 3. V { 0 6 0 } ( d P J / d / / ) ( * ) = / ( * , 0 ) = g[T(X), 0]h(X) W, μ\ Proof. Suppose Γ is a sufficient statistic. Let countable subclass of the class
= {P£r P^, . . . } be a
(see Lemma 2.3.10)
^ f x x ^ ,
A = £ c , P i , C J > 0 , J = 1,2, 7=1 n As Τ is sufficient, we get for an arbitrary A e 3i V{Ao6^=r-W}V{060}Î JB0
£ c , = l. 7=1
χΑ(Χ) dPj(JR) = F P(A\T(X))
d/>J(jr) =>
JB0
65
Ι
ΧΑ*) d (tfjpfy*)
=
l Ρ(Α\Τ(χ))
[ZfjPéjjW-
d
If we let « -> oo, we obtain Ί
χΑ(χ)άΧ(χ)
=
P(A\T(x)) dX(x) =
Ί
[
£ Λ( ^ ( · ) | Γ ( Χ ) )
dA(jf).
Thus Ρ(Λ| 7"( · )) is the Γ-conditional mean value of the characteristic function χΑ{) with respect to the measure A. If we restrict the measures P j and A on ä?JJ, the Radon — Nikodym derivative (dPj/dA) ( · ) is then ^-measurable, and n therefore according to Lemma 2.3.9 (dPj/dA) (X) = g[T(x), Θ] [âS Qn A]. It can be shown that ( d P j / d A ) ( Χ ) = g[T(x), Θ] is true modulo [âS\ λ] as well. For AG obviously Pi(A)
=
{X)
dPi(x)
XA
dPJ(X).
Ελ(χΑ{·)\Τ{χ)) As Ελ\χΑ{)\Τ(·
dPJ(X) =
= JP(A\T{X))
· ) ] is ä^-measurable, W e get
Εκ{χΑ-)\Τ{χ)) —
dPi(x) = ^E,(XA(-)\T(x))g(T(x), \EAxA-)g(T(%
Θ) dX(x) =
Θ)\Τ(χ)] dX(x) =
Θ) dX(x) = [ g i m ,
= \ΧΑ(Χ)8(ΆΧ),
Θ) dX(x).
Thus (dPi/dX)(x) = g[T(x), Θ] \β\Χ\\ if we denote h(x) = ( d l / d / i ) ( X ) μ], then according to Lemma 2.3.11 (άΡ£/άμ)(χ) = g[7"(JR), Θ ] Λ ( Χ ) [ Λ " , μ]· Let the conditions 1, 2 and 3 of the assertion now be fulfilled. Then (άλ/άμ)(,χ) = £ cjg(T(x),
0)h{x)
and a
^ άλ
( j r)
^ ( άμ
x)
=
(dPJ/dA)(*) =
g ( r ( x ))
0 )
G(NJR), Θ )
=^
/ Σ c (T(xl jg
7 =1
66
(,) dA
Σ
Θ)
C / g( m
,
Θ) h(x) :
= ^*(Γ(Χ), Θ)
[Λ",
A].
For Ae8$", B0eOS"0 it follows that V { 0 e 0 } P i ( A η B0) = f χΑ{χ) d P j ( * ) = ί Ρθ{Α\Τ(χ))
dPj(x) =
= ί P 0 (^|r(x))g*(7-(*), 6>)dA(x), f * , ( * ) dPJ(x) = f xA(x)g*(T(x),
0) άλ(χ) =
JBQ
JBN
Ex[xA)g*(n\
=
0)\Τ(χ)] άλ(χ) =
g*(T(x),
Θ)Ελ(24()\Τ(χ))άλ(χ).
Thus V{0e0}Vifioe^}
Pe(A\T(x))g*(T(x),
Ex(xA)\m)g*(m, ^V{0eQ}PM\nx))g*(T(x), As P£{x: g*[T(x),
0)άλ(χ)
Θ)άλ(χ)-
0) = Ex(jcA)\nx))g*(T(x),
0)
[@l λ].
0] = 0} = 0, obviously Ρθ(Α\T(x))
= £ ίχ<( ·)|T(x)) Λ
[&l Pi),
so that Εκ[χΑ·)\Τ(χ)\ = P[A\T(x)] is a variant of the Γ-conditional probability of the set A which does not depend on 0e 0 . s
Theorem 3.1.2. Let T: W -> 01 be a sufficient statistic for Θ in (âT, 1 k ^ = {P£: 0 e 0 cz @ }), and let g ( ) : 0 ä ? be a function to be estimated. 1 Let be the class of unbiased estimators r g ( · ) : 0t" -*• ÛÎ of the function g ( · ) , and £ θ [ ΐ ^ ( £ ) ] < oo for each 0 e 0 . Then for any TgeJf the following assertions are valid: 1. Ε[τΑ·)\Τ(ξ)] does not depend on Θ ε θ (thus it is a statistic for 0 ) ; 2. ν{0εΘ}Εθ{Ε[τΑ·)\Τ(ξ)]} = 8(0) (thus E[Tg(• )lΊ\ξ)] is an unbiased estimator of the function g(0), 0 e 0 ) ; 3. V { 0 e 0 } 0 e { £ [ r g ( - ) i n É ) ] } ^ 3>θ{τΑξ)} and for a given 6 > o e 0
^^{^^(οιη^)]} = 9^{τΑξ)}<>
tax)
= w o r n * ) ] [^4i67
Proof. Let Ae0»\
then for each B0e@"0 XAx) dPJ(ir) =
Ε(χΑ • )ι
dPj(*),
k
so that for every simple function h(x) = ]Γ α^Α{τί) analogously F h(x)dPi(x)=
F £[A(-)|7W]dPj(jr).
Let Α ( · ) be a non-negative ^"-measurable function and {hj( · )} 7 = =, be a sequence of simple functions 0 < Α , ( · ) ΐ Α ( · ) . Show that 00
E[hj(· ) | T(x)] S E[hj + ,(· ) | Γ(χ)] [«S, For each BQeâSl we have F £ [ Α 7 ( · ) | Γ ( χ ) ] dPj(jr) = Ί hj{x) d/>j(JR) ^ F A y + , ( * ) dPj(jr) =
=
F
E[hJ+]()\T(x)]dPS(x).
JB0
As / / , ( · ) Î A ( ) , the sequence {£[A y ( · ) | Τ1[*)]} 7" , is bounded and there exists lim Ε[Η;( · ) | T(x)] ^ ] ; this limit is a ^-measurable function of x. j-
X
Further, for each Θ Ε Θ and for each 5 0 e ^ g we get
ί
B.L.
lim E[hj()\T(x)]dPS(x)
= lim f hj(x) dPi(x)
=
= ί
lim
r
E[hj(x)\ T(x)] dPj(jr) =
lim « , ( * ) dPJ(x) = f h(x) dP&x) =
= ί £[A()|r(*)]dPj(x). J/?0
Thus lim £[A y( · ) | T(x)] = J?[A(-)|7W)] [^S, j-*
and since the left-hand side
oo
does not depend on Θ Ε Θ , the right-hand side does not depend on Θ Ε Θ either. (The notation B . L . above the equals sign means utilization of the B . Levi theorem.) Assertion 2 is obvious, according to Corollary 2.3.1. 68
Further, ®θ{τ,(ξ)}
= Εθ{(τΑξ) 2
+ E[Tg(-m&]
- g(&)> }
+ ®&{E[Tg(-)\T(4)}}
- E[rg()\m)
2
- E[Tg(-)\T(4)]> }
= Εθ{<τ,(ξ) + 2Εβ{ζτκ(ξ)
+
-
+
Ε[τΑ-)\Τ(ξ)}}-
•
• • ) - E(Tg( • ) | Γ( · · ))] [E(Tg( • ) | T( • · ) ) - g(0)]\ Τ(ξ)}} =
Ε {[Ε(τ,(·)\ηξ)) θ
·) -
- g(0)]E{[Tg(-
E[r (-)\ng
·)]\Τ(ξ)}}.
As E{[Tg(--)-E(Tg(-)\T(--))]\T(4)}
there also is
= 0,
- £(Γ,(.)ΐηξ))][^(τ,(·)|Γ(^)) - g(0)]}
Εθ{[τ,(ξ)
= 0,
and thus 9θ{Ε[τ,(·)\ηξ)]}
+ ΕΒ{[τΕ{ξ)
2
- Ε(τ,(·)\Τ(ξ))] }
=
®θ{τ^ξ%
which implies the assertion 3. In the last step of the proof, it is necessary to realize that Ε[τ8(·)\Τ(ξ)] - g(0) is the ^-measurable function, to utilize Lemma 2.3.6 and the following consequence of the Chebyshev inequality Εθ{[τ,(ξ)
2
- E(Tg(-)\T(4))] }
= 0oPi{Tg(4)
= E[rg(-)\T{ffl
= 1.
Example 3.1.1. Let (di>j/dA) ( * ) = « ( * ; Α Π > , Θ , Σ ) = /2
l/2
λ
= { l / { ( 2 n ) " [ d e t ( L ) ] } } exp ( - - (x-ΑΘ)'Σ~\χ-
A0)J,
where regularity of the covariance matrix Σ of the random vector ξ and i?(A) = k < M are assumed. The aim is to improve an unbiased estimator Tg{x) = L'x+l of the function g{&) = p'0, 0 e Θ , relating to Theorem 3.1.2. In our case the following must hold V { 0 e 0 = ®"}Εθ(υξ
+ I) = L AO + ! = ρ'θοί
= O&A'Z. = p. 69
k
An unbiased linear estimator of the function g{0) = ρ'Θ, ΘΕΘ = 0l exists if, and only if, pEJi(K) (the column space of the matrix A ' ) , and its form is ig(x) = Lx, where A'L = p. In the following, the sufficient statistic Τ is _ 1 required. This is T{x) = Α ' Σ χ , according to Theorem 3.1.1. It is sufficient to rewrite the density function n(x; Α Θ , Σ ) in the form n(x; Α Θ , Σ ) = = g[T(x), 6>]A(x), where /2
l
2
exp ( - - j f L " ' * ) .
h(x) = {1/{(2πγ [άεί(Σ)Υ' }}
According to Theorem 3.1.2, the improved estimator is τ8(χ) = Ε[τ8(ξ)\Τ(χ)]
]
Ε[1.'ξ\Α'Σ,~ χ\.
=
- 1
We shall show that rg(x) = ρ ^ Α ' Σ ^ Α ^ ' Α ' Σ * ; therefore let us consider the random vector (f,(É), Γ(ξ))'
= (L% ( Α ' Σ - ' ξ ) ' ) ' ~ N[(p, Α ' Σ - Ά ) ' Θ, Γ ] ,
where /.ΣΙ, ρ,
Γ =
ρ Α Σ Α _ 1
If we denote Γ{] = L T L , Γη = ρ', Γ 2 2 = Α ' Σ Α , it can easily be proved (for η 12 22 more detail see Andel [2]) that the corresponding elements Γ , F , Γ*\ Γ - 1 of the inverse matrix Γ are
f = (^ii - r T r y r — (r — r r Γ ) \ 1
x
n
22
2X
22
1
22
2X
u
]2
d e t ^ ) ^ , - J i 2 r 2 ^ , ) d e t ( r 2 2) ,
Γ = -Γΰ'Γ Γ = -F r r£ 2
22
]
12
= (/*')'·
]
n
_1
The A'E Af-conditional density function of the random variable U = fg(Ç) = = L'4 is /ί«|Α'Σ-'χ) = Α ' Σ " - Χ )
;
( 0
Θ
'
Γ
) /
Λ
(
Α
'
Σ
" '
1
Χ
;
Γ
1 1
2
2
'
ΘΓ
2
2)
=
- 1
= « ( « ; ρ Χ Α ' Σ - ' Α ) - Α'Σ-'χ, ( F ) ) , |
1
thus Tg(x) = £[f f (^)|r(x)] = ρ ' ( Α ' Σ - Α ) - Ά ' Σ - χ . According to Theorem 3.1, it can be shown that r s ( ) is even the U B U E . If τ0(ξ)€&, then 70
Ο=Φ(0) =
τ0(χ)η(χ;
Α Θ , Σ ) άχ =
— τ0(χ)η(χ; 90 J
=
Α 0 , Σ ) dx,
_1
J r 0 ( j r ) ( A T x — Α ' Σ Α 0 ) Λ ( Χ ; Α 0 , Σ ) dx = Ο, _ ,
which implies that i / A ' L " ^ is the U B U E of its mean value Εθ^Ά'Σ~ ξ) 1 = / . Α Σ A 0 . If ρ ' ( Α ' Σ - ' Α ) - is chosen for L , then 1
]
/
,
ι
,
,
,
ρ (Α Σ- Α)- Α Σ- χ=
=
Tg(x).
Thus r^(^) is not only the improved estimator, but it is the U B U E as well. For the sake of completeness, it should be said that the assumptions Λ ( Σ ) = η and - 1 and R(A) = k 2ζ η are sufficient for the existence of the inverse matrices Σ 1 1 (ATE" A ) " . Remark 3.1.1. The improvement Ε[τ8(·)\Τ(ξ)] of the estimator τ8(ξ) does 7 not mean that in all situations we obtain such an estimator from the class Jf which has the least possible dispersion. The completeness of the sufficient statistics is itself a sufficient condition for this. Τξ
ξ
ξ
Definition 3.1.2. Let 0> = {Ρ£ : Ρ£ε0> } be the class of probability meaξ s s sures Ρ£ induced by a statistic T: M"^0t \ V{Be@ }P£(B) = Ρ£[Τ~\Β)]. r s The statistic 7" is complete if every ^ Mntegrable function Λ ( · ) : 0t '-+ & such that V { 0 e 0 } f Λ(/) dP^(t) = 0 has the property V { 0 e 0 } / ^ { / : h(t) Φ Φ 0} = 0. Theorem 3.1.3. Let 9* = {P£: 0 6 0 C 0t ) be a class of probability mean ] sures on 36 , g( · ) : 0 ât a function to be estimated, be a non-empty class x of unbiased estimators Tg(-): at -> St such that ^ {0e&} E0[^)] < oo and T: ât" -> at be a sufficient and complete statistic. Then for each fg( · ) G the random variable ^[^(-^ΐχξ)] possesses the following properties: k
1.
V{SeS}Ee{E[f (-)\nffl=gmi t
2. V { % e J T } V { e 6 e } » e { J S t f f ( - ) i n ö ] } = Proof. The assertion 1 is obvious. Let f(T(x)) i.e./(·)-"
= E[fg(.)\T(x)]
-
E[Tg(.)\T(x)l
According to the property 1 above, for each 0 G 0 71
dPs(x) = J / ( / ) dP£(t) = g(&) - g(0) = 0.
f(T(x)) Thus
* 0} = P^x: E[f (-)\T(x)]
V{eeS}P?{i:Af)
Φ
g
#£lV-)in*)]} = 0. Further, Th. 3.1.1
• )i h©]} = ® { £ p , ( · )i nm
® {Ε[τ(
^
e
θ
® { y «»· e
Remark 3.1.2. If a complete sufficient statistic for Θ exists, then, according to Theorem 3.1.3, the U B U E of any function of the parameter 0 e 0 is a function of this complete statistic. Example 3.1.2. Let 0* = { P j : 0 e ( O , 1 ) M A I , f(x,
0 (1 V
Θ) = (dPé/dMHx)
-
Θ)"\
χ = 0, 1, Obviously Γ(Λ:) = χ is a sufficient statistic for (0, 1) = Θ. Prove that it is complete. 1 Let g(·) : {0, 1, . . . , « } - • ä? be a ^-measurable and Pj-integrable function (i.e. a finite one), and further let V { 0 e ( O , 1)}0
g(x) dPi(x) = Σ ( j &V - ey- g(x) x
=
If we denote 0/(1 — 0 ) = w, we see that the polynomial
*(0) ( o ) + ^ 0 ) ( " ) « + . - + ^ ( " ) ( " ) « " is identically zero on (0, oo). This implies that g(x) = 0 for χ = 0, 1, ..., αι. Thus, according to Theorem 3.1.3, every function / ( · ) : {0, 1, n)-+& (Pj-integrable) is the best unbiased estimator of its mean value: ΕΘ(ξ) 72
=
dPi(x)
= tQ
k
(jj)
0^(1 - & ) "
k
=«0,
2
ΕΘ(ξ )
2
=
2
2
χ dPé(x) = Σ k (") β*(ΐ _ Θ)"-" = (ηΘ) + ηθ(1 - 6>), 2
2
®Θ(ξ) = £ 0 ( £ ) - [ £ θ ( £ ) ] = « β ( 1 - Θ ) , 2
2
£ 0 (£(w - ξ)) = « Θ - η & -ηθ(\
-Θ)
=
2
= « <9(1 - Θ) - « 0 ( 1 - 6>) = η(η - 1) 6>(1 - Θ). lfg(0) = Θ, then T,(JC) = χ/η; if g(0) = ηΘ(\ - Θ) = 5> β (£), then r,(x) = — χ(η — χ)Ι(η — 1), etc. Example 3.1.3. Let = {Pi: 0 6 ( 0 , oo)} <ξ μ , / ( χ , Θ ) = ( d P j / d / i ) ( x ) =
&
Θ
χ
= ε- Θ /χ\,
χ = 0,1,2,....
The statistic Τ(χ) = χ is obviously sufficient. Prove that it is complete. If, l namely, a function Α ( · ) : {0, 1, 2, ...}->8t is ^-integrable and for each 6>e(0, oo)
Ε [Κξ)} = e - i Θ
= e" ( A(0) + ^ e
Θ + · · . ) = 0, 1
then obviously A(0) = A ( l ) = ... = 0 . Thus according to Theorem 3.1.3, x every ^-integrable function/(·) : {0, 1, . . . } - • 0t is the U B U E of its mean value. If, for example, g(0) = Θ = ΕΘ(ξ) = ®Θ(ξ), then the U B U E is n τ8(ξ) = ξ. Let xe 0t be a realization of the random vector ξ whose components are stochastically independent and have the same probability distriΘ bution characterized by the function h(x, Θ) = ε~ Θ*1χ\. In this case
/=ι η
and the probability distribution φ( ·, Θ) of the statistic Τ(ξ) = £ ξι is n€>
φ(ί, Θ) = e~ (n0y/t !, / = 0, 1, 2, ... (this can easily be proved by means of characteristic functions). According to the foregoing, we see that the statistic Τ(ξ)
= Χ ξί is sufficient and complete, and, for example, the estimator
τ8(ξ)
= (ΙΛΟ t
6
i s
t
he
U B U E of its mean value g(0) = Θ (=
ΕΘ[τ8(ξ)]).
ι = 1
Example 3.1.4. Let a random variable ξ having a rectangular probability 73
distribution, i.e. a probability density with respect to the Lebesgue measure of the form
^
O
)
- l 0
if χφ [Ο,Θ],
be realized stochastically independently η times. The statistic T(x) = max {*,, x2, ·*„}, where xn i= 1, 2, n, is the ith realization of the random variable £, is used for estimating the parameter Θ. It can easily be proved that the probability density of the random variable Τ(ξ), where § = ..., ξ„)' is the random vector with independent components having the same probability distribution as ξ, is
'
ΙΟ
\ΐίφ[0,θ] if /#[0, ©].
(The «th order statistic ξ(η) = max { ξ , , £2, ..., ξη} is at stake. Its probability density for an arbitrary continuous random variable ξ is determined by ] g(t) = n
its distribution function
2
( « - f 1 ) ( " + 2)
Λ+1
7 * ( ^ ) = i L i Ü . τ{ξ) is " persion is &Θ[Τ*(ξ)]
and
are
unbiased estimator of the parameter (9, its dis-
a n
1
. (With increasing « , the dispersion n(n + 2) ^ θ [ Γ * ( ξ ) ] decreases more quickly than the estimators that attain the lower Rao —Cramer boundary — see Section 3.2.1. W e ask readers studying Section 3.2.1 to realize that the requirements of regularity quoted there are not fulfilled in Example 3.1.4.) W e now show that the estimator Τ(ξ), and therefore Τ*(ξ) as well, is sufficient for an arbitrary parametric space Θ. Consider an w-tuple of order statistics = &
= min{E„ ..., ξη}ύ =
74
ξ{ΐ)ύ ... ύ ξ{η-Χ)^ξ{η)
max{£,, ..., ξη) =
Τ(ξ\
=
whose joint probability density is
{
for
n \ Θ"
x(n) such that
0 ^ x{]) ^ x{2) ^ ... ^ χ ( Λ ), 0 in other cases. The vector of the order statistics is obviously a sufficient statistic for an arbitrary parametric space Θ . The T(x) ( = x (, ;))-conditional probability distribution of the random variables ..., ξ(η) is given by the probability density A ( . x ( 1, )
= A ( x ( 1 ),
x(
tr
_ i ) | X ( W) )
. v ( M_ n , x ( / 0) ^ / | A ( x ( 1 ) , χ
η
= η ! β " 7 ( 4 ηΘ )
=
x ( n )) d x ( 1 ). . . d x ( w_ , ) =
= ( λ - 1)
1
for the arguments x ( „ _ 0 satisfying the inequalities 0 ^ x{]) ^ ... ^ < x{n- i) ^ x{n). The latest probability density does not depend on the parameter 6>, thus Τ(ξ) is a sufficient statistic for an arbitrary parametric space Θ. For the parametric space Θ = [Θ,, Θ2], where 0 < Θ,, Τ(ξ) is, however, 1 x not a complete statistic, since a non-zero function s(-): at -+ 0t such that 1 ν { β 6 [ β „ 6>2]}
5(w)Aîw"- 0"dw = O
Ό
can be easily found. For example, one can choose s(-) in the following form r
pk(u),
where pk(-) is a non-zero polynomial with the property {
Ό .0
pk(u)u"-
dw = 0,
if w ^ [0, 0 , ] .
We wish to remark that a statistic complete for one parametric space need not be complete for another. Therefore it is necessary to bear in mind which parametric space is under consideration. Theorem 3.1.4. Let 0* = { / > j : θ ε θ ^ χ [ - α
ρ
α ^ < μ , 75
(άΡϋάμ)(χ)
k
Σ W W + (Θ)
=Μχ) exp
The statistic U(x) = (£/,(*),
ν
Uk(x))' is sufficient and complete for Θ.
Proof. According to Theorem 3.1.1, the sufficiency is obvious. The υξ ξ ξ completeness is yet to be proved. Denote 0> = {Ρ£ : 0 6 0 } , Ρ£ (Β) = ] k = P£[U- (B)l B G SS ; let £ ( • ) : be a ^-measurable and ^ - i n t e ξ ± grable function such that V { 0 G 0 } \g(u) άΡ% {υ) = 0. Further, denote g = + + = max{0, ±g}; obviously g ( ) > 0, £ " ( · ) ^ 0 and = g () According to the assumption, +
g (u) d ^ ( o ) = J
g-(u)
dPF(u).
«
As for each 5 e # exp ( χ ©,.£/,.(*) + K ( 0 ) ) / o ( * ) άμ(χ) U-HB) \j=\ /
L.2.3.6
= ί exp ( Σ θ,Uj(x)) exp [ V( Θ)] E[fn( · )|t/(x)] d/i(x) = Ju-HB) \j=\ / = j " exp (6>'u) exp [ K « 9 ) ] £ t / o ( ) l « ] Φ
»
,
we get for P^ dPg(u)
= exp (&u) exp υ
where dv(u) = £ T / 0 ( ) | u ] άμ (υ). +
\g (u)dPg(u)
dv(u),
Furthermore, =
exp (β'ιι) exp
jg-(u)dPP(u]')=> dv(u) =
* The class of such distributions is called the Darmois —Koopman class, or the class of exponential distributions. The probability density of an exponential type is often given in the form
/ ( j t , Θ) = C{0)f{x) exp f £ Qt(G) */,.(*)) and the substitution vy = Qj(0)J = 1, k is called a natural parametrization. In our theorem, a class of exponential distributions with natural parameters is considered.
76
= J exp (0'u) exp [V(0)]g~(u)
+
exp (0 u) dx (u) =
dv(u) =>
= J e x p (Θ'υ) dx~(u), ±
±
where g (u) dv(u) = dx (u).
The equality
J e x p (Θ'υ) dx+(u) = j e x p (Θ'υ) dx~(u) k is true on Θ
X [ — aj9 aj.
Consider now the analytical functions Ζ
zk) = j e x p ( Σ Λ )
±
ç (zu
d x ±
w
( " i ' · · · ' *)>
Zj=0j + Uj9 -aj<.0j^aj; - o o < i y < o o , 7 = 1 , ...,/:. k, we have According to the assumption, for 0j€[ — aj9 a,], t}, = 0,y = 1, + ç> (z l 9 ..., ζ*) = ç>~(z,, ..., ζ*), so that with respect to the analyticity of the ±
+
functions φ (ζι, the set {(0,
zk) we get the equality φ (ζ{,
f): 0jE[—aj,
zk) = φ~(ζΐ9
00, 00), 7 = 1,
a,],
ζ*) on
/τ}. I f we now put
φ = ... = 0k = 0, we get
ί
+
exp ( i f a) dx (u)
^x+
= x-=>g+(.)
=
= g-(.)[v]^g(.)
Example 3.1.5. Let the class 9* = such that
(d*u)(* Φ
exp ( i f u) dx (u) = 0 ψ>"% 00) χ (0, oo)} < λ be
0e(-00,
- (r^)'"
= φχ(χ, q, Θ,, 02)
(obviously the joint probability distribution of the arithmetic mean ξ = η
η a n c
e
2
m
= Ο/Ό Σ è * ° f ^ quadratic form β ( £ ) = Σ ~ ζ) Λ β case <5 ~ N(I0X, 02\) is under discussion). W e make a regular transformation of the parameters: 5, = Θ,/(2Θ 2 ), 5 2 = —1/(2<92). Using this notation, we get
77
φχ(χ, q, Θ,, 0 2 ) = φ(χ, q, 3U 92) = = f0(x, q) exp [^Ufa
q) + 92U2(x,
q) + K ( 0 „ 92)],
where
/o(*, q) =
1/2
n 2π.
n- 1
2
2
Γ
n- 1 2 2
Ux(x, (y) = 2/?x, i/ 2 (x, Î ) = - ( « x + 9),
(5,, I9 2)G( — o o , o o ) x ( — o o , 0).
According to Theorem 3.1.4, the completeness is already obvious. It should be remarked here that the condition X [ — ap a} cz 0 from Theorem 3.1.4 can be replaced by the condition X [ap bj] cz 0 , a} < bj9j = 1, j
=
k, with an
1
analogous method of proving. The latter form of the condition was used in our example. s
Definition 3.1.3. The statistic T: -> 3t sufficient for the class of probξ 0 6 0 } defined on 3F is said to be minimal if for ability measures 3Ρ = r every other sufficient statistic V: 01" -• 0t there exists a ^'-measurable funcr tion g(. ) : m - • 3t such that g{ V) = T. τξ
Definition 3.1.4. Let 0> = {Pg: P | e ^ } be a class of probability measures P£ induced by a statistic T: df^ât; V{BeâT}Pg(B) = Ρ£[Τ\Β)]. ] A statistic Tis boundedly complete if every bounded function Α ( · ) : ât*-> 3t such that V { 0 6 0 } f A ( / ) dPg(t) = 0 has the property V { 0 6 0 } Pg{t: h(t) Φ Φ 0} = 0. Theorem 3.1.5. Let ^ be the class of probability measures from Theorem 3.1.3, Γ be the minimal sufficient statistic for 0*^, V be sufficient and ξ r boundedly complete for 0> \ T: 3t" -• 0t\ V: &" - • 0t , M"^ = T\0F\ @l = = i r ' O T ) . Then V {M0 6
3 {Moo e ^ } V { 0 6 0 } P&M0AMW)
= 0.
Proof. The minimality of Τ implies that there exists a ^'-measurable 78
function g(-): W-> at* such that T= g{V) (which obviously implies âN^ g g 0Q. For each M e f , define a function ^ ( · ) : in the following way
=
P{A/|y(
¥(*))} - P { M |
is J^-measurable and for each Θ ε Θ
£ e l M § ) ] = jp(M\g(V(x)))
dP&x) - JP(M\W)
àP^x) =
= Pi{M) - P | ( M ) = 0. As ν { θ ε θ } | ί / „ 0 > ) | = \P(M\g(y))
- P(M\y)\
< 1 \ » , P$),
the bounded completeness implies V { 0 e 0 } P ^ { y : „(/) = 0} = P|{jr: ç>„(*) = 0} = 1. Thus it is proved that As
V { M e # " } V { e e e } J&jr: P ( M | V ( * ) ) = Ρ ( Μ | Γ ( χ ) ) } = 1. V { M oe ^ } V { 0 e 0 } ^ o( x ) = iW0W*)) =
= Ε[χ ^)\ν(χ)} Μ
we have
[®l,Pil
V { M o e ^ } V { 0 e 0 } / > | { x : ^ ( j r ) = Ρ ( Μ 0 | Γ ( * ) ) } = 1.
The function/(·) = P(M0\T(-)) is ^-measurable, so that Μω = e^Jo from which it follows that V {Be&} Ρ&(ΜηΔΜ0) = 0.
/ - \ { \ } ) Ε
Remark 3.1.3. Since events with zero probability can be neglected from the point of view of statistical applications, the boundedly complete statistic is considered to be practically minimal. ξ
Definition 3.1.5. Let 0> = ΘΕΘ} be a class of probability measures η n defined on (β , âS ), which class is dominated by a σ-finite measure μ. The n elements xx, x2 of the sample space & are said to be equivalent (the notation ~ is used for equivalence; JT, ~ jr2) if the ratio/(*,, 0)//(jr 2 , Θ) = K(xu x2) does not depend on ΘΕ Θ ; here f(x, Θ) = (dP^/άμ) (x). Remark 3.1.4. As the function / ( · , 0) is defined uniquely, except for a n μ-zero set from 0S which may depend on Θ Ε Θ , and since further the cardinality of the set Θ may be the cardinality of the continuum, Defini79
tion 3.1.5 does not guarantee that the decomposition of the sample space 0t by the classes of equivalent elements is a measurable decomposition. n
s
n
ξ
Theorem 3.1.6. Let T: 0t -+3t bea statistic for 0> = { P | : 0 e Θ} such that *Ί ~ x2o R(jr,) = T(x2). Then RIS a sufficient and minimal statistic. ξ
τ
According to Definition 2.3.2 and Proof. Denote Θ) = (άΡ^ /άμ )(ί). Lemma 2.3.8, there exists a measure that is a variant of the /-conditional n measure μ; this measure is denoted μ ( · | / ) . Evidently, for Ae3S V{*e#»}f
χΑ(χ)άμ(χ)
=
= f μ(Α\Τ(χ)) άμ(χ) = \ μ(Α\ί) JBQ
άμ\ί)
JB
and E„iA;
β)10 =
θ)άμ{χ\$.
According to the definition of a Γ-conditional mean value
JB.
= f Εμ[χΑ(·)Λ·,
&)\T(x)]
άμ(χ).
For A =
Be®*}
m-,
mn*)] m*) =
gim,
®\ Φ
«
=
= P$(B) => there exists a version EJJ{-, Θ)\(\ such that
= s(t Θ) = j7(x, &) d//(y|/)On A^x) = {y: 7(y) = T(x) = / } (obviously A£x)e@") the assumption, / ( y , 0 ) = / ( J T , 0 ) K(y, x), so that g(t, &) = / ( * , 0 ) f
K(y, χ) άμ{γ\ί) =
JA/IX) = / ( * , 0 ) | ^ ( y , jr)dMy|/). 80
we have, according to
T h u s / ( x , &) = g{T(kl 0)h(x\ where h{x) = l / j # ( y , χ) άμ{γ\ί\ ξ according to Theorem 3.1.1, proves the sufficiency of Γ for 9 . n
r
which,
ξ
Further, let V: 3t -> 0t be a sufficient statistic for 9 , y φ T. Then, according to Theorem 3.1.1, (dPi/dM)(x)
0)hx(x)
= gl(V(xl
= /(*,
0)
Thus, if V(x}) = V(x2\ then/(jr,, 6>)//(χ2, 0) = Λ ,(*,)//*](*>) does not depend on 0, i.e. χ, ~ x2oT(xx)= T(x2). Thus the implication V(xx) = V(x2) => T(x{) = T(x2) holds. Hence there exists r a function (Definition 3.1.4) g( · ) : 0t - » 3t such thatflr(V) = 7"being, accordξ ing to Lemma 2.3.9, ^-measurable, which proves the minimality for 9 . For further detail see references [6—8, 15, 20, 21, 31, 41, 52, 54, 63, 64, 77, 79—81, 83, 88, 103, 111, 132, 137, 140].
3.2 Inequalities of the estimation theory 3.2.1 One-dimensional (scalar) parameters Definition 3.2.1.1. Let 9* = { P j : ΘεΘ cz ®% {άΡ^άμ)(χ) = / ( * , Θ) and l ξ Θ be an open set in 9 . The class 9 is said to be regular with respect to the first derivative if 1. V { 0 G 0 } / ( j r , Θ) > 0 μ\\ 2. V { j r e ^ } V { 0 e O } 3 ô In/(χ, Θ)/8Θ; 3. V { 0 G 0 } f [ô ln/(jr, 0)/d0]f(x, Θ) άμ(χ) = d<9 J
<9)d/i(*) = 0; 2
4. V { 0 e 0 } O < Γ[θ l n / ( * , 6>)/δΘ] /(*, Θ) άμ(χ) < oo. If we add the following conditions 5 and 6, 2 5. V { j r e ^ " } V { 6 > e 0 } 3 ô ln / ( * , 0)/d&* ; d2f(X Θ)
6. V { 0 6 0 } [ \ άμ(χ) 2 J Θ0
=
[f{x, Θ) άμ(χ) = 0, 2 d© J
to the preceding conditions 1—4, the class 0* <^ μ is said to be regular with respect to the second derivative if all of the conditions 1—6 are fulfilled. (The definition can be formulated using 3 { M e J " , μ(Μ°) = 0} V {xe M) instead of 81
Vjjre^H;
however, the following considerations are essentially unchanged.)
Theorem 3.2.1.1. Let derivative. Then
<ξ μ be a class regular with respect to the second δ ! η / ( £ Θ)
2
δ In/Kg, Θ)
Ν{ΘεΘ}£θ
6Θ
= ΕΘ
2
Proof. δ 1η/(χ, Θ)
1
ô/(*, Θ ) .
66)
f(x, Θ)
δ(9
3 1η/(χ, Θ) δΘ
/θ/(χ, Θ ) Γ
1
2
2
/ ( j r , Θ)
2
°Ν
Ô6>
2
δθ
V
/
+ f{x,
J/(*, 0 ) V
/
00
6
Ô /(x, Θ) , . , · Γ /δ 1η /(χ, ©)Υ δβ
2
J
V
δΘ
δ ln/fé, 0 )
2
δ /(*, θ )
1
δβ
Θ)
2
/ _Λ
y χ 2
00 Remark 3.2.1.1. If the class ^ with a one-dimensional parameter is regular with respect to the first derivative, then for each Θ ε Θ there exists a random variable ηΘ = δ In / ( £ , 6>)/δΘ such that ΕΘ(ηΘ) = 0 and 3>Θ(ηΘ) = 2 is regular with respect to the second deriva= Ee[(d In /"(£, 6>)/ô6>) ]; if 2 2 tive, then 2 M = £<=>[-δ In Λξ, Θ^Θ ). Definition 3.2.1.2. Let 3?^ <^ μ (σ-finite measure) with a one-dimensional parameter Θ be regular with respect to the first derivative and let g( • ) : 0 -> ^ ' be a differentiable function. An unbiased estimator zg is said to be regular with respect to the first derivative if V{6>e©}
dg(6>)_ d d6>
τΛχ) ηΘ{χ)Αχ, where /? β = δ 1η /"(£ Θ)/δ6>. 82
d<9
T.(jr)/(jr, Θ) άμ{χ)
Θ) άμ{χ) ( = c o v e ( r , ( g ) , ^ θ ) ) ,
Theorem 3.2.1.2. Let ^ = { P | : &e®
g
6 ^ ®
0
( r
g
( ö ) ^ ^ )
^Θ{ηΘ)\
and if b(0o) = 0, then ^ [ ^ ( ξ ) ] = 0 => d g ( 0 o ) / d 0 = 0. Here the symbol b(0o) denotes a real number which can be different for different values of the parameter 6>0. Proof. If the notation ζΘ = α[τξ(ξ) ΕΘ(ζΘ) = 0 and
— g{0)\ + οηΘ is used, then obviously
Here, according to the assumed regularity of the estimator r g, dg(0)/d0 = = cov 0[r g(<^), 77θ]. As every covariance matrix has to be at least positive semidefinite, the covariance matrix dg(0)/d0\ ®Θ(ηΘ) J
Σ 0
\dg(0)/d0,
of the random vector (τ8(ξ) — g(0), ηθ)' must possess this property, and 2 therefore its determinant 2θ[τ^ξ)\ 3>θ(ηΘ) — [dg(0)/d0] must be non-negative, which implies the first assertion (the value 2Θ(η^ differs from zero ξ because of the regularity of & ) . Further, &Θο(τ8(ξ))
2
= [ d g ( 0 o ) / d ( 9 ] / ^ e o( / 7 0 o) o d e t ( L O ) = 0 ο Λ ( Σ 0 ) = 1
(its rank cannot be zero, since 2Θο(ηΘ)
> 0)
b0) Φ (0, 0)}(<ΐο, ο0)Σ0(α0, oQ^aoWQ ξ
οΡ %{α,[τ,(ξ)
b0)' =
0o
- g(0o)] + M e J = O o - g(6>0)] +
= 0} = 1.
83
The value a0 cannot be zero, since a0 = 0 implies 3)θ (ηΘο) = 0, which is a contradiction. We use the notation b0/a0 = — b(0o). If b(0o) = 0, then ob2 viously (according to the first assertion) S Θο[τ8(ξ)] =0^ [dg(0o)/d0] / /@>Θη(ηΘο) ^ 0=>dg((9 0)/d<9 = 0, which proves the second assertion. Remark 3.2.1.2. The value @Θο(ηΘο) is called the Fisher information of the probability distribution (or of the vector ξ at the point <90); also the 2
symbol F(0O) is used for it; F(-) : Θ 3t\ The value [dg(0)/d0] /@^e) is called the Rao —Cramer boundary; an estimator τ8(ξ) whose dispersion decreases to the Rao —Cramer boundary is usually said to be efficient (for more detail see Rao [102, p. 283]). Some fundamental properties of the Fisher ξ x information for the class 0> = { P j : 0e& cz 0t ) < μ, (dP|/d^)(x) = / ( * , 0) will now be given. (a) If vectors ..., ξΗ are stochastically independent and have the same probability distribution P j e then the Fisher information contained in this A:-tuple is kF(0), where F{0) is the Fisher information of the single vectors £·, / = 1, k. (b) In the class of probability measures the Bhattacharyya distance (see Ξ Kendall and Buckland [54]) between the probability measures Ρ Θ Ι and Ρ%Ί is defined by the relation Pi) = arc cos j V/(x, 0x)f{x,
KPir
02) άμ(χ).
If Δ0 is such a small value that in the development f(x,
0 + Δ0) =/(*, 0) + [ô/(x, 6>)/Ô<9] Δ0 + 2
+ (l/2!)[Ô /(jr, θ)βθ>>\(Δθγ
+ ...
the third and higher powers can be neglected, then X
h{PlPi+Ae)= -F{0){A0f. 8 n
S
(c) Let T: {9t, âS ) - » {β\ SIS ) be an arbitrary statistic for ξ class 0> satisfy the following conditions : V {BeeP} and
~ d0
f
f{x, 0 ) άμ(χ) = ί
JT-l(B)
V { 2 ? e ^ } A f g(t, 0 ) dM\t) d 0 JB 84
άμ(χ)
J Τ-i(B)
= f
JB
« μ. Let the
Q0
00
άμ\ί),
where μ\Β)
= μ[Τ~\Β)],
Be@\
and Ί
fx,
£„[/(·,
Θ) άμ(χ) =
θ)\7{χ)]άμ(χ)
=
τ-HB)
JT-HB)
Θ)\ί\ άμ\(),
Be&. 1
Denoting g(t, Θ) = £ „ [ / ( · , 6>)|/] = (dP^/d/i ) (0. ' 6 Λ*, /&θ, f(x, Θ) = ô/(x, <9)/ô0we get
r-I(I) = (d/d6>) Ί
fx,
=> W
[/'(if, 6>)//(*, Θ)] dPl(if) = T
Θ) άμ(χ) = (d/dÖ) F £ [ / ( ·, 0)|fl dM (t)
= (d/d6>)j^(/, Θ)άμ\ΐ)
- Ij e
β ) = Ôg(f, β ) /
W 0 >
=
T
0)/g(t, θ)]
\g\t,
=
dP \t)
Θ)//(·, 0 ) ] | / } d ^ ( / ) :
( ·, Β ) / / ( ·, Β ) Μ = S'IL E)/g(t,
Θ)
w,
t
P j\.
Furthermore,
et)/g(n$, Θ)] - / ' ( £ θΜξ, Θ)} } = F « 9 ) -
0 ύ E {{\g'(.m,
2
e
F\0),
since ^ e i f e r n « ) , E I / S I N A , 6>]}ί/"(^ &)/m, Θ)}] = = £ « ' ! Η · ) , —
© Μ Η · ) , θ}} { / χ · , Θ ) / / ( · , β ) } ΐ Η Ί ) ] } =
^eife'MA, © M Ï Ï S ) ,
= E {{g\m, e
β)ΐΗ©]} =
EY [n&, g
Θ]} } = F\0). 2
T
The statistic Γ does not cause any loss of the Fisher information, i.e. F (&) = = F(0), Θ ε Θ if, and only if, © Μ Η * ) , Θ]} - L f ( £
ofiteVig),
2
Θ)]] } = Oo
E\/g[n&, © ] = / ( £ W ( £ 0)1 = 1 *
o / ( j r , 6>) = g[T{x),
&]h(x)
(see Theorem 3.1.1).
The last assertion is formulated as the following theorem. 85
Theorem 3.2.1.3. Let = ΘΕΘ C â? } ^ μ be regular, at least with n respect to the first derivative, and let T: @t -• ât be a sufficient statistic for Θ. The class of probability measures induced by the statistic Τ is denoted τξ by 0> : 1
ξ
]
and the Fisher informaThe Fisher information of the class 0 , F():0-+â$ τξ T 1 tion of the class 0 , F (): Θ J* are in this case the same. Proof. According to Lemma 3.1.1 /(χ, 0) = g(T(x),
T
= g(f, Θ) f A(jr) d//(*X
0)h(x) => (dP^/dM )(t)
where Λ , = { * : T(x) = t}e@". Denoting / ( / ) =
h(x) άμ(χ) we get JA,
f(l
0) = g(t θ) 1(1).
Further, 2
= j (θ In g(r(x), 6>)/0<9) g(r(x), Θ)Α(χ) d / ι « =
F(0)
2
τ
= J (θ In g(f, 6>)/ô<9) g(/, Θ) 1(f) άμ (ί) =
F\0).
Remark 3.2.1.3. Theorem 3.2.1.3 shows that a sufficient statistic preserves the Fisher information on a parameter. Example 3.2.1.1. Let <
where
*
d
6>e(-oo, 4
d
A
)
w
" ^
e
x
( - ^ l ,
p
2
oo) => ηθ = ( l / σ ) f
t
e
-
e
(ξ, - 6 » and
,
!
) ' ® θ ( ^ θ ) = F(6>) =
2
= nc^/cr = η/σ . The Fisher information in this case does not depend on 0e( — oo, oo) and the larger it is, the smaller is the dispersion. Example 3.2.1.2. Let (άΡ^άμ)(χ) = (*„ •f 86
xj,
x , e { 0 , 1, 2, . . . } , ; = 1,
( £ , - Θ ) and F(<9) = n/0.
Xj
= f\(0 /xj\)
exp ( - Θ), where x =
n9 <9e(0, oo). Here
ηθ=(1/0)·
Theorem 3.2.1.4. Let ^ = { P | : <9e0 = ^ ' } be dominated by a σ-finite x measure μ, g( • ) : M -* & be differentiable on & and its derivative g'{ • ) be not identically zero on dt\ Further, let the function b(-): θ - » ^ ' from Theorem 3.2.1.2 be such that V { 6 > e 0 } P l { r , (£) - g(0) = b(&)
ηΘ}=\,
V{6>E0}aW) = ί — d/, Λ(6>) = - f f ^ d i }J ί J - » 6(0 bit) J-* 6 ( 0 and lim 6>-
-oo
In/ix, 6 » = I n / „ ( * ) .
Then 2
V { 6 > e 0 } % ( r ^ ) ) = (dg(6>)/d<9) /0 0 (// @) =>
=>fix, Θ) =Mx) exp iQi&) r ix) + Λ(6>)). g
Proof. According to the assumption and with respect to Theorem 3.2.1.1, 2
V{<9e0}<5> 0 (r g (£)) = idgi0)/d0) /®eine) ^V{6>e0}P|
\x: — L
=>
[r g(x) - g(6>)] = Ô Infix, 6>)/S6>l = 1.
bi0)
J
For a fixed value * 0 , let us integrate the equality θ I n / t o , Ο/δί = [1/6(0] r,(jr 0) -
git)/bit)
over the interval (— oo, 0). W e get I n / t o , 0) =
lim I n / ι * , 0 + Qi0)
r ix ) + /?(€>) => g
0
t -* - 00
=>/to,
= /oto) exp [ß(6>) r,(ib) + Λ ( Θ ) ] .
The value XQ is an arbitrary element of the set
.
,i,
: T
J^ -S
bi0)
bi0)
d0
such that · Ρ | ( Λ θ ) = 1. Thus the density
fix, 0) =Mx) exp [Qi0) r ix) + Λ(6>)] g
can be considered to be the Radon —Nikodym derivative (dPj/d/i) ( * ) . 87
x
Corollary 3.2.1.1. The statistic r g ( ) : a t 3 t from Theorem 3.2.1.4 is, x according to Theorem 3.1.1, sufficient for Θ = 3t . ξ
Corollary 3.2.1.2. If 3Ρ is a class from Theorem 3.2.1.4, i.e. (dP|/d//)(if) = f0(x) exp [Q(0) rg(x) + then 0
ξ
R(0)l
is regular according to Definition 3.2.1.1 and ηΘ=5
ln/(jr, 0)/d0=
=> ΕΘ(ηΘ)
= ΕΘ(τκ(ξ))
rg(x) dQ(0)/d0 dQ(0)ld0
+
dR(0)/d0=>
+ dR(0)/d0
= 0,
which implies that τ^ξ) is an unbiased estimator of the function g(0)=
-[dR(0)/d0]/[dQ(0)/d0]
(see Example 3.8). For an arbitrary 0e0t\ define χ = Q(0), y = R(0) and in the orthogonal coordinate system x, y consider the trajectory {(x, y): χ = Q(0), y = R(0), 0e&}. According to the assumption, this trajectory has a tangent y — y(0) = — g(0)(x — x(0)) at every point (x(<9), y(0)). If χ : ] the trajectory can be characterized by a function />(·)·' { = 0 ( Φ ) 0€&t } -> where />[x(0)] = -R(0\ then obviously
dx
v(6>)
ξ
Corollary 3.2.1.3. If in addition to Corollary 3.2.1.2,0 in Theorem 3.2.1.4 is considered to be regular with respect to the second derivative, we get 2
d p(x)\ 2
dx
ν = χ(Θ)
This relation can be easily derived from Theorems 3.2.1.4 and 3.2.1.2: 2
® e ( r , ( ö ) = (g '(Θ)) /2θ(ηθ) 2
= [(R"(0)Q\0) = [(R"(0)Q'{0) = [(R"(0)Q'(0)
2
- R'(0)Q"(0))/Q' (0)] /Ee(-d
-
\η/(ξ,
2
R'(0)Q"{0))IQ\0)] IEe(-Q'X0)Tg<<® 2
2
- R'(0)Q"(0))/Q' (0)] l{ =
R"(0)Q'(0)-
Q'\0) 88
=
2
-
Ii O"(0)R'(0) ρ
2
Θ^ΟΘ ) =
;
( θ )
R\0)Q"(0)
*"(©))
=
\
- "( )J = Ä
0
If the function y = p(x) is given in the parametric form χ = Q(0), y = = — R{0), 0e@\ then obviously dy/dx = -R'(0)/Q'(0) and 2
dy
R"{0)Q\0)
2
dx ~
- R'{0)Q'\0)
d0 _
Q'\0)
dx
β ( Θ ) - R'(0) Q'\0)
Fß0)
1 β'(Θ)'
Q'\0) by which our assertion is proved.
Remark 3.2.1.4. Theorem 3.2.1.4 can be easily modified for the case Θ = (a, b) c 3t\ Theorem 3.2.1.5. Let 9* = {Ρ%: Θ ε Θ c £ ' } <^ μ, Θ be an open set and η
(dPj/αμ) (JT) = f(x,
0) = f ] φ(χρ
0) (the statistical independ ence of the
components of the vector ξ = (ξΧ9 ..., £„)'), and let the derivatives d
r(0)].
Proof. According to Theorem 3.1.1 and with respect to our assumptions, I n / ( * , 0) = f j= ^
Σ ô In
In (*,, <9) = In g[T(x\
0] + In h(x) =>
ι 2
= ô In £[Γ(χ), <9]/ô<9=>ô In ç(xk9
2
= [Ô In g(s, 0)/d0ds]\s=mQT(xu
0)/d0dxk
·.., *„)/9**.
Let the value of the parameter 0 be fixed by 0 o G 0 . Then for k = 1, 2
=
«
2
Ô In
θ In ô(9ôs
/
0)
ô<9ô*A 2
ô \ng(s9 s=nx)l
Q0ds
Θ0) = π*)
In the following, we omit for the time being the index k. The right-hand side of the last equation does not depend on χ until .s = T(x) is fixed, which is why 89
2
it is denoted by α(Θ). For a fixed 0, ô In φ(χ, 0o)/d0èx on x, which we denote 2
ô In φ(χ, 0o)/d0dx
depends only
= v(x).
Thus 2
Θ In φ(χ, 0)/d0dx
= v(x) a(0) •
- ô In φ(χ, 0)/dx = v(x) a(0)d0
+ s(x)-
=> In φ(χ, 0) = I v(x) dx;Ja(6>) \ a(0) d0 d 0 ++ J| s(x) dx + r ( 0 ) . If the notation Q(0) = | a ( 6 > ) d0,
= ^v(x) dx,
In p 0 ( x ) = ^s(x)
dx,
is used, then
exp ( f
Θ) = % ( * , ) . . .
t(Xj) Q{0) + nr(0)^j =
φτ,) β ( θ ) + w(6>) ) .
= f0(x) exp
Remark 3.2.1.5. According to Theorem 3.1.4, £ t(x) is a sufficient and 7=1
complete statistic for Θ , so that it is the U B U E of its mean value. Hence, if
giß)
L
where Α ( · ) : (β\ Λ ' ) - » ( « ' , U B U E of the function g ( · ) .
06
V-i then τ , ( ξ „
0,
£,) = A Ι Σ 1 '~
Example 3.2.1.3. Let /(χ, 0 ) = Π ( 7=1
Xj =
90
0, 1,
M
) 0^(1-0)"-^,
W
« , y = 1,
q, 0 e ( O , 1).
) J
i s
t
he
Then obviously 9
f{x, 0 ) =f0(x)
exp ( X XjQ(0) + J= 1
where n
fo(x)=f\( ), j= ι
β(Θ)= I n R ( 0 ) 1- 0
v v
q
0(1 - Θ ) the U B U E of the function g ( ) , where 7-=ι
^ g(0)
dfl(0)/d0
=
0).
an — and τΛχ) = Y xf is 1- Θ * . - ι
1-0
=
dß(0)/d0
= qn In (1 -
q
1
It is also obvious that Ώ θ = Υ .χ.
R(0)
J_
. = nq0.
(-1)
0
1-0
^el g(h)\
=
— z q n
=
· nq
Example 3.2.1.4. Let the result of an experiment be a realization of a random variable having the Γ-distribution, i.e. whose probability density has the form
0
χ "
1
exp (-x)
= exp (-x)
exp [ ( 0 - 1) In χ - In Γ ( 0 ) ] , χ > 0,
and 0 e ( O , oo). ξ
If the experiment is replicated independently η times, we get & <ζ μ, where (άΡϋάμ)(χ)
= f }
• Σ In j e , - η 1η Γ ( 0 ) ) ; 0 ( 0 ) = 0 - 1 , R(0) = - η 1η Γ ( 0 ) , ι=
1
/
91
rg(x) = Σ In x„ g(0) = -R'(0)IQ\0)
= «(ln Γ(6>))'.
/'= 1
η
Thus the expression (\/n) £ In jc ;. can be used for the estimate of d(ln /= 1
Γ(Θ))/
/d(9. If the purpose of the experiment is to determine an estimate of (9, then on the basis of the facts given so far, it is not obvious how to construct the estimator. We return to this problem in Chapter 4 (Example 4.1.4). Example 3.2.1.5. In Example 3.7 it was seen that the U B U E for g(0) = = & was the statistic ( t ^ K / I + l ) ] .
=
The dispersion of this estimator is 2Θ{τ,{ξ))
=
n(n + 1)
**·
while ( - η In Θ - 1 / ' ^ ) = 4 6 > > .
(g'{&)Yl®e 2
Thus @Θ[τ8(ξ)] > (^(Θ)) Ι3ιΘ(η^. This example shows that no unbiased estimator exists of a given function whose dispersion would decrease to the Rao —Cramer boundary. Further, it will be shown how it is possible in some cases to increase the lower boundary of dispersions of unbiased estimators. ξ
Definition 3.2.1.3. Let us take 0 from Definition 3.2.1.1 and let the conditions 1—6 given in this definition hold. If further for m > 2 5
7. V { j r E ^ r } V { 6 > e 0 } V { s = 3, 8. V { 0 G 0 } V { 5 = 3,
w} f
J
8
m}3ô /(jr, y
(
* '
0 )
d&
άμ(χ)
Θ)/5Θ*; =
9. an m χ η matrix B(6>), whose (i,y)th element is i
B
(
_1Y
6 m
·'
J
1
3ft*. * ) W
V/(x, 0 )
1
d& J \f(x, Θ)
m*,*» f{x, Θ) άμ(χ),
d&
is regular for each ΘεΘ, then 0* j s called regular with respect to. the mth derivative. 92
ξ
Remark 3.2.1.6. If Θ> is regular with respect to the mth derivative, then for each ΘΕΘ there exists a random vector
=( _ L _ m J l . . . . _ L _
M
t
ô6>
\Λξ,Θ)
μ , Θ)
W L » ) , eer
such that EJJKß)] = 0 and its covariance matrix Σ Α
0)
/
= B(6>) is regular.
Definition 3.2.1.4. Let 3* be regular with respect to the /nth derivative and 1 let the function g(): Θ -* at and the estimator rge lÎg (the class of unbiased estimators of the function g( • )) have the following properties : s
1. V { < 9 e 0 } V { i = 1, 2. V { < 9 e 0 } V { i = 1, Γ
1
, ,
m}3à g(0)/d&; s
m) d g(&)ld&
&f(x, Θ) „
f(x, Θ)
„
=
— d& . , ,
zJxMx,
Θ) άμ(χ)
d&
Θ) άμ(χ) = οονΘ(τ,(ξ),
Tg(x)ßs(&)f(x,
β(6»))·
Then the estimator τΛξ) is said to be regular in the mth derivative. Theorem 3.2.1.6 (the Bhattacharyya lower boundary for the dispersion). x Let 9* and the estimator τ8(ξ) of the function g( · ) : Θ 3t be regular in the mth derivative. Then 1. Ν < 6 >
< Μ Γ,
6 Β !
L\d6> d&
( Ί
)) 6 [ ( £ .
£ ,
....
μ ]
Β-',Β).
d&"J_l
2. 0^(4)) = bm(0o) ο 3 {L(0O) e Ä " } Ρθο{τΑξ) = L'(0o)pX&Q)}= 1.
- g(&0) =
Proof. Consider an arbitrary (m + l)-dimensional random vector η = = (τ/,, η'2)', where η2 is the vector of the last m components. The covariance matrix of the vector η is
and let d e t ( L 2 2) Φ 0. Denote by μ = (//,, /ι 2 )' the vector of the mean value of 93
m
the vector η. The random variable p'i/ 2 , p e ^ , is maximally correlated with m 1 ι/,, i.e. ρ(77,, ρ'η2) = max{^(7/,, * Ί / 2 ) : x e f } if, and only if, p ' = ^σ 1 2Σ 2 " 2 . For Ä: = 1 1 1 ^(77, - σ 1 2 Σ 2 2 ι / 2 ) = σ „ - σ ^ Σ ^ ^ , ^ 0, and χ
= OoP{77, - /ι, = ^ ^ ( f f e - Λ ) } = 1.
σ η - σΧ2Σ22 σ2λ Choose now
77, = τ?(ξ)=>μχ °n = ®&(rgtf))
and
=g(&\ η2 = β(Θ)^
μ, = 0.
Considering the assumption of regularity with respect to the mth derivative, we have σ, 2 = ( — , - ^ - r , -—) g(&) and Σ 2 2 = B(<9). This implies the \d<9 d& d€T/ validity of the first assertion and the validity of the implication
o 3 { L ( 0 o ) e r } ? ^ ^ ) - g«9 0 ) = L'(0o)ß(0o)}
= 1
from the second assertion. It suffices to choose Σ 2 - > 2 1 = Β - ' 0 9 0) for L(&0). Further, if there exists L(0O) such that Ρ*{τ,(ξ)
- g(@o) = L\0Q)β(Θ0)}
= 1,
then -
_/Z/(<9 0 )B(6> 0 )i V B(6>0)L(6>0),
B«9 0 )
) -
which implies 9eS*,&)
1
_
= i ' ( 0 o ) B ( 0 o ) L ( 0 o ) = !.'( 6>0) Β ~ ( 6>0) B( 6>0) Β ' ( Θ0) L( Θ0) =
-Ks £) *H '"^ [& • - £) H' - bM)
94
Theorem 3.2.1.7. Let 9* and the estimator τζ(ξ) of the function g ( - ) : x & be regular in the mth derivative and
Θ
«*-[&-£H***[(5
£Μ·
where B 5 arises from the matrix B(<9) by omitting the (m — 5 ) last rows for 5 = 1, 2, (m - 1). and columns. Then bs(0) ^ bs+](0) Proof. If the notations / d
d* \
\ά0
d
d6^
ά&)
ΣΠ = Β^(Θ),
5+ 1
+1
Σ
σ
Β 5 + 1 , , + 1(<9) = ^ » '
^
are used, then the inequality
"•*>(t" :;;)"(:) *·*•-•• a
is to be proved. This is implied by the following relations
/
«νΣ,,,σ,Λ/ΐ,
I,
Λ-Ο- 2 1ΣΓΛ
-Σπ'σ,Λ/l,
1/U„VU
1
J\0,
l
ΣΓ,'σ,Λ}" (a\ 1
/J W
=
a -^r> )(i, ^'Λ)} ö
=
( β
=
(
''*
}
ΰ
)
- * '
{ ( σ 2 1Σ π ' , \ 0 , =
22
1
β'Σπ'β +
12
JU (σ,,-σ,,Σπ'σ,,Γ'Λ-σ,,Σπ', (6 -
β'ΣΓ>.2) (σ 2
22
1>/
1
- σ,,ΣΓ,'σ,,)- ^ β'ΣΓ,'β.
The last inequality is a consequence of the fact that σ 2 2 — ff2^fj'ff,2 > 0, which follows from the proof of Theorem 3.2.1.6. Remark 3.2.1.7. The problem formulated in Example 3.2.1.5 can be solved in some cases by means of Theorems 3.2.1.6 and 3.2.1.7. 95
Example 3.2.1.6 (which is a continuation of Examples 3.7 and 3.2.1.5). Let /(jr, Θ) = Θ" exp ( - / ' * / © ) ; consider Z>2(6>) from Theorem 3.2.1.6. Then
/(& Θ)
00
0*
0
Θ
2
Θ
/I(H+1)
(r
ßW)-= ^ 2
02
s
'
where Βη(Θ)
= ηΙ&;
Βη(Θ) = 0;
Ad0
d0V
Β22(Θ)
J
4
= 2«(« + Ι)/© ,
I JP_
I 2
Vd©/
We see that the lower boundary (greater than the Rao — Cramer boundary) 2 of the type b2(0) was attained by the estimator τ8(ξ) = (Γξ) /[η(η + 1)], which is known (see Example 3.7) to be the U B U E of the function g(G) = &. Remark 3.2.1.8. It is quite natural to ask what the value lim bs(ß) is in the s -*
00
ξ
case when 9 and τ8(ξ) are regular for every order of the derivative. An elegant example of such a situation is given by Machek [83, p. 78]. 3.2.2 Multidimensional (vector) parameters ξ
Definition 3.2.2.1. Let 0> = 0 e 0 c # * } < μ (σ-finite measure), (άΡ&/ Ιάμ) (χ) = / ( J T , Θ), Θ be an open set in Ä*, and let 1. V { 6 > E 0 } / ( x ,
θ)>Ο[0Τ9μ]; n
2. V { 0 6 0 } V { j r G ^ } V { î = 1, 96
k}3df(x
9
θ)βθ,;
k} [θ lnf(x, 0)/à&]f(x,
3. V { < 9 e 0 } V { / = 1,
a 80, 4.
\f(x,
Θ) άμ(χ) =
Θ)άμ(χ) = 0;
V { 6 > e 0 } V { U = 1, ..., k}3 ί 8 1η/(*, 6^ 9 1 η / ( , , 0
J
)
dOj
80,
&
)
^
Further, let thé matrix F ( 0 ) iF(^,,=
i
a
J
'
n
^
9
)
90,
a
'
n
^ d0j
e
>
^ .
e
) d ,
W
be positive definite. ξ Then the class 9 is regular with respect to the first derivatives. ξ If besides the foregoing, the class 0> also satisfies the conditions 5. V { 0 6 0 } V { / , 7 = 1,
ln/(jr, 0 ) / ô 0 , . ô 0 y ;
2
fc}V{jre^H3o
6. V { 0 G 0 } V { / , y = 1, . . . , * } i ^ ^ d / i ( i f ) = J 80,00, 00,00, j7(x,
0 ) d ^ * ) = 0,
then it is said to be regular with respect to the second derivatives. ξ
Theorem 3.2.2.1. In the class &> regular with respect to the second derivatives V { 0 e 0 } V { i , y = 1,
9(9,
k}Ee' 9
2
9(9,
InM, β)> 96>,96>
= E
The proof of our theorem is analogous to the proof of Theorem 3.2.1.1. Remark 3.2.2.1. If 0* from Definition 3.2.2.1 is regular in the first derivatives, then for each 0e 0 there exists the random vector ηθ = 9 In/(£, 0 ) / 9 0 such that Εθ(ηθ) = 0, and its covariance matrix Σ , β is Σ „ == FΓ („ 0 )= if 0
>ξ
ε ( f& d ln
0
)9 * ) '
is regular with respect to the second derivatives, then 97
Σ„ = F ( 0 ) = ΕΘ\ η ° \
8080
ξ
Definition 3.2.2.2. Let 9 be a class of probability measures from Definir tion 3.2.2.1. Let g(): Θ 2/l and suppose that for all components g , ( 0 ) , / = 1, r, there exist their partial derivatives dgi(0)/d0i, j = 1, k. A n n r unbiased estimator r y ( · ) : 0t - • @t of the function g( · ) is said to be regular ξ for the set Θ (or for the class 9 ) if V { 0 e 0 } V { / = 1, . . . , r } V { / = 1, . . . , * } 8 g /( 0 ) / Ô 0 , = A = | r f t ,(*)
5
1
η
^
θ )
|
0 ) άμ(χ) =
T f t
dtfjr)( = c o v 0 ( r y , , ( © ,
,)).
Here r ft is the ith component of xg and τ]Θ} is the y'th component of ηΘ from Remark 3.2.2.1. Theorem 3.2.2.2. Let ^ be the class from Definition 3.2.2.1 and g(-): r r Θ -> ^? , and let r y ( ) : - » 0l be from Definition 3.2.2.2. Furthermore, let the existence of the covariance matrix Σ Γ of the estimator xa be assumed. Then V{0G0}V{ae
= 0
( = > - ^= 0
00
or
^
K
/(Ö - S , ( 0 o) = [ * ° ( W
5
1
o 3 { B ( 0 o ) a n r x i matrix: Β ( Θ 0 ) # =
5 1n/(^0o)|
B ( 0 o )
90 98
J
= 1
N
/
^
0(
O
0 } / > ΘΟ
)
J
= 1 ~
j r „ ( £ ) - flr(0o) =
Proof. The covariance matrix Σ 0 of the random vector [(τβ(ξ) ô In /(ξ, 0)/d&]' is / Σ \οον(ηθ,
0
c o v ( r r ΐ|'β)\ F(6>) /
ν
φ,
/ Σ ν \dg'/d&,
— g(0)Y,
Ôff/ΘΘ'λ F(0) λ
Let the symbol M denote the matrix Vo,
l t. t
which is obviously regular (det(M) = 1). Then ΜΣ 0 Μ = Σ, =
- W0')F-'(ɻe^0,
0
The matrix Σο is a covariance matrix, and thus it has to be at least positive semidefinite; with respect to the regularity of the matrix M, the matrix Σ , is also at least positive semidefinite if, and only if, Σο is positive definite. Thus , l / is at least positive semidefinite, i.e. the matrix Σ ^ — (dg/d0 )F~ (0)dg /d0 ^ 0,
ν { β ε £ Τ } β ' [ Σ Γ ί - (dg/ô0')F-\0)dg'/d0]a
which proves the first part of the assertion. Suppose that for a fixed Θ0 there exists a matrix B ( 0 O ) such that - 9(Θο) = B ( 0 O ) 8 In /(ξ, 0O)/Q0} = 1 =>
Ρ ^ ξ ) ν
= 0
/B(0o)F(6>o)B'(0o), V F ( 0 O) B ' ( 0 o) ,
B(0o)F(0o) F(6> 0)
=> dg/d& = B ( 0 O ) F(6> 0) => ΣΤβ = B ( 0 O ) F(6> 0) B ' ( 0 O ) = = B(<9 0) F ( 0 O ) F ( 0 O ) F ( 0 O ) B'(6> 0) = φφθ')
F " '(6>0) dg'/d0,
_ I
Γ
thus ν { β ε ^ } β ' Σ , β β =
a'(dg(0o)/d&)F-\0o)(dg'(0o)/d0)a.
Conversely, let ν { β € ^ ' } β ' Σ Γ a = a'(dg/d0')F-\0) β
(dg'ß0)
ao
o V { e e ^ } ( i ' , 0 ) M M - % ( M ' ) - ' M ' (^j = Oo
onae&H*',
( _ ^
m
%
g
>
i
m
a
)
- 0. 99
Choosing for a in sequence e,, e2, 0 ry, we get
en where e, = (0,,
0,_,, 1,, 0 / +
ο = (.;, - « w e é r ç p - ' W U ï . ( _ = ®
- */(3>)
Θο
oPek jr /(© ft
gi(&0)
-
{(Se/a^F-'i^)},
= { ( 6 g / Ô 0 ' ) F ( 0 o) } ,
^ )^1
51η
51η/(
θο)
θο)
00
= 1.
It is now obvious how to finish the proof of the second part of the assertion. Remark 3.2.2.2. The matrix F ( 0 ) is called the Fisher information matrix. Its properties are analogous to the properties of the Fisher information mentioned in Remark 3.2.1.2. ξ
k
Theorem 3.2.2.3. Let 0> = 0 e 0 c 0t ) < μ be regular, at least with n s respect to the first derivative, and let 7"( · ) : 0t -+ 0t be a sufficient statistic ξ for 9 . The class of all probability measures induced by the statistic T( · ) is denoted T k τ 0> $ = {pg: © 6 0 <= 0t ) <ξ μ . Then for each 0 e 0 the Fisher information matrix F ( 0 ) of the class 9 is the r τ ξ same as the Fisher information matrix F ( 0 ) of the class & . The proof is analogous to the proof of Theorem 3.2.1.3. ξ
Example 3.2.2.1. Let (dP|/dA) ( * ) = / ( * , 0 ) = 1 w/2
(2K) [det(L)]
]
1 /2
exp ^ —- (χ — ΑΘ)'Σ~ (χ V 2
— A0)Y 7
0 = m\
where Σ is the regular η χ η covariance matrix of the vector ξ and the rank of the η χ k matrix A is i?(A) = k ^ n. In this case, the Fisher information matrix F ( 0 ) has the form 2
/
δ 1η/(£ 0 ) \ _ 1 " " = Α'Σ Α. 6000' /
Λ ξ
Definition 3.2.2.3. Let 9 be the class of probability measures from Definition 3.2.2.1, and let the conditions 1—6 from that Definition be fulfilled. 100
Further, for m > 2 let the following conditions 7, 8 and 9 be fulfilled, where 7. V { j r e ^ } V { é > e 0 } V { / „ ...,jk:j\
8. V { 0 e 0 } V { / „ ...,jk:
+ ... + j k ^ m}3 —
j \ + ... +jk ^ m]f — J
+ ... +jk
'^-f[x,
^ - / ( * , Θ);
Θ)άμ(χ)
=
ΘΘΙ'.,.θβέ*
Λ
/(*, Θ) άμ(χ) = 0.
= —.
9. Before stating this condition, a table of the ordered first, second and up to the mth derivatives has to be given:
1
j\
h
h
• ··
k
1
1
0
0
. ..
0
k
ό
ό
ό . .. i
k+\ k +2
2 1
0 1
0 0
. .. . ..
0 0
ό
ό
ό . ..
2
ό
ό
ό . ..
m
s
the number of the first derivatives is (*)
the number of the second ] derivatives is (* + )
etc.
Further, the symbol /?,(*, 0 ) is used, which for the single indices i = 1, 2, s is defined from the foregoing table as /?,(*, 0 ) =
0))8/(x, 0)/80,
0) =
0))8/(x, 0 ) / 8 0 ,
0 ) = (l//(jr, Θ))8-/(χ, 0 ) / 8 6 f and then the condition 9 may be formulated in the following way: the s x s ] matrix B ( 0 ) , where s = φ + (* + *) -f ... + + ) and whose (/, y)th element { B ( 0 ) } u is defined by
101
{B(é>)}„ =
©)/(*, Θ) άμ(χ\
Ä(jr,
is regular for each ΘΕΘ. ξ The class 0> fulfilling the conditions 1—9 is said to be regular with respect to the rath derivative. Ξ
Remark 3.2.2.3. If <Ρ is regular with respect to the mth derivative, then for each ΘΕΘ there exists a random vector
az
σ\-(
1
m
'
0
1
)
ôé>,
\Μ,Θ)
ι
""Λξ,θ)
"'"M,
e
Θ)
^
>
θ
d0k
""
d-Λξ, Θ)\ 9ΘΓ /
θ)] = 0, and the covariance matrix
whose mean value is Εθ[β{ξ, is regular.
= Β(Θ)
ξ
Definition 3.2.2.4. Let &* be regular with respect to the mth derivative and let g( • ) : Θ -»· ίΜ' and the unbiased estimator τα have the following properties 1. V { Ö e 0 c ^ V { / ; , ...,jk:j\
3 8"
1,
/}
+ ...+jk^m}V{i=
1, . . . , / }
+Jk
2. V { e 6 e c J * } V { / „ ...,jk:h
& - g,(S)/d& +
+ ... +jkim}V{i=
•• g^S)ldS{\..det;
+
+Jk
1
]
—
ôef'...ae£*
...οθ£ = rfl, ,(*)/·(*, 6>)d/i(*) =
ί
( = c o v * ( r ^ ) , / ? r ( £ β))), m
where r is an index determined for j\9 fr° the table in Definition 3.2.2.3. Then xg is called the estimator of the function g(-) regular with respect to the mth derivative. ξ
Theorem 3.2.2.4. Let 0> and the estimator xg of the function g( · ) : Θ -> 9? be regular with respect to the mth derivative. Then 1.
V { 0 G 0 } V { e E ^ } e T a ^ a MB r
9
102
(6>)M a, where
Q gi\ m
- 1 '
2
δ6>,
\δ6>,'
δ<9 2'
'"
g, 6Θ?'
06^'
Ô6T
à g, m
""
ae^.^ei!"- ' 1
o79f/
2. Σ Γ ί = M B - ' ( 6 > O ) M ' « > 3 { G ( 0 O ) : ί χ 5 matrix}
Ρ^ξ)
- 9(Θ ) = Ο(Θ )β(ξ, 0
0
Θ )} = 1, 0
+
where s = (*) + ... + (* ™-')· The proof of this theorem is omitted, owing to its similarity to the proofs of Theorems 3.2.1.6 and 3.2.2.2. Theorem 3.2.2.5. Let 5 and Τ be convex closed sets in the /i-dimensional Euclidean space 5 n Τ = 0 and S be bounded. Then there exist ue,f" and 1 ce^? such that V{xeS} u'x> c&{ye Τ} υ y < c. Proof. As S is, according to the assumption, bounded, there exist points XQE S and y0eT such that Κ
- y 0|| = inf {II*- y I I : xeS, ye T} = min { | | j r - y|| : xeS, ye T).
Consider now a hyperplane which contains the point (XQ + y 0 )/2 and is orthogonal to the connecting line of the points x 0 and y 0 , i.e. (Xo - y0y χ = k ( = (jtb - y 0 ) ' (Xo + y 0 )/2). Now it can be shown, from the statement of the theorem, that ι/is x^ — y 0 and c = k. Let the point q satisfy the inequality (JT 0 - y0)'q ^ k and let / ( · ) be a function such that 2
2
f(a) = \\aq+ (1 - a)x0 - y 0 | | = \\a(q 2
+ (x, - y 0 ) | | = 2
2
= a \\q - Xo\\ + 2a(q - x0)' ( j * - y 0 ) + ||jr0 - y 0 || . This implies d/(«)| do
= 2{q - Xo)' (Xo - y0) = 2'(*b - y0) - 2x'0(x0 - y0) ^
U=o
^
2£ -
2*Ό(χ0 -
y 0) <
0.
103
The latest inequality is a consequence of the inequality *rX*o -
y 0) -
k
= *o(*o -
y 0) -
(*o + y 0y Κ
-
y 0)/2 =
||x 0 -
2
y 0ll /2 >
0.
From / ' ( O ) < 0 it follows that, on the line segment with end points x0 and , there exists a point w with the property || w — y 0 || < ||JT 0 — y 0 ||, so that The point q is an arbitrary point satisfying only the condition
from which it follows that V{xeS}(x0-y0)'x>k. The rest of the proof is now obvious. Theorem 3.2.2.6. Let S and Τ be convex sets and let {S„}™= ι be a sequence of convex sets such that 1. V { « = 1, 2, ...}3{ι# η , c„: une®\
c„e&}V{xeS„}
u'„x^c„&V{yeT}u'ny^c„; 2. V {G(s, r)\re (0, oo), se S} 3 {x(s, r) e Sn : « = 1, 2 , . . . with the exception of a finite number of indices} jr(s, r ) e G(s, r), where G(s, r) is a sphere with centre s and radius r. Then for 5 and Τ there exists a separating hyperplane, n ] i.e. there exist ue0i and ce0t such that \f{seS}u's^c
and
V{#e Γ } ι ι 7 g c.
Proof. It may be assumed that ||uj| = 1 (if not, then un/||i#J| and cn = = ζ,/lluJI are considered). Let y be an arbitrary element from Γ and { s X L ι be an arbitrary bounded sequence from 5, s„çS„. Considering the condition 1 above and the Schwarz inequality, we get - l l y l l uuny^cn^u'nsn^
H4,||.
Thus the sequence {cn}™= x is bounded and there exists its subsequence {cn)fL x and a number c such that cn. -* c. As the sequence of vectors un is on the unit sphere, which is a compact set, there exists its subsequence {u„^fL, and the point ι/, \\u\\ = 1, such that ||u — u\\ 0. According to the second condition, V [n = 1, 2, ... with exception of a finite number of indices} u'ny ^ cn => =* u'y ^ c. For an arbitrary se S there exists {sn}™=, such that ||s„ — s\\ -+ 0. Again, according to the second condition, we get i / s ^ c. 104
Theorem 3.2.2.7. Let Γ be a convex set and s be its limit point (i.e. x V{G(s, r)\ r e ( 0 , oo)}3{fe T}teG(s, r)), then there exist ueât and ce0l such that V {xe T} u'x ^ c and u's = c. Proof. For the considered point s let {sn}™=, be such a sequence that II sn - s\\ -> 0, s„e Τ (the closure of the set Γ ) , therefore {s} = S and Τ satisfy the conditions 1 and 2 from Theorem 3.2.2.6. Thus there exists a hyperplane u'x = c separating s and T. As u's ^ c and simultaneously u's ^ c (sis the limit point of Γ ) , we have i/'s = c. Remark 3.2.2.4. The hyperplane u'x = c from Theorem 3.2.2.7 is usually called the supporting hyperplane of the convex set Τ at the point s. n
x
Theorem 3.2.2.8. Let Κ (cz 0t ) be a convex set and / ( · ) : K-*0l convex function. Then V {X0
G* }
3 {£/(*b) 6 Λ " } V { * G * ] / ( * )
^ / ( J T 0) + l / ' ( X 0 )
be a
(Jf - * 0 ) .
Proof. Let C be a set of elements of the (n + l)-dimensional Euclidean space, such that C = {(JT, y): xeK, y ^ f(x)}. It can easily be proved that C is a convex set and ( j K o , / ( j r 0 ) ) is, for XQEK, a limit point of C. According to η+ x ] Theorem 3.2.2.7 there exist (ι^(χ 0), ν)'θ@ \ ve0t and ceâl such that
for y ^ f(x) and for j> > f(x) in ( * ) the sharp inequality is true. If in ( * ) χ$ is substituted for χ and y > f(x0), then ν > 0 and ( * ) implies
(
^
i
^ )
)
=>/(*) To finish the proof, we put
i
^
(
i
)
4- ( - /(J%)/D)(jru'(*o) =
-
V(x^jv.
Theorem 3.2.2.9. Let ξ: (Ω, if) -> probability measure on ST, a n d / ( ) : (at, P-integrable function. Then
/(£(£)) = /
G w ) *
be a random vector, Ρ be a &) a convex and
x d/*(*)) ^ j7(*) d/*(jr) =
Ε(/(ξ)). 105
Proof. According to Theorem 3.2.2.8, at the point μ = Ε(ξ) =
^χάΡ\χ)
there exists a vector υ'(μ) such that
/(*) ^ /(μ) + u'Qi) (JT - Ii) =>
= j/W
^ Λ^ίολ
since Ju'foi) (JT - /ι) d/*(jr) = υ'(μ) (μ - μ) = 0. Definition 3.2.2.5. Let ^ = 0 e 0 c g ( . ) : Θ -+ Λ ' and r , ( ) : n 9t - • be an unbiased estimator of the function g( · ) . Further, let L ( ·, · · ) : x f x f 3 t be for each ye ffl as the function of the first arguments convex, ^'-measurable and ^^-integrable. A n unbiased estimator τ\\ξ) is called L-optimal if for each unbiased estimator τβ(ξ) and for each 0e Θ Εθ{Κτ%ξ),
g(0))}
^ ΕΜτβ(ξ),
g(0))}.
Theorem 3.2.2.10. Let there exist under the conditions of Definition 3.2.2.5 n s for Θ. Then the L-optimal the complete sufficient statistic 7"(): 0t 01 estimator of the function g( · ) is τ%χ) =
Ε[τβ()\Τ(χ)],
where τβ(ξ) is an arbitrary unbiased estimator (the optimal estimator is obviously a function of the complete sufficient statistic). Proof. With respect to Theorem 3.2.2.9 and Corollary 2.3.1, we have Εθ{ί\τβ{ξ\
g(0)]}
= E^E{L[Tg0l
ρ(β)]|7«)}>
and simultaneously E{L[re(-),
g(0)]\T(x)}
^ L{E[Ta(-)\T(x)],
g(0)}.
Denoting τ^Χξ) = Ε[τβ()\Τ(ξ)] (according to Theorem 3.1.3, this estimator is unique for the entire class of unbiased estimators τβ(ξ)), we see that for an arbitrary unbiased estimator τ9(ξ) the inequality Es{L[r^l is obvious. 106
g(0)]}
^ ΕΜτ£ξ)9
flr(6>)]},
0e Θ,
n
x
Remark 3.2.2.5. If, for example, &E0t\g(- ) : Θ -* 9t\ r g (- ) : 9l St is an 2 unbiased estimator and L ( « , · · ) is chosen in the form L(rg, g) = (rg — g) (which is in the first argument obviously a convex function), then the L-optimal estimator gives the U B U E (see Definition 3.2). To complement this Chapter, see the references given in Section 3.1 and also [12, 16, 25, 27, 28, 100, 101, 115, 126].
107