U.S.S.R. Comput.Maths.Math.Phys., Printed in Great Britain
LOWER
Vo1.24,No.6,pp.182-188,1984
0041-5553/84 $10.00+0.00 01986 Pergamon Press Ltd.
BOUNDSOF THE CAPACITY OF L-DIMENSIONAL ALGEBRAS OF ESTIMATE-COMPUTINGALGORITHMS* V.L. MATROSOV
For the most commonly used test spaces, polynomial lower bounds are obtained for the capacity of the algebra of computed operations over the class of estimate-computing algorithms. It follows from the upper bounds previously proved by the author that the present bounds are asymptotically unimprovable. We shall retain the terminology given in /l/. In /l/ the following upper bounds of the capacity obtained:
of the model of algorithms
in% were (1.1)
A~~[(~)=]d[l+sc(n,m,L)]e,(n,m,L), where
M,=R(n),
M,
The question
.s((n, m, L)+O, n, m, L-cm, 8,(n,m,~)=(2mL)",81(n,m,L)=(L+i)m".
is any space, naturally
arises
as to the construction
of lower bounds
for
Ax*[(%].
sufficiently close to the bound (l.l), and in particular, we wish to know whether or not the is admissible. bound of %(n, m,L) Our aim below is to show that &(n,m,L) is the lower bound of the capacity of the model ll"=(R;(f)), containing as principal pI‘.x(I), if U=llO is the algebra of computable operations operations
1.
addition,
multiplication,
if}={+,
and subtraction:
-. Xl.
Maximum index of the system of events.
Let A=‘\@) be the space of parameters specifying the algorithm of the model %& Put CIMq[A] the classification of sample S7 in space R, realized by algorithm A. ct,(S~,fI)=(Cl,r[A]IAEW). AE~&, can be regarded as a function of two arguments:
Every solving algorithm i.e., Co, where a=A, SEM, G
induces
T(A)=(Z',[a=~},
as the number of elements
T(n)
of model
and
$o=F~,
of the set of
A) 1.
Indr”(S”)=ICl.w(S,. '&02&,~(~~}~(&~)~,
T(A) if
the system of events
T,=Ta[~]=((S,I)IA(S,a)=l,AE~).
We define the index of the system of events classifications Cl,(Sq:A): We put
and let
&!,t(A)=(J)s,1
be the
Zs-submodel
(i,),; then we have
sz Theorem 1. If all the objects of sample Zy-submodel proximity spectra with respect to the
in problem Z(I,,s') have piecewise distinct (&)B.,, then
Ind,,"(Sv)=2q. Proof. of
A=.4(S,
1,A).
A :MX&(O, We shall then say that model
and
We will show that any classification
(%)0. Given the vector-algorithm
&$= (A,,...,A‘),
of sample generating
A,=A(&'), ~'=(E,',....E,,'), +=R+", and for any pair of objects a$$
‘(B)#B$,‘$“(B),
i.e . jO={l, 2,...,L), ko=(l, 2,...,n), &={I, We define
the sequence
(1.2) ss may be realized the
Zs-submodel
byanalgorithm
(&)=.I. where
S',Sf~Sq, let
B = F(B,, . , BL), F 5G90, 2,...,m} exist such that
@'Ma (j0)+=a"icko (j0). of L operators 8,,...,B= with parameters
(1.3) fj,B, P, z"', j=l, 2,...,L,
B@Jt,: (,,'=1/2',i=l, 2,...,n, p‘= (p,‘, 1pr.‘), v=l, 2,....m. -f’= (rl’, .,,T,“‘), y,‘=i/Z’*-“‘“, z'=(i,l), We shall show that the estimates
e'=&'. of all objects
8=
lz11.vychisl..yat.mat.Fiz.,24,12,1881-1891,1984
iB,, I-1 182
SEX',
supplied by the operator
183 are pairwise fact.
distinct,
” e:,(j)
e:,(j)
8,) =
(rlJ.
t,t'~(l,&...,q},
i.e., given any pair
.
af“,(i)
/ e:,,(i)
2 y:,[p+?&(j) +
we have
(r,,,(r,.,,8).
xn
/I~2I(+-)“-““n+“-‘” [g (+-)i I*;
(plj,
,
., p,j) =
:A
6$(j)].
._+ pf&, (j)] =
v=,
Y~l
I==,
Consequently,
TX
n
.
In binary
(lJ
+
i-i
;“,,‘(L)]‘+c’-“m” + . +C[ -La&) ]‘+n~m-‘i+‘i-“mri) i-1
form we obtain
the bounds
(r,,,8)=O, o,'(l)...e,'(l)...O,'(j)...~i,'(j) ...@.'(L)...&'(L), where
~~'(j)=(~v,'(j)...~,,i(j))~ j=l,P,...,L,
Since inequality quantities, where
Cl.31 holds,
v=l,Z?,...,W
+1,2,...,q.
then-(rl,,,8)#(rl,,8),
since
in the binary
form of these
(rl,,8)=O, @,"(l)...@~"(l) ...@."(j)...O,"(j)
[email protected]'(L) ...%"(L).
iokOjoplaces are not the same. suppose we are given a classification (s*;n,,...n,+)of the sample 3'. defined conditions: objects .!?'I&?,, L?EC.%I, v+n,, j=i,&..., d. We consdier the polynomial
It is easily
seen that
P
‘
cl&[A]=(Sy For, when
lb==+ aL
(Z 1
o=DF,‘..r#
A=f~c.
n,, ._ ,n,),
B,
,
r-1
II
(bi-b,)‘.
fj-r ‘4
t&(n,,...,nJ
ws,, 8) =L
(r,,,
F:, bd @I ) =
n Sl”,..
(r,,,, 8) =O.
Hence
from the
lJhen t&(n,,
(b,-bJ’=O.
..nr,
. . ..n..)
U-tinFst.bd(8) )>O, whence
(r,,, 13)=
$
jj (bcbd’-CC;[ ’ LEf”,, ..*a)
n (b.-b,,‘]-’ V-t lrcl
n (b,-br)‘. lc~r,,....ns)
But [ fi v-1 i+, Consequently,
(r,.,,B)>C,.
(b,-b,)‘]-‘
Since
n (b,-bJ’= IEl”,, “a,
AE((in,),, we obtain
2. Lower bound of the capacity of Consider the test space W=M,@X . ..XI!~.~.
M,O-CR"',@, v=l,2,...,n. Theorem
2.
The capacity
[ ]II (bi-b,,)‘]-’ >ern,.....na) f#J.b (1.2) _
=L 1.
The theorem
is proved.
the algebra of algorithms in space w
p.ky)=
of the model of algorithms
max Izr-yi(. L<‘
in space
M"
is lower-bounded:
A,,+[((P,)~]>(L+l)"". q=(L+l)"" exists with sample length and a Lemma 1. A regular problem z=z(I,,s*) such that any two objects have in space M" distinct vector algorithm &=I,‘ S',S~'ESP, t#t' Z9-submodel 9?tl'_(B)-(8)',,,, where B=F(B,,....B,): proximity spectra with respect to the
Ef;vfilf (B) F Bgj~‘iif’(U).
184 we construct
Proof.
Is,= (d,
the samples
3, and L
a;=1+-
. .,a3 9
sp as follows:
a,'=O; ifv, v,i=1,2,...,m,
L+l ’ L-r,
Idr&L+l,
Given
the
1= numbers
(s,,L r mliGr,GL+l,
&,"...+_G% for every
all
objects
of the type
(2.U
. . , s,,. _‘“,),
, . , n.
k=l,2,.
The resulting sequence of objects forms The order of numbering objects (2.1) is The information vectors S,), v=l,2,...,m. is complete. Z(Zrn,S9 The vector algorithm &=(A,,..., A,) is
the required sample ~p-(~L,...,~'t"'mr). Henceforth let not important here. S.=(S.,,..., The construction of a(&) are arbitrary. defined
as follows:
G=(e,‘, . . ..E.‘).
s'=(1,1),
A,=.4 (y',P’, E’, z’), are arbitrary,
L-‘-l+ e’
F&&i+ L L+i’ we consider
L-l -,.. L+1
there exists
S'ZS",
.,el’=i+(L+i)-I.
any pair of objects
s~=s.,:“,m, Since
r,EN.
r,EN}.
(L+l)m”
from 1 to
s,,t:: ,= C&L. V:, 1
7'9p'
,m,
set of all the elements
we number by the natural
where
i=1,2 ,...
an index
k=((1,2 ,...,
S’ =
s,,?,
n),
such that
s.f..z_+s.1...*_. we have v,*#w,*. Hence, for at least one v=l,2,.. .,m, Then, by the construction of the problem Z(I.",W,
p (S.,, Consier
the function
For clarity,
assume that
V"~>W,~.
s,:....< ) =I+ & qm&..r,)=i+&.
H,*(~)=p~(&~,z)--Em).
Then,
H.“,(S.:.. &=O, and H&(S&,)=(i+&)-(i+&)=+j+ Hence
The lemma is proved. a=Pi. Since the proximity spectra of objects of the sample sq are pairwise by Theorem 1, the index of the system of events Ind$$'(sq)=2q.
where
distinct,
then,
Hence we obtain
The theorem is proved. 3. Capacity R(n)
of the algebra
The capacity Theorem 3. is lower-bounded:
of algorithms
in space
of the model of algorithms
H(n)
(i,)O in the space of tests
M=
AR~n~[(%)ol~(2mL)". z9For the proof, we construct the regular problem Z(I,,Sb) and the corresponding submodel, whose capacity is precisely equal to (2mL)” (Lemma 3). ?? of problem By Theorem 1, it is sufficient to show that all objects of working sample This fact is obtained from Lemma 4, and is Z have pairwise distinct proximity spectra. proved on the basis of certain properties (Lemma 5, 6) of the functions H,,'(z), introduced in /l/. Lemma !DP U' (&+=(&)
A regular problem
2. 8'
Z=Z(l,,s')
exists,
where
6=(2mL)",
and a
%9-submodel
such that AR w [(..?) WI = (2mL)".
Proof. &=(A,,...,A,)
We construct Z and the vector algorithm &: S,=(S,,...,S,), where and A,=A(y’ p’, d,z’), where y',p’ are arbitrary, ?=(l,l),
s.= (a,,, E’=(E,‘,.
. a..) ; .E,‘),
j=l,
2,. _, L. Given the constant
b=l.
Then,
a,&< . . . ia, By construction, S', we define the sequence
for every k=l,2,. ., n. a,*-eI‘>O Before constructing the sample satisfying
of real numbers,
Obviously,
c+-%e,'=m+j/(Lfl).
S,r, I &m‘+l,r the following conditions.
e*'>a,,-a,,
and
et'< . . .
and
(3.1) q--2mL+l,
Let
3m+Ll(L+l)
(3.2a) .
.
.
.
tS,,,,,+,,,<3m-i+2+1/(L+l), . . . . . . . .
.
.
.
.
(3.2b)
.
.
.
.
(3.2~)
3m+l/(L+1)~S~,_,+,~,<3m+2/(L+l); Sm-i+l+W(L+l) . . . . . .
.
3m-i+1+1/(L+1)CS,.,,,+,,r<3m-i+1+2/(L+l), r(i)=q-iL,
i=l,
2,.
. ., m,
m-l/(L+l)i
.
.
cS,r~r+n-,~-~,~~m-i-l-L/(L+l);
1-2/(L+1)tS,,,*,_,,_,,,cl-ll(L+1), . . . . . . . . . .
.
*.
. .
(3.2.d)
0~&,,,,,~1-L/(L+1). The the sample
Sd is given by s"={sIs=(s.,,,...,s."");s.""~w.~),
where
ur~{I,2,.
.,2m+l}\{m+l},
The elements
v==l,2 ,...,
S,,,,, is not present
n.
It is easily
in the definition
Is”l-(2mL)“.
seen that of
S".
It has to be introduced
into the sequence (3.1) for the sake of symmetry of the notation. The sequence of inequalities (3.2) is divided into systems. We associate system a family of elements of the sequence (3.1). We introduce the following each of the families:
with each notation for
. . ,&‘+I)Ar
w,*=&,
Will=&l,+L,)I,. . , s,,,i,+t,J, ~“:1=&m,.), . . . . . . . . . . . . . . . . . . . I 6.3 r(‘+m,h 1, . . . . . . . . . . . . . . . . .
W m:r+t= (S [r(l+m-,,--1,L,.
W It.+,,= (S [1(Zm--L,-l,*, ..,S r(lm)* 1. We consider
the functions
we shall consider every k-l, 2, , n.
H,,*(S), defined
All the objects Lemma 3. with respect to the Z$-submodel
of sample x!? have pairwise (,J$)~(~).
We will show that, for any sets we have
distinct
j=l, 2,...,L,
proximity
for
spectra
(u,,...,u.), (w,,...,w,,),S’=(S.,,,...,S,,)Z(S.,,,...,S,.,)=SL’
&?&'w(A) Since s,;L.
in /l/:
HJ1*(~)=~a,~,i,l--s~-e~"'; here the identity permutations ql(i)=i, cpl(j)=j, i=l,2,...,m,
# &(5,f&(A).
there exists (vi,...,~‘,)P(w,,...,w.), k={1,2,...,n), u,Zw,, and by construction, It suffices to show that there exist n, such that j-1,2 , . . . . L and v=l,2 ,...,
S,;, k=+
&'(j)+&"(j). For brevity, we put u*=v, w*=w, w=u+r, r>O. We first state an auxiliary lemma. cases. Lemma 4. holds. Proof.
If Let
H,A(S.,)Hmk(S,.+,,r)
O
and
(3.3)
To prove
for certain
(3.3), we need to consider
x1=1,2 ,...,
m,
j=1,2
,...,
L,
several
then
H,,*(S,,+,,r)
~~~'(j)=Sg'-a(H,,*(S,*))ZSg'-((Hj,*(S~,+,,*))~~r~,(j), where
a=P,.
And similarly
in the case
II,,'(&)O.
The lemma is proved.
(3.3)
case
1.
S.,, S&EW:,
i==(l, 2,...,
2m+l}\(m+l).
With
. or
.
.
.
.
.
.
.
.
.
.
a,~+s~-‘
else
i~(l.2,.
r=m-ii
a,,+sk~d.r
. ., m)
either
t,
.
.
.
j=r+i ,...,L-1,
,
L--r+,
_1
&r+Er=tSrr
~,~'(S,)>O,H,~*(S~.+,I~)
r,emfna5. m-l,
StiEW&+i+l.
2,,. . , L; Proof.
If, for some
if
H:,” (S.,)
1.
Let
.s,j, j=1, 2,...,L.
H,,‘(S.,)
jo.we have
j=l, 2,...,L;
&=WF,
H&,+I,(zS,h)
HTc,+l, (S,)G
Hence
jO, we have
H;.v (S.,)>O,
or
v=2,3,...,m,j=l,2,...,&
f&,,
~0, v=l,
H:,” (S.,)>O,
if
2,
.,
then
if and only if
HFov &)>0,
~al.+,,r-S.r~-s~'
Ia,v+,,k-SorI-sk'
which is equivalent
to
>O,v=l,2,...,m-4, %+n then H:C,_l,
then
(S444
so that, by the definition ]a,,+l,r-S.*l
In addition,
Let
then
If, for some
(S.,)
. . . , m.
>O, v=2,3,
2. Let
i-l,
iE(1,2,...,m}, &EWF.
Let
Hgcv+,,(S,)
or
H;<,_l, (S,)
1.
of
E:,
Icz,,-S,,I-E~~>O;then
then
a,,+,,,>a,,, Er'<
IG,-S,J- s?
Hk Icv+t)(SC,) (0,
2, . . . , L.
j=l,
Hjk~v_,)(S.I)>I~r-S.*l--e,'">O,
j=l,2,..., L. 2. Let
SWkEWm+f+,.
H;,,(S...)'O, then
(&,)-CO is treated
The case k&r, Case 2.
If
(S~~)>la,,-S,,(--E,h>O,j=l, 2,...,L.
%+n
If
~ac,-,~r-S~~~~~-S.~(, f$,,_,,(Sti)< I~sL-S.~I-Q'<(J.
then
Ht."(S*)
similarly.
Lemma
Srl=Wtik, one index being greater
s&w,:,
5 is proved. than the other.
We can assume without
loss of generality that Pi,. with i,,&=( 1, 2,...,m) by construction of the family of objects Wt* wehave H&-,,+,,(Sa)> 0 and for any v’ such that P’EW~,,, we have H$,,_,,,(S..,)
(0. With
where
i,, ilE(m+2,.
and by construction same time,
m-i,i-l
H,.k(S,)>O. It now remains
i,=m+i’+l,
of the sequence
H,,,‘(S~~)=l~,,-SDII-
il=m+i*+l,
i’>i’,
(3.1), we have
ek’>O, v,=m-?+I.
But
to apply Lemma 4 to complete
Case 3. SW,,s.C=w:, Then, either
iE (m+2,
where
.
1 of Lemma
by definition
of the families
and by property
the proof
H&i,+,,
6, we have
H,,A(S.*)=lav*-S.rl-E,'cO, v=m--is. v,Gm-i'
W,+,+,
At the
2 of Lemma
5,
in both subcases.
_ ,2m+l).
E:+', lGj
'+?+I
'+'<&<~~--er', arl-sI or else
so that, by property
L--r aV~-e*‘-'+'
v=m-i+l,
a,,_,,r-Er’
H,,:.,,(S.I)O,H=.l(SeAO, By Lemma 4, in the first subcase a,'(j+r)#0.~"(j+r), and in the second 0,' (L) ZB,*” (L)
case 4. SMEW,,‘, &EW,,~, Let
i =i’=i.
&={I,
2,.
. , m}, W(m+2,
Then, by definition
_
. , 2m+l},
of the families
iz,~+E:
,
iz=m+i’+l.
(3.2), for certain
j.j’E{l,
2,...,L]
r=m-i+i,
or G--e,"+'tS,,
a,,+EI‘~S.r
a,,-,,,-Ee*'
k (S.,)
H:#,)>O,
H,:+Q,(S,)
or H ,(t+,, (S,)
a)
G-1;
or
else
the following: then, by Lemma 5, and inequalities
HtF7-,, (S.d (0.
(3.4)‘
we have
(3.4)
187
f&:,-i, (S.*) -a bj
?=I:
fIj;,-,, 6s.J ‘0;
then H ,,;+,, (S.,) -=O,
By lemma
4, for a) we have
Hi,:+,, ah)
>O.
6,:-,,,ti)=@,z,,,(jj, and for bj we have (3.5)
In the ~~;;h;;;$$;e Let
i,#i', C-i,.
H I;+11 (&k) (0. v=m-i'+l; Put we obtain
, we have either
Then, H:,(S,)>O,
then either
Hj.k(S.k)>O. and all the more
or
H,,L(Sa)>O.
But
Hc~+,,,(S,,)
V
and using Lemma
5,
H$,,+,,(Sti)>O.
By Lemma 4, (3.3) or (3.5) holds, The case i'
and hence The theorem is proved. From Theorems 2 and 3, and the results Corollary.
Given
of /I/, we have
m, n, LEN,
any
(2mL)“GAn,,,t
(%‘)olG[i+~,(m,
n,
L)l ChL)“,
(L+2)““9A,~[(~‘),]~[i+et(m,n,L)](L+1)””, er(m, n, L)-+O, i==l,Z, m,n,L-w. Hence the quantities (2mL)”
where
and
(L+l)‘“”
estimate
up to an infinitesimal
the capacity
of model (&), in spaces R(n)and M". Hence it follows in particular that it its exact astmptotic lower bound. &(m,n,L) Applying /2/, we obtain an estimate of the sample length, sufficient for training with reliability 1-q. For instance, for the deterministic case and space R(n)we have: Theorem 4. Given recognition algorithm than
6, if the sample
anv 1-q the error frequencies of the '1,O-=rl
4. Capacity Put
Since model sufficiently
I--ln$- 11131.
(2mL)“[
L-bounded linear closure of model
Em,
g,(!DI) is often used to solve different practical problems in the case of large L, we shall study the capacity of PL{iD3) in more detail.
The bound Theorem bounded:
of the
(l+e,)
the condition
5.
obtained
in the corollary
The capacity
can be greatly
of the L-bounded
strengthend
for the model
linear closure of model
FL{!%}.
J, in space
R(n)
is
([L/2]+1)mn~A~,.,[~~(W,}]~mn(LSI)+1. Proof. The lower bound can easily be obtained by the method of constructing explained in /l/. To prove the upper bound, we take any algorithm A=~‘(~,~, and a regular problem from the expression
Z(I,,s');
A=B-C,
then the estimate
1-1
8-L
a-o,,
operators
C=C (C,, C,) , of the object
S'&?
can be evaluated
,_I
P,-a
where
AyA (I’, p’, e’, z’) . We introduce the notation
operator
B we associate
the
alV,=c,~~p.%z'pi ~~*=p(v,i,tj,~=(z,~.....x:,,j. With pI-(pl,',...,p,,'), mn-place
function
188 g"(p')=(r#,,@. The class of functions
Obviously,
class of separating c,, and to class
algorithms
cx,
if
pI. .
in space
g"(P')Cc,.
~=(g"(s)~~~~r(%,})
vector
can be regarded as a
p’ is referred to class .%!', if
g"(p')>
Here,
A~,,,[~~I=A.R,~,[~,('D~,JI=~~~+~. Hence it follows
that there is a sample
sa, all the classifications
of which are realized by
algorithms of 2r(fol,). R(n) there corresponds the sample p'"=(P',...,p'")in To the samples s",% in space space R”” which is divisible into two subsets by all possible methods with the aid of functions of '6. Take another class of functions of dimensionality (Lfl)mn:
Z%-(p(Z,y)
IZ= F; t, _
a,.,, blvj=R, y= (y,,,, . . . , y,.t)
where
With the sample
),
_
.
we associate
P"
~&j~*i&**j~&i
_
the sample of elements Y,=(l,.
P'"(Y,)=(($, Y,),..., (P",Y,)),
..,
R'L+""".
of
I), Y,=RmnL.
Class
b is the class of separating functions with respect to the constants (C,,G). Let us show that any classification of the sample p”(Y,) may be realized with the aid of the approximate function of 6. p"(E',). We can assume without Suppose we are given a subsample ;;(Y,) of the sample loss of generality that P(Y*)=((p~,Y,),..., space
In
R””
there
Then, by definition 1,
.
t 40)
corresponds of
p*,
(P',Y,))% p(Y,)
to sample
there exists
the
g'Ea
oe=Gq,. subsample
0" :‘p=(p',...,p').
of sample
v, w, u=(l,2 ,...,
such that, for any
r),
wE(r+
we have ga(p")>c,, g"(p')
We introduce S,=(ilP,(&)-I)
the notation we put
D'=((i, v, j)Ip,v'-~:GO), D'=((i, v, j)(pav'-~.QO).
bw,= ( and for the set of indices
of
a,*+%
if
(i,v,j)ED*,
ec,,
if
(i,v,j)ED',
For ti;e set of indices
S,={ilP,(S,)=O),
b,,-= By the definition
2 ~o~",sgl-"(Pi.'-s:). ‘-0 "_I PI,*,,--
if
‘&’ 1 e,*fl,
if
(i,v,j)ED", (i,v,j)ED'.
afvi, b,.,,for any triple of numbers
(i, v, j) .
~8, (b
satisfies the equation g((pL,Y1))=g(pr). Thus function g refers all the elements one class, which it was required to prove.
WI"
of sample
P(Y,),
and only these elements,
to
But the capacity of the class ?6 of separating functions does not exceed the capacity of the class of hyperplanes of dimensionality (Li-!)mn, which we know is equal to (L+I)mn+l. The theorem is proved. Hence qaG (L+l) mn+l. In conclusion
the author thanks Yu. I. Zhuravlev
for valuable
comments.
REFERENCES 1. MATROSOV V.L., Capacity of algebraic extensions of the model of estimate-c~~~utingalgorithms, Zh. vych. Mat. i mat. Fiz., 24, No.11, 1719-1730, 1984. (Teoriya raspoznavaniye 2. VAPNIK V.N. and CHERVONENKIS A.YA., Theory of pattern recognition obrazov), Nauka, Moscow, 1974. Trar.slated by D.E.B.