Chapter 8 Vector and Matrix Equations

Chapter 8 Vector and Matrix Equations

CHAPTER8 Vector and Matrix Equations 8.1. Cauchy-Pexider-Sincov Equations 8.1.1. CAUCHY-PEXIDER VECTOR FUNCTIONAL EQUATIONS.Without resorting to the...

1MB Sizes 0 Downloads 107 Views

CHAPTER8

Vector and Matrix Equations 8.1. Cauchy-Pexider-Sincov Equations

8.1.1. CAUCHY-PEXIDER VECTOR FUNCTIONAL EQUATIONS.Without resorting to the more involved results from vector and matrix calculus, we can treat vector and matrix functional equations only fleetingly. Extending the definition of ordinary real functions to matrices and vectors also frequently happens by means of such matrix and vector functional equations (cf. Sect. 4.2.1). If nothing else is said, the components are arbitrary real numbers. Instead of the generalized Cauchy functional equation f(x

+ Y ) = f b )+ f ( Y )

(1)

where x andf(x) are vectors (not necessarily with the same number of components), we immediately investigatelog the generalization of Pexider’s equation g(x Y ) = h(x) k(Y). (2)

+

+

Here, too, we set first x=o=

(O,O, ..., 0), h(0) = a, respectively y

+ a, g(x) = h(x) + b, AY)

= k(y)

= 0,

k(0) = b:

k(Y) = g(Y) - a,

(3)

b.

(4)

h ( x ) = g(x)

-

Io9See, among others, T. SZENTM~RTONY 1944; F. KOZIN 1952, 1963; M . Hosszd 1955[b, d], 1959[i]; J. Aczfr. AND M . Hosszii 1956; R. THEODORESCU 1956; K. PACHA N D T. FREY1960; M. KUCZMA 1961[b], 1962[a]; S. KUREPA 1961[b], 1964[a, b, c , d]; H. LENZ R 1962; J . M. OSBORN1961; G . TEVAN 1961; R. BELLMAN AND 1961; F. M O L N ~1961, W. KARUSH 1962; K . MENCER1962; R. M E Y N I E ~1962[a, IX b, c]; 0. E. GHEORCHIU 1963[b]; also for other similar equations. 347

8. Vector and M a t r i x Equations

348

If we write (1) in terms of the componentsfioffand xi of x, we obtain

f&

+y1,xz fYz,...,Xn

+Yn)=fi(X1,Xz,...,Xn)

+fi(Yl?Yz?-YYn)

(i = 1, 2, ..., m). T h e equations valid for the individual components are of the form of Eq. 5.14 5 ) , whose continuous solutions are f i ( x l , x2 , ..., xn) = c i p l

+ ci2x2 + ... + cinx,

(i = 1, 2, ..., m).

Therefore we have Theorem 1.

The general continuous solution of f(x

+Y ) =f(x) +f(Y)

(1)

for m-dimensional f and n-dimensionul x, y is f(x)

=

c. x

(6)

where C = I / cij 1 1 is a constant matrix with m rows and n columns, which is multiplied by the vector x. Substitution shows that (6) does satisfy Eq. (1). Because of (3), (4), and ( 5 ) , we have also Theorem 2.

is

The general continuous system of solutions of

+ Y ) = 4x1 + W Y ) g(x) c - x + a + b,

R(x

=

c x + a, k(x) = c - x + b. h(x) =

*

(2)

8.1.

Cauchy-Pexider-Sincov Equations

349

These functions do in fact satisfy Eq. (2). Equations of a similar type are the vector equations solved in Sect. 2.3.3. Just as in the case of ordinary Cauchy functional equation 2.1.1( l), these solutions here, too, are the most general bounded solutions on n-dimensional sets of positive measure. T h e method in Sect. 4.2.2, which shows that the integrable solutions are all continuous, (even differentiable) also remains valid here.ll0

SXMPLE MATRIX FUNCTIONAL EQUATIONS.Equation 8.1.1( 1) 8.1.2. SOME with the solution 8.1.1(6) was also used to define matrices and was generalized to multidimensional matrices. I n fact, no difficulty arises in using matrices in place of vectors in 8.1. I ( 1) for variables and function values, since with respect to addition, matrices with m rows and n columns can be considered to be vectors with mn components. T h e situation is different with the remaining Cauchy and Pexider functional equations in which multiplications also occur. I n these, the multiplication relates to (square) matrices, which represent variables or functions (or both). For example, the functional equation

-

F(X * Y ) = F ( X ) F(Y ) , where Xand F(X ) are square matrices not necessarily of the same order, has been investigated many times,lll but is solved still only for continuous F. 'l0See also B. SZOKEFALVI-NACY 1936; A. WEIL 1940; C . CHEVALEY 1946; A. KUWACAKI 1962; etc. ll1 Among others by A. HURWITZ 1894; I. SCHUR1901, 1927, 1928; G. FROBENIUS AND I. SCHUR 1906; C . STEPHANOS 1913; A. WEINSTEIN 1923; H. WEYL1925[a, b, c, d], 1939; J. NEUMANN 1927; H. NAKANO 1932; D. E. LITTLEWOOD 1935, 1936, 1937, 1940; F. D. MURNACHAN 1938; A. WEIL1940; 0. PERRON 1942; T. S A T 6 1942; P. REISCH1944; C. CHEVALLEY 1946; E. HILLE1948; J. DIEUDONNE 1951; M. P. SCH~TZENBERCER 1955; S. G O L ~ 1956[d], B 1959[a]; E. HILLEAND R. S. PHILIPS1957; 0. E. GHEORCHIU AND V. MIOC 1958; S. KUREPA 1958[a], 1959[a, b], 1960[a, b, c, d, el, 1961[a], 1962[a, b, c], 1963; M. Hosszii 1959[e], 196O[c]; M. KUCHARZEWSKI 1959, 196O[c]; M. KUCZMA 1959[a, b, c], 1961[a], 1964; M. VELASCO DE PANDO1959; A. OPREA1960; A. J . M. SPENCER .4ND R. S. RIVLIN 1960; A. KUWACAKI 1961[a, b], 1962; J. ACZEL,B. BARNA, 1962; M. KUCHARZEWSKI A N D M. KUCZMA 1962, J. ERDOS,et al. 1962; H. J. HOEHNKE 1963[a, d, el, 1964[a]; R. NEVANLINNA 1962; D. A. ROBINSON 1962[b]; A. ZAJTZ1962[a, b], 1964; S. CATER1963; 0. E. G I ~ E O R C H 1963 I U [a]; 0. TAUSSKY AND H. WIELANDT 1963; J. L. BRENNER 1964; M. KUCZMA AND A. ZAJTZ 1964; several of these authors have also investigated other similar equations.

8. Vector and M a t r i x Equations

350

If in particular F is a scalar function (that is, F ( X ) is a matrix of the first order), the solution of F ( X Y ) = F(X)F( Y ) ,

(2)

which yields the value 1 for the identity matrix and which is analytic or even merely continuous in the neighborhood of the identity matrix, is always of the form

F ( X ) = I det X

la

or

F ( X ) = I det X

la

sgn (det X ) ,

(3)

where N is an arbitrary real constant and det X is the determinant of X . T h e functional equation (2) usually occurs in the numerous systems of conditions characterizing the determinants of matrices112. Following M. Hosszd 196O[c] we sketch a proof of the more general Theorem 1.

The most general solution of the functional equation F ( X * Y ) = F ( X ) F (Y )

(2)

dejined for nth order nonsingular square matrices X , Y, where the values of F are real numbers, is F ( X ) = f(det X )

(4)

where f ( x ) is an arbitrary multiplicative function: f@Y) = f ( 4 f ( Y )

( X Y f 0).

(5)

We make use of a theorem of J. DIEUDONNB 1943 asserting that every nonsingular matrix can be written as a product of matrices differing from the identity matrix but in one element. For easier understanding we give the details of the proof of Theorem 1 only in the case n = 2. By the theorem of Dieudonne just quoted, every nonsingular matrix X is a product of two kinds of matrices: ( i ) Nondiagonal matrices as, for example,

See among others, K. WEIERSTRASS 1886; K. HENSEL1903, 1928; L. KRONECKER 1913; C. CARATHEODORY 1918; W. ILIOVICI1927[a, b, c ] ; 0. HAUPT 1903, C. STEPHANOS 1930; 0. SCHREIER ANU E. SPERNER 1931; J. D I E U D O N N 1943; ~ 1929; L. BIEBERBACH E. ARTIN1944, 1957; T. SZENTMARTONY 1944; H . W. E. J U N C 1952; K. MENCER1952; R. R. STOLL1952; G . GASPAR1953, 1955[a, b, c, d], 1957, 1959, 1960[a, b], 1962[a, b, c], 1963[a, b], 1964; G . PICKERT1953; M. S T O J A K O V1954, I~ 1957; I . HELLER1955; A. CLIMESCU 1956[a]; W. GRAEUB1958; A. BERCMANN 1959; G. TEVAN 1961 ; S. CATER 1962; F. KOZINAND K. MENCER1963; etc.

8.1.

Cauchy-Pexider-Sincov Equations

351

for which by repeated use of (2)

thus by repeated use of (2) and (6),

E being the identity matrix. For the other nondiagonal matrix ($ :) differing from E in one element, it is proved similarly that F ascribes to it the same value as to E. (ii) T o diagonal matrices such as

352

8.

Vector and M a t r i x Equations

Thus for an arbitrary X by (7), (S), (2) and by the theorem of DieudonnC,

T h e determinants of

being all equal 1, the theorem of DieudonnC implies that det X

=

d,dz,

and thus with x

=F[(O

JI

o

we have (4) and, by resubstitution into (2), also ( 5 ) , Q.E.D. Theorem 1 remains true if singular matrices are allowed in (2) also, but the above proof is valid in more general fields and s fieldsthan that of real numbers. By Theorem 3 of Sect. 2.1.2, we have the following Corollary. The most general solutions of (2), continuous in a neighborhood of a nonsingular matrix ( X , Y nonsingular matrices), are of the form (3). Here, we now present the solution of the following matrix functional equations,'13 which are considerably easier to treat than (1):

-

F(X Y ) =F(X) Y

and

F(X

*

Y ) = x * F(Y)

where, of course, X , Y, F ( X ) , F( Y ) now are square matrices of the same order. By substitution of the identity matrix E,

X

=

E

respectively Y

=

E,

we obtain without any assumption just as in Sect. 1.1.1 with F ( E ) = C 113 For these and for similar equations, see among others €5. NAKANO 1934; R. BELLMAN 1952; I . C C L O J O A1958; R ~ M. R. MEHUI1959; R. C. BUCK1960; S. G O L ~ 1960[a]; B W. GRAEUB1960; L. MIRSKY1960; J. ACZBL1961[i], 1962[e]; G . N. SAKOVIE 1961; J. ACZELAND M. Hosszir 1963; A. W. MARSHAL A N D I. OLKIN 1964; A. ZAJTZ1964; etc.

8.1.

Cauchy-Pexider-Sincov Equations

Theorem 2. F(X)=C * X

respectively

F(X) = X

353

-C

are the most general solutions of the functional equations F ( X . Y ) = F ( X ) *Y ;

F(X' Y )= X . F ( Y )

respectively, X , Y, F ( X ) being square matrices of equal order. In addition to the functional equations

FbY)

= F(X) *

@Y)

= I.'(x)

F(Y),

+ F(Y)

for scalar arguments, the functional equation

+ Y ) = 0 )F ( Y ) *

has been investigated in great detail.l14 T h e latter has an important application in the theory of homogeneous transition probabilities. We shall only treat a similar application for the inhomogeneous case.

8.1.3. GENERALIZATIONS OF SINCOV'S FUNCTIONAL EQUATION.I n the

functional equation

F(x, Z)

=

W,Y ) - H(Y,z)

(1)

'14 See, among others, G. POLYA1928; M. NAGUMO 1936; B. SZOKEFALVI-NACY 1936; K. YOSIDA1936; W. DOEBLIN 1938; M. FRECHET 1938; H. SCHWERDTFECER 1938; A. DE MIRA FERNANDES 1940; A. WEIL 1940; B. KEREKJARTO 1941[a]; G. HAJOS1942; F. KARTESZI AND F. ZICANY 1942;E.HILLE1948; O.E. GHEORGHIU 1952[c], 1959[h,c], 1960[b], 1961[b], 1964; J. L. DOOB1953; M. GHERMXNESCU 1953, 1956; M. P. SCH~TZENBERGER 1953; B. GYIRES1954[a, h], 1955; D. G. AUSTIN1956, 1958; K. L. CHUNG1956, 1960; I. FENYO1956[a]; 0.ONICESCU, G. MIHOC,AND C. IONESCU-TULCEA 1956; R. THEODORESCU 1956; R. W. BASS1957; W. FELLER 1957; E. HILLEAND R. S. PHILIPS1957; A. CLIMESCU 1958; 0. E. GHEORCHIU AND V. MIOC1958; W. B. JURKAT 1958; S. KUREPA 1958[a, b], 1959; R. BELLMAN 1960, 1961[h]; G. 1963; A. BALOGH1959[a, h]; V. P. CISTJAKOV 1960; M. GHERMANESCU 1960[b]; V. MIOCAND B. CRSTICI CIUCUAND R. THEODORESCU AND B. CRSTICI 1961[a, h]; 0. E. GHEORGHIU, 1960[h]; A. GEIER1961 ; 0. E. GHEORGHIU AND M. KUCZMA 1961[b]; G. N. V. MIOC,AND B. CRSTICI1961; M. KUCHARZEWSKI SAKOVIE 1961; H. SCHMIDT 1961[a]; H. G. TILLMAN 1961; L. YOUNG1961; A. DRACULANESCU 1962; J. ACZELAND Z . DAROCZY 1963[c]; M. BAJRAKTAREVI~ 1963; D. 2. D O K O V ~ ~ 1963[hl; C. G. POPA1963; M. KUCZMA 1964; E. V A J Z O V1964; I ~ etc.

354

8.

Vector and M a t r i x Equations

let x, y , z be vectors (or elements of an arbitrary set) and F, G, H , nonsingular square matrices (or elements of any group where the sign "-" then naturally denotes the group multiplication). Strictly speaking, it suffices to assume the existence of three constant vectors a, b, c f o r which G(a,Y ) and W b , c )

are nonsingular matrices. First, we set y

in (1) and write

H(b, z )

C(X, b) = L(x),

F(x,Z)

=b

= L ( x )*

= N(z):

N(z).

(2)

Now we substitute (2) back into (1): L(x) * W Z )

= G(X,Y) *

H(Y, 4.

After setting x = a, we multiply on the left with the inverse G(u,y)-' of the matrix G(a,y ) , which is nonsingular according to our assumption:

WY,Z) = G(a,Y ) ~ '- L(a) * N(z),

or with G(a, y ) - l . L(a) = M ( y ) (nonsingular) H(Y, z )

= M(Y) *

N(z).

(3)

Finally, we substitute (2) and (3) back into (I),

-

L(x) * N ( z ) = G(x, y ) M ( y ) * N ( z ) .

If we put z = c here, then we can multiply from the right with the inverse N(c)-' M(y)-' of the nonsingular matrix

-

M ( y ) * N(c) = C(U,y)-'

and get

*

C(U,h ) * H(b, c),

G(x, y ) = L ( x ) * M(y)-'.

(4)

On the other hand, (2), (3), and (4)do in fact satisfy Eq. (1). T h u s we have proved the following Theorem 1.

Under the conditions det G(a,y ) # 0,

det H(b, c ) # 0

(5)

the vector matrix functions F(x, z )

=

L(x)

NZ),

G(x, y ) = L(x) * M(y)-',

H ( y , z) = M ( y ) * N (z )

8.1.

Cauch y-Pexider-Sincov Equations

355

(which can also be interpreted as solutions for functions with variables in arbitrary sets and values in an arbitrary group- without restrictions ( 5 ) ) are the general solutions of F(n, Z )

=

G(x,Y )

*

MY,z ) .

(1)

No assumptions were made here concerning F, G, H (as continuity, etc.). Actually, this Theorem 1 is but a variant of Theorem 2 in Sect.

7.1.1.

I n Sect. 8. I .4 we shall apply Sincov’s special case F(s, u )

= F(s, t ) *

F(t, u),

(6)

where s, t, u are scalar (real) quantities, to the determination of inhomogeneous transition probabilities. Here we consider the following generalization of (6): F(x,z)

=

G[F(x,Y ) , F(Y, z)],

(7)

or with another notation, F ( r ,Y ) 0 F ( y , 4

=F(x,

4

(8)

(cf. 7.2.1(7)), were the variables x , y , z are elements of an entirely arbitrary set A and the functional values F(x,y ) , F ( y , z ) , F ( x , z ) are elements of an entirely arbitrary group B with the “multiplication”: u o v = G(u, v). This possibility o f generalization has already been indicated for Eq. (1). Here the solution simply proceeds as follows: We assume (8) only for one x = a and introduce the notation F(n, z ) = n(z).

With this, (8) becomes >.(n

=

n(y) c‘ F(y, z ) ,

and because B is a group, it follows that F ( y , 2) = n(y)-’

0 n(z),

where up1 is the inverse element of u in B. If we replace y by x and z by y , we see that the solution of ( 8 ) can be only of the form F ( s , y ) = n(.y)-’

L’

n(y).

(9)

356

8.

Vector and M a t ri x Equations

On the other hand, (9) with arbitrary n ( x ) satisfies Eq. (8) for all values of x, y , z : F ( x , y ) o F ( y , Z ) = n(x)-' o " ( Y ) o n(y)-' o n(z) = n(x)-' o .(z) = F ( x , z). Thus we have proved Theorem 2.

The general solution of the equation F(x, 4

= F ( x , Y ) 0 F(Y,

4,

(8) where the domain of F is an arbitrary set A and the range of F lies in an arbitrary group B, is of the f o r m F(x, Y ) = n(x:J-l 0 n(Y)

(9)

where n(x) is an entirely arbitrary function defined on A with values in B. Moreover, if (8) is satisfied f o r one x = a and f o r a l l y , z in A , then it is valid f o r all values x E A also. Here, too, no assumptions relative to continuity, etc., were made about F(x,y ) . We expressly note this because (7) is an inhomogeneous generalization of 2.2.1(5) in the same sense as that in which Eq. 5.1.2(1) is a generalization of 2.1.1(1), but 2.2.1(5) was solvable only under continuity assumptions. A certain converse of this result will interest us here: Theorem 3. If a function F ( x , y ) is defined for x, y E A and if its values lie in a set B, in which for all pairs of elements of the f o r m F(x,y ) , F ( y , z ) (the second variable of the first element must be identical to the first variable of the second), an operation o is defined such that the functional equation

F ( x , r) 0 F(Y, 2) = F(x, 4

(8) is valid f o r all x, y , z E A , and such that F(x, y ) assumes all values in B on variation of y for all $xed values x and on variation of x f o r one value of y = a, then B is a group with respect to the operation 0. We have to prove that: (a) For all values u, v E B also u o v E B. (b) For all values u, v, w E B, the equation u o (v o w ) = ( u o v) o w holds. (c) An e E B exists such that

for all u

E

B.

e o u = u

(10)

8.1.

(d) For every u

E

Cauchy-Pexider-Sincov Equations

B, there exists a u-l

E

357

B with

u-l o u = e.

(11)

Our assumptions assure that with arbitrary values u, v, w , there exists, for example, for a a value of x with F(a, x) = u , for x a value of y with F(x, y ) = v, and for y a value of z with F ( y , z ) = w. T h e proof of Theorem 3 proceeds as follows: (a)

uov

= F ( a , x)

o F(x, y )

F ( a ,y ) E B.

(b) (u o v ) o w = [F(a,x) o F(x,y ) ] o F(y, z ) = F(a,y ) o F ( y , z ) = F(a,

4 0F(x, 4

= u 0 (v 0

(c)

With e

= F(a, a ) ,

= F(a,

= F(a, z )

4 0 F ( x , Y ) 0 F(y,z)l

w).

(10) is satisfied:

u = F(a, x) = F(a, a )

o F(a, x)

=

e o u.

(d) According to the assumption, there exists for an arbitrary value u a t E A with F(t, a ) = u ; then with u-l = F(a, t), Eq. (1 1) is satisfied u-l o u = F(a, t ) o F ( t , a ) = F(a, a ) = e, which was to be proved. With this, we can again obtain the result of Theorem 2, Sect. 6.3.1 for quasigroups. As mentioned in Sect. 7.1.3, a set A is called a quasigroup, if in it a unique operation F ( x ,y ) is defined, and if there exists for every pair of elements x, z exactly one value y and for every y , z exactly one value x = G ( z , y ) ,with F ( x , y ) = x.

An associative quasigroup is a group. We thus prove Theorem 4. If A forms a quasigroup with respect to the operation F(x,y ) and the transitivity relation

F[F(x, z), q y ,4

1

= F(x, Y )

holds, then A forms a group relative to the inverse operation x = x z y of F ( x , y ) = z and the general solution of (12) is F ( x , y ) = x o y-1.

(12) =

G(z,y ) (13)

358

8.

Vector and M a t r i x Equations

follows from (12). Since A forms a quasigroup relative to F ( x , y ) , the assumptions of Theorem 3 are satisfied, and A forms a group relative to the operation G , as asserted. Since, moreover, z = F ( x , y ) is equivalent to x = G ( z , y )= z o y , we find that F(x,y) = z

=x 0

y-1,

that is, (13), which completes the proof. Comparison of (13) with (9) shows that in this case n(z) = 2-l. By means of the same method, the generalization F(x, Y ) = W ( x , 4, F(y, 4 1

(14)

(see 7.1.3(2) and 7.2.1(7)) of the transitivity equation can also be solved in an elementary manner. Theorem 5. If A is the domain of F, B the range of F , and B ai the same time the domain and range of H , if H(u, v) = w can be solved uniquely for u ( u = G ( w , v ) ) , and i f F ( x ,y ) = u a n d F ( t , a ) = v can be solved for y and t respectively, the general solution of

q x ,r) = H [ F ( x ,z), F(Y, 4 1

is then F(x, y )

= ?Z(x)-' 0 n(y),

where o is a group operation in B. I n fact, from (14) it follows that

we obtain the solution

H(u, v ) = u

(14) 0 v-1,

8.1.

Couchy-Pexider-Sincov Equations

359

This may also be deduced under weaker conditions from the Theorem 2 of Sect. 6.3.1. If we set z = a in (14) and let F ( x , a ) = f ( x ) , we find immediately F(x7 Y ) = f f [ f ( f4( Y, ) l ,

which is one part of (16). If this is substituted back into (14), we obtain for H(u, v ) the transitivity equation H(u, 4

=

ff[H(u, w ) , H(V, 4 1 ,

and, according to Theorem 2 of Sect. 6.3.1, it follows that H(u, V)

This proves

=

u o v-~,

(15)

m y ) = H [ f ( x ) ,f ( Y ) 1 = f ( 40 f(r)-'.

(16)

Theorem 6.

H(u, 0) = u 0 v-',

&,y)

f(r)]= f(x) 0 f ( ~ ) - '

= ff[f(x),

constitutes the general system of solutions of

41

~ ( xy,) = H r q x , z), q y ,

(x, Y , z E A ; ~ ( s t, ) , ~ ( uV ), E B)

;f H(u, v ) is reducible on the left and x in A .

F(x,a ) = u E B can be solved for

Similarly, the algebraic considerations of Sect. 6.3.1 (and also those mentioned in Sects. 5.3. I , 6.4.1, 6.5.1, 7.1.2, 7.1.3, etc.) could also be included in this chapter. T h e case in (8) where x y z are real numbers, the values of F are matrices, and o is matrix multiplication finds application in probability theory.

< <

8.1.4. APPLICATION TO THE DETERMINATION OF INHOMOGENEOUS TRANPROBABILITIES.115 If f j x ( t ,u ) denotes the probability for an

SITION

1931, 1935; l L 5 For Sects 8.1.3 and 8.1.4, see, among others, A . N. KOLMOGOROV M. FRECHET 1932[a, b], 1933, 1938; G . C. MOISIL1932; S . G o q s 1933; B. HOSTINSKY 1939; E. SPERNER 1948, 1949; A . TORTRAT 1949; R. L. DOBRUSIN 1953; J. L. DOOB 1953; J. ACZEL 1955[a], 1956[b], 1960[b], 1961[b]; R. THEODORESCU 1955, 1960[a, b]; M. Hosszd 1958; R. M. REDHEFFER 1959; W. T. REID1959; G . CIUCUAND R. THEODORESCU 1960; M. GHERMANESCU 1960[bJ; G. VRXNCEANU 1962; G . VRANCEANU AND C. G. VRXNCEANU 1962. J. ACZELA N D J. EGERVARY 1957; J. E G E R V ~ 1957 R Y also treat the case in which the matrices F are singular.

360

8.

Vector and M a t r i x Equations

object (particle, system) to change in the time interval ( t , u ) from a state E, into another E k , then under the Markov assumption of the independence of transitions during mutually disjoint or only adjacent time intervals, the following equations must be valid:

2 f&, n

j=l

t)= 1

(a

< s < t;

i = 1, 2, ..., n).

Here, a is the initial point of our time measurement (the assumption, that such a point exists, is not essential). Equation (1) follows from the fact that the transition in the time interval (s, u ) of Eito E, occurs in such a way that at the intermediate time t , one of the possible states Ej ( j = 1, 2, ..., n ) is assumed, while (2) simply follows from the fact that during (s, t ) the transition from Ei to some state Ei ( j = 1, 2, ..., n) certainly takes place. It should be noted here that the functional equation 5.1.2(4) of the inhomogeneous composed Poisson distribution is a special case of (l), with

and that the result obtained there can also be deduced from the results that will follow here. We note that (1) is just the matrix functional equation F(s, u )

= F(s, t )

. F(t, u )

(a

< s < t < u)

(3 l

for the matrices F(s, t ) = 1 j fij(s, t ) 11. Here we assume that F(s,t ) or at least F(a, t ) = W ( t ) (4) is nonsingular. T h e following should be noted: Lemma. If F(s, t ) (that is, the functions fij(s, t ) )or at least det F(s, t ) is continuous and the equally natural equation

j1 fiAt, t ) = ( 0

for j = i, for j # i

8.1.

Cauchy-Pexider-Sincov Equations

361

holds, that is, F(t, t ) = E (identity matrix) (if no time passes, the system remains in the same state, that is, it makes no jumps), then the solution F(s, t ) of (3) is always nonsingular. Let us assume that there exists a pair of values (s, t ) with

det F(s, t )

= 0.

(5)

Because of the continuity, it follows that with det F(s, s)

1

=

the determinant is also different from 0 in a sufficiently small neighborhood of (s, s). Let, for example, but

det F(s, t) # O

for

t < to,

det F(s, to) = 0,

(6)

(7)

that is, let t = to be the next zero of f ( t ) = det F(s, t ) to the right of s (such a point exists because of continuity and because of assumption (5), to be contradicted). Since also det F(to , to) = 1 , then because of continuity, det F ( t , to) # 0,

(8)

holds if i is near enough (from the left) to t o . We shall choose such a f < t o . Then because of (3), we have det F(s, to)

=

-

det F(s, t ) det F

(t, to),

where the left side is equal to 0 because of (7), but the right side is different from 0 owing to (6) and (S), and this is the sought contradiction, thus assuring that F(s, t ) is nonsingular, and proving therewith our lemma. Equation (2) states that the row sums of the matrix F(t, u ) are equal to 1, thus that it is stochastic. Obviously, f j k ( t ,u ) 2 0, but this we shall not utilize here. First we solve (3) without assuming (2). In (3), set s = a as in 8.1.3(8) and multiply with the inverse W(t)-' of the matrix (4),which is assumed to be nonsingular. Then we obtain [cf. 8.1.3(9)] F(t, u ) = W(t)-' ' W(U). (9)

8.

362

Vector and M a t r i x Equations

On the other hand, every F(t, u) of the form (9) with arbitrary nonsingular W(t)satisfies functional equation (3): F(s, t ) * F(t, u )

=

W(s)-'

*

W ( t )*

wp-1 * W(u) =

W(s)-l

*

W(u) = F(s, u ).

Thus we have proved Theorem 1. F(t, u ) = W(t)-'

W(u)

*

(a
< u)

(9)

is the general nonsingular solution (or the general continuous one with F(t, t ) = E ) of F(s, u) = F(s, t ) * F(t, 21) ( a < s < t < u). (3)

It even suffices that ( 3 ) and det F(s, t ) = det f i j ( s , t ) # 0

holds for s = a in order for ( 9 ) and at the same time ( 3 ) and (10) to be ensured for arbitrary a s t u. If we write our result in component form, we obtain the following

< < <

I f (1) and (10) hold for s

Corollary.

=

a and for all a

< t < u, then

A ( s , t ) = 2 WkZ(S)Wk3(t) n

(1 1)

h=l

where w k j ( u )is arbitrary with det wkj(u) # 0 and wki(t)is the quotient of the adjoint of wki(t) and of the determinant det wkj(t), and then also (1) and (10)hold for arbitrary a s t u. Now we also consider (2), again only for s = a, and substitute

< < <

j&, t ) = into it: 1

n

= 3=1

c2 n

f t , ( a ,1 ) =

2 Wkt(a)%(t) n

k=1

n

3=1 k = l

Wk2(a)Wk,(t)

(2

= 1, 2 ,

'.., n).

If we multiply these equations by wli(a)and sum over i, we obtain

8.1.

Cauchy-Pexider-Sincov Equations

363

I n fact,

is an element of the product of a matrix with its inverse, that is, an element of the identity matrix, it must therefore be equal to ski

=

1 10

for for

K

= I, # 1.

Hence we have proved: If ( l ) , (lo), and (2) are valid for s = a, then the row sum of W ( t ) in ( 9 ) is constant. As we have seen, we can choose W ( t ) = F(a, t ) and the elements of this matrix are probabilities; thus W ( t ) can be chosen as stochastic in (9) and (1 1). Conversely, from

2

i=1

wki(t) = constant,

( 2 )follows f o r arbitrary

a

that is,

< s < t:

2

i=l

w r j ( t )=

2

j=1

wkj(s),

364

8.

Vector and Matrix Equations

and Wki(t) is the quotient of the adjoint of wki(t) and of the determinant det wzi(t) . 8.2. Associativity, Transformation, and Distributivity Equations

8.2.1. THETRANSFORMATION EQUATION.T h e transformation equation

of the m-parameter transformations in n-dimensional space belongs to the theory of continuous groups. Here,l16 we indicate a simple but often useful theorem for a very special case. Equation (1) is the generalization of 7.1.3(15) to vectors. I n (I), let x, f ( x , U ) be n-dimensional vectors, while U = ( u l ,..., a,,,) is a collection of m parameters, thus itself an m-dimensional vector of the parameter space. This we write with ordinary capital letters instead of bold-faced lowercase letters in order to emphasize the difference in dimensions. V and G( U , V) are thus m-dimensional vectors too. Our restriction consists first of all in the fact that the dimension n of the vector space of the x shall not be smaller than the dimension m of the parameter space n >, m.

We shall represent the n-dimensional vector x frequently with the notation

where X has dimension m and 5 is (n - m)-dimensional; that is, X has the first m components of the vector x and has the remaining components. In (1) we set U = Y and

For Sects. 8.2.1 and 8.2.2, see, among others, .4. SUSKEVIE 1929[a, b]; B. 1930, 1941[b, c, d], 1942; G. VRANCEAXU1947, 1962; 1'. V. STEPANOV 1950; B. A. SEVAST'JANOV 1951; J. ACZQL1955[d], 1956[b], 1959[i], 1960[b, d], 1961[j], 1962[c]; M. H O S S Z3955[d], ~ 1956, 1957[a], 1959[b,c,f,i, k], 1960[d,e], 1962[a, b], 1963[c]; J . X c z f ~ AND M. Hosszii 1956; R. BELLMAN 1958[d]; V. D. BELOUSOV1960[d]; S. B. P R E S I ~ 1960[a]; J. A C Z ~ LM. , Hosszii, A N D E. G. STRAUS 1961 ; M. KUCZMA1961[a]; etc. 'I6

KERljKJiRTo

8.2.

Associativity, Transformation. Distributivity

365

where A is a constant m-dimensional vector, we obtain

If we introduce the notation

then we obtain

We now assume that

h(Y) = x

can be solved uniquely for y : Y

in particular,

Y

= k(x),

= K(x),

q =

“(X).

Then h(y) and k ( x ) are inverse functions to each other: k[k-’(y)l = Y , N Y ) = k-l(y), K[k-l(y)] = Y , x[k-l(y)] = q. We then obtain

as solution of (1). On the other hand, there follows the associativity of G, G[G(U, V ) , WI = GtU, G ( V , Wl, (3) if it is assumed that f ( x , U ) = f ( x , V ) holds f o r every x only ;f U = V (weak left reducibility). This again is a weaker condition than reducibility. From ( l ) , f{x, G[G(U,v),Wll

f { f [ x , G(U, ‘)Ij W > = f { f [ f ( x ,U ) , VI, w>= f [ f ( x , U ) ,G(V7 W)l = f{x, G[U, G ( V , W)l> =

366

8.

Vector and M a t r i x Equations

follows, just as in Sect. 7.1.3, paragraph C , and the asserted associativity (3) follows on the basis of the assumption of weak left reducibility, just made. But if G is associative, then (1) is actually satisfied by (2):

T hus we have proved the following Theorem. If dim x 3 dim U , if there exists an A = ( a l , a 2 , ..., a,) for which x = f[(",,Y ] can be solved uniquelyfor y = (:), and if for U # V the inequality f(x, U ) + f ( x , V ) holds, then

with arbitrary k ( x ) = (",:')

having unique inverse is the general solution of

8.2.2. THETRANSLATION EQUATION. T h e translation equation f [ f ( x , u>,v3

where

u + v = ( u 1 , ..., u,>

=f(x,

+ {vl , .-.,v,>

u + V),

= {Ul

+

Dl

(1)

, '.., u,

+ v,)

is ordinary vector addition, can be solved in a similar elementary way, not only for n = dim x 2 dim U = m but also for n < m.

8.2.

Associativity. Transformation, Distributivity

367

(a) For n 3 m, it follows from Sect, 8.2.1 that

=

k-'[k(x)

+ (Ed")

*

u] = k-'[k(x) + c *

U],

where Em is the identity matrix of mth order and C is a matrix with n rows and m columns, which is multiplied by the m-dimensional column vector U. Conversely, every f ( x , U ) = k-l[k(x)

+c

*

U]

(2)

with arbitrary constant ( n * m)matrix C satisfies Eq. (1): f [ f ( x , U ) , V ] = k-l[k(x) = k-l[k(x)

+ c - u +c V] + C . ( U + V ) ] J(x, U + V ) . *

=

On the other hand, the condition that

can be solved, for y = (f) is satisfied only if C can be completed to a nonsingular square matrix of nth order (or-what is equivalent-can be reduced to a nonsingular matrix of mth order). Since the vector addition U 17 is associative, we do not need any special condition here to assure associativity. Thus we have proved the

+

Theorem 1. If dim x then uniquely for y =

(0,

> dim U

and

f ( x , U ) = k-"k(x)

I[((), Y ] = x can be solved

+ c - U]

(2)

with an arbitrary vector function (with unique inverse) k ( x ) and with an arbitrary constant matrix C = (p) ( C , nonsingular) is the general solution of

f [ f ( x , U ) , VI

(b) Now we pass to the case n

= dim

x

=f(x,

U

+ V).

< dim U = m.

(1)

368

8.

Vector and M a t r i x Equations

Here we write the m-dimensional vector U a s follows:

u=(;), where u has dimension n and p has dimension ( m - n). Let x = a (constant) in (1):

f l f [ a , i31' i:,I and choose also p

=

f b (i" +

u+u

v11

(4)

= - v:

i 0 11 .

f l f k ?( 3 1 ' (31

u f v

= 4.7

If we assume that

f[..

(3

=

x

(5)

can be solved uniquely with respect to u, that is,

then it follows from

that =

g"31

Further, let

From our assumption, the existence of an inverse of h(z) = n obviously follows: z = h-'(x) = k ( x ) .

8.2.

Associativity, Transformation, Distributivity

369

T h e functions introduced last are related, as can be seen immediately, as follows:

k(n)

=

h-l(x) = g [ (

f j] .

T h u s we have

This we substitute back into (4) with

p = 0:

With

this becomes

so that from (6)

follows. If we finally substitute back into (4),we obtain

k-"k(a)

+ u + l ( P ) + u + +)I

that is, l(P

+ v)

= k-"k(a)

=

l(P)

+ u I- u + l ( P + v)],

+ lb).

This, however, is just Cauchy's functional equation 8.1.1( 1). If g [ ( t ) ] and thus l ( p ) , are assumed to be continuous or bounded, then I(v) = A

is the general solution.

*

v

370

8.

Vector and Matrix Equations

Thus the solution of (1) in this case is

= k-"k(x)

+ (EJ)

*

(,")I

= k-'[k(x)

+c

'

U],

that is, of the form (2), and it does in fact satisfy ( l) , as was already shown in (a). However, the solution f ( x , U ) = k-'[k(x)

+ 4u)1,

(7)

occurring in the discontinuous, unbounded case with arbitrary I( U ) , that satisfies the equation I(U

+ V)

= I(U)

+ W),

which according to G. HAMEL 1905 possesses also discontinuous, unbounded solutions (cf. Sect. 2.1.1), also satisjies (1). Therefore, this assumption, which does not occur in case (a), cannot be removed in case (b). I n fact, the solutions (7) with discontinuous I were eliminated in (a) by the condition of unique solvability of ( 3 ) , whereas the similar condition of unique solvability of (5) in (b) does not eliminate them. It is readily seen that here, too, the condition of solvability gives rise to C = ( C , A ) where C , is nonsingular. Thus, we have proved Also in the case dim x < dim U the general solutions of (1) are thefunctions (2) with arbitrary k ( x ) (with unique inverse), and with arbitrary constant matrices C = (CoA ) (Co nonsingular) if f [ a , (31 = x possesses a unique solution u = g [ ( t ) ] and g [ ( z ) ] is continuous or bounded. Theorem 2.

8.2.3. THEASSOCIATIVITY EQUATION.T h e associativity equation g[g(u, u), w] = g[u, g(u, w)], dim u = dim u = dim w = dim g = m, (1)

which we now write with lowercase boldface letters throughout since only vectors of equal dimension are involved, also belongs to the theory of continuous groups and is investigated here again under strongly restrictive conditions. In connection with the previous section, the question could be posed as to when the transformation equation 8.2. I ( 1 )

8.2.

311

Associativity, Transformation. Distributivity

can be reduced to the translation equation 8.2.2( l), that is, when can an additive parameter be introduced, which is equivalent to g(u, v ) = h[h-l(u)

+ kl(u)],

PI =

dim u

=

dim g

=

dim h.

(2)

Then, according to 8.2.2(2), the solution of 8.2.1(1) is of the f o r m f ( x , U ) = k-l[k(x)

+c

*

H(U)1,

where H ( U ) takes the place of the function h-l(u) in (2). We shall establish conditions such that (2) is the only possible solution of (1). Such a system of conditions is117 : 1 . The parameter space represents an Abelian semigroup with respect to the operation g(u, u ) = u 0 u. 2. There is a domain of real (scalar) operators which are defined on the parameter space with the following property: uz 0 U Y = U”+”.

3. There exists a basis al , a 2 , ..., a, with respect to the operation is, every vector u can be uniquely represented in the f o r m

u o v ; that

m

which according to condition 3 assumes every possible U-value exactly once. Then

where, in addition to the definition of h ( x ) , the commutativity, associativity, and condition 2 were used. Condition 3 means that h(x) = u ll’See among others, M. Hosszir 1955[d], 1956, 1962[c]; J. ACZBLAND M. Hosszii 1956; F. A. SOLOHOWE 1958; J. ACZEL1959[i].

372

8. Vector and Matrix Equations

has a unique inverse, and thus we have h[h-l(u)

+ h-l(u)] = u

0

u,

that is, (2). On the other hand, (2) with arbitrary h, which has a unique inverse, satisfies Eq. (1) and conditions 1, 2, and 3. Thus, we have proved the following Theorem. Conditions 1, 2, and 3 are necessary and suficient for g(u, u ) = h[h-’(u)

+ h-l(v)]

with an h having a unique inverse to be the general solution of m u , 4,wl = sb4

d V ?

41.

T h e associativity equation (1) can be interpreted for two-dimensional vectors as associativity of an operation between complex numbers. I n that case, differentiability conditions may be considered for the solution, as in Sect. 7.2.2.llS 8.2.4. THEGENERALIZED DISTRIBUTIVITY EQUATION.T h e generalized distributivity equation f M x , Y ) ! UI = “(x,

U ) , 4.b U)l

(1)

for vectors will be investigated again only under certain restrictive assumptions.11s One such limitation could be that f ( z , U ) , k ( x , U ) , I(y, U ) are transformations where additive parameters can be introduced (cf. Sects. 8.2.2, 8.2.3, and 8.2.5). As in Sect. 7.2.4, we assume here, among other things, the existence of a vector, for example, U = 0 = (0, 0, ..., O), such that k ( x , 0 ) and l ( x , 0 ) are constant: k ( x , 0 ) = a,

I(x, 0 ) = b.

Dzfferentiability is also assumed here, even at U of vector-vector functions are matrices. Ila

=

0. T h e derivatives

See among others, A. KUWACAKI1952[c]; M. I I o s s z ~ 1955[d,

1964.

e l ; D. J. HANSEN

119 See among others, 0. OYICESCU 1927; M. Hosszri 1955[d], 1957[c], 1959[b, i]; J. A c z h ~1959[i]; Id. FIICHS 1963.

8.2.

Associativity, Transformation, Distributivity

373

We differentiate (1) with respect to a component of U = (ul , u 2 , u1 , and denote the derivatives of f ( z , U ) ,k(x, U),I(y, U ) with respect to this component (which, as derivatives of vectors with respect to scalars, are themselves vectors) with the index 2, whereas hl(u, u ) and hz(u, u ) denote the matrix derivatives with respect to u and v respectively:

..., u,J, say,

fz[g(x,Y ) , UI

If we set U

=

= h,[k(x, U ) ,l(Y> U)l

+ h,[k(x, U ) , 0,U>l - MY, U ) .

0, we obtain

f2[g(x, Y ) , 01

k,(x, U )

= hda,

b) * k d x , 0 )

+ hz(a, b )

l2(Y>0 ).

With

this becomes and if

has a unique inverse, namely, then

also holds. We substitute this into (1) with U "(x,

C ) , ldv, C)l

= f{r-"p(x)

=

C # 0 (constant);

+ d Y ) L C).

If the functions k ( x , C ) = u,

l(y,C ) = v

also have unique inverses with respect to x and y respectively, that is, x

=

44,

y

=

Nu),

8.

374

Vector and M a t r i x Equations

then with f[r-l(z), C ] = w(z),

p[v(~)l= s(u),

it follows that h(s, v )

= w[s(u)

d+(v)l

= t(v),

+ t(v)].

If we substitute this too back into ( l ) , we obtain with w-l{f[r-l(z), UI> = C(Z, [I),

s(k[p-'(x), Ul> = d(x, U ) ,

t { l [ q - l ( y ) ,Ul> = 0,f3

Pexider's equation dx

+y , U )

= d(x,

u ) + e ( y , u),

which, aside from the dependence on U , has the form 8.1.1(2) and therefore the bounded solutions

( C is a matrix, a, 6, are vectors). Therefore, under the above assumptions,

is the general system of local solutions of (1). Substitution shows directly that these functions do satisfy Eq. ( I ) . T h e exact statement of the conditions will be left to the reader this time. T h e distributivity equation

f M x , U)? UI

=

d f ( - r , U ) , f(Y7 U)l

(2)

8.2.

Associativity, Transformation, Distributivity

375

for vectors will be solved here by means of a differentiation process under the substantially weaker conditions of the existence of an identity transformation f(x,E) = x

and the following inequality for the dimensions n

= dim x

P=(31-

< dim U = m

Moreover, both assumptions can be eliminated. We differentiate our vector equation (2) with respect to the vector u, that is, with respect to the first n components of U at U = E. As a result, we obtain an equation between matrices f*[g(x,Y ) , El

= g d x , Y ) * f,(&

E)

+ gdx, Y )

*

f,(Y> El.

(3)

We introduce two new vector functions k ( x ) , f ( x ,y ) by that is,

k'(x) = f*[k(x),El, k - l ( ~= )

S~.(Z,E)-'

dz

= h(z)

(4)

8.

Vector and M a t r i x Equations

thus fl(x>Y )

+ [(x, Y )

=E

holds, where E denotes the identity matrix. Th u s

This yields, because of (5), the following Theorem.

Every continuously dzfferentiable local solution g of f M x , Y ) , UI = g [ f ( x ,

w, f ( Y , U>l

(2)

has the form &>Y)

= h-l{h(y)

+ P [ W - h(Y)l) <

( p , h are dzferentiable, h also invertible), ;f dim x dim U and E ) = x, detf,(x, E ) # 0 and ;f (4) has a locally invertible solution k . T h e determination of f ( x , U ) has not succeeded so far under these conditions. A proof of the conjecture that, as before,

f(x,

f(x,

w

=

h-YC(U)

- h(x) + a ( U ) ]

would be of great importance, among other things, for the theory of geometric objects. WITH RESPECT 8.2.5. VECTOROPERATIONS, WHICH ARE AUTOMORPHIC CERTAINTRANSFORMATIONS. A vector operation g(x, y ) is called automorphic with respect to the transformationsf(x, U ) , if the image of the vector operation g(x, y ) on two vectors x and y under the transformation f is equal to the vector operation on the images f ( x , U ) and f(Y, U): TO

Thus, these are special cases of the distributivity equation 8.2.4(2) for vectors, that is, also of 8.2.4(1), with knownf(x, U ) . G. DARBOUX1875 has proved, in establishing the parallelogram rule for the composition of forces (vector addition) in an elementary geometric manner, that the continuous Abelian group of the three-dimen-

8.2.

Associativity, Transformation, Distributivity

377

sional vector operations g(x, y ) that are autornorphic with respect to rotations is isomorphic with the vector group having vector addition as group operation. All g(x, y ) are of the f o r m S(X,Y) = W h ( 4

with

+W l

(2)

4 3 = h(/x I)x,

and these functions always satisfy the above conditions. Here h ( t ) is an arbitrary (continuous) scalar function and 1 x 1 the magnitude of the vector x. T h e rotational automorphism means, as in Sect. 1.3, that a simultaneous rotation of x and y results in the same rotation of g(x, y ) . From this, it may be deducedlZ0that the continuous Abelian group of three-dimensional vector operations that are automorphic with respect to rotations and with respect to dilatations is isomorphic with the vector group that has vector addition as group operation

with ( a scalar). By “dilatation” we mean the transformation f ( x , u ) = ux where u is a scalar. Here we prove another theorem, also discovered by M. H O S S Z1955 ~~ [d], which determines the vector operations g(x, y ) , automorphic with respect to rotations and dilatations with no restrictions on the number of dimensions and without assuming group properties, b u t under assumption of dzflerentiability of g(x, y ) at (0, 0). Equation (2) with (3) is differentiable at (0, 0) only for a = 0. T h e dilation automorphism states that

.&,

Y ) = g(ux, U Y ) ,

(4)

and if we take the derivatives of both sides of (4) with respect to u , we obtain (cf. Sect. 7.2.1) &,y)

and with u obtain

=

= g,(ux,

0 and g,(O, 0)

=

-

uy) x

+ g*(”x, .Y)

A , g (0, 0)

=

‘Y,

B (constant matrices), we

l Z o See M. Hosszir 1955[d], 1957[c]; M. Hosszir AND E. VINCZE 1962[a]. Concerning 1899[a, b, c]; G. HAMEL1903; this class of problems, see also, among others, F. SI.%CCI P. SCHIMMACK 1903, 1908, 1909[c]; F. SCHUR 1903; J . . I C Z1960[h], ~L 1961[g], 1962[b].

378

8.

Theorem 1.

Vector and M a t r i x Equations

&,Y)

=

A *x

+B.y,

are the most general vector operations that are automorphic with respect to dilatations and dzzerentiable at x = y = 0 ; that is, this is the general solution of %(X,Y) = d u x , U Y ) (4) that is dzyerentiable at ( 0 , 0). Equation (4) is a generalization of Euler’s functional equation 5.2.1( 1) to vectors. For y = x and y = -x, there now follows g ( x , x)

=

(A

+ B) - x

respectively

g(x, -x) = ( A - B ) * x.

If rotational automorphism is also assumed, then these resulting vectors remain unchanged under a rotation around the x-axis, which is possible only if ( a , /3 scalars). (A B)x = a x , ( A - B)x = /3x

+

These equations can be satisfied, on the other hand, for arbitrary vectors x only if ( E identity matrix), A B = aE, A - 5 = /3E

+

that is, g ( x , y ) = ax

+ by

(a, b scalar).

These operations are, in fact, automorphic with respect to rotations and dilations and are differentiable at (0, 0). T h u s we have Theorem 2.

Functions of the form g ( x ,y )

=

ax

+ by

(a, b scalars)

are the most general vector operations dzyerentiable a t x = y = 0 and automorphic with respect to rotations and dilatations in a space of arbitrary dimension. It is seen that they always associate with the vector pair x, y , a vector g(x, y ) , which lies in the plane formed by them. Numerous other problems in the theory of geometric transformations may also be handled by solving functional equations.