A data transformation model for factor analysis

A data transformation model for factor analysis

~IATHEXL\TIC.~L A Data Transformation Communicated by liichard The traditional technique 146 BIOSCIESCES Model for Factor Analysis* Bellman ...

232KB Sizes 0 Downloads 63 Views

~IATHEXL\TIC.~L

A Data Transformation

Communicated

by

liichard

The traditional technique

146

BIOSCIESCES

Model for Factor Analysis*

Bellman

approach

primarily

to factor analysis views this multivariate

as a method

for determining

component

data analysis

sources of variation

in a derived matrix that excludes the unsystematic or error variation found in the The problem of transformation is only of concern in the final

original data matrix.

step of the procedure,

that of finding meaningful locations

for the reference vectors.

In this article we describe an orientation toward factor analysis viewing it primarily as a technique for transformin, 0 the data matrix directly into factor score and factor loading

matrices.

ISTRODUCTION

Although

a number

of multivariate

of an n by m (n > nz) matrix analysis

technique

analysis

have

known

recently

X,

as factor

been

models

almost

analysis.

described

exist

for linear

all are variations The techniques

by Horst

[4].

analysis

of the data of factor

We may

review

the essentials. The matrix a common

X of observed

component

scores is hypothesized

C and a unique

component

to be composed

of

U such that

(1)

X=CS_U

Wld C’U = U’C = 0. * Presented at the January,

(4

1967, meetings of the California State Psychological

Association. ,~at/wnzaticul Copyright

0

1968 by American

Uiosczences 2, 145-

Elsevier

Publishing

149 (1968)

Company,

Inc.

146

P. M,

Thus

U and

C are assumed

orthogonal.

X’X The matrices diagonal, is then

analyzed

immediately

that

= C’C + U’U.

U and C are generally

and is estimated

It follows

BENTLEK

unknown.

U’U is traditionally

in some manner.

for its eigenvectors

(3)

The matrix

assumed

(X’X

L and eigenvalues

-

U’Cr)

d2

C’C = Ld2L’, where d2 is a diagonal

matrix

(4)

and L’L = I, the identity

matrix.

Then

F = Ld is the

factor

variables

loading

(columns

matrix

(5)

representing

of X) and factors

the

(columns

degree of F).

of association

C’C = FF’ is the “fundamental There analysis.

are two

such

main

(6)

of factor

analysis.

indeterminacies

The model presented

U and C exist unique

postulate”

for which

in the

formulations

in (1) and (2) is not unique.

(1) and (2) hold.

scores can be composed

of

One may note that

For example,

of two orthogonal

of factor

Many matrices the matrix

sources

of

of error scores

that U = E, + E,

(7)

E;E, = 0.

(8)

and

Then X’X and another

model

= C’C + E;E,

C;C, = C’C + E;E,.

posed by this problem trace

of U’U

Gramian, procedures

= C;C, f A frequent

of the

subject resulting

have been developed

so that the current

E+;E,,

is to find diagonal

is maximized

regardless

(9)

for (3) is X’X

where

+ E;E,,

procedure

goal

(l(J) in solving

values

to the constraint rank

of C’C.

for an adequate

of choice appears

the

dilemma

for U’U such that that

Unfortunately,

solution

the

C’C remain no

to this problem,

to be that of finding the

DA\T=\ TRANSFORhlATIOX

F that

minimizes

the off-diagonal

a given specified minimization the choice loading

rank of I;.

[3].

residual

elements

U’U is computed

A second

of the matrix

matris

147

MODEL

F.

in (X’X

-

FF’)

as a consequence

indeterminacy

arises

If F provides

one solution

in factor

for

of the

analysis

in

for the factor

in (5), then FT = LdT

provides

another

presented

solution

compatible

in (6) if T is any square

matrix.

That

an infinite

indeterminate

number

unless

The problem of rotation

with

fundamental normal

postulate

(orthonormal)

= FF’ = CC.

of

T matrices

additional

of T is usually

for “meaningful”

the mathematical

that describes

formulation

we describe

of factor

analytic

for F is

upon F.

called the problem can be

of the matrix

of transformation

FT.

traditionally

analysis.

a new orientation

the entire factor

solution

a transformation

interpretations

It is only at this step that the problem In this paper

the

are placed

It is hoped that

or transformation.

(12)

exist,

constraints

posed by a selection

found that provides enters

the

orthogonal,

is. FTT’F’

Since

(11)

toward

factor

analysis

model as a model of transforma-

tion. FORMULATION

We may write the Eckart-Young score matrix

[l]

decomposition

of the obsrr\,ed

X as X = PbQ’

where

P’P

= I, Q’Q = I, and b is a diagonal

or less than m. normals and

P and Q are sometimes

of X.

U such that

We assume Eq.

C = KdL’ and

that

(1) holds

there

(13) matrix

of order equal

to

known as the left and right orthoexist

real-numbered

matrices

C

and that (K’K=I;L’L==I)

(14)

1’.

148 where d is a diagonal diagonal matrix that

matrices

matrix

of order equal to or less than m and g is a

of order m. Further, C and

we assume KM

6’ are orthogonal,

Note

that

factor

we have

analytic not

= Ng2AT’ is diagonal

lJ’U

restriction

may

the ensuing

(3) through

U’U

upon

the

(6) follow explicitly.

to be a diagonal

only if N = I;

be placed

(16)

if one wishes,

U matrix

matrix,

since

this additional

without

difficulties

for

discussion.

Now we note that the matrices of the matris matrix,

= 0.

equations

required

= 0. It then follows

since

C’U = LdK’MgN’ The traditional

Al. BES7

X.

The matrix

C and U are themselves

K, usually

allows one to transform

transformations

called the common-factor

the X matrix

directly

score

to the C matrix

by the multiplication KK’X

= KK’(C

+ U) = KK’(KdL’

+ MgN’)

= KdL’ + 0 = C

(l’i)

Similarl\-, MM’X

= MM’(C =0+

Thus,

it is possible

as another

for this purpose. matrices technique

U.

(18)

X = KK’X

+ MM’S

formulation

of factor

no useful technique

K or M subject

of C’C = X’KK’X, C itself

MgN’=

has suggested

of transformation,

analysis.

itself

for estimating

If such procedures

were to be developed,

and the factor

analysis

can compute

as traditionally

procedures the matrix

score and factor directly.

the matrix

presented

the

of the trace or rank

it would seem possible to develop iterative

K and Ld of (14) could be calculated

still remain.

(19)

to (19) and the minimization

would be determinate, of factor

+ AfgiY’)

to write

multivariate

Although matrices

-1 U) = MM’(KdL’

in (ll),

loading

At present K.

no

The problem

would

of course

An interesting conclusion arises from the transformation model of If we represent the general inverse, factor analysis we have presented. or pseudo-inverse (e.g., [2, 5, B]), of the matrices C and U as Ci and U” in C’ = I,d-rk”

FZ (C’C)‘C’

(20)

and

we obtain some useful left inverses multiply X by C’, we obtain

\Vhen

the rank

for the matrix

X itself.

of C is -in, LL’ = I, and hence C’X = I.

= XJy’

=

If we lxe-

l;urthermore,

I

(23)

since we assumed the rank of Ii to he 111. It would seem possible to utilize the information in (22) and (23) in iterative procedures designed to estimate C and U.

1 C. Eckart

rank,

and C;. Young,

Psychornetvika

Tllc

1(1536),

approximation

of one matrls

2 T. N. E. Greville, The pseudo inverse of a rectangular application

to the solntion

by anuthvr of IoN cr

21 l-218.

of systems

or singular matrix and its

of linear equations,

SIA‘W

Rev. l( l%i!)),

38-43. 3 H. H. Harman and \V. H. Jones, Factor analysis by minimizing residuals (Mimes), Ps.vchometrika

4 P. Horst,

%1(196G),

Factor awlysis

3 It. Penrose,

.\ generalized

3.5-368. of d&a wulvzccs, \Vlley, Sew York, inxw-se for matrices,

Pwc.

196.5.

Cawb. I’hil.

SOC. 51(195.i),

406-413. ti I<. Kado, Sote on generalized inverses of matrices, I’voc. Cu~ub. I’kil. Sot. S’I(19561, WO-601.