~IATHEXL\TIC.~L
A Data Transformation
Communicated
by
liichard
The traditional technique
146
BIOSCIESCES
Model for Factor Analysis*
Bellman
approach
primarily
to factor analysis views this multivariate
as a method
for determining
component
data analysis
sources of variation
in a derived matrix that excludes the unsystematic or error variation found in the The problem of transformation is only of concern in the final
original data matrix.
step of the procedure,
that of finding meaningful locations
for the reference vectors.
In this article we describe an orientation toward factor analysis viewing it primarily as a technique for transformin, 0 the data matrix directly into factor score and factor loading
matrices.
ISTRODUCTION
Although
a number
of multivariate
of an n by m (n > nz) matrix analysis
technique
analysis
have
known
recently
X,
as factor
been
models
almost
analysis.
described
exist
for linear
all are variations The techniques
by Horst
[4].
analysis
of the data of factor
We may
review
the essentials. The matrix a common
X of observed
component
scores is hypothesized
C and a unique
component
to be composed
of
U such that
(1)
X=CS_U
Wld C’U = U’C = 0. * Presented at the January,
(4
1967, meetings of the California State Psychological
Association. ,~at/wnzaticul Copyright
0
1968 by American
Uiosczences 2, 145-
Elsevier
Publishing
149 (1968)
Company,
Inc.
146
P. M,
Thus
U and
C are assumed
orthogonal.
X’X The matrices diagonal, is then
analyzed
immediately
that
= C’C + U’U.
U and C are generally
and is estimated
It follows
BENTLEK
unknown.
U’U is traditionally
in some manner.
for its eigenvectors
(3)
The matrix
assumed
(X’X
L and eigenvalues
-
U’Cr)
d2
C’C = Ld2L’, where d2 is a diagonal
matrix
(4)
and L’L = I, the identity
matrix.
Then
F = Ld is the
factor
variables
loading
(columns
matrix
(5)
representing
of X) and factors
the
(columns
degree of F).
of association
C’C = FF’ is the “fundamental There analysis.
are two
such
main
(6)
of factor
analysis.
indeterminacies
The model presented
U and C exist unique
postulate”
for which
in the
formulations
in (1) and (2) is not unique.
(1) and (2) hold.
scores can be composed
of
One may note that
For example,
of two orthogonal
of factor
Many matrices the matrix
sources
of
of error scores
that U = E, + E,
(7)
E;E, = 0.
(8)
and
Then X’X and another
model
= C’C + E;E,
C;C, = C’C + E;E,.
posed by this problem trace
of U’U
Gramian, procedures
= C;C, f A frequent
of the
subject resulting
have been developed
so that the current
E+;E,,
is to find diagonal
is maximized
regardless
(9)
for (3) is X’X
where
+ E;E,,
procedure
goal
(l(J) in solving
values
to the constraint rank
of C’C.
for an adequate
of choice appears
the
dilemma
for U’U such that that
Unfortunately,
solution
the
C’C remain no
to this problem,
to be that of finding the
DA\T=\ TRANSFORhlATIOX
F that
minimizes
the off-diagonal
a given specified minimization the choice loading
rank of I;.
[3].
residual
elements
U’U is computed
A second
of the matrix
matris
147
MODEL
F.
in (X’X
-
FF’)
as a consequence
indeterminacy
arises
If F provides
one solution
in factor
for
of the
analysis
in
for the factor
in (5), then FT = LdT
provides
another
presented
solution
compatible
in (6) if T is any square
matrix.
That
an infinite
indeterminate
number
unless
The problem of rotation
with
fundamental normal
postulate
(orthonormal)
= FF’ = CC.
of
T matrices
additional
of T is usually
for “meaningful”
the mathematical
that describes
formulation
we describe
of factor
analytic
for F is
upon F.
called the problem can be
of the matrix
of transformation
FT.
traditionally
analysis.
a new orientation
the entire factor
solution
a transformation
interpretations
It is only at this step that the problem In this paper
the
are placed
It is hoped that
or transformation.
(12)
exist,
constraints
posed by a selection
found that provides enters
the
orthogonal,
is. FTT’F’
Since
(11)
toward
factor
analysis
model as a model of transforma-
tion. FORMULATION
We may write the Eckart-Young score matrix
[l]
decomposition
of the obsrr\,ed
X as X = PbQ’
where
P’P
= I, Q’Q = I, and b is a diagonal
or less than m. normals and
P and Q are sometimes
of X.
U such that
We assume Eq.
C = KdL’ and
that
(1) holds
there
(13) matrix
of order equal
to
known as the left and right orthoexist
real-numbered
matrices
C
and that (K’K=I;L’L==I)
(14)
1’.
148 where d is a diagonal diagonal matrix that
matrices
matrix
of order equal to or less than m and g is a
of order m. Further, C and
we assume KM
6’ are orthogonal,
Note
that
factor
we have
analytic not
= Ng2AT’ is diagonal
lJ’U
restriction
may
the ensuing
(3) through
U’U
upon
the
(6) follow explicitly.
to be a diagonal
only if N = I;
be placed
(16)
if one wishes,
U matrix
matrix,
since
this additional
without
difficulties
for
discussion.
Now we note that the matrices of the matris matrix,
= 0.
equations
required
= 0. It then follows
since
C’U = LdK’MgN’ The traditional
Al. BES7
X.
The matrix
C and U are themselves
K, usually
allows one to transform
transformations
called the common-factor
the X matrix
directly
score
to the C matrix
by the multiplication KK’X
= KK’(C
+ U) = KK’(KdL’
+ MgN’)
= KdL’ + 0 = C
(l’i)
Similarl\-, MM’X
= MM’(C =0+
Thus,
it is possible
as another
for this purpose. matrices technique
U.
(18)
X = KK’X
+ MM’S
formulation
of factor
no useful technique
K or M subject
of C’C = X’KK’X, C itself
MgN’=
has suggested
of transformation,
analysis.
itself
for estimating
If such procedures
were to be developed,
and the factor
analysis
can compute
as traditionally
procedures the matrix
score and factor directly.
the matrix
presented
the
of the trace or rank
it would seem possible to develop iterative
K and Ld of (14) could be calculated
still remain.
(19)
to (19) and the minimization
would be determinate, of factor
+ AfgiY’)
to write
multivariate
Although matrices
-1 U) = MM’(KdL’
in (ll),
loading
At present K.
no
The problem
would
of course
An interesting conclusion arises from the transformation model of If we represent the general inverse, factor analysis we have presented. or pseudo-inverse (e.g., [2, 5, B]), of the matrices C and U as Ci and U” in C’ = I,d-rk”
FZ (C’C)‘C’
(20)
and
we obtain some useful left inverses multiply X by C’, we obtain
\Vhen
the rank
for the matrix
X itself.
of C is -in, LL’ = I, and hence C’X = I.
= XJy’
=
If we lxe-
l;urthermore,
I
(23)
since we assumed the rank of Ii to he 111. It would seem possible to utilize the information in (22) and (23) in iterative procedures designed to estimate C and U.
1 C. Eckart
rank,
and C;. Young,
Psychornetvika
Tllc
1(1536),
approximation
of one matrls
2 T. N. E. Greville, The pseudo inverse of a rectangular application
to the solntion
by anuthvr of IoN cr
21 l-218.
of systems
or singular matrix and its
of linear equations,
SIA‘W
Rev. l( l%i!)),
38-43. 3 H. H. Harman and \V. H. Jones, Factor analysis by minimizing residuals (Mimes), Ps.vchometrika
4 P. Horst,
%1(196G),
Factor awlysis
3 It. Penrose,
.\ generalized
3.5-368. of d&a wulvzccs, \Vlley, Sew York, inxw-se for matrices,
Pwc.
196.5.
Cawb. I’hil.
SOC. 51(195.i),
406-413. ti I<. Kado, Sote on generalized inverses of matrices, I’voc. Cu~ub. I’kil. Sot. S’I(19561, WO-601.