Factor Analysis and Latent Structure Analysis: Overview

Factor Analysis and Latent Structure Analysis: Overview

Factor Analysis and Latent Structure Analysis: Overview David J Bartholomew, London School of Economics, London, UK Ó 2015 Elsevier Ltd. All rights re...

124KB Sizes 2 Downloads 277 Views

Factor Analysis and Latent Structure Analysis: Overview David J Bartholomew, London School of Economics, London, UK Ó 2015 Elsevier Ltd. All rights reserved.

Abstract For over 100 years, factor analysis has probably been the most widely used multivariate technique among social scientists. It is still not well known that it is one of a family of methods which share the same theoretical framework and pose similar problems of interpretation. These are known, collectively, as latent variable methods. The family also includes latent class analysis and latent profile analysis. The family connection is described here within a framework which is designed to show what they have in common. Latent variables play a fundamental role in social science and some other occurrences are briefly touched upon.

Introduction The terms factor analysis and latent structure analysis refer to two aspects of essentially the same problem. In practice, a ‘factor,’ as the term is used in factor analysis, is simply a latent variable and the failure to recognize this equivalence has often caused much misunderstanding. In ordinary statistical usage the term ‘factor’ is used, rather loosely, to refer to almost any variable. The reason for this imprecision is historical in that factor analysis is, possibly, the oldest multivariate statistical method. It was invented by Charles Spearman around 1904, and was developed and initially used almost exclusively by psychologists. It is now possible to see the two strands in their proper relationship. A latent variable is any variable occurring in a statistical problem, which cannot be observed. This may be because it is inconvenient, or impracticable, to observe it or it cannot be observed, even in principle. The latter case includes what we might call hypothetical variables, which are introduced into a statistical problem because they simplify the interpretation of the results. For example, one of the earliest and most enduring examples of a latent variable is ‘general ability.’ This is closely related to the subject of Spearman’s original work on general intelligence. It is commonly found that scores on tests of arithmetic, for example, made on the same individual, say, are highly correlated with one another. This is usually put down to the fact that all such scores depend on an underlying ability to perform such tests. By introducing a latent variable representing arithmetical ability, the interpretation of the data can be considerably simplified. Sometimes the object of the analysis is less specific and we merely inquire whether there might be any latent variable, which plays such an explanatory role without knowing, in advance, what it is. The problem then is to identify what that latent variable might be in terms which make substantive sense. Viewed in this way, latent variable analysis is a special case of a much wider range of statistical problems concerning unobserved variables, which has been treated in Bartholomew (2013). These include missing values, simple mixtures, and, when viewed appropriately, time series analysis. Indeed, the whole of statistics is included if we approach the subject from

International Encyclopedia of the Social & Behavioral Sciences, 2nd edition, Volume 8

a Bayesian perspective where parameters are treated, in effect, as latent variables.

The Basic Idea Although factor analysis and other latent variable methods are often regarded as relatively advanced multivariate methods, they all depend on a very simple idea. This idea is very familiar to students, in elementary statistics courses, under the guise of what is called ‘spurious’ correlation. There it is explained that a correlation between two variables may not be due to a real causal link between them, but be caused by the presence of a third variable which is related to both. This common influence results in a correlation which is properly called ‘spurious.’ To take an old example where the relationships are now relatively well understood, consider the variables involved in cancer of the lung. It has long been known that the amount of tobacco smoke inhaled by an individual is positively correlated with the risk of cancer of the lung. But it could be argued that this correlation could be spurious because it might be induced by some other variable. For example, living in an urban environment, with its polluted atmosphere, might be the cause of both smoking and lung cancer. Had this been the case, the correlation between smoking and lung cancer would have vanished if the analysis had been done for subgroups whose members all had the same exposure to polluted air. In essence, the general aim is to find other variables which, if held constant, would have eliminated the correlation between smoking and cancer. In any latent variable analysis the aim is to ‘explain’ the correlations between observed variables by showing that they could have arisen because of their dependence on other, unobserved, variables. This idea may be further illustrated by a numerical example using only binary variables called A and B and expressing their relationship in the familiar form of a 2  2 table. A¼1

A¼0

Total

B¼1

700

400

1100

B¼0

300

600

900

Total

1000

1000

2000

http://dx.doi.org/10.1016/B978-0-08-097086-8.42043-X

691

692

Factor Analysis and Latent Structure Analysis: Overview

This table shows a positive association between A and B; if A ¼ 1, B is more likely to be 1 than if B ¼ 0. Suppose now there is a third binary variable called C. We might then look at the association between A and B for each of the two values taken by C. Suppose the breakdown gives the following two tables. C¼1

A¼1

A¼0

Total

B¼1

640

160

800

B¼0

160

40

200

Total

800

200

1000

of the distribution of the observable variables x. The relationship between the two can be obtained by using standard probability methods as Z f ðxÞ ¼ f ðx; zÞdz [1] Here, as already noted, the equation is written as if the variables were all continuous – hence the integral in eqn [1]. This is purely for convenience and the translation needed to cover other cases does not affect the logic of the argument. The next step is to write f ðx; zÞ ¼ f ðzÞf ðxjzÞ

C¼0

A¼1

A¼0

Total

B¼1

60

240

B¼0

140

560

700

Total

200

800

1000

300

It is clear that there is no association between A and B in either of the last two tables; the column frequencies in the first two columns are in exactly the same proportion as in the final column. This implies that the association in the first table is entirely spurious because, when we hold C constant, the association vanishes. Although this highly simplified example conveys the basic idea, real problems are much more complicated. The variables need not be binary but could be continuous, categorical – with and without ordering of the categories – or any mixture of the two. In our example, there was only one latent variable, C. In practice there may be several and, as with the observed variables, they may be continuous or categorical. In all cases the essential idea is to find whether variables exist, like C, such that, when we condition on them, the observed variables become independent. We now go over this ground again but this time in greater generality.

Formal Statement of the Conceptual Framework The last section sets out the key idea underlying all latent variable models. Now, the general approach is expressed more formally in mathematical terms. This is essential if the full generality is to be grasped but, to ease the path of the nonmathematical reader, we treat only the case of continuous variables and simplify the presentation by omitting details such as the limits of integration. Full details will be found in Bartholomew et al. (2011), but the reader who is interested only in specific models may pass immediately to later sections. Denote the set of observed variables by the p-vector, x. Its elements, x1, x2, ., xp may be continuous or categorical or a mixture of the two. These are the generalization of the binary variables A and B of the last section and they are often referred to as ‘manifest’ variables. The latent variables are similarly denoted by the q-vector, z. The essential thing about this kind of model is that z is not observed. For this reason everything must be expressed in terms

[2]

This decomposition provides us with the two elements to which the description of the last section referred – namely a statement of how the manifest variables would vary if z were known, given here by f(xjz), and the prior distribution of z. If both parts of eqn [2] can be specified, f(x) can be determined and thus the way is open to estimate any parameters of the model from a likelihood function constructed from f(x). To scale z, that is to find a value for z for an individual with a given x, we may use the posterior distribution given by f ðzjxÞ ¼ f ðzÞf ðxjzÞ=f ðxÞ

[3]

A scale value, or factor score as it is known, for a particular z can then be found by using some measure of location of its posterior distribution such as E(zijx). This specification is deceptively simple. It merely spells out some of the consequences of regarding x and z as random variables. To turn it into a model we have to specify f(z) and f(xjz). Here we face a fundamental problem because the decomposition of eqn [3] is not unique – there are infinitely many possible pairs, each leading to the same f(x) and hence indistinguishable empirically from one another. The choice of f(z) is resolved by appealing to the fact established above that, since z is a latent variable, it can be constructed to have any convenient distribution. For instance, as a matter of convention one may choose the z’s to be standard, normal, and mutually independent. The core of the model-building exercise centers on the choice of the conditional distribution f(xjz). This has to be considered in the context of individual problems but there is one further simplification which applies to all models. This simplification involves making explicit what is already implicit in our formulation. The fact that the x’s are supposed to be indicators of z implies that the x’s will be mutually correlated – because of their common dependence on some, at least, of the z’s. Hence, if z were to be fixed, the source of those correlations would vanish. Conditional on z, the x’s would then be independent. If this were not so, it could be inferred that there was at least one other z, not already included, giving rise to the correlations remaining. Sufficient z’s therefore have to be included in the model to eliminate the correlations after conditioning of z. It then follows that f ðxjzÞ ¼

n Y

fi ðxi jzÞ

[4]

i¼1

This is often referred to as the assumption of conditional (or local) independence but it is not an assumption in the usual

Factor Analysis and Latent Structure Analysis: Overview sense. It is, rather, a statement of what is meant by saying that z explains the interdependence of the x’s. It is, essentially, a definition of q as the smallest integer for which an equation of the form [4] can be constructed. How this may be done in particular cases will be shown in the following sections, but there is an important general principle which offers a fruitful way forward. Distributions of the exponential family are widely used in statistical modeling because they lead to estimates with highly desirable properties such as sufficiency. They provide a rationale for many of the most important models used in statistics, such as generalized linear models. Essentially the same properties can be utilized here. They are set out in the article entitled The General Linear Latent Variable Model in Bartholomew et al. (2011). The essential idea is to make what is sometimes called the ‘natural’ or ‘canonical’ parameter of the one-parameter exponential family distribution, a linear function of the latent variables. If that is done, it turns out that the posterior distribution of z given x of eqn [3] depends on the x’s only through q linear combinations of the x’s, which we may write Xj ¼

p X

aij xi

ðj ¼ 1; 2; .; qÞ

[5]

i¼1

This is a remarkable result, which justifies the common practice of ‘estimating’ latent variables from sums, or linear combinations of observed variables.

Structure of the Family As we have already noted, latent variables can be classified into several types. Many, like intelligence, are conceived of as being continuous, in which case we are looking for a scale on which individuals can be located. In other contexts, it is more appropriate to think of the latent variable as categorical. In that case individuals are supposed to belong to one of several classes, which may or may not be ordered. What is true of the latent variables is, of course, true for the manifest variables and the only essential difference between the various methods is in the types of variables for which they are appropriate. A convenient way of displaying the relationship between the methods is to introduce the fourfold classification of Table 1. We classify the manifest and the latent variables as ‘metrical’ or ‘categorical.’ In the former case we mean that they take values on some continuous or discrete scale of measurement; in the latter they fall into categories. This classification is not exhaustive. It does not, for example, include cases where the two kinds of variables occur among the manifest (or latent) variables. Neither does it distinguish Table 1

Classification of latent variable methods Manifest variables

Latent variables Metrical Categorical

Metrical

Categorical

Factor analysis Latent profile analysis

Latent trait analysis Latent class analysis

693

between continuous and discrete metrical variables nor between ordered and unordered categorical variables. Nevertheless, it does serve to provide a broad convenient framework in which to place the most widely used methods covered in this article. The terminology of the subject reflects the diverse origins of the subject accumulated over almost a century.

Members of the Family Factor Analysis This is the oldest and most widely used latent variable method. The model is usually written as x ¼ m þ Lz þ e

[6]

where E(z) ¼ E(e) ¼ 0; cov(e) ¼ J, a diagonal matrix; cov(z,e) ¼ 0; and z  Nð0; 1Þ. An equivalent way of writing the model, which conforms more closely to the general treatment above, is to specify the conditional distribution of x given z as xjz  Nðm þ Lz; jÞ

[7]

which makes it clear that z influences x only through its mean, which is linear in z. The marginal distribution of x is N (m, LL’ þ J) and it is from this that the parameters L and J must be estimated. The matrix LL’ þ J is sometimes referred to as the covariance structure. Fitting the model consists in determining L and J, which make the fitted and observed covariances of the x’s as close as possible. This can be done without invoking the distributional assumptions at all since the covariance structure derived from eqn [6] is the same, regardless of the assumptions about the form of the distributions of z and e. However, without such assumptions nothing can be said about goodness of fit or the sampling variation of the estimators. Posterior analysis allows the distribution of z given x to be determined, which for the factor model leads to     X1 x  m ; I þ L0 j1 L zjx  N L

[8]

P where ¼ LL0 þ J. As predictors of z given x, we can use the mean values of this distribution, replacing the parameters by their sample estimates.

Interpretation and Rotation in the Case of the Factor Model The interpretation of factors involves procedures that are common to all linear latent variable models. They are most familiar in the context of factor analysis so, although treated here, their generality must be emphasized. The problem may arise in two forms. If there is some idea of what factors to expect (as in the arithmetical ability example above), the question is how to confirm their existence. If, on the other hand, the analysis is purely exploratory, further guidance is needed on how to assign meaning to any factor that is found. The essential idea for doing this is to look at the relationships between the latent and manifest variables. This is facilitated by noting that the element lij of L may be interpreted as the correlation between xi and zj. Manifest variables, which are highly correlated with zi say, have a lot in common with zi which, in turn, is highly

694

Factor Analysis and Latent Structure Analysis: Overview

influential in determining those x’s. It must then be asked what it is that those x’s have in common with one another which they also share with zi. Interpretation is often facilitated by a process known as ‘rotation.’ Thus far the treatment has supposed that there was only one solution to the problem of fitting the model. In fact there are infinitely many models, which all predict the same covariance structure. These solutions can be generated from one another by a procedure known as ‘rotation’ (because of its geometrical interpretation). Each such solution represents an equivalent way of describing the latent space, but some ways may be easier to interpret than others. For example, a solution for L which has, at most, one nonzero value in each row implies that the x’s have been divided into disjoint groups, each group depending on a single latent variable. That latent variable is then interpreted in the light of what the members of that subset have in common.

Latent Class Models If the latent space consists of a finite (usually small) set of classes and if the manifest variables are also categorical, the bottom right cell of Table 1 applies. The commonest case is where the manifest variables are binary. If there are c latent classes, the prior distribution f(z) becomes a discrete probability distribution over the classes, hj being the probability of belonging to class j. In this case these probabilities can be treated as unknown parameters to be estimated. For convenience, the two values which each xi takes are coded as 1 and 0, the conditional distributions fi(xijz) consists of two probabilities: Pr{xi ¼ 1jz} and Pr{xi ¼ 0jz}. The obvious choice for the distribution of xi is the Bernoulli distribution written as fi ðxi jjÞ ¼ fpi ðjÞgxi f1  pi ðjÞg1xi

xi ¼ 0; 1

j ¼ 1; 2; .; c [9]

The joint probability distribution of x is then f ðxÞ ¼

c X j¼1

hj

p Y

fpi ðjÞgxi f1  pi ðjÞg1xi

notation of its own. The model may also be used in other fields, such as sociology, where it may be more appropriate to introduce several latent variables. A latent trait model with binary x’s is similar to a latent class model. The prior distribution is now continuous and will usually be taken as standard normal. The response variables will be Bernoulli random variables but now they depend on the continuous latent variables. Since the Bernoulli distribution is a member of the exponential family the appropriate form for pi(z) turns out to be logit pi ðzÞ ¼ ai þ ai1 z1 þ ai2 z2 þ / þ aiq zq

[11]

Other versions of the model that exist are widely used in which the logit function on the left-hand side of eqn [11] is replaced by f1 ð:Þ, (the inverse standard normal distribution function). These give very similar results to eqn [11] but they lack the sufficiency properties. If j ¼ 1 and if the aijs are the same for all i, the model is a random effects version of the Rasch model (see, for example, Bartholomew, 1996). In the latter model the trait values are taken as parameters rather than random variables so, in the strict sense of classical inference, the Rasch model is not a latent variable model.

Other Models The latent profile models of Table 1 have not been included here because they have found relatively little practical application. However, it has also been shown by Molenaar and von Eye (1994) that they are virtually indistinguishable from factor models in the sense that they have an equivalent covariance structure. Nor have we considered hybrid models in which manifest and latent variables may be mixed type – discrete or continuous. A particular advantage of our general approach is that it lends itself to the treatment of all such models as exemplified, for example, in Bartholomew et al. (2011).

[10]

i¼1

This can be used to construct the likelihood function from which the parameters may be estimated. Posterior analysis for this case is concerned with allocating individuals to latent classes after x has been observed and this is done easily by substituting into eqn [3]. This gives us the probability that an individual with observed vector x belongs to each of the classes. The number of parameters in this model increases rapidly with c and serious problems of identifiability arise if there are more than, say, four or five classes. Also the standard errors of the parameters estimates also increase rapidly as c increases.

Latent Trait Models These models were devised, primarily, mainly for use in educational testing where the latent trait refers to some ability. Thus there is usually only one latent variable, representing that ability, and many indicators. The latter are often binary, corresponding to responses to the items being ‘right’ or ‘wrong.’ Latent trait modeling has become a specialized field, often referred to as item response theory with a literature and

Relationships between Latent Variables In much sociological research, the interest is not so much in the latent variables themselves as individual entities, but in the relationships between them. The interrelationship of manifest variables is a major subject of investigation in social science covering techniques such as path analysis, regression, log-linear analysis, and graphical models. It is natural to want to extend these models to include latent variables. This has given rise to linear structural relations modeling, which can be implemented using widely available software packages such as LISREL, Mplus, AMOS, and EQS. In essence, such models have two parts: (1) a measurement model which supposes that each latent variable is linked to its own set of indicators through a factor model and (2) a structural model which specifies the (linear) relationships among the latent variables. Such a model is fitted by equating the observed and theoretical covariance matrices. A much more general framework, which allows a wider range of models, is provided by Bartholomew et al. (2011). Although these methods are very widely used, serious questions have been raised about the identifiability of the models (see Anderson

Factor Analysis and Latent Structure Analysis: Overview

and Gerbing, 1988; Croon and Bolck, 1997; Bartholomew et al., 2011). These authors have suggested that it may be better to separate the measurement and structural parts of the analysis. This can be done by constructing ‘estimates’ of the latent variables and then exploring their interrelationships by the traditional methods used for manifest variables.

Estimation of Parameters The feature of latent variable models which has posed considerable practical problems in the past has been the large number of unknown parameters that have to be estimated. For example, a standard factor analysis model with p variables and q factors involves pq þ q parameters and since p may easily exceed 10 or, even 100 this leads, inevitably, to large numbers of parameters. In the early days of factor analysis, great ingenuity had to be exercised to devise methods of estimating the parameters which were computationally feasible. The calculation of standard errors was almost impossible and this led, perhaps, to overoptimistic claims about what had been established. Modern computing facilities have radically altered the situation but words of caution are still in order. General methods such as maximum likelihood are, of course, available and they have the great advantage of yielding asymptotic standard errors and covariances of the estimators. Bayesian methods are, likewise, not restricted, in principle at least, by the number of parameters. The latter provide estimates of the posterior distributions of the parameters, which provide a more complete picture of the uncertainty surrounding the estimates. Indeed, in a certain sense, the two approaches may be said to have converged when there is a large number of parameters to be estimated; the justification for this remark will become clearer as we proceed. The standard asymptotic theory of maximum likelihood estimation depends on the behavior of the likelihood in the immediate neighborhood of the maximum. Empirical work with the method has shown that the sampling distribution only approaches the normal form implied by the asymptotic theory, if the sample size is very large indeed. This means that the asymptotic standard errors may not be an adequate approximation in many practical situations. Furthermore, the approach to the maximum of the standard iterative methods may be extremely slow. With problems involving relatively small number of parameters and with adequate computing power, this is not usually a serious problem and the software, which most practical applications use has been based, has often been equal to the task. Nevertheless, there are considerable attractions in moving to Bayesian methods, especially as the number of parameters increases. Bayesian methods involve approximating the posterior distribution of the parameters. This enables one to find the maximum of this distribution and, if required, other properties of the posterior distribution. Provided that the prior distribution chosen for the parameters is fairly ‘flat’ there will be very little difference between the posterior distribution and the likelihood function in the neighborhood of the maximum. Hence the method is thus, for all practical purposes, an approximation to the method of maximum likelihood.

695

The method, known as Markov Chain Monte Carlo, generates the empirical posterior distribution from which summary values, like the maximum, can be estimated. The Markov Chain part of the name comes from the fact that the posterior distribution emerges as the steady state of a certain Markov chain. Repeated empirical realizations of the chain then generate a random sample from the posterior distribution. The method, illustrated numerically, is described in Bartholomew et al. (2011). The particular attractiveness of the method is that it involves a very large number of very simple steps – a task for which computers are ideally suited. It therefore offers a way forward in what have, hitherto, been extremely intractable numerical problems.

Historical Background Factor analysis was invented by Spearman (1904) for the specific purpose of operationalizing the notion of general intelligence. He remained wedded to the view that there was a single dominant latent variable of human ability which he named ‘g’ which accounted for most of the variation in observed performance. This idea was challenged by Godfrey Thomson who introduced what he first called the ‘sampling of bonds’ model which had an identical correlation structure (see, for example, Thomson, 1938). Over the years, Thomson was at pains to emphasize that his model had been produced primarily to show that there were more than one model which would account equally well for the data rather than that it corresponded to reality. More recently, what has become known as Thomson’s ‘bonds model,’ has received more attention (see, for example, Bartholomew et al., 2009) as a serious competitor for Spearman’s model in that there is no good biological ground for distinguishing between them. It can also be generalized in various ways which have considerable practical plausibility. Following Spearman, Thurstone (1947) emphasized the multifactorial approach and, for that purpose, generalized Spearman’s treatment to allow for several factors. At that time, factor analysis was ahead of its time in two senses. First, it was a sophisticated multivariate technique introduced long before the statistical technology was available. In consequence, its development was somewhat idiosyncratic and its separation from the statistical mainstream gave it an alien appearance, which persists to this day and still causes some suspicion and misunderstanding. Second, it is a computer-intensive technique whose full potential could only be realized with the arrival of the powerful electronic computers in the 1980s. This lack distorted the early development of the subject by the need to concentrate on the search for computational simplifications. Latent structure analysis, which covers the remaining cells of Table 1, was introduced by Lazarsfeld after the World War II, with a view to its applications in sociology. A book-length treatment, by Lazarsfeld and Henry (1968), has held the field for many years as the definitive source of information but this has now been superseded by a new generation of books of which Heinen (1996) is both comprehensive and relatively recent.

696

Factor Analysis and Latent Structure Analysis: Overview

Latent structure analysis came on the scene half a century after factor analysis, and in a different disciplinary context. For that reason there was, until very recently, little crossfertilization between the two fields. When viewed from a level of sufficient generality, the methods are essentially the same. In the postwar era, more powerful methods became available but the whole field has largely remained a set of distinct subspecialisms each with its own literature and technology. It is over 40 years since Anderson (1959) pointed out that all latent methods shared a common conceptual framework but his insight has been slow to yield its full potential. The classification of methods as given in Table 1 was made on the basis of a common approach in Bartholomew (1984), the culmination of which is to be seen in Bartholomew et al. (2011).

Future Developments The advent of massive computer power has changed the practice of multivariate analysis radically, and of latent variable analysis in particular. It is not possible to foresee all of the consequences of this development. The limiting factor is no longer computing power but of getting data of sufficient quality and quantity to justify the very complicated models which the theory provides and computers can handle. The precision with which models with many parameters can be estimated may often be very low unless the sample size runs into many thousands. This makes it all the more important to estimate the sampling variability adequately of the estimates of the parameters on which the interpretation depends. Traditionally this has been done by finding asymptotic standard errors but these can be very imprecise, as we have already remarked. However, it is now possible to supplement these results by resampling methods such as the bootstrap or other straightforward simulation methods. There are other fields in which latent variable models are, or may be, used which currently exist in isolation. Some of these are covered in Bartholomew (2013). One generalization, which may not be obvious, is the latent time series. Some work has been done for the case where the latent process is a Markov chain. In this area the term ‘hidden’ is often used instead of ‘latent’ and this helps to conceal the family connections (for an introduction see Zucchini and MacDonald, 2009). An application in a more traditional time series context will be found in Harvey and Chung (2000). Also, there has been a great deal of important work done by economists on what they call unobserved heterogeneity. This essentially involves the introduction of latent variables into econometric models and there is substantial scope for new work in this area.

Notes on Bibliography The literature in this field is enormous. On the one hand, there are classics dating back to the early years of the last century or, in the case of latent structure models, almost as

long. These are distinguished from current work by three things. First they were written before the computer era and much attention is typically given to the complications of computation. Second, they were written, largely, from within particular disciplinary perspectives and hence suffer from lack of recognition of the wider context. Third, being mainly the work of those whose interests lay in substantive fields, they are strong on substantive relevance but sometimes weak in technical generality. Some examples are provided in the following Bibliography but they should only be read in the light of these limitations. A useful source on factor analysis is Cudeck and McCullum (2007). This publication marked the 100th anniversary of Spearman’s introduction of the method and it looks backward and forward. A very wide-ranging treatment, covering latent structure methods more generally, is provided by Skrondal and Rabe-Hesketh (2004). A useful text on the theoretical basis of latent variable methods is provided by Bartholomew et al. (2011). At the textbook level, for students in the social and behavioral sciences, there is a good deal relating to latent variable analysis in Bartholomew et al. (2008). The references in these books will lead on to much reliable information for those new to the field.

See also: Bayesian Graphical Models and Networks; Bayesian Models in Neuroscience; Bayesian Statistics; Bayesian Theory, History, Applications, and Contemporary Directions; Factor Analysis and Latent Structure Analysis: Confirmatory Factor Analysis; Factor Analysis and Latent Structure: Irt and Rasch Models; Latent Structure and Causal Variables; Markov Chain Monte Carlo Methods; Monte Carlo Methods and Bayesian Computation: Importance Sampling; Monte Carlo Methods and Bayesian Computation: Overview; Psychometrics: Preference Models with Latent Variables.

Bibliography Anderson, T.W., 1959. Some scaling models and estimation procedures in the latent class model. In: Grenander, U. (Ed.), Probability and Statistics. Wiley, New York, pp. 9–38. Anderson, J.C., Gerbing, D.W., 1988. Structural equation modeling in practice: a review and recommended two-step approach. Psychological Bulletin 103, 411–423. Bartholomew, D.J., 1984. The foundations of factor analysis. Biometrika 71, 221–232. Bartholomew, D.J., 1996. The Statistical Approach to Social Measurement. Academic Press, San Diego, CA. Bartholomew, D.J., 2013. Unobserved Variables: Models and Misunderstandings. Springer, Heidelberg. Bartholomew, D.J., Deary, I.J., Lawn, M., 2009. A new lease of life for Thomson’s bonds model. Psychological Review 116, 567–579. Bartholomew, D.J., Knott, M., Moustaki, I., 2011. Latent Variable Models and Factor Analysis: A Unified Approach, third ed. John Wiley and Sons Ltd., Chichester. Bartholomew, D.J., Steele, F., Moustaki, I., Galbraith, J.I., 2008. Analysis of Multivariate Social Science Data, second ed. CRC Press, Boca Raton, FL. Croon, M., Bolck, A., 1997. On the Use of Factor Scores in Structural Equations Models. Technical Report 97.10.102/7. Work and Organization Research Centre, Tilburg University. Cudeck, R., McCullum, R.C. (Eds.), 2007. Factor Analysis at 100: Historical Developments and Future Directions. Lawrence Erlbaum Associates Inc., Mahwah, NJ.

Factor Analysis and Latent Structure Analysis: Overview

Harvey, A., Chung, C.-H., 2000. Estimating the underlying change in unemployment in the UK. Journal of the Royal Statistical Society A 163, 303–339. Heinen, T., 1996. Latent Class and Discrete Latent Trait Models: Similarities and Differences. Sage, Thousand Oaks, CA. Lazarsfeld, P.F., Henry, N.W., 1968. Latent Structure Analysis. Houghton Mifflin, New York. Molenaar, P.C.W., von Eye, A., 1994. On the arbitrary nature of latent variables. In: von Eye, A., Clogg, C.C. (Eds.), Latent Variables Analysis: Applications for Developmental Research. Sage, Thousand Oaks, CA.

697

Skrondal, A., Rabe-Hesketh, S., 2004. Generalised Latent Variable Modeling: Multilevel, Longitudinal and Structural Equation Models. Chapman and Hall/CRC, Boca Raton, FL. Spearman, C., 1904. General intelligence, objectively determined and measured. American Journal of Psychology 15, 201–293. Thomson, G.H., 1938. The Factorial Analysis of Human Ability. University of London Press, London. Thurstone, L.L., 1947. Multiple Factor Analysis. University of Chicago Press, Chicago. Zucchini, W., MacDonald, I.L., 2009. Hidden Markov Models for Time Series: An Introduction Using R. Chapman and Hall/CRC Press, Boca Raton, FL.