Diagnosing incremental learning: Some probabilistic models for measuring change and testing hypotheses about growth

Diagnosing incremental learning: Some probabilistic models for measuring change and testing hypotheses about growth

0191-491)(/93 $24.00 Studies in EcJucational Evaluation. Vol. 19, pp. 349-362, 1993 Printed in Great Britain. AU eights reooNed. e 1993 Pergamon Pre...

714KB Sizes 0 Downloads 13 Views

0191-491)(/93 $24.00

Studies in EcJucational Evaluation. Vol. 19, pp. 349-362, 1993 Printed in Great Britain. AU eights reooNed.

e 1993 Pergamon Press Ud

DIAGNOSING INCREMENTAL LEARNING: SOME PROBABILISTIC MODELS FOR MEASURING CHANGE AND TESTING HYPOTHESES ABOUT GROWTH1 Rolf Langehelne Institute for Science Education, University of Kiel, Germany

Introduction This contribution is about the topic of longitudinal curriculum development and assessment. More precisely, it is about how to measure growth (knowledge, abilities) over time. The general situation as concerns of the data to be analyzed is given by a data cube such as in Figure 1.

1

subjects

v

.T .t

N

time points

1 ... i ... K items

Figure 1: The General Data Situation Considered in This Paper

We thus have measurements from N subjects in K items (variables, indicators) at T points in time, with treatments (e.g., a curriculum) given between time points. Given such a situation, there are in principle a variety of ways to attack the problem of measuring change and making statements about growth. Before doing so, it appears useful to look for factors that may be relevant in guiding the analysis. Suffice it to say that in this paper the focus will be on categorical manifest variables (e.g. dichotomous pass/fail items) and ordered categorical (ordinal) manifest variables 349

350

R.Langeheme

=

=

=

=

(e.g. solution of some task; 0 not at all, 1 poor, .... , 4 good, 5 excellent). These variables will be considered as indicators of continuous, categorical or ordered categorical latentyariables. As a consequence, statements about change and growth will be made on the latent level. To do so, only probabilistic models - from among the statistical machinery - will be considered, models that conceive of the probability of a response (reaction, behavior) of some subject as a function of a set of parameters to be estimated. Two rather broad classes of models will be considered, i.e., the Rasch model and generalizations thereof for continuous latent variables and the latent class model and generalizations thereof for categorical and ordered categorical latent variables. All of these generalizations are dynamic versions of the respective model for a single point in time only. However, time enters into these models as an implicit variable only instead of an explicit one. All of the models considered thus belong to the class of discrete time models. To save space, no details will be given with respect to problems of parameter estimation and identifiability of models. For details, see references given. I will also refrain from presenting real data examples (which may be found in the references, however), but present a variety of models in a rather abstract fashion. In some cases, I will use artificial examples that may help to better understand the philosophy of some specific model. Finally, in order to simplify things, I will, in general, start with a specific model for a single time point only, then extend to T =2 points in time and comment very briefly only on extensions to more than 2 points in time.

Continuous Latent Variables Dichotomous Items: T

=1 Point in Time

Given a number of K items that are considered as indicators of some construct (latent variable, say math skill), the Rasch model (Rasch, 1960) describes the probability that subject u solves item i correctly by means of the logistic function. exp «(" - 0";)

Pv; = 1 + exp«(" -

(v = 1, ... ,N;i = 1, ... ,K),

0";) (1)

(u = 1, ... , N; i = 1, ... , K), where ~u is the ability of subject u to solve items of this type, (Jj is the difficulty of item i and exp ( ...) denotes the exponential function e (...). This model has a number of very useful properties (see e.g., Fischer, 1974; 1976; 1977). The Rasch model may be extended in several ways for measuring change when data are available from two or more points in time. Rost and Spada (1983) present a hierarchy of models, some of which will be considered in what follows.

Dichotomous Items: T=2 Points in Time The simplest extension that may be thought of adds a change parameter to model 1 now describing the probability of a correct response of subject u on item i at time point t by:

351

Incremental Learning

exp (e"

Pvil

+ 61 -

= 1 + exp (t.,,, + Vt c

O"i) (1;

-

)

(2)

where the effect of change between time points is measured by Bt. Note that, if we set Bt =0 at t =1, there is one additional parameter (~) only to be estimated. Since all other sets of parameters (the ;'s and the (j's) remain constant over time, this model may quantify global change only, that is equal for all subjects. For details of this model, both with respect to its formal properties and applications see Haussler (1978; 1981), Rost and Spada (1983) and Spada (1973a, 1973b, 1976). Assume, for a moment, that the results of some analysis have lead to the conclusion that global change is in fact present. Can we be sure that this change is exclusively due to the treatment given between the two time points? The answer is "no" as long as we do not have at least a control group (without a treatment) in addition to the treatment group.

effe~t

g

1 2 3

Group Curriculum A Curriculum B Control

qgj

=

parameters 1/1

1/2

T

1 0 0

0 1

1 1 1

0

Figure 2: An Example of a Design Matrix for Two Treaunent Groups and a Control Group in the Linear Logistic Test Model

In order to extend the simple global change model 2 to enable the simultaneous analysis of several treatment groups and a control group, Fischer (1976, 1977) has proposed what he calls the linear logistic test model (LLTM). The main feature of this model is that it allows for a decomposition of the change parameter Bat t =2 (which - in order to avoid misunderstandings - will now be denoted by Bg 2 where g refers to treatment group g) into a sum oflinearly additive effect parameters. m

6g2

= L: qgj1]j + T

(3)

j=l

where the single quantities have the following meaning: qgj is the dose of treatmentj given to subjects of group g. In the simplest case the treatment may be either present (qgj 1) or absent (qgj = 0). 'TJj quantifies the effect of treatmentj and 't stands for all other causes of change unrelated to the treatment(s) (such as biological or social growth factors). As Figure 2 shows, the 't parameter may be formally handled as a treatment parameter with qgj 1 for all groups. In this example group 1 is given curriculum A only and group 2 is given curriculum B only whereas neither of these treatments is applied to the control group. Note, that it is

=

=

352

R. Langeheine

possible to extend the design by additional groups getting two or even more treatments, some of which may differ from the curricula (such as a training in how to work effect!vely). The main advantage of the LLTM may be seen in the fact that it allows to test a variety of hypotheses by imposing restrictions on the parameters, e.g., for the above example:

1. 2. 3. 4.

no change: 111 = 112 = 1" = 0; ineffective treatment(s): 11j 0; 111 = 112 equal treatment effects: 111 = 112; treatment effects only: 1" = O.

=

=0;

The adequacy of each single hypothesis may be evaluated by means of a "I} statistic (e.g., Pearson's X2. or the likelihood ratio statistic). Using conditional likelihood ratio tests, it is possible, in addition, to test whether some specific model is superior (i.e., results in a better fit) to another one, provided these models are nested (all of the hypotheses 1 to 4 above are nested within the full model with 111, 112 and 1""# 0). Note that while an LLTM of this type allows to explain change by means of specific effect parameters it still is a global change model since the change parameter is equal for all subjects given a certain treatment. Another restrictive assumption implicit in model 2 with or without extension 3 is that the K items of the first time point and the K items of the second time point taken together must conform to the simple Rasch model 1 in each group. In what follows, we will therefore consider some models with relaxed assumptions of one kind or the other. Whereas model 2 and the LLTM require the items to be homogeneous with respect to the Rasch model, the following model drops the assumption of a single latent variable (uni-dimensionality): exp(~vi

pvit

+ Ot)

= 1 + exp (t.,vi + Ot'")

(4)

where ~\li is the ability of subject'\) to solve item i. This model thus allows for interactions between subjects and items, put differently: item difficulties may vary between subjects. Since the ~\li are assumed constant over time, model 4 again is a global change model (there is one 0 only if we set 01 =0, applying to all subjects and dimensions or items). As in the case of model 2, Oz in model 4 may be decomposed according to formula 3 for a design with several treatment groups, which again results in a global change model with equal treatment effect 11j for all subjects and all dimensions of a treatment. This is Fischer's (1976; 1977; 1989; see also Fischer & Formann, 1982) linear logistic test model with relaxed assumptions (LLRA). Though the change models considered so far, especially those with the extension to several treatment groups, have some appeal and have been successfully applied in several contexts, they appear rather restrictive by allowing for global change only. We will therefore finally consider a model that allows for subject-specific change:

Incremental Learning

353

(5)

where ~\ll is the ability of subject'\) at time point t. Whereas the item parameters remain constant over time, each subject is characterized by a specific ability parameter at each point in time, thus resulting in subject-specific change in ability (Ou = 0u2 - Oul). Rost and Spada (1983; see also Spada 1973a; 1973b) consider model 5 as the standard model for the measurement of person specific change. On closer inspection the model presented by Embretson (1991) turns out to be equivalent to the Rost and Spada model 5. Unfortunately, so far there exists no extension of this model to several treatment groups. The models considered so far are but a few out of a larger number of models, some of which may be more relevant in a specific context. I will therefore briefly refer to some extensions. Rost and Spada (1983) present a hierarchy of eight different models with model 2 considered above as the simplest model. Spada (1973a) considers extensions of models 1, 2 and 5 that allow for a decomposition of item difficulties into a number of operations necessary to solve an item (e.g., three operations are necessary when solving: -../9/0.3 - 5 as well as a decomposition of training transfer due to a specific operation. Some of the models are easily generalized to more than two time points; be it that the same items have been measured across time or that different (however parallel) items have been used (e.g., models 2, LLTM, and 5). Fischer (1989) has shown how to extend the LLRA to several time points and different items per time point.

Ordered Categorical (Ordinal) Items Rasch models for measuring change that take into account the ordinal information of the data are rare. Fischer and Formann (1982) present an extension of the LLRA to polytomous items. Though such items may be ordinal, the polytomous version of the LLRA does not exploit the ordinal information of the data. Some models that do so have been proposed by Rost (1989). One of his models is an analog of the subject-specific change model 5 considered above, now formalized as a three-way generalization of Masters' (1982) partial credit model. Recently, Fischer and Parzer (1991) combined the rating scale model of Andrich (1978), which assumes equidistant scoring for ordinal items, with a linear decomposition of the item parameters similar to that in LLRA.

Rasch Models for Measuring Change: Pros and Cons The most striking observation is that there is a great variety of Rasch models for measuring change that appear we)) suited both in curriculum construction and evaluation (see, e.g., Spada 1973a; 1973b; 1976; 1977). In some of these models the emphasis is on

354

R.Langeheme

the evaluation of the relative merits of different educational programs (and, maybe, on biological growth factors) thus aiming at explaining change by pinpointing the hypothetical causes. underlying change. Moreover, these models have some attractive properties: 1.

Measurement scale: in models I, 2 and 5 considered above measurement is on a difference scale; the effect parameters of the LLTM and the LLRA lie on an absolute scale.

2.

Specific objectivity: given that the model holds, the comparison of the ability parameters of two subjects (and, likewise, the comparison of item and treatment parameters) does not depend on the sample(s) of items and/or persons (only the accuracy of the estimates depends on the samples). Models 1 and 2 are sample-free with respect to both items and subjects, model 5 is so with respect to items, and the evaluation of the treatment effects in the LLTM and the LLRA are sample-free with respect to subjects.

t

response patterns It.

0 0 0 0 1 1 1 1

B C 0 0 1 1 0 0 1 1

0 1 0 1 0 1 0 1

=1

t

items solved

, of

0

1

2

3

---------x

=2

It. 0 0 0 0 1 1 1 1 B00110011 COl 0 1 0 101

x

x

x x

x

x

x

x

x

x x x x x x x

x

Figure 3: Response Patterns and Number of Items Solved at t = 1 (panel on the left); Response Patterns and Some Specific Patterns of Change at t =2 (panel on the right)

However, the Rasch model has some properties that do not allow to test certain specific hypotheses of change. Consider, for a moment, a Rasch model for three dichotomous items A, B and C and a single time point. As the left panel of Figure 3 shows, data may be aggregated into a contingency table with eight response patterns. One property of the Rasch model is that the specific response vector of some subject does contain no information in estimating that subject's ability parameter ~'\). More precisely: all subjects with the same number of items solved are estimated to have the same ability (irrespective of which item(s) were solved, e.g., response patterns 001, 010, 1(0). Now assume that the three items are ordered in difficulty with item A being the easiest and item C being the most difficult one. If, in addition, we have measurements at two points in time, we might be interested in hypotheses like these: Where do we find

Incremental Learning

355

those subjects at the second time point who do not solve any item initially? How many of them stay with their initial response pattern, how many of them go ahead by solving the easiest item. the fIrst two items or even all three items (panel on the right of Figure 3)? Likewise, what about those subjects who initially solve the most easy item, the fIrst two items or all three items? The x's in the panel on the right of Figure 3 imply that we assume no decay. Is such a hypothesis tenable? Hypotheses like these imply that we are interested in certain groups and how they change. This leads to the second class of models considered in this paper: latent class models.

Categorical and Ordered Categorical Latent Variables Dichotomous Items Very often the interest is not in quantifying a person's scale value on a latent dimension but in whether persons may be classilled into two or more groups. Mastery learning is but one example, where, in the simplest case, we might be interested in whether the subjects of some sample may be allocated into two groups only: masters and non-masters. Figure 4 gives a simplilled hypothetical example for four dichotomous items A, B, C and D with response categories - (item not solved) and + (item solved) . .

groups classes

class prop.

6 1 non-masters

0.6

2 masters

0.4

+ +

conditional response probabilities p D A B C 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0

Figure 4: Hypothetical Example of a Simplified 2-Class Mastery Learning Model

According to this example 60% of the sample belong to the non-mastery class and 40% to the mastery class. The classes are characterized by (conditional - see below) response probabilities for all items. For the sake of simplicity, I have assumed response probabilities of 1.0 (0.0) for a correct response and all items of the mastery (non-mastery) class. . Though the compatibility of these assumptions with real data may be checked, such a model has obvious disadvantages: (a) The model would only fIt the data if frequencies unequal zero would be observed for two of the 16 cells of the contingency table only (i.e., those with response patterns 0000 and 1111); such data are rather unlikely, and (b) response probabilities of one and zero throughout result in a deterministic formulation of a latent class model. As a consequence, such a model no longer is a latent class model. In the latent class model, however, all probabilities are estimated from the data. Assume

356

R.Langeheme

that the results of such an analysis turned out to be those in the left-hand panel of Figure 5. Again, we have a mastery class (30% in proportion) with high probabilities for solving an item correctly and a class of non-masters (70%) with low probabilities for a correct response in all items. The latent class model (Lazarsfeld, 1950; Lazarsfeld & Henry. 1968; for a recent review see Langeheine, 1988) for a situation like this (one time point) is given by

s p.ijk/

' " 6X

BXpCXpDX = ~ • Pil'AX Pil' kl. II.

(6)

.=1

where P ijld is the proportion in the population in cell ij,k,/, o~ is the proportion of class s of the latent categorical variable X and p~;\s the conditional probability for a response in category i of item A, given membership in class s (and likewise for items B, C and D). The model assumes that classes are mutually exclusive and exhaustive and that local independence of items holds within classes.

t=2 {)

p

=1

0.7 0.3

C 0

.80 .85 .90 .95

+

.20 .15 .10 .05

-

.45

.20 .30 .90 .95

+

.80 .70 .10 .05

-

.41

.05 .10 .15 .20

+

.95 .90 .85 .80

p

{)

t

A B

.14

-

+

-

+

A .80 .20 .05 .95

B .85 .15

.10 .90

C .90

.10 .15 .85

0 .95 .05 .20 .80

Figure 5. A Hypothetical Example of a 2-Class Mastery Learning Model at t = 1 (left-hand panel) and a 3-0ass Latent Class Model at t = 2 (right-hand panel)

Now assume that measurements in the four variables are also available from a second point in time and that a latent class analysis of these data required a 3-class solution in order to obtain an acceptable fit (again, the fit of a specific model may be evaluated by chi-squared statistics). The result may.look like that in the panel on the right of Figure 5. As at t = 1 we have a class of non-masters (class 1) and a class of masters (class 3). In addition, there is an intermediate class (class 2) with high probabilities for solving items A and B correctly but low probabilities for a correct response in items C and D. As the class proportions show, there is obviously change in the positive direction (the proportion of non-masters decreased from .7 at t = 1 to .14 at t = 2). However separate analyses of the data from each time point do not tell us where members of the two classes at t =1 go to at t = 2. The six boxes at the lower right corner of Figure 5 are left empty, therefore. This question may be easily answered by performing a simultaneous analysis of the-data from both time points. The empty boxes may then be filled by

Incremental Learning

357

probabilities denoting the transitions from a given class at t =1 to each of the classes at t = 2. We thus have a latent class model with latent change, which is a special case of the more general latent Markov model (Langeheine & Van de Pol, 1990a; 1990b). The latent variables of this subsection have been conceived of so far as being unordered categorical. In principle, there are several ways to specify models that conceive of latent variables as being ordered categorical. One way to do so is to use manifest items that have been constructed a priori so that they form a hierarchy with item difficulties increasing, say from A to D. As an example, consider Figure 6 with four pass (+)/fail (-) items and repeated measurements from two points in time. For t = 1 we specify five classes. These classes may be ordered by putting constraints on the response probabilities. Let a minus denote a low probability for solving an item and a plus denote a high probability. Then class 1 will contain subjects only with low probabilities for solving all items. Class 2 is characterized by a high probability for solving the easiest item only, and so on up to class 5 where there is a high probability for solving all four items.

1

A

B

resp.prob.

C

D class

class prop.

2

t=13 4 5

A

B

C

- - -

1

6;

-

-

t=2 dass 3 4 2

+

-

5

+ + + + + + - + + - - +

D

-

+ - - + + - + + + + + + +

T:I~

Figure 6: A Hypothetical Example of a Latent Markov Model with 5 Ordered Classes at t ;:: 1 and 2 and 4 Dichotomous Items

At t =1 the proportion in latent class x is now denoted by B!. If we impose the same structure on the response probabilities at t = 2,the transition probabilities 't21~ tell us where someone goes at t = 2 given membership in some class at t = 1. ' The model for a situation like this is given by:

(7)

=

=

where P ..• is the proportion in the population in cell ijId (t 1) and ijk1 (t 2), B! is the proportion of latent class x at t = 1, P !I~ is the conditional probability for a response in

358

R. Langeheine

=

category i of item A (given membership in class x at t 1), '[':J~ is the probability of a transition from class x at t = 1 to class y at t =2, and p~J' is the conditional probability for a response in category i of item A at t =2 (given membership in class y). Note that there are formally two latent variables, indexed by x at t =1 and by y at t =2. However, these two variables may be conceived as a single variable that may change over time (or not). In order to make the model identifiable, all (indexed) sets of parameters have to sum to unity (e.g., L: B! = Lj p1J:= Ly '[;11:= 1.0). Of course this also holds for the sets of parameters in model 6. . Model 7 is a generalized version of the classical latent Markov model which applies to the case 'of a single indicator measured repeatedly over time. The extension considered above is to multiple indicators (Langeheine & Van de Pol, 1993). As in case of the LL1M and the LLRA a variety of hypotheses may be tested by imposing restrictions on the parameters of model 7, such as: 1.

No latent change: T =I, where T denotes the matrix of transition probabilities and I is the identity matrix with ones in the main diagonal and zeros elsewhere.

2.

No change at all: T = I plus time homogeneous response probabilities.

3.

No decay: all '['s below the main diagonal are zero.

4.

Progress from t to t + 1 is restricted to a fixed range, say from class 1 (2) to class 3 (4). All of these hypotheses concern the structural part of the model.

5.

Restrictions on the response probabilities may be used to define a variety of measurement models. Different types of latent distance models that are compatible with the example in Figure 6 are discussed in Langeheine (1988).

Before closing this subsection, I will refer to two extensions. First, model 7 is easily extended to more than two time points. In fact, the specific Markov assumptions (one-step transitions from t to t + 1 only, e.g., from t = 1 to t = 2, from t = 2 to t = 3 but no higher order transitions; stable transition probabilities across time) may be tested only when measurements are available from at least three points in time. Second, just as in the LLTM and the LLRA, Markov models may be extended to the simultaneous analysis of several groups (defmed by additional external categorical variables such as socio-eConomic status or curricula). In Figure 7 we have the same situation as in Figure 2 above.

group

currie.

1

A

initial distrib. 6'5

2

B

3

control

response probs. p's

transition probs. T'S

Figure.7: An Example of a Multiple Group Markov Analysis with 2 Treatment Groups and a Confrol Group

Incremental Learning

359

This results in a huge number of hypotheses specifiable in addition to the single group case. In other words: all, some or no groups are equal in o's and/or p's and/or 't's. (See Van de Pol & Langeheine (1990) for details.)

Ordered Categorical (Ordinal) Items Whereas the extension to unordered polytomous items poses no new problems in both latent class and Markov analysis only models of Rost (1989) exploit the ordinal information of the data. One of his models that is relevant in the present context allows for subject-specific (categorical) change, i.e., individuals may change class membership across time while response probabilities are assumed to be time homogelleous. This model is formalized as a three-way latent class model in analogy with the partial credit model of Masters (1982) in order to take into account the ordinal information of the data.

Latent Class and Markov Models for Measuring Change: Pros and Cons As was the case for Rasch models, we again may conclude that there is a multitude of models for measuring categorical change. In all these cases the general multiple indicator latent Markov model (defined above for two points in time only) is the point of departure. Special cases of this model are obtained by letting items and/or persons change across time or not. In addition, this class of models offers a high flexibility in testing specific hypotheses (e.g., no decay or the choice from among different measurement models). Contrary to Rasch models, specific response patterns (both on the manifest and latent level) play an important role in this class of models that may therefore be called models of structural change. This class of models is not without problems, however. Whereas most of the Rasch models are easily extended to different sets of items administered across time, we have so far assumed that the same item set is used in Markov analysis. In principle; models for different item sets may be thought of. However, this case deserves sP.,eCial consideration. An advantage of Rasch analysis is that many of the models from this class do work for even relatively small samples. This is due to using conditional maximum likelihood estimation procedures where one conditions, e.g. on the raw-score of the subjects (number of items solved) in estimating the item parameters. Markov analysis, on the other hand, generally requires large samples, depending on the number of items, the number of time points and on the number of groups. As an example, take the simultaneous multiple group analysis considered above. For three groups, two time points and four dichotomous items this results in a 3 x 24 x 24 cross table with 768 cells. Many of these cells will be empty or have low frequencies. We are thus faced with the problem of sparse-data. As a consequence, the chi-squared statistics used in evaluating model fit may be questionable, since the chi-squared statistics no longer follow the chi-squared distribution. In order to cope with this problem, several suggestions have been made in the literature. A generally satisfactory solution is lacking, however. Fortunately this problem has no consequence for estimating the parameters of a Markov model, at least in most cases. Problems of model identifiability are dealt with in the references mentioned.

360

R. Langeheine

Whereas the class of Markov models presented here thus appears attractive in curriculum research also, I know of hardly any applications. One exception is Kordes (1978) who used Markov chains (very simple ones, however) in describing the process and structure of argumentative communication. One reason for this deficit clearly is that problems of parameter estimation hampered the applicability of Markov models over years. Fortunately, these problems have been surmounted during the past decade. Notes 1. Thanks are especially due to Jilrgen Rost who helped me in coping with the somewhat conflicting terminology/notation used by different (groups of) authors writing on Rasch measurement

References Andrich, D. (1978). A rating formulation for ordered response categories. Psyclwmetrika, 43,561-573, Embretson, S.E. (1991). A multidimensional latent trait model for measuring learning and change. Psychometrika.56, 495-515. Fischer, G.H. (1974). Einfiihrung in die Theorie psychologischer Tests [An introduction to the theory of psychological testing]. Bern: Huber. Fischer, G.H. (1976). Some probabilistic models for measuring change. In D.N.M. De Gruijter, & L.J.Th. van dec Kamp, (Eds.), Advances in psychological and educational measurement. London: Wlley. Fischer, G.H. (1977). Linear logistic test models: Theory and application. In H. Spada & W.F. Kempf, (Eds.), Structural models of thinking and learning. Bern: Huber. Fischer, G.H. (1989). An IRT-based model for dichotomous longitudinal data. Psychometrika.54. 599624. Fischer, G.H., & Formann, A.K. (1982). Vemnderungsmessung mittels linear·logistischcr Modelle [Measurement of change with linear logistic test models]. Zeitschrift fur Differentielle und Diagnostische Psychologie, 3, 75-99. Fischer, G.H., & Parzer, P. (1991). An extension of the rating scale model with an application to the measurementofchange. Psychometrika, 56, 637-651. Hliussler, P. (1978). Evaluation of two teaching programs based on structura1leaming principles. Studies in Educational Evaluation. 4, 145-161. Haussler, P. (1981). Denken und Lemen lugendlicher beim Erkennenjunktionaler Beziehungen [Juvenile reasoning and learning in the identification of functional relationships]. Bern: Huber. Kordes. H. (1978). Measurement and educational evaluation. Studies in Educational Evaluation. 4. 163183.

Incremental Learning

361

Langeheine, R (1988). New developments in latent class theory. In R. Langeheinc & I. Rost, (Eds.), Latent trait and latent class models. New Yolk: Plenum. Langeheine, R~ & Van de Pol, F. (1990a). A unifying framewolk for Malkov modeling in discrete space and discrete time. Sociological Methods and Research. 18. 416-441. Langeheine, R, & Van de Pol, F. (1990b). Veranderungsmessung bei kategorialen Daten [Models for measuring change in categorical data analysis]. Zeitschriftfur Sozialpsychologie. 21. 88-100. Langeheine. R., & Van de Pol, F. (1993). Multiple indicator Malkov models. In R. Steyer, K.F. Wender, & K.F. Widarnan, (Eds.), Psychometric methodology. Proceedings of the 7th European Meeting of the Psychometric Society in Trier. Stuttgart: Fischer. Lazarsfcld, P.F. (1950). The logical and mathematical foundation of latent structure analysis. In S.A. Stouffer, L. Guttman, E.A. Suchman, P.F. Lazarsfcld, S.A. Star, & I.A. Clausen (Eds.), Measurement and prediction. Princeton: Princeton University Press. Lazarsfeld, P.F., & Henry, N.W. (1968). Latent structure analysis. Boston: Houghton-Mifflin. Masters, G. (1982). A Rasch model for partial credit scoring. Psychometrika. 47. 149-174. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen: The Danish Institute for Educational Research. Rost, I. (1989). Rasch models and latent class models for measuring change with ordinal variables. In R. Coppi, & S. Bolasco, (Eds.), Multiway data analysis. Amsterdam: North-Holland. Rost, J., & Spada. H. (1983). Die Quantifizierung von Lerneffekten anhand von Testdaten [The quantification of learning effects from test data]. Zeitschrift fur Differentielle und Diagnostische Psychologie. 4. 2949. Spada, H. (1973a). Denk- und Lemmodelle der Rasch-Messtheorie unter psychologischen. formal en und didaktischen Aspekten [Rasch measurement and models of thinking and learning: Psychological, formal and didactic aspects]. In H. Spada, P. Haussler, & W. Heyner, (Eds.). Denkoperationen und Lernprozesse als Grundlagefiir lernzielorientierten Unterricht. Kiel: IPNArbeitsberichte No.5. Spada, H. (1973b). Die Analyse kognitiver Lerneffekte mit stichproben-unabhangigen Verfahren [The analysis of cognitive learning effectS using sample-free methods]. In K. Frey, & M. Lang (Eds.), Kognitionspsychologie und naturwissenschaftlicher Unterricht. Bern: Huber. Spada. H. (1976). Modelle des Denkens und Lernens [Models of thinking and learning]. Bern: Huber. Spada. H.(1977). Logistic models of learning and thought. In H. Spada, & W.F. Kempf. (Eds.). Structural models of thinking and learning. Bern: Huber. Van de Pol. F. & Langeheine, R (1990). Mixed Markov latent class models. In C.C. Clogg (Ed.), Sociological methodology 1990. Oxford: Blackwell.

362

R. Langeheine

The Author ROLF LANGEHEINE holds the position of research associate at the Institute for Science Education at the University of Kiel, Department of Educational and Psychological Methodology. Over many years, his interest has been in the analysis of categorical data by loglinear and latent class models, with a focus on the analysis of change by Markov chain models for sequential categorical data during the past few years.