Knowledge-Based Systems 21 (2008) 852–855
Contents lists available at ScienceDirect
Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys
An algorithm of constructing concept lattices for CAT with cognitive diagnosis Yang Shuqun a,b,*, Ding Shuliang c, Cai Shengzhen b, Ding Qiulin a a b c
College of Information Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China Faculty of Software, Fujian Normal University, Fuzhou 350007, China Computer Information Engineering Institute, Jiangxi Normal University, Nanchang 330027, China
a r t i c l e
i n f o
Article history: Received 22 December 2007 Accepted 30 March 2008 Available online 7 April 2008 Keywords: Concept Lattices Ea-matrix CD–CAT
a b s t r a c t Concept lattices have proved useful in many fields such as machine learning, knowledge discovery, in which building concept lattices is extensively studied. A modification of the Godin’s algorithms is proposed. Considering attribute hierarchies, an algorithm are presented for computerized adaptive testing (CAT) with cognitive diagnosis (CD). Both of algorithms are studied experimentally, and the algorithm for CD–CAT is studied theoretically. Algorithmic complexity of the algorithms for CD–CAT is studied theoretically (time complexity is studied in the best case and the worst case). With eight attributes, the test results are generated for both algorithms. With 10 attributes, quadratic regression is given for CPU time for the algorithm of CD–CAT. Ó 2008 Elsevier B.V. All rights reserved.
1. Introduction As a conceptual clustering method, concept lattices have proved to benefit machine learning and knowledge discovery [1–4] since Wille proposed concept lattices [5]. The algorithms of constructing concept lattices play an important role in formal concept analysis, in which the incremental update algorithms proposed by Godin [6] are the widely recognized algorithms. But when the lattice is initialized with two elements, i.e., ðG; /Þ and ð/; MÞ, where G is the set of objects and M is the set of attributes, Godin’s algorithms have a flaw: it is not suitable for the case that exists object(s) has (have) all attributes in formal context. For example, given a formal context (i.e., Table 1), a is an attribute which every object shares. The concept lattice obtained from Table 1 is shown in Fig. 1 by Godin’s algorithms. From Fig. 1, the set of attributes of the concept #0ðf1; 2; 3; 4; 5; 6g; /Þ is /, which means that the intersection of the sets of attributes is / for all of the objects. In fact, the intersection is {a}. Hence, the Godin’s algorithms should be modified. In the field of cognitive diagnosis, rule space model (RSM) [7] is one of the classic models and attribute hierarchy method (AHM) [8] is a new method. Reduced Q-matrices play an important role in RSM and AHM. Ea-matrix is the transposition of Q-matrix. It has been proved that Ea-matrices can be served as formal contexts * Corresponding author. Address: College of Information Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China. Tel.: +86 2584895499. E-mail address:
[email protected] (S. Yang). 0950-7051/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved. doi:10.1016/j.knosys.2008.03.056
and the concept lattices from Ea-matrices are the models of CD– CAT [9]. So the algorithm deserves attention for constructing concept lattices on CD–CAT. The paper is organized as follows. In Section 2, we give main definitions about formal concept analysis (FCA) and Ea-matrix. In Section 3, we propose an algorithm of generating concept lattices for CD–CAT, and give the theoretical sustainability of the algorithm. In Section 4, we consider results of experimental comparison, and give statistic models for the algorithm and Godin’s algorithms. With 10 attributes, we give the quadratic regression for the algorithm of CD–CAT. 2. Main definitions 2.1. Formal concept analysis We introduce standard FCA notation [10], which will be used throughout the paper. A formal context is a triple of sets ðG; M; IÞ, where G is called a set of objects, M is called a set of attributes. I # G M. The star * is a monotone operators, and it satisfies (1) and (2) as below. Here, P(A) stands for the power set of A. 8A 2 PðGÞ; A ¼ fm 2 MjgIm; 8g 2 Ag 2 PðMÞg 8B 2 PðMÞ; B ¼ fg 2 GjgIm; 8m 2 B 2 PðGÞg
ð1Þ ð2Þ
It can be proved that ** is a closure operator. If ðA; BÞ 2 PðGÞ PðMÞ, A ¼ B, and B ¼ A, then ðA; BÞ is called a concept. Suppose that ðA1 ; B1 Þ and ðA2 ; B2 Þ are concepts, the order operator ‘‘6” is defined by (3).
S. Yang et al. / Knowledge-Based Systems 21 (2008) 852–855 Table 1 A formal context
853
3. To construct an algorithm of concept lattices for CD–CAT
Object
a
b
c
d
1 2 3 4 5 6
1 1 1 1 1 1
0 1 0 0 1 1
0 0 1 1 1 1
0 0 0 1 0 1
3.1. The modification of Godin’s algorithm In Godin’s algorithms, given a formal context, suppose G is the set of objects and M is the set of attributes, the concept lattice would be initialized with the two elements: ðG; /Þ and ð/; MÞ. If A is the set of objects and every object in A has all attributes, then ð/; MÞ should be modified to ðA; MÞ. If the intersection of the sets of attributes of all objects is B and B is not empty, ðG; /Þ should be modified to ðG; BÞ. By Godin’s algorithms, the lattice is modified when every object is added to the lattice. When an object i with all attributes adds to the lattice, ð/; MÞ would be modified to ðfig; MÞ. Then finally ð/; MÞ can be modified to ðA; MÞ. But when the intersection of the sets of attributes of all objects is B and B is not empty, ðG; /Þ could not be modified to ðG; BÞ, which is the reason that leads to the wrong concept lattices. If the concept lattice is initialized with ðG; BÞ and ð/; MÞ, the result of Godin’s algorithm would be right. 3.2. An algorithm of constructing concept lattices for CD–CAT
Fig. 1. The concept lattice from Table 1.
ðA1 ; B1 Þ 6 ðA2 ; B2 Þ () ðA1 # A2 Þðor B1 B2 Þ
ð3Þ
The set of all concepts of a formal context forms a complete lattice LðG; M; IÞ. The greatest lower bound and the least upper bound are defined as (4) and (5), respectively. ðA1 ; B1 Þ ^ ðA2 ; B2 Þ ¼ ðA1 \ A2 ; ðB1 [ B2 Þ Þ
ð4Þ
ðA1 ; B1 Þ _ ðA2 ; B2 Þ ¼ ððA1 [ A2 Þ ; B1 \ B2 Þ
ð5Þ
Suppose that ðA1 ; B1 Þ and ðA2 ; B2 Þ are concepts, ðA1 ; B1 Þ 6 ðA2 ; B2 Þ, and there is no concept ðA3 ; B3 Þ such that ðA1 ; B1 Þ–ðA3 ; B3 Þ, ðA2 ; B2 Þ–ðA3 ; B3 Þ, ðA1 ; B1 Þ 6 ðA3 ; B3 Þ 6 ðA2 ; B2 Þ. Then ðA1 ; B1 Þ is called the lower neighbor of ðA2 ; B2 Þ, and ðA2 ; B2 Þ is called the upper neighbor of ðA1 ; B1 Þ. 2.2. Ea-Matrix Ea-Matrices are derived from attribute hierarchies. Hierarchy describes the dependent relations between the attributes. For an example in Fig. 2, a; b; c; d are cognitive attributes (short for attributes), and a is prerequisite to b and c, b is prerequisite to d. If an attribute hierarchy has n attributes, then the corresponding reduced Q-matrix is n m matrix, where m is the number of items [7]. Every column (i.e., item) is a vector composed by 0 and 1. If the jth element is 1, then the item has the jth attribute. For example, (1 0 1 1)T is a column of the reduced Q-matrix from Fig. 2, which also expresses a set fa; c; dg. Every item must meet the attribute hierarchy, e.g., (1101)T is not a column of the reduced Q-matrix because it does not satisfy the attribute hierarchy. Ea-Matrix is the transposition of the reduced Q-matrix. The Ea-Matrix from Fig. 2 is shown in Table 1.
Before presenting the algorithm for CD–CAT, some characterizations are discussed about Ea-matrix and the concept lattices derived from Ea-matrix. Throughout this paper, we denote the Eamatrix by E and attribute hierarchy by S. Let / be a row in which every element is zero, whose size is suitable from the content. It is clear that / cannot express an item. Every row of E can be expressed as a set of attributes (e.g., 1 0 1 0 stands for attributes set {a, c} in Table 1). Hence, / also stands for the empty set. Definition 1. An attribute is called start attribute or start node if and only if it is not prerequisite to others. Because every row of E must satisfy the prerequisites of the attribute hierarchy, every row of E can expressed by a subgraph of S which meets the prerequisites. For example, the Ea-matrix (i.e., Table 1) from Fig. 2 have six rows, there are six corresponding sub-graphs in Fig. 2. For example, the row (1 0 1 1) corresponds to the sub-graph ðfa; c; dg; fða; cÞ; ðc; dÞgÞ, in which fa; c; dg is the set of vertices and fða; cÞ; ðc; dÞg is the set of edges. Property 1. A sub-graph of S can expresses a row of E from S if and only if for every attribute i, all node(s) prerequisite to i is (are) in the sub-graph. Proof. We first prove the necessary condition. If S1 is a sub-graph of S, for every node x in S1 , if y is a prerequisite node of x in S, then y must be in S1 , else S1 could not express an item, which is contradictory with the meaning of Ea-matrix. The next step is to prove the sufficient conditions. Because every row must satisfy the prerequisite of the attribute hierarchy, then the sufficient condition is easily satisfied. Proving is finished. h Property 2. If E is expressed by ðaij Þmn , then E can be arranged to Pn Pn Pn Pn or for satisfy i¼1 aðiþ1Þj i¼1 aij ¼ 0 i¼1 aðiþ1Þj i¼1 aij ¼ 1 i 2 f1; . . . ; m 1g.
Fig. 2. Attributes hierarchy.
Proof. S can expresses a row of E obviously. If a node in S which is not prerequisite to others is deleted, suppose the rest of S is called S1 , then S1 must satisfy the prerequisites in S. So S1 also expresses a row of E. The rest may be deduced by analogy. Hence, for every
854
S. Yang et al. / Knowledge-Based Systems 21 (2008) 852–855
i 2 f1; 2; . . . ; ng, there at least exists a row of E, such that the sum of elements of the row is i. Obviously, If the order of row(s) is decided by the order of the sum of element(s) of row(s), the row(s) of E can be arranged to satisfy the result. The proof is finished. h Lemma 1. If E is from an attribute hierarchy, then join of every pair of rows of E is still a row of E, and intersection of every pair of rows of E is a row of E or /. That is, join operation is a closure operation for E, and intersection operation is a closure operation for E [ f/g. Proof. Suppose that r 1 and r2 are a pair of rows of E, we easily get r1 –/ and r 2 –/, i.e., r1 [ r 2 –/. For every i 2 r 1 [ r 2 , suppose k is an attribute node, if k is prerequisite to i, then k 2 r1 or k 2 r 2 according to Property 1. So k 2 r1 [ r 2 . From Property 1, we have that r 1 [ r 2 is a row of E. For every i 2 r 1 \ r 2 , if r1 \ r2 –/, and k is a node which is prerequisite to i, then k 2 r1 and k 2 r 2 . So k 2 r1 \ r2 . From Property 1, we get that r1 \ r2 is a row of E. The proof is finished. h If every row of E is served as an object, then E is obviously a formal context and E can form a concept lattice. In fact, Table 1 is the formal context from Fig. 2. Fig. 1 is the concept lattice from Table 1. Specially, a concept lattice from an Ea-matrix is called LðEaÞ. Theorem 1. Suppose S is an attributes hierarchy, E is derived from S, LðEaÞ is the concept lattice from the Ea-matrix. If S has one start node, then the set of concepts is {ði ; i Þji is an object of E}. If S has multi-start nodes, then the set of concepts is fði ; i Þj i is an object of Eg [ fð/ ; /Þg. Proof. Suppose i is an object in LðEaÞ and B is the set of attributes of i, then ðB ; BÞ 2 LðEaÞ. If B1 and B2 are the attributes of every pair of objects of E. According to Lemma 1, B1 [ B2 is a row of E. It must be an object whose set of attributes is B1 [ B2 , so ðB1 [ B2 Þ equals to B1 [ B2 . Then ððB1 [ B2 Þ ; ðB1 [ B2 Þ Þ is a concept of LðEaÞ. According to the Eqs. (4) and (5), we get that ððB1 [ B2 Þ ; ðB1 [ B2 Þ Þ is the great lower bound of ðB1 ; B1 Þ and ðB2 ; B2 Þ. Hence, the set of attributes of the great lower bound of ðB1 ; B1 Þ and ðB2 ; B2 Þ is still in E. h Case 1. If S has one start node, then B1 \ B2 –/. According to Lemma 1, B1 \ B2 is a row of E. We get the set of attributes of the least upper bound of ðB1 ; B1 Þ and ðB2 ; B2 Þ is still in E. So the set of objects and the set of concepts are one-to-one correspondence in LðEaÞ. The first result is proved. Case 2. If S has multi-start nodes, then B1 \ B2 –/ or B1 \ B2 ¼ /. If B1 \ B2 –/, we get the result from case 1. If B1 \ B2 ¼ /, then ð/ ; /Þ is the least upper bound of ðB1 ; B1 Þ and ðB2 ; B2 Þ. The second result is proved. The proof is finished.
Theorem 2. Suppose S is an attributes S, LðEaÞ is the concept lattice from E. concepts of LðEaÞ, and ðA1 ; B1 Þ is the then B1 # B2 and jB2 j jB1 j ¼ 1. The donated by jXj.
If (S has multi-start nodes) Add / to be a row of Ea-matrix; Compute the number of attributes of every row of Ea-matrix; Sort all rows of Ea-matrix in terms of the number of attributes For every object i, if the number of its attributes is n, search an ?>object m, such that m # i and jm j ¼ n 1; Make ðm ; m Þ be the parent concept of ði ; i Þ; }
4. The analysis of the algorithm for CD–CAT To get some understanding about the relative behavior of the algorithm for CD–CAT proposed in Section 3 and the Godin’s algorithm, implementations were done in Matlab 6.5.1 on Microsoft Windows XP Professional(2002), Pentium(R) 4 CPU, 1.70 GHz, 512 M. If the number of rows of Ea-matrix is m, then the space complexity is m (i.e., with one start node) or m þ 1 (i.e., with multistart nodes). The time complexity of iterating on creating the concept lattice is the major factor in analyzing the complexity of the algorithm. Suppose n is the number of attributes, the best-time complexity is Ohðn 1Þ, which comes from a special attribute hierarchy which Pn1 i iþ1 cn cn Þ, which is linear. And the worst-time complexity is Oð i¼1 comes from a special attribute hierarchy whose attributes are independent. Here Qm1 i¼0 ðn iÞ ¼ : Cm n m! The first advantage of the algorithm for CD–CAT is that the database is scanned one time. The second advantage is that it need not spent any time in considering the forming of new concepts compared with the other algorithms for the construction of concept lattices. Two type of comparison were performed. The experiment was designed as followings: The first is based on the Godin’s algorithm and the second is on the algorithm of CD–CAT. First, we randomly get an adjacency matrix whose rank is 8 and then get the reachability matrix. Second, we get the reduced Q-matrix by augment algorithm proposed by us before [Shuliang Ding, Fen Luo, Haijing Lin, Xiangbo Wang, Complement to Q matrix theory, Presentation at International Meeting of the Psychometric Society 2007, Tokyo, Japan]. Ea-matrix is the transpose of the reduced Q-matrix. As a formal context, the Ea-matrix is the input of two algorithm and the concept lattice is the output. Third, we get the CPU time for two algorithms. For every formal context, every algorithm was run 100 times and we got the mean of CPU time. Finally, the data of the mean CPU time are compared. Fig. 3 shows the CPU time (in
hierarchy, E is derived from If ðA1 ; B1 Þ and ðA2 ; B2 Þ are upper neighbor of ðA2 ; B2 Þ, cardinality of a set X is
Proof. According to properties 3, Theorem 2 is easily proved. According Theorem 1 and Theorem 2, the algorithm of concept lattices for CAT is easily obtained. The algorithm is shown as following: Program (Ea-matrix) {Judge S corresponding the Ea-matrix has one start node or multi-start nodes;
Fig. 3. CPU time for data elements with eight attributes.
S. Yang et al. / Knowledge-Based Systems 21 (2008) 852–855
855
quadratic regression is given for evaluating the relationship of CPU time to the number of concepts. The resulting algorithm can be used for getting the models of CD for CAT. Hence, the perspective concerning this work is applying this approach to the designing of CD–CAT. Acknowledgments The work was supported by the Natural Science Foundation of China (Grant No. 60263005), the Natural Science Foundation of China (Grant No. 60673014), and the Natural Science Foundation of Fujian province (Grant No. 2007J0178). References Fig. 4. Regression for CPU time
seconds) obtained for simulations done with 8 attributes using Godin’s algorithm and the algorithm of CD–CAT. To see the empirical complexity of the algorithm for CD–CAT, Fig. 4 gives a quadratic regression for CPU time for the algorithm for CD–CAT with 10 attributes. The quadratic regression is as following: y ¼ 0:4552 8:3888E 3x þ 0:4041E 4x2 5. Conclusion and perspectives In this work, an algorithm for generating the concept lattices of CD–CAT was proposed and analyzed. The characterizations of the Ea-matrix were discussed, on which the strategies for the algorithm are based. Empirical comparisons were done with Godin’s algorithms to see if the algorithm for CD–CAT is worth considering. Empirical evidence showed that it performed very well for the generation of concept lattices with 8 attributes. For 10 attributes, the
[1] P. Valtchev, R. Missaoui, R. Godin, Formal Concept Analysis for Knowledge Discovery and Data Mining: The New Challenges, Springer-Verlag, Berlin Heidelberg, 2004. [2] C.T. Lin, C.-B. Chen, W.-H. Wu, Fuzzy clustering algorithm for latent class model, Statistics and Computing 14 (2004) 299–310. [3] R. Cole, P. Eklund, G. Stumme, Document retrieval for e-mail search and discovery using formal concept analysis[J], Applied Artificial Intelligence 17 (3) (2003) 257–280. [4] S.O. Kuxnetsov, Machine learning on the basis of formal concept analysis, Automation and Remote Control 62 (10) (2001) 1543–1564. [5] R. Wille, Restructuring lattice theory: an approach based on hierarchies of concepts, in: I. Rival (Ed.), Ordered Sets, Reidel, Dordrecht, Boston, 1982, pp. 445–470. [6] R. Godin, R. Missaoui, H. Alaui, Incremental concept formation algorithms based on Galois (concept) lattices, Computational Intelligence 11 (2) (1995) 247–267. [7] K.K. Tatsuoka, Computerized cognitive diagnostic adaptive testing: effect on remedial instruction as empirical validation, Journal of Educational Measurement 34 (1) (1997) 3–20. [8] J.P. Leighton, M.J. Gierl, S.M. Hunka, The attribute hierarchy method for cognitive assessment: a variation on Tatsuoka’s rule-space approach, Journal of Educational Measurement 41 (3) (2004) 205–237. [9] S. Yang, S. Ding, X. Zhu, Q. Ding, The design and realization of CAT with cognitive diagnosis based on formal concept analysis, in: Proceedings of the Third International Conference on Natural Computation, 2007. [10] B. Ganter, R. Wille, Formal concept analysis, Mathematical Foundations, Springer, Berlin, 1999.