A novel Supervised Competitive Learning algorithm

Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ Contents lists available at ScienceDirect Neurocomputing journal homepage: www.elsevier.com/locate/neucom Brief Pap...

Download PDF

1MB Sizes 1 Downloads 120 Views

Report

PDF Reader
Full Text

Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

Contents lists available at ScienceDirect

Neurocomputing journal homepage: www.elsevier.com/locate/neucom

Brief Papers

A novel Supervised Competitive Learning algorithm Qun Dai n, Gang Song College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

art ic l e i nf o

a b s t r a c t

Article history: Received 9 July 2015 Received in revised form 1 January 2016 Accepted 4 January 2016 Communicated by R.W. Newcomb

Competitive learning is a mechanism well-suited for the learning paradigm of regularity detection, and is typically an unsupervised learning mechanism. However, in this work, a novel Supervised Competitive Learning (SCL) algorithm is proposed for the generation of Multiple Classiﬁer Systems (MCSs), which is substantially supervised. SCL algorithm seeks to strengthen simultaneously both the accuracy of and the diversity among the base classiﬁers in the MCSs, in a supervised and competitive manner. Our inspiration for the development of SCL algorithm comes from the modern education concept and those classical competitive learning algorithms intuitively. It is found through the experimental study of this work that, SCL algorithm effectively improves the classiﬁcation and generalization performance of the constructed MCSs. & 2016 Elsevier B.V. All rights reserved.

Keywords: Competitive learning Supervised Competitive Learning (SCL) algorithm Multiple Classiﬁer Systems (MCSs) Ordinary Supervised Learning (OSL) algorithm Pattern classiﬁcation

1. Introduction Competition in neural networks means that, provided the input pattern, the processing elements in a neural network will compete for the “resources”, such as the output [1–4]. For every input pattern, all the processing elements will generate an output. Only the “most desirable” output is adopted, and only the winning processing element is renovated. Competition is needed because competition produces specialism in the neural network, while specialism implies that, by competition, the processing elements are adapted and specialized for different regions of the pattern space. In many situations, resources are limited, so competition recreates these natural constraints in the environment [1–4]. In the past two decades, Multiple Classiﬁer Systems (MCSs) have been well established as a research direction [5]. Intuitively, a critical factor to the successful construction of a MCS is that its base classiﬁers possess diversiﬁed performance [5]. Researchers have developed abundant algorithms to build a superior MCS by simultaneously exploring both the accuracy of the base classiﬁers and the diversity among them [5–15]. While in this work, we propose a novel Supervised Competitive Learning (SCL) algorithm for the generation of MCSs. Our SCL algorithm fundamentally differs from those classical competitive learning algorithms in that, SCL algorithm for the generation of MCSs is substantially n

Corresponding author. Tel.: þ 86 25 84593038; fax: þ86 25 84498069. E-mail address: [email protected] (Q. Dai).

supervised. And our SCL algorithm seeks to enhance both the accuracy of and the diversity among the base classiﬁers in a MCS at the same time, in a supervised and competitive manner. The rest of this paper is organized as follows. In Section 2, the proposed Supervised Competitive Learning (SCL) algorithm for the generation of diversiﬁed MCSs is described in detail. Section 3 reviews the related researches about the diversity property of the MCSs and the diversity measures for the MCSs brieﬂy. Section 4 reports results of the experimental study. Finally, conclusions are drawn in Section 4.2.

2. A novel Supervised Competitive Learning (SCL) algorithm for MCSs 2.1. Our main ideas about the proposed SCL algorithm for MCSs In this work, we propose a novel SCL algorithm for the generation of diversiﬁed MCSs, which is substantially supervised. Our inspiration for the development of the SCL algorithm comes from the modern education concept and those classical competitive learning algorithms intuitively. Firstly, to train and build an excellent MCS is just as the talent cultivation. During the process of our pedagogical practices, it is reasonable and indispensable to teach groups of students in accordance of their aptitude and based on their special characteristics, so as to make the students give full play to their potential, make up for the deﬁciency of each student,

http://dx.doi.org/10.1016/j.neucom.2016.01.045 0925-2312/& 2016 Elsevier B.V. All rights reserved.

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

2

Q. Dai, G. Song / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

stimulate students' interest in learning, and help students to strengthen learning conﬁdence, thus ﬁnally promoting students' all-round development. Similarly, with the SCL algorithm, MCSs are constructed in a supervised and competitive manner, in accordance with their “aptitude”, or rather, with their classiﬁcation capability. Secondly, our inspiration of SCL also comes from the rationale of those classical competitive learning algorithms. Competition mechanism is required in the build of MCSs since specialization of the base classiﬁers can be obtained from competition. In the context of MCS, specialization implies that, through competition, the base classiﬁers learn and specialize for different ﬁelds of the hypothesis space. Taking another look at it, specialization of individual base classiﬁers can be comprehended as a kind of speciﬁc implementation to the diversity of MCSs. And similar to those classical competitive learning algorithms [1,16,17], in the context of MCS, resources are also restricted. Therefore the learning chances of the individual base classiﬁers are also restricted. The classiﬁers are demanded to compete with each other to acquire the opportunity to learn. Analogous to the teaching method which teaches students in accordance with their aptitude, and those classical competitive learning algorithms, with our proposed SCL algorithm, we build our MCSs competitively, based upon their “aptitude”, or rather, based upon their classiﬁcation capability. Speciﬁcally, in the initialization stage, a set of base classiﬁers are initialized randomly. Next, in the preliminary training stage, the grounding of each base classiﬁer is carried out with the whole training dataset for a certain number of preliminary training epochs. And then, in the competitive learning stage, all the base classiﬁers are trained competitively, namely, intensive training is carried out for the winner classiﬁers which stand out from the competition. In the end, after implementing the former three stages, the ﬁnal testing stage is implemented, where the MCS is tested with the testing dataset. Speciﬁcally about the implementation of the competitive learning stage, after the preliminary training stage has been completed, for each training sample, the classiﬁcation capability of each base classiﬁer towards the sample is evaluated in a supervised manner. Those base classiﬁers with sufﬁcient classiﬁcation capability to one training sample are regarded as “winners” in the competition for that speciﬁc sample. The strategy that “the winner takes all” is also employed in our SCL algorithm. Only those winning classiﬁers in the competition for one speciﬁc training sample will have the chance to learn it, or to be trained with it. Conversely, failing classiﬁers do not have opportunity to learn. They even do not have the chance to forget what they have already learned. As a satisfactory result, we are pleasantly surprised that the proposed SCL algorithm effectively boosts the diversity of the generated MCS, and consequently, signiﬁcantly improves the pattern classiﬁcation performance of the constructed MCS. This satisfactory result might due to the reason that the property of diversity plays an important role in a MCS. The performance of a MCS not only depends on the power of the individual classiﬁers in the system, but also relies on the independence among them [18,19]. The effectiveness of the SCL algorithm can be properly explained as it enhances simultaneously both the power of individual classiﬁers and the diversity among them. Or it can be explained as the SCL algorithm effectively enhances the “beneﬁcial diversity” among the classiﬁers in the MCS. While the term “beneﬁcial diversity” here can be comprehended as the diversity of a MCS which can truly improve its classiﬁcation performance. As pointed out by Kuncheval that, diversity is generally beneﬁcial but it is not a substitute for accuracy [19]. Therefore, the key to the classiﬁcation performance of a MCS is its “beneﬁcial diversity”.

In the preliminary training stage of the SCL algorithm, fundamental training to all the constituent classiﬁers is performed, laying a solid foundation for the recognition capability of the entire MCS, and for the successful implementation of the following competitive learning stage. And in its competitive learning stage, selective reinforcement learning is performed to those winners in the competition, which, at one time, effectively boosts the power of each individual classiﬁers and the diversity among them. In fact, the competitive learning stage of SCL algorithm can be understood as a ﬁnal reﬁnement and reinforcement to the MCS, which is indispensable to the further improvement of its classiﬁcation performance. Besides, in its effective enhancement to the diversity and classiﬁcation performance of the generated MCS, the SCL algorithm avoids the implementation of ensemble selection. Although the paradigm of ensemble selection is very inﬂuential in the research ﬁeld of ensemble learning [20–33], however, to develop an ensemble selection algorithm with superior performance is still a rather difﬁcult task. In actual fact, the problem of ensemble selection has proven to be an NP-complete problem [34,35]. Moreover, although ensemble selection can reduce memory requirements and computational costs, improving the efﬁciency of decision making, a noteworthy weakness of almost all the existing ensemble selection algorithms lies in that, they curtly abandon all of the classiﬁers which are not selected into the pruned ensemble, resulting in a waste of useful resource and information. Everything has its two sides – positive and negative. From the above point of view, it could be said that SCL algorithm is superior to those ensemble selection algorithms in these respects. Our SCL algorithm belongs to the kind of methods which manipulate the training data employed. Since naturally, if we competitively determine the winning classiﬁers to be trained with the speciﬁc training sample, conversely, different base classiﬁers will be trained with different training data in the competitive learning stage. And moreover, according to the taxonomy drawn by the authors in [36], our algorithm belongs to those methods which vary the set of hypotheses that are accessible by the MCSs. As for whether our SCL algorithm belongs to explicit or implicit diversity methods, we might say that the algorithm falls in between them. Since although our algorithm is somewhat similar to Bagging [6], however, Bagging randomly samples the training patterns to produce different training sets for each member network, while our algorithm competitively, instead of randomly, selects subset of classiﬁers for each training sample. And although our algorithm is also similar to Boosting [8] in some ways, however, Boosting [8] method directly and explicitly manipulates the training data distributions to ensure some form of diversity in MCSs, which is fundamentally different from our algorithm. To be exact, our SCL algorithm is a kind of heuristic diversity method. The formal description of the proposed SCL algorithm for MCS is given out in the following sections. 2.2. The algorithm description of the proposed SCL algorithm for MCSs 2.2.1. The initialization of our MCSs with ICBP as the base model For the sake of originality, effectiveness, ease of implementation and simplicity, we adopt Improved Circular Back-Propagation (ICBP) neural network model, proposed in one of our previous works [37], as the base classiﬁer in our MCS, while our proposed SCL algorithm can be generalized to MCSs with any other types of base learning models very directly and easily. The initialization of our MCSs with ICBP as base model is realized just as the initialization of the n-Bits Binary Coding ICBP Ensemble System (nBBCICBP-ES) [15]. For the details about ICBP network model and nBBCICBP-ES, please refer to our published works [37,15].

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

Q. Dai, G. Song / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

2.2.2. The main principle of the proposed SCL algorithm for MCSs As illustrated in Fig. 1, the main process of an Ordinary Supervised Learning (OSL) algorithm for MCSs can be divided into three stages: (1) initialization stage of the MCS; (2) ordinary training stage of the MCS; and (3) ﬁnal testing stage to the MCS. In contrast, as illustrated in Fig. 2, the main process of the SCL algorithm can be divided into four stages: (1) initialization stage of the MCS; (2) preliminary training stage of the MCS; (3) competitive learning stage of the MCS; (4) ﬁnal testing stage to the MCS. With the SCL algorithm, a MCS is constructed in a supervised and competitive way, according to the “aptitude”, or rather, according to the classiﬁcation capability of its constituent base models. In detail, in the initialization stage, a set of base classiﬁers are initialized randomly. Speciﬁcally, in this work, the MCS is initialized as the initialized nBBC-ICBP-ES with ICBPs as its base models. Next, in the preliminary training stage, basic training is performed. Each base classiﬁer is preliminarily trained with the entire training dataset for a certain number of preliminary training epochs. This preliminary training stage is important and indispensable to the successful implementation of SCL. It lays a ﬁrm foundation for the recognition ability of the whole MCS, and for the effective implementation of the next competitive learning stage. And then, in the competitive learning stage, all the base classiﬁers are trained competitively. For each training sample, the classiﬁcation capability of each base classiﬁer to it is evaluated in a supervised way. The base classiﬁers with sufﬁciently powerful classiﬁcation capability to one training sample are regarded as “winners” in the competition for that speciﬁc training sample. The strategy that “the winner takes all” is also implemented in our SCL algorithm. Only the winning classiﬁers in the competition for one speciﬁc training sample will have an opportunity to learn it, or rather, to be trained with it. And even more speciﬁcally, the classiﬁcation capability of one base classiﬁer to one speciﬁc training sample is evaluated in accordance with its supervised approximation error made to that sample. The smaller supervised approximation mistake the classiﬁer makes, the stronger classiﬁcation capability it possesses. The detailed implementation of the competitive learning stage is as shown in Fig. 3. Firstly, for each training sample, the approximation errors made to it with all the base classiﬁers in the

MCS are calculated. Next, all the base classiﬁers in the MCS are ranked according to their approximation errors made to the speciﬁc training sample. And then, the classiﬁers which rank in the top nwc places will become victorious, where the variable nwc denotes the number of winning classiﬁer in each competition, which is a crucial parameter for the SCL algorithm, and should be preset appropriately. If the variable nwc is set as a too small number, then classiﬁcation performance of the entire MCS might be weakened, since in the competitive learning stage, too few classiﬁers will become winners and have the opportunity to be trained with the speciﬁc training sample under processing. However, the variable nwc should not be set too large, either. It can be obviously observed that, when the variable nwc gradually becomes larger, and closer to the total number of member networks in the MCS, the SCL algorithm will gradually degrade into a common supervised training algorithm for the MCS, which trains each member network with all the training samples for a certain number of training epochs. It can be interpreted as, when the variable nwc becomes too large, the enhancement capability of our SCL algorithm to the MCS diversity will be weakened, and the whole SCL algorithm will degrade into an Ordinary Supervised Learning (OSL) algorithm for MCS.

3. Diversity measures for the MCSs Diversity can be interpreted differently from some angles, such as independence, orthogonality or complementarity [38,39]. In the following, four diversity measures are reviewed, which were proposed by different researchers independently in the literature [5]. 3.1. The disagreement measure In 1996, Skalak [40] proposed the disagreement measure to evaluate the diversity between two base classiﬁers. Ho [11] also employed the disagreement measure to evaluate diversity in a decision forest. This measure is deﬁned based on the intuition that two diverse classiﬁers will perform differently on the same training sample. Given two base classiﬁers C i and C j , let numða; bÞ be the number of training samples on which the oracle output of C i and C j is a and b, respectively. Then, the disagreement between the two base classiﬁers is measured by disi;j ¼

Fig. 1. The ﬂow diagram of the Ordinary Supervised Learning (OSL) algorithm for MCSs.

numð1; 1Þ þ numð 1; 1Þ numð 1; 1Þ þ numð 1; 1Þ þ numð1; 1Þ þ numð1; 1Þ

ð1Þ

The disagreement within the whole set of base classiﬁers is then calculated by averaging over all pairs of base classiﬁers: dis ¼

Fig. 2. The ﬂow diagram of SCL algorithm.

3

NC NC X X 2 dis NC ðN C 1Þ i ¼ 1 j ¼ i þ 1 i;j

ð2Þ

where N C denotes the total number of base classiﬁers. Since for any pair of base classiﬁers: numð 1; 1Þ þ numð 1; 1Þ þ numð1; 1Þ þ numð1; 1Þ ¼ N Tr , where N Tr denotes the total number

Fig. 3. The sorting list of base classiﬁers based on their approximation errors made to the speciﬁc training sample. If the number of winning classiﬁer, i.e., the variable nwc in the SCL algorithm, is set as three, the former three classiﬁers in the sorting list, i.e., Net#i1, Net#i2, and Net#i3, will become the winners in this competition, and will have the opportunity to be trained with the speciﬁc training sample.

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

Q. Dai, G. Song / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

4

of training samples, it can be gotten that dis ¼

NC X

NC X

2 ðnumi;j ð1; 1Þ þ numi;j ð 1; 1ÞÞ N Tr N C ðN C 1Þ i ¼ 1 j ¼ i þ 1

ð3Þ

assumption that a set of classiﬁers should disagree with one another to be diverse, the MCS diversity will decrease when the value of k increases [5]. The inter-rater agreement k is calculated as PNTr

The diversity among the set of base classiﬁers will increase as the value of the disagreement measure increases [5].

k ¼ 1

i ¼ 1 ðN C li Þli N Tr N C ðNC 1ÞaveAccð1 aveAccÞ

ð10Þ

3.2. The double-fault measure

where aveAcc denotes the average classiﬁcation accuracy of the base classiﬁers on the training data.

The double-fault measure was proposed by Giacinto and Roli [41] to select classiﬁers which are the most irrelevant from a set of base classiﬁers. The double-fault measure between a pair of base classiﬁers is calculated by

4. Experimental studies

DF i;j ¼

numð 1; 1Þ numð 1; 1Þ þ numð 1; 1Þ þ numð1; 1Þ þ numð1; 1Þ

ð4Þ

This measure originated from the idea that, in order for two classiﬁers to be diverse, they should make different mistakes. Just as claimed by Giacinto and Roli, the fewer the coincident errors made by the pair of classiﬁers, the more diverse they will be. The double-fault measure within the whole set of base classiﬁers is calculated as DF ¼

NC NC X X 2 numi;j ð 1; 1Þ N Tr N C ðN C 1Þ i ¼ 1 j ¼ i þ 1

ð5Þ

The diversity among the set of classiﬁers will decrease when the value of the double-fault measure increases [5]. 3.3. Kohavi–Wolpert variance The Kohavi–Wolpert variance originated from the bias-variance decomposition of the error of a classiﬁer proposed by Kohavi and Wolpert [42]. The variance of the predicted class label y for a sample x is calculated originally as N class X 1 variancex ¼ ð1 Pðy ¼ ωi jxÞ2 Þ 2 i¼1

ð6Þ

classes. Since N class ¼ 2 where N class denotes the number of output in the case of oracle output, and Pðy ¼ 1xÞ þ Pðy ¼ 1xÞ ¼ 1, it can be gotten that 1 variancex ¼ ð1 Pðy ¼ 1jxÞ2 Pðy ¼ 1jxÞ2 Þ ¼ Pðy ¼ 1jxÞPðy ¼ 1jxÞ 2 ¼ PðO ¼ 1jxÞPðO ¼ 1jxÞ

ð7Þ where O denotes the oracle output. As the probability PðO ¼ 1xÞ can be computed as PðO ¼ 1jxÞ ¼

X li and li ¼ N C weight j NC O

ð8Þ

ij ¼ 1

namely, li equals product of N C and sum of the weights of the base classiﬁers that misclassify the training sample xi . A modiﬁed version of KW variance was proposed by Kuncheva and Whitaker [10] to measure the diversity of a MCS: KW ¼

1

NTr X

li ðN C li Þ N Tr N 2C i ¼ 1

ð9Þ

The diversity of a MCS will increase with the increase of the KW variance value [5]. 3.4. Measurement of inter-rater agreement This measurement is developed by Fless as a measure of interclassiﬁer dependability [43], named k. It can measure the level of agreement within a set of classiﬁers. Therefore, based on the

4.1. Experimental datasets Experiments on four benchmark classiﬁcation datasets have been conducted, including the Semeion Handwritten Digit Recognition dataset, the Pen-Based Recognition of Handwritten Digits dataset, the Optical Recognition of Handwritten Digits dataset, and the Image Segmentation dataset, that are all drawn from the UCI Machine Learning repository [44]. The information about the datasets used in our experiments is listed in Table 1. 4.2. Experimental results The average misclassiﬁcation percentages and standard deviations of OSL and SCL implemented to the three groups of nBBCICBP-ES [15], where n¼ 5, 6 or 7, on the four benchmark classiﬁcation tasks using 10-fold cross validation are shown in Table 2 for the purpose of comparison. It can be very clearly observed from Table 2 that, out of the twelve pairwise comparisons, for eleven cases, the nBBC-ICBP-ES trained with SCL obtains the smallest average misclassiﬁcation percentage. With respect to the standard deviations, it is also obviously shown in Table 2 that, out of all the twelve pairwise comparisons, for ten cases, the nBBC-ICBP-ES trained with SCL gets the smallest standard deviation. The detailed numbers of misclassiﬁed test samples on the four tasks achieved by implementing OSL or SCL to train the three groups of nBBC-ICBP-ES using 10-fold cross validation are displayed in Figs. 4–7. It is very clearly shown in the four ﬁgures that, the overall level of misclassiﬁed numbers obtained by the three groups of nBBC-ICBP-ES trained with SCL is lower than OSL. And often, the Competitive-7BBC, i.e., the MCS obtained by implementing SCL to train 7BBC-ICBP-ES, gets the smallest number of misclassiﬁed test samples. Table 3 lists out the t-test results of the classiﬁcation performances obtained by implementing OSL and SCL to train the three groups of nBBC-ICBP-ES on the four classiﬁcation tasks through 10-fold cross validation. It is very clearly shown in Table 3 that, among all the twelve t-tests, for eleven t-tests, the three groups of nBBC-ICBP-ES using SCL as their training algorithm, have signiﬁcantly improved classiﬁcation performances compared to those trained with OSL (viz. Competitive-nBBC versus nBBC), at 5% signiﬁcance level, i.e. t-value r 1.8331. Table 1 Data sets used for classiﬁcation tasks. Data set

Attribute Class Size of the whole data set

Semeion Digits 256 Pen-Based Digits 16 Optical Digits 64 Image Segmentation 19

10 10 10 7

1593 7494 5620 2310

Size of the test set 159 749 562 231

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

Q. Dai, G. Song / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

5

Table 2 Comparison of the average misclassiﬁcation percentages and standard deviations of the OSL algorithm and the proposed SCL algorithm implemented to the three groups of nBBC-ICBP-ES, where n¼ 5, 6 or 7, on the four classiﬁcation tasks using 10-fold cross validation. Data set

5BBC

Competitive-5BBC

6BBC

Competitive-6BBC

7BBC

Competitive-7BBC

Semeion Digits

35.53 (15.16) 5.77 (1.13) 3.63 (0.97) 5.32 (1.61)

11.7 (3.75) 3.79 (1.21) 3.15 (0.51) 5.41 (1.7)

20.75 (16.62) 4.43 (1.08) 2.81 (0.67) 5.41 (1.53)

10.13 (2.96) 2.0 (0.58) 2.42 (0.45) 4.81 (1.25)

11.76 (2.91) 4.3 (0.82) 2.44 (0.83) 5.5 (1.67)

8.43 (1.55) 1.58 (0.47) 1.99 (0.47) 4.63 (1.32)

Pen-Based Digits Optical Digits Image Segmentation

Remark (1): nBBC, where n¼ 5, 6 or 7, respectively in the table, denotes the MCS achieved by implementing the OSL algorithm to train the three groups of nBBC-ICBP-ES; while Competitive-nBBC, where n¼5, 6 or 7, respectively, represents the MCS achieved by implementing the proposed SCL algorithm to train the three groups of nBBC-ICBPES. The abbreviations used in other places of this paper have the identical meanings. Remark (2): Standard deviations are listed in parentheses. The numbers displayed in bold corresponds to the smallest values obtained in each group of experiments.

Fig. 4. The number of misclassiﬁed test samples on the task of Semeion Handwritten Digit Recognition achieved by implementing OSL or SCL to train the three groups of nBBC-ICBP-ES.

Fig. 5. The number of misclassiﬁed test samples on the task of Pen-Based Recognition of Handwritten Digits achieved by implementing OSL or SCL to train the three groups of nBBC-ICBP-ES.

5. Conclusion In this work, a novel Supervised Competitive Learning (SCL) algorithm is proposed for the generation of Multiple Classiﬁer Systems (MCSs). SCL algorithm fundamentally differs from those classical competitive learning algorithms in that, SCL algorithm is

substantially supervised, which seeks to enhance both the accuracy of and the diversity among the base classiﬁers in a MCS at the same time, in a supervised and competitive manner. It is veriﬁed through the comparative experiments carried out on the four benchmark classiﬁcation tasks using 10-fold cross validation that, compared to the Ordinary Supervised Learning

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

Q. Dai, G. Song / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎

6

Fig. 6. The number of misclassiﬁed test samples on the task of Optical Recognition of Handwritten Digits achieved by implementing OSL or SCL to train the three groups of nBBC-ICBP-ES.

Fig. 7. The number of misclassiﬁed test samples on the task of Image Segmentation achieved by implementing OSL or SCL to train the three groups of nBBC-ICBP-ES.

Table 3 Comparison using t-test of the classiﬁcation performances obtained by implementing the OSL algorithm and the proposed SCL algorithm to train the three groups of nBBC-ICBP-ES on the four classiﬁcation tasks through 10-fold cross validation. Data set

Semeion Digits Pen-Based Digits Optical Digits Image Segmentation

(OSL) algorithm, SCL algorithm successfully improves the classiﬁcation and generalization performance of the constructed MCSs.

Acknowledgments

Competitive5BBC 5BBC

Competitive6BBC 6BBC

Competitive7BBC 7BBC

This work is supported by the National Natural Science Foundation of China under Grant no. 61473150.

6.43 3.52 1.92 0.25*

1.99 5.47 1.91 1.9

3.57 10.61 1.96 2.24

References

Remark: All the items without an asterisk in the table indicate that the three groups of nBBC-ICBP-ES, using SCL as their training algorithm, have signiﬁcantly improved classiﬁcation performances compared to the three groups of nBBC-ICBPES, using OSL as their training algorithm (viz. Competitive-nBBC versus nBBC), at 5% signiﬁcance level, i.e. t-valuer 1.8331. The items with an asterisk (*) in the table indicate that the difference between the classiﬁcation results obtained by the two algorithms is not signiﬁcant at 5% signiﬁcance level, i.e. t-value 4 1.8331.

[1] Y.-J. Zhang, Z.-Q. Liu, Self-splitting competitive learning: a new on-line clustering paradigm, IEEE Trans. Neural Netw. 13 (2002). [2] B. Widrow, M.E. Hoff, Adaptive switching circuits, in: 1960 IRE WESCON Convention Record, New York, 1960, pp. 96–104. [3] R. Winter, B. Widrow, MADALINE RULE II: a training algorithm for neural networks, in: Proceedings of the IEEE International Conference on Neural Networks, 1988.

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

Q. Dai, G. Song / Neurocomputing ∎ (∎∎∎∎) ∎∎∎–∎∎∎ [4] B. Widrow, M.A. Lehr, 30 Years of adaptive neural networks: perceptron, madaline, and backpropagation, Proc. IEEE 78 (9) (1990). [5] E.K. Tang, P.N. Suganthan, X. Yao, An analysis of diversity measures, Mach. Learn. 65 (2006) 247–271. [6] L. Breiman, Bagging predictors, Mach. Learn. 24 (1996) 123–140. [7] Y. Freund, R.E. Schapire, A decision-theoretic generalization of on-line learning and an application to boosting, in: Proceedings of the 2nd European Conference on Computational Learning Theory, 1995. [8] Y. Freund, R.E. Schapire, Experiments with a new boosting algorithm, in: Proceedings of the 13th International Conference on Machine Learning, 1996. [9] T.G. Dietterich, Ensemble methods in machine learning, in: Multiple Classiﬁer Systems, vol. 1857, Lecture Notes in Computer Science, Springer, Berlin Heidelberg, 2000, pp. 1–15. [10] L. Kuncheva, C.J. Whitaker, Measures of diversity in classiﬁer ensembles and their relationship with the ensemble accuracy, Mach. Learn. 51 (2003) 181–207. [11] T.K. Ho, The random subspace method for constructing decision forests, IEEE Trans. Pattern Anal. Mach. Intell. 20 (1998) 832–844. [12] R.E. Schapire, Y. Singer, Improved boosting algorithms using conﬁdence-rated predictions, Mach. Learn. 37 (1999) 297–336. [13] Y. Liu, X. Yao, T. Higuchi, Evolutionary ensembles with negative correlation learning, IEEE Trans. Evol. Comput. 4 (2000) 380–387. [14] Q. Dai, The build of a dynamic classiﬁer selection ICBP system and its application to pattern recognition, Neural Comput. Appl. 19 (2010) 123–137. [15] Q. Dai, N.Z. Liu, The build of n-bits binary coding ICBP ensemble system, Neurocomputing 74 (2011) 3509–3519. [16] D. Rumelhart, D. Zipser, Feature discovery by competitive learning, Cogn. Sci. 9 (1985) 75–112. [17] T. Kohonen, Self-organizing feature maps, in: Self-Organization and Associative Memory, Springer Series in Information Sciences, Springer, Berlin Heidelberg, 1989. [18] L.I. Kuncheval, Diversity in multiple classiﬁer systems, Inf. Fusion 6 (2005) 3–4. [19] L.I. Kuncheval, An experimental study on diversity for bagging and boosting with linear classiﬁers, Inf. Fusion 3 (2002) 245–258. [20] I. Partalas, G. Tsoumakas, I. Vlahavas, An ensemble uncertainty aware measure for directed hill climbing ensemble pruning, Mach. Learn. 81 (2010) 257–282. [21] R.E. Banﬁeld, L.O. Hall, K.W. Bowyer, W.P. Kegelmeyer, Ensemble diversity measures and their application to thinning, Inf. Fusion 6 (2005) 49–62. [22] R. Caruana, A. Niculescu-Mizil, G. Crew, A. Ksikes, Ensemble selection from libraries of models, in: Proceedings of the 21st International Conference on Machine Learning, 2004. [23] D. Margineantu, T. Dietterich, Pruning adaptive boosting, in: Proceedings of the 14th International Conference on Machine Learning, 1997. [24] G. Giacinto, F. Roli, G. Fumera, Design of effective multiple classiﬁer systems by clustering of classiﬁers, in: Proceedings of the 15th International Conference on Pattern Recognition, 2000. [25] W. Fan, F. Chu, H. Wang, P.S. Yu, Pruning and dynamic scheduling of costsensitive ensembles, in: Proceedings of the Eighteenth National Conference on Artiﬁcial Intelligence, American Association for Artiﬁcial Intelligence, 2002. [26] G. Martinez-Munoz, A. Suarez, Aggregation ordering in bagging, in: Proceedings of the International Conference on Artiﬁcial Intelligence and Applications, 2004. [27] G. Martinez-Munoz, A. Suarez, Pruning in ordered bagging ensembles, in: Proceedings of the 23rd International Conference in Machine Learning, 2006. [28] I. Partalas, G. Tsoumakas, I. Vlahavas, Focused Ensemble Selection: A Diversitybased Method for Greedy Ensemble Selection, IOS Press, Patras, Greece: Amsterdam, 2008. [29] G. Tsoumakas, L. Angelis, I. Vlahavas, Selective fusion of heterogeneous classiﬁers, Intell. Data Anal. 9 (2005) 511–525. [30] Q. Dai, A competitive ensemble pruning approach based on cross-validation technique, Knowl.-Based Syst. 37 (2013) 394–414. [31] Q. Dai, An efﬁcient ensemble pruning algorithm using One-Path and TwoTrips searching approach, Knowl.-Based Syst. 51 (2013) 85–92.

7

[32] Q. Dai, A novel ensemble pruning algorithm based on randomized greedy selective strategy and ballot, Neurocomputing 122 (2013) 258–265. [33] Q. Dai, Z. Liu, ModEnPBT: a modiﬁed backtracking ensemble pruning algorithm, Appl. Soft Comput. 13 (2013) 4292–4302. [34] I. Partalas, G. Tsoumakas, I. Vlahavas, Pruning an ensemble of classiﬁers via reinforcement learning, Neurocomputing 72 (2009) 1900–1909. [35] C. Tamon, J. Xiang, On the boosting pruning problem, in: Proceedings of the 11th European Conference on Machine Learning, Springer, Berlin, 2000. [36] G. Brown, J. Wyatt, R. Harris, X. Yao, Diversity creation methods: a survey and categorisation, Inf. Fusion 6 (2005) 1–28. [37] Q. Dai, S.C. Chen, B.Z. Zhang, Improved CBP neural network model with applications in time series prediction, Neural Process. Lett. 18 (2003) 197–211. [38] Q. Hu, D. Yu, Entropies of fuzzy indiscernibility relation and its operations, Int. J. Uncertain. Fuzziness Knowl. Based Syst. 12 (2004) 575–589. [39] Q. Hu, D. Yu, M. Wang, Constructing rough decision forests, in: Proceedings of the Tenth Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing, 2005. [40] D. Skalak, The sources of increased accuracy for two proposed boosting algorithms, in: Proceedings of the American Association for Artiﬁcial Intelligence, AAAI-96, Integrating Multiple Learned Models Workshop, 1996. [41] G. Giacinto, F. Roli, Design of effective neural network ensembles for image classiﬁcation processes, Image Vis. Comput. J. 19 (2001) 699–707. [42] R. Kohavi, D. Wolpert, Bias plus variance decomposition for zero-one loss functions, in: L. Saitta (Ed.), Proceedings of the 13th International Conference on Machine Learning, Morgan Kaufmann, 1996, pp. 275–283. [43] J. Fleiss, Statistical Methods for Rates and Proportions, John Wiley & Sons, 1981. [44] 〈http://www.ics.uci.edu/ mlearn/〉 MLRepository.html or ftp.ics.uci.edu:pub/ machine-learning-databases.

Qun Dai received a B.Sc. degree from Nanjing University, PR China. In March, 2003, she completed her M. S. degree in Computer Science at Nanjing University of Aeronautics and Astronautics (NUAA), and then worked at the College of Information Science and Technology of NUAA as an Assistant Lecturer. There she received a Ph. D. degree in Computer Science in 2009. Since 2010, she has become an Associate Professor for the College of Computer Science and Technology of NUAA. Her research interests focus on neural computing, pattern recognition and machine learning.

Gang Song received a Bachelor of Science Degree in Computer Science from Anhui University of Technology, China. Then he entered Nanjing University of Aeronautics and Astronautics, Nanjing, China, as a graduate student. His research interests include pattern recognition, intelligent systems and machine learning.

Please cite this article as: Q. Dai, G. Song, A novel Supervised Competitive Learning algorithm, Neurocomputing (2016), http://dx.doi. org/10.1016/j.neucom.2016.01.045i

A novel Supervised Competitive Learning algorithm

A novel Supervised Competitive Learning algorithm

Recommend Documents