Random ensemble of fuzzy rule-based models

Random ensemble of fuzzy rule-based models

Knowledge-Based Systems xxx (xxxx) xxx Contents lists available at ScienceDirect Knowledge-Based Systems journal homepage: www.elsevier.com/locate/k...

2MB Sizes 0 Downloads 69 Views

Knowledge-Based Systems xxx (xxxx) xxx

Contents lists available at ScienceDirect

Knowledge-Based Systems journal homepage: www.elsevier.com/locate/knosys

Random ensemble of fuzzy rule-based models✩ ∗

Xingchen Hu a,b , , Witold Pedrycz a,c , Xianmin Wang d a

Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6R 2V4 AB, Canada College of Systems Engineering, National University of Defense Technology, Changsha 410073, PR China c Systems Research Institute, Polish Academy of Sciences, Warsaw, Poland d Hubei Subsurface Multi-scale Imaging Key Laboratory, Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan, PR China b

highlights • We design bagging and boosting mechanisms of assembling fuzzy rule-based models. • We quantify and analyze the performance of the ensemble mechanism. • We thoroughly study the predominant parameters of resulting ensemble models.

article

info

Article history: Received 26 July 2018 Received in revised form 8 May 2019 Accepted 8 May 2019 Available online xxxx Keywords: Random ensemble Fuzzy rule-based model Random forest Boosting Performance

a b s t r a c t Fuzzy rule-based models, due to modular architecture, have attracted attention and resulted in some practical implications because of their nonlinear characteristics and substantial interpretability. Datadriven fuzzy modeling is one of the most prevailing approaches, and the performance of such fuzzy models has been directly affected by a fundamental bias–variance dilemma. The concept and ensuing topologies of the ensemble strategy (bagging and boosting) offer an efficient method for constructing models to address this dilemma and to achieve a sound tradeoff. In this study, we design an ensemble fuzzy rule-based model in the setting of random forest and boosting mechanisms. To demonstrate the feasibility of the proposed method, we focus on the regression type of models. First, we design a method for assembling fuzzy rule-based models to improve the prediction accuracy. Second, we quantify the performance of the ensemble mechanism. To illustrate the effectiveness and discuss the main features of the proposed method, a series of publicly available datasets are considered in the experimental studies. © 2019 Elsevier B.V. All rights reserved.

1. Introduction Fuzzy modeling of real-world systems has emerged as an important approach to many fields of science and engineering. A fuzzy rule-based model exhibits a modular topology that has been widely studied and accepted, since the resulting architecture is interpretable, and the nonlinearity of characteristics contributes to the model’s performance. Among the increasing number of different variants of fuzzy rule-based models, data-driven constructs are most dominant [1–5]. Obviously, in practice, the performance of fuzzy models depends on the amount and quality of available data. In other words, in system modeling, one must consider the ✩ No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys. 2019.05.011. ∗ Corresponding author at: Department of Electrical and Computer Engineering, University of Alberta, Edmonton T6R 2V4 AB, Canada. E-mail address: [email protected] (X. Hu).

fundamental bias–variance dilemma and become cognizant of the existing tradeoff. In this setting, a concept and ensuing topologies of random forest [6] offer an efficient idea for addressing the issue, viz., constructing a model in which both the bias and the variance are reduced. The interest in random forest, regarded as a sound concept and a viable algorithmic vehicle [7–10], has grown in recent decades. The resulting model is made more reliable by combining outputs of several different models built in the presence of various subsets of data and features (variables). In this framework, several randomized decision trees are trained and combined to produce an ensemble learning model to cope with classification [11–15], regression [16–21] or clustering problems [10,22]. The concept of fuzzy sets has been introduced to ensemble modeling in recent research studies with the motivation that, in most practical problems, information granularity is associated with the data and the problem of classification or prediction. Multiple fuzzy decision trees, referred to as a fuzzy random forest, have been proposed by Bonissone [23]. For example, a fuzzy random forest has been used to address imperfect data in

https://doi.org/10.1016/j.knosys.2019.05.011 0950-7051/© 2019 Elsevier B.V. All rights reserved.

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

2

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

classification tasks [11] and was implemented to perform facial visual surveillance [24]. In [12], a fuzzy rule-based multiclassification system has been designed with bagging and feature selection methods. Scherer introduced a series of methods applied to fuzzy, neuro-fuzzy, and neuro-rough-fuzzy ensembles handling classification tasks [13,14]. For instance, an AdaBoost ensemble of relational neuro-fuzzy classifiers, ensemble of the Takagi-Sugeno systems classifiers, and ensemble rough-neurofuzzy modular classifiers were considered. As to the regression problems, ensemble methods have been studied in a variety of problems leading to some promising results. For example, the least squares support vector regression method has been used in an ensemble to predict hydropower consumption [17]. Bootstrap feature subsets ensemble regression trees have been designed for dealing with large-scale and noisy data [16], and fuzzy regression tree forest has been proposed in [18] and [19] to improve the performance of a single fuzzy regression tree. In [20], fuzzy linear model trees have been arranged into an ensemble topology for multioutput systems. The random forest framework has also been used with ensemble neural networks [21] to improve the prediction accuracy and computing efficiency. A survey of ensemble learning for regression problems has been presented in [25], offering a comprehensive discussion and categorization of various approaches. The study identified several main phases in building ensemble models: generation, pruning, and integration. Ren has reviewed several traditional and state-of-the-art ensemble methods dealing with classification and regression problems [26]. In the cited review, it has been pointed out that only few ensemble regression methods have been studied for fuzzy models, and this research direction is worth pursuing. The ultimate objective of this study is to design augmented fuzzy rule-based models in the framework of ensemble methods. The proposed model relies upon the commonly encountered architecture of rule-based models and exploits the ‘‘standard’’ design method, viz., the conclusion part of the rules is built through fuzzy clustering, whereas parameters of linear models forming the conclusion parts are optimized by minimization in the Least Squares Estimation (LSE) method [27]. To demonstrate the feasibility of this method, we focus on the regression type of models. The originality of this study is twofold. First, we design a method for assembling fuzzy rule-based models to improve the prediction accuracy. Second, we quantify the performance of the ensemble mechanism. To this end, we thoroughly study several important parameters, such as the number of fuzzy rules, the number of base models, the number of input variables involved in the model, and the threshold value of boosting. Afterwards, we provide some indications on how to choose these parameters. The study is organized in the following manner. In Section 2, we briefly review the general architecture of fuzzy rule-based models. Next, in Section 3, we introduce the ensemble mechanism of fuzzy rule-based models. To demonstrate and analyze the performance of the proposed method, we report a series of experimental studies completed for a suite of publicly available datasets (Section 4). We draw the main conclusions and discuss future work in Section 5. 2. Fuzzy rule-based regression system modeling In this study, we consider a multiple input and single output (MISO) nonlinear fuzzy system modeling architecture proposed by Takagi and Sugeno [28]. We assume that the TS fuzzy rulebased model is built on a basis of data D composed of input– output pairs (xk , yk ), k = 1, 2,. . . , where xk is the n-dimensional Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

input data, and yk is the corresponding output. Each if-then statement (the ith rule) of the model composed of c rules has the following form: If If x is Ai (x), then yi = pi0 + pi1 x1 + . . . + pin xn

(1)

where Ai (x) in the condition part of the rule is a multidimensional fuzzy set, and the conclusion part of the rule is a linear function with parameters pi0 , pi1 , . . . , pin . The fuzzy sets are constructed through clustering. Clustering, especially fuzzy clustering, plays a significant role in fuzzy modeling [29–32]. The obtained clusters are the results, as information granules that not only capture the detailed structure of the data but also refine functional modules of the model. Typically, the Fuzzy C-Means (FCM) [33] clustering algorithm is used. The FCM is performed over the data positioned in the combined input–output space (viz., the clustering is carried out in R n+1 ). The parts of prototypes located in the input space are used as cluster centers vi to calculate the membership functions in the condition part of rules, as follows [33]: 1

Ai (xk ) =

∑c

j=1

(

∥xk −vi ∥2

(2)

)1/(m−1)

∥xk −vj ∥2

where xk is the kth input datum, v represents the cluster centers in the input space, c is the predetermined number of clusters, m is the fuzzification coefficient, m>1, and ∥∥ stands for the distance between the inputs and prototypes. Typically, a weighted Euclidean distance function is used in this calculation [31,32]. Here, the membership function Ai (xk ) satisfies the obvious ∑c requirement stemming from the FCM algorithm, namely, i=1 Ai (xk ) = 1. Once the TS fuzzy model has been constructed, for any input x, the model returns the output computed as follows: yˆ (x) =

c ∑

Ai (x) yi

(3)

i=1

In this study, we consider two commonly encountered types of conclusions used in the rules. The simplest is the zero-order TS fuzzy models, where parameters pi1 , pi2 , . . . , pin are set to 0. In the first-order TS fuzzy models, the conclusion parts of fuzzy rules are described as linear functions. The two scenarios yield the corresponding formulas: yˆ (x) =

c ∑

Ai (x) pi0

(4)

Ai (x) (pi0 + pi1 x1 + · · · + pin xn )

(5)

i=1 c

yˆ (x) =

∑ i=1

where parameters pi0 , pi1 , pi2 , . . . , pin in (4) or (5) are both optimized by the LSE method. 3. A general topology of random ensemble fuzzy rule-based models Ensemble learning is widely used because it is possible to transform a relatively weak learner into a more powerful one [34]. After combining a set of base learning models, one anticipates that the ensemble learning model may perform better than any individual model. In ensemble learning models, two fundamental constructs are considered, namely, the base learning models and ensemble strategies. Here, the base learner is a fuzzy rule-based model, while the two ensemble strategies involve bagging (the random subspace technique) and boosting methods. Along with these two constructs, the aggregation scheme is decided upon. Bagging and boosting are two typical techniques of ensemble learning used to obtain better prediction results by using multiple

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

3

Fig. 1. Conceptual view of a bagging ensemble of fuzzy rule-based models.

weaker predictors. In what follows, we elaborate on the approach of developing ensemble fuzzy rule-based models and consider these two strategies to improve the accuracy and stability of fuzzy rule-based models.

In what follows, we explain the reason the bagging strategy works. Eq. (6) can be rewritten as F (x) = ED f (x ∗)

(7)

Then, ED (y − f (x ∗))2 = y2 − 2yED f (x ∗) + ED f 2 (x ∗)

3.1. Bagging strategy of an ensemble fuzzy rule-based model

2

Bagging is used to neutralize the instability of base fuzzy rule-based models. In this process, the instances obtained from the original training dataset are randomly sampled with replacement. The generated new training dataset inevitably contains some duplicated samples, while some other instances are deleted. The random subspace technique is an attribute-based bagging approach. With this method, a subset of features of the data is randomly selected, potentially reducing correlation of the base models. The base fuzzy rule-based models are built based on these randomly generated training datasets and then aggregated to an ensemble model. The aggregation method is used to obtain the final output by aggregating the outputs of all the base models. For the regression problem, there are several aggregation mechanisms, including averaging, weighting, selective weighting, and several machine learning methods. In the existing research, the averaging method is the most commonly used strategy because of its simplicity, associated with typically good results. Suppose that there is a dataset D(x, y) composed of N input– output pairs of instances with n-dimensional input variables. The bagging strategy is theoretically defined as follows: (1) a bootstrap sample Lk ∗ = (xk ∗ , yk ∗ ) is independently constructed N times from dataset D with replacement to form subspace data D* (x*, y*). (2) The bootstrapped predictor is calculated as f (.) = g((x1 ∗ , y1 ∗ ), . . ., (xk ∗ , yk ∗ ), . . ., (xN ∗ , yN ∗ )). (3) Repeat (1)–(2) P times and complete the bagged estimator as follows: F (x) =

P 1∑

P

fi (x ∗)

(6)

i=1

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

(8)

2

By applying the triangle inequality EX ≥(EX ) to the third term of (8), and considering (7), we obtain ED (y − f (x ∗))2 ≥ (y − F (x))2

(9)

Hence, we observe that the mean squared error of F (x) is not greater than the mean squared error of f (x ∗) on dataset D. The more different the values of f (x ∗) are, the greater the improvement the ensemble models may produce. The overall architecture of the proposed bagging ensemble fuzzy rule-based models is illustrated in Fig. 1. The overall process follows the bagging method in the context of the random forest mechanism [6], the pseudocode is shown in Algorithm 1: Typically, for regression, l is taken to be ⌊L/3⌋, and the minimum number is 5 (as recommended in the literature on random forest for regression [6]), where ⌊⌋ represents the floor function. The performance of bagging ensemble fuzzy rule-based models is expressed as a mean squared error (MSE) as follows: MSE =

N 1 ∑

N

(yk − F (xk ))2

(10)

k=1

3.2. Boosting strategy of an ensemble fuzzy rule-based model Boosting is a mechanism that sequentially builds an ensemble model by training each new fuzzy rule-based model and applying a bias while considering the training instances that had worse performance in previous models. Similar to the bagging approach, the first model is trained by the dataset that is randomly selected with replacement from the original training datasets. Then, these training instances are processed by the first model, and the models that result in worse performance are marked. The marked instances are more likely to be picked as the samples of training

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

4

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

datasets for the second model. Thus, with this type of iteration, the instances that are difficult to deal with are more likely to be used to train base models. Hence, different base models are capable for different parts of instances, and the ensemble model could perform better. The base fuzzy rule-based models are built using the respective generated subparts of training datasets. The weighted averaging method is usually used for aggregating the results of base models. The weight of every base model depends on its performance on its subset of data. Adaptive Boosting (AdaBoost) is one of the most commonly encountered implementations of the boosting mechanism [35, 36]. AdaBoost can be regarded as a procedure for greedily minimizing the loss function E(y−F (x))2 . In the ith iteration, F (x) is updated to F (x)←F (x)+αi fi (x∗ ), where αi is the weight of the base model. Thus, AdaBoost can be viewed as a functional gradient descent form of approaching the objective function. It not only is the reason for AdaBoost but also represents a guideline for designing new boosting algorithms. We recall the AdaBoost.RT algorithm to be used in the subsequent steps. The AdaBoost.RT algorithm is used in this study because it not only is better than a single model with more than 99% confidence but also outperforms other boosting algorithms on most datasets [37–39]. Here, R and T stand for regression and threshold, respectively. The boosting ensemble fuzzy rule-based model refers to the architecture of AdaBoost.RT displayed in Fig. 2. Algorithm 2 represents the algorithmic layout implemented in this study. Initially, the instances have equal chances of being drawn from the training set. The error rate εi is calculated using the notion of threshold θ for demarcating the accuracy of predictions. In this mechanism, the strategy attempts to find the base model fi (x∗ ) with small εi by reducing the magnitude of relative error. Depending upon the value of ρ , we arrive at linear, square or cubic relationships to tune the error rate. The probability is updated according to the loss function. The higher value of loss is, the greater the probability of the respective instance exhibits, indicating that the kth instance will be probably picked as a member of the sampled training set for the next base model. So that the next base model is focused on these peculiar instances step by step, which makes the overall models approaching to the training dataset. Finally, F (x) is additive with respect to base models. The similarities of bagging and boosting strategies are as follows: (a) they both aggregate several base models into one ensemble model; (b) the training datasets are generated by random sampling, and (c) they both use the averaging strategy to compute the output of the overall model. The differences between the two strategies are as follows: (a) in the bagging strategy, new models are built independently; in contrast, boosting adds new models

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

Table 1 Summary of datasets. Dataset

Instances

Attributes of inputs

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

4,177 45,730 8,192 68,040 768 40,768 5,875 14,998 950 9,822 1,049 308

8 9 21 15 8 10 21 26 9 86 15 6

according to the failure of previous models; (b) bagging generates training datasets by random sampling, while boosting randomly selects training samples with the weights corresponding to the most difficult cases; and (c) the output of bagging is decided by an equally weighted average, whereas the output of boosting gives more weight to the base models with better performance on training data. Additionally, the time complexity of the bagging strategy is O(n N log(N) P), where N is the number of instances, n is the number of attributes, and P is the number of base fuzzy models. The complexity of the boosting strategy is O(n N P). 4. Experimental studies To demonstrate and analyze the performance of the proposed methods, we use several publicly available datasets obtained from the KEEL-dataset repository (http://www.keel.es/) and the UCI Machine Learning repository (http://archive.ics.uci.edu/ml/). We select 12 regression datasets, including the small size and low dimensions category, the medium size and dimensions category, the large size and medium dimensions category, and the medium size and large dimensions category. A summary of datasets is shown in Table 1. These datasets are used to compare the traditional zero-order and first-order fuzzy rule-based models with our proposed ensemble fuzzy models. Each dataset is split and processed in the experiments by running a 10-fold cross validation. In the FCM clustering procedure used for the base fuzzy models, we set the fuzzification coefficient to 2.0, which is the most commonly used and basic value in this clustering. In the experimental studies, we set the number of base models of the ensemble models to 10, 30, 50, 70 and 90. As to the number of randomly selected attributes, we start with l set to ⌊L/3⌋ as recommended in [6], and sweep through the range of higher values until all the input attributes have been involved. The power coefficient ρ is set to 2, as studied in [37]. ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

5

Fig. 2. Conceptual view of a boosting ensemble of fuzzy rule-based models.

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

6

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

Table 2 MSEs obtained for various fuzzy models (testing data). Dataset

c

Zero-order fuzzy models

First-order fuzzy models

Original

Bagging

Abalone

3 5 7 9 11

7.44 7.08 7.04 7.19 6.95

7.356 6.628 6.524 6.333 6.181

± ± ± ± ±

Casp

3 5 7 9 11

37.51 37.39 37.24 37.14 37.24

37.17 36.75 36.48 36.12 35.99

± ± ± ± ±

Compactiv

3 5 7 9 11

266.59 ± 43.76 207.67 ± 52.27 92.74 ± 53.77 155.89 ± 49.87 146.01 ± 45.92

189.18 ± 29.23 134.77 ± 53.12 93.92 ± 32.21 113.16 ± 35.8 107.10 ± 33.16

213.7 ± 35.84 103.92 ± 28.27 72.73 ± 14.13 89.61 ± 14.18 69.01 ± 24.20

23.00 ± 2.84 13.59 ± 3.83 8.43 ± 2.72 9.74 ± 2.44 8.50 ± 3.14

18.02 ± 2.37 11.19 ± 2.34 8.65 ± 1.35 8.55 ± 1.51 8.33 ± 1.32

8.67 8.24 7.70 7.87 7.66

Corel

3 5 7 9 11

0.40 0.26 0.20 0.17 0.16

0.29 0.17 0.14 0.12 0.11

0.37 0.24 0.18 0.16 0.14

0.00017 0.00016 0.00015 0.00015 0.00015

0.00015 0.00015 0.00014 0.00014 0.00014

0.00015 0.00014 0.00014 0.00014 0.00014

Energy

3 5 7 9 11

31.18 20.25 15.24 13.39 11.04

Mv

3 5 7 9 11

108.58 108.59 108.59 108.59 108.61

± ± ± ± ±

0.95 0.89 0.89 0.93 0.78

± ± ± ± ±

± ± ± ± ±

0.45 0.51 0.44 0.49 0.48

0.02 0.02 0.02 0.01 0.01

± ± ± ± ±

Boosting

Original

Bagging

0.917 0.798 0.735 0.744 0.706

7.29 7.03 6.96 6.68 6.51

4.81 4.73 4.68 4.65 4.71

4.72 4.59 4.54 4.49 4.45

0.42 0.45 0.43 0.43 0.44

37.51 37.38 37.24 37.15 37.20

0.02 0.02 0.01 0.01 0.01

± ± ± ± ±

0.92 0.9 0.89 0.75 0.71

± ± ± ± ±

± ± ± ± ±

0.45 0.50 0.43 0.49 0.48

± ± ± ± ±

25.85 25.22 25.16 25.06 24.88

0.02 0.02 0.02 0.01 0.01

0.56 0.56 0.45 0.47 0.39

± ± ± ± ±

0.31 0.36 0.46 0.34 0.50

± ± ± ± ±

1e-05 1e-05 1e-05 1e-05 1e-05

± ± ± ± ±

25.82 25.23 25.10 25.00 24.77

Boosting 0.48 0.47 0.43 0.44 0.44

± ± ± ± ±

0.33 0.35 0.41 0.35 0.43

± ± ± ± ±

1e-05 1e-05 1e-05 1e-05 1e-05

4.80 4.72 4.55 4.49 4.50

± ± ± ± ±

26.47 25.66 25.38 25.43 25.14

0.54 0.55 0.46 0.472 0.45

± ± ± ± ±

0.4 0.47 0.48 0.41 0.5

1.61 1.07 1.29 1.43 1.36

± ± ± ± ±

± ± ± ± ±

1e-05 1e-05 1e-05 1e-05 1e-05

24.74 ± 5.95 16.99 ± 3.59 14.43 ± 2.92 10.56 ± 2.54 9.77 ± 2.37

24.3 ± 5.81 16.95 ± 3.97 11.98 ± 2.77 10.24 ± 2.44 9.52 ± 2.46

1.15 0.93 0.78 0.54 0.46

2.01 2.01 2.00 2.00 2.01

79.49 75.93 74.48 72.61 71.19

± ± ± ± ±

1.77 1.61 1.58 1.50 1.45

97.11 97.11 97.13 97.15 97.15

20.18 20.19 20.20 20.20 20.21

Parkinsons

3 5 7 9 11

113.77 ± 4.7 113.91 ± 4.62 96.60 ± 5.13 82.57 ± 4.14 78.86 ± 4.52

61.19 55.01 50.33 43.80 38.25

± ± ± ± ±

3.01 2.36 2.32 2.71 2.31

113.53 ± 4.70 110.56 ± 4.92 61.62 ± 2.68 55.71 ± 3.30 51.90 ± 2.83

10.09 ± 0.70 10.12 ± 0.69 9.39 ± 0.67 8.80 ± 0.91 7.93 ± 0.54

9.76 9.21 8.13 7.09 6.60

Pole

3 5 7 9 11

496.22 526.64 540.80 484.68 479.91

465.15 506.70 504.10 452.57 420.01

489.61 ± 20.09 440.1 ± 20.39 445.91 ± 33.31 450.83 ± 43.37 420.59 ± 15.00

308.84 241.83 219.92 173.58 176.69

± ± ± ± ±

300.16 228.14 203.28 161.28 149.60

± ± ± ± ±

Stock

3 5 7 9 11

15.09 ± 2.57 14.36 ± 2.08 4.92 ± 0.45 3.90 ± 0.67 3.52 ± 0.64

15.08 ± 1.93 6.42 ± 0.86 4.45 ± 0.59 3.50 ± 0.67 2.98 ± 0.63

18.67 ± 3.04 6.86 ± 0.47 4.30 ± 0.42 3.69 ± 0.49 3.00 ± 0.37

1.65 1.06 0.93 0.91 0.84

0.15 0.18 0.13 0.14 0.15

1.31 1.00 0.89 0.82 0.73

0.16 0.17 0.14 0.14 0.12

Tic

3 5 7 9 11

0.055 0.058 0.059 0.065 0.062

± ± ± ± ±

0.006 1.463 0.029 0.143 0.121

0.051 0.050 0.049 0.049 0.049

± ± ± ± ±

0.006 0.005 0.006 0.005 0.006

0.055 0.055 0.055 0.065 0.060

0.006 0.006 0.006 0.004 0.004

0.055 0.056 0.058 0.059 0.063

± ± ± ± ±

0.005 0.024 0.024 0.044 1.877

0.050 0.050 0.049 0.049 0.049

± ± ± ± ±

0.006 0.005 0.006 0.004 0.005

0.064 0.065 0.065 0.080 0.073

± ± ± ± ±

17.69 0.04 13.19 0.21 0.19

Treasury

3 5 7 9 11

5.045 3.548 2.373 2.072 1.941

± ± ± ± ±

0.952 0.64 0.602 0.601 0.503

3.787 2.788 2.217 1.732 1.498

± ± ± ± ±

0.836 0.635 0.557 0.526 0.528

4.042 ± 0.802 2.522 ± 0.541 1.889 ± 0.497 1.455 ± 0.532 1.13 ± 0.400

0.043 0.035 0.034 0.031 0.031

± ± ± ± ±

0.018 0.018 0.016 0.018 0.018

0.043 0.035 0.033 0.029 0.027

± ± ± ± ±

0.018 0.018 0.017 0.017 0.015

0.042 0.035 0.032 0.029 0.028

± ± ± ± ±

0.018 0.017 0.017 0.016 0.013

Yacht

3 5 7 9 11

217.02 ± 44.54 207.11 ± 47.59 193.34 ± 49.36 196.5 ± 44.34 206.87 ± 45.67

93.64 90.27 83.00 82.11 79.52

± ± ± ± ±

18.04 15.11 16.01 16.40 17.20

199.86 194.00 190.69 181.65 175.17

76.7 ± 16.34 66.2 ± 19.85 49.53 ± 22.15 44.41 ± 17.29 19.02 ± 9.21

56.87 48.54 33.02 18.73 17.09

± ± ± ± ±

14.83 15.25 29.96 15.65 9.08

73.09 61.17 35.43 21.25 14.60

± ± ± ± ±

14.56 16.16 13.02 5.55 40.93

± ± ± ± ±

10.04 5.13 5.34 3.57 2.84

± ± ± ± ±

± ± ± ± ±

23.07 23.87 24.17 38.82 73.08

± ± ± ± ±

21.54 22.37 22.55 23.01 17.00

± ± ± ± ±

± ± ± ± ±

2.01 2.01 2.00 2.00 2.00

± ± ± ± ±

39.78 44.84 46.77 44.06 42.00

± ± ± ± ±

0.46 0.30 0.24 0.22 0.17

± ± ± ± ±

± ± ± ± ±

0.52 0.53 0.53 0.52 0.53

12.44 9.78 9.99 16.91 28.41

1.10 0.90 0.75 0.43 0.37

± ± ± ± ±

19.81 19.75 19.71 19.13 18.32

0.44 0.28 0.23 0.13 0.13

± ± ± ± ±

± ± ± ± ±

± ± ± ± ±

0.44 0.43 0.44 0.43 0.42

0.64 0.61 0.54 0.49 0.45 12.14 8.62 8.03 9.09 6.43

0.99 ± 0.31 0.88 ± 0.24 0.72 ± 0.19 0.43 ± 0.15 0.4 ± 0.14 19.08 19.08 19.10 19.12 19.13

± ± ± ± ±

0.51 0.51 0.51 0.51 0.52

9.96 ± 0.75 10.08 ± 0.69 8.54 ± 0.51 7.51 ± 0.67 7.07 ± 0.57 303.16 228.02 195.10 164.42 157.22

± ± ± ± ±

1.16 0.96 0.84 0.70 0.68

0.16 0.18 0.15 0.11 0.12

± ± ± ± ±

11.15 10.38 7.96 8.88 5.54

The numbers in bold indicate that the difference in performance between the ensemble model and the original model is statistically significant.

To illustrate the overall performance of the proposed ensemble fuzzy models, Table 2 shows the MSE values of generic fuzzy models based on clustering (the models produce output with all input attributes) and proposed bagging and boosting ensemble fuzzy models. For the results of bagging ensemble models, we select the best results for the different numbers of base fuzzy models and of randomly picked attributes; the corresponding

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

numbers of base fuzzy models and attributes are listed in Tables 3 and 4, respectively. As to the results of boosting ensemble models, we show the best results for the different numbers of base fuzzy models; the corresponding numbers of base fuzzy models are shown in Table 5. The two-tailed t-test [40] is used to test if the two groups of results are significantly different. The significance level of the null hypothesis (the means of two results

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx Table 3 The numbers of base models in bagging ensemble fuzzy models producing the best results.

Table 5 The numbers of base models in boosting ensemble fuzzy models producing the best results.

Zero-order fuzzy models

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

Zero-order fuzzy models

c=3

c=5

c=7

c=9

c = 11

30 70 30 30 30 70 30 70 30 50 90 30

30 30 30 30 50 70 30 50 30 50 90 30

30 30 90 30 70 70 30 90 90 70 90 30

30 30 50 30 50 70 30 70 90 50 50 30

30 30 50 30 70 70 30 90 90 70 30 30

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

First-order fuzzy models

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

c=5

c=7

c=9

c = 11

30 50 30 30 70 50 70 90 30 50 30 30

30 50 30 30 30 50 70 90 30 70 30 30

30 50 90 30 70 50 70 50 70 50 70 50

30 50 50 30 30 50 70 70 50 50 50 90

30 50 70 30 70 50 70 90 90 70 50 90

Zero-order fuzzy models c=3

c=5

c=7

c=9

c = 11

6 6 16 12 6 5 9 24 4 70 6 2

6 5 16 10 6 4 9 25 4 73 4 2

3 5 21 12 6 4 9 25 6 70 5 2

3 5 19 11 7 5 15 26 6 73 6 2

3 5 21 12 7 5 15 26 6 73 4 2

First-order fuzzy models

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

c=3

c=5

c=7

c=9

c = 11

30 90 10 50 10 30 70 30 70 30 70 30

30 90 30 30 90 10 90 30 90 30 30 30

90 30 50 50 90 10 90 30 10 30 30 30

90 70 30 50 30 70 90 10 30 30 30 50

30 70 10 90 70 50 90 30 50 30 30 30

First-order fuzzy models

c=3

Table 4 The numbers of randomly picked attributes producing the best results.

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

7

c=3

c=5

c=7

c=9

c = 11

4 8 16 14 7 8 20 24 9 70 11 3

6 8 16 14 8 8 19 25 9 67 9 5

6 8 21 15 7 8 19 25 9 70 10 5

6 8 21 15 8 9 19 26 8 73 10 5

6 8 21 14 7 9 19 26 7 70 10 5

being equal) is set to 5%. As shown in the table, the numbers in bold indicate that the difference in performance between the ensemble model and the original one is statistically significant. In most cases (95.4%), ensemble models increase the accuracy of prediction compared to the accuracy of a single model; moreover, in 57.9% of cases the accuracy is statistically significantly higher. Performance of zero-order fuzzy rule-based models is usually worse than that of first-order models. The ensemble approach

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

Abalone Casp Compactiv Corel Energy Mv Parkinsons Pole Stock Tic Treasury Yacht

c=3

c=5

c=7

c=9

c = 11

90 50 90 30 10 30 90 70 90 90 50 70

90 30 70 50 10 30 90 90 50 90 90 90

30 30 70 50 30 30 90 90 90 70 50 70

70 30 90 70 90 30 90 90 90 70 90 90

70 70 90 30 50 30 90 90 70 30 70 50

offers a greater performance improvement for zero-order fuzzy rule-based models. Specifically, 70.8% of zero-order fuzzy models are significantly improved by these augmentation strategies. This result is convincing and expected: ensemble learning exhibits higher effectiveness when dealing with relatively weaker models. Overall, the comparison of performance of original fuzzy rulebased models and of ensemble fuzzy models shows that, in most cases, ensemble fuzzy models offer better results. However, ensembles of first-order fuzzy models still significantly outperform ensembles of zero-order fuzzy models. This phenomenon indicates that the performance of base fuzzy models still has a fundamental impact on the combined outcomes. Moreover, the bagging method helps reduce overfitting similarly to the advantage of the random forest method. In Tables 3 and 4, we list the numbers of base fuzzy models and randomly picked attributes of the bagging ensemble fuzzy models, respectively, that obtain best results. It can be observed from these tables that, generally, the bagging ensemble fuzzy models do not require too many base fuzzy models. Specifically, 65% cases involve combining fewer than 50 base models to improve prediction accuracy, while only 21.7% and 13.3% of cases need 70 and 90 base models, respectively, to achieve better results. A possible reason behind this is the fuzzy rule-based model being a relatively good nonlinear regression model, so that combining a small number of single models is sufficient for approximating the best results. Additionally, most of the first-order ensemble models achieve the best results when we pick all or most of the input attributes; however, some of the zero-order ensemble models only need less than half of input attributes to improve performance. A possible reason behind this phenomenon is that, for first-order ensemble models, the conclusion part of rules involves input variables, and a smaller number of input attributes could reduce the performance of the linear function, so almost every input attribute counts for the prediction. In contrast, the conclusion part of rules of zero-order ensemble models is a constant value and is independent from the input variables; thus, the number of input attributes has less impact on the overall performance.

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

8

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

Table 6 The settings of threshold θ used in the experimental studies. Dataset

Abalone Compactiv Energy Parkinsons Stock Treasury

θ

Dataset

Zero-order

First-order

0.0001 0.1 0.1 0.1 0.1 0.1

0.0001 0.1 0.001 0.1 0.1 0.00001

Casp Corel Mv Pole Tic Yacht

θ Zero-order

First-order

0.1 0.001 0.01 0.001 0.00001 0.05

0.001 0.001 0.01 0.00001 0.00001 0.00001

In Table 5, we list the number of base fuzzy models in boosting ensemble fuzzy models. Similarly, to the bagging strategy, most cases (53.3%) of boosting ensemble fuzzy models combine a small number of base models, e.g., 10–50. Several cases (30.0%), e.g., Energy, Parkinsons, Pole, Stock, Yacht, etc., require numerous base models. Furthermore, in general, first-order fuzzy models commonly require a greater number of base models. This phenomenon probably occurs because there are more parameters in the conclusion part of fuzzy rules, so that combining more base models helps the overall model remain stable. Since the first-order ensemble fuzzy models obtain much better prediction accuracy, in what follows, we display more details of the impact of the parameter settings on the output results. In Fig. 3, we show the MSE performance of bagging ensemble fuzzy models with various numbers of clusters and of models for testing datasets. As these figures show, the number of clusters and the number of models are two critical parameters that impact the performance of ensemble fuzzy models. In general, the more clusters (rules) are used, the better the performance of the model. With a larger number of clusters, the clustering procedure becomes more capable of capturing the detailed structure of the data. In addition, usually, the greater the number of models that have been combined is, the more accurate the outputs. This tendency exhibits some fluctuations in several cases, such as Energy, Mv and Yacht datasets, probably because of the data features and the randomness of sampling. Comparing these two parameters, the effect of the number of rules is more significant than that of the number of ensemble models. From a certain perspective, a classical TS fuzzy rule-based model is similar to an ensemble model assembling a number of linear functions (models) as fuzzy rules. In this regard, it is reasonable that a single fuzzy model is more sensitive to the number of rules than are ensemble fuzzy models, because the former is closer to the data and more sensitive to the bias–variance of data distribution. In Fig. 4, we demonstrate the impact of another influential parameter on the performance of an ensemble model, viz., the number of input attributes. In these experiments, we set the number of base fuzzy models to 90. Considering the first-order ensemble fuzzy models, the higher number the number of attributes is, the better the performance of the overall result, probably because the results of the most of the base fuzzy models are more accurate. We reckon that if the number of input variables of the data is too small, the base models can hardly offer reasonable results, since fewer input variables can be used in the conclusion part of fuzzy rules. In contrast, if the model has more input attributes, e.g., approximately 80%∼90% of all the input attributes, it could both retain randomness and produce meaningful results. In Fig. 5, we illustrate the MSE performance of boosting ensemble fuzzy models with various numbers of clusters and of models for testing datasets. We observe the general tendency in most cases that the MSE decreases with the increasing numbers of both clusters and base models. Furthermore, we also note that, in contrast to the bagging ensemble fuzzy models, there is potential overfitting when we involve too many base models in some cases, such as those of Abalone, Treasury, Pole, Tic and Yacht

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

Fig. 3. MSE of bagging ensemble fuzzy models for various numbers of clusters and of models — results obtained on testing datasets (l = (⌊L/3⌋)): (a) Abalone, (b) Casp, (c) Compactiv, (d) Corel, (e) Energy, (f) Mv, (g) Parkinsons, (h) Pole, (i) Stock, (j) Tic, (k) Treasury, (l) Yacht.

datasets. We find a way to overcome this problem by setting an appropriate value of threshold θ . The threshold settings, obtained ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

Fig. 4. MSE of bagging ensemble fuzzy models for various numbers of clusters and of attributes — testing datasets (P = 90): (a) Abalone, (b) Casp, (c) Compactiv, (d) Corel, (e) Energy, (f) Mv, (g) Parkinsons, (h) Pole, (i) Stock, (j) Tic, (k) Treasury, (l) Yacht.

9

Fig. 5. MSE of boosting ensemble fuzzy models for various numbers of clusters and of models — results obtained on testing datasets: (a) Abalone, (b) Casp, (c) Compactiv, (d) Corel, (e) Energy, (f) Mv, (g) Parkinsons, (h) Pole, (i) Stock, (j) Tic, (k) Treasury, (l) Yacht.

by the trial-and-error method, are listed in Table 6. When overfitting occurs in the boosting strategy, we decrease the value of Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

10

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

threshold θ so that the failure of regression instances has a lower probability of being selected into training datasets by random sampling. In summary, we compare the performance of these two ensemble strategies on fuzzy models in general. First, bagging and boosting methods reduce the variance of a single prediction by combining different base models so that the resulting models could have higher stability and accuracy. However, there is no outright winning approach, since the performance depends on the data. Second, we observe that the ensemble method could augment a model with a lower prediction error if the single model performs very poorly, whereas if the performance of a single model could be enhanced by itself, the usefulness of the ensemble strategy in reducing the pitfalls of a single model would be limited. Third, bagging is a good approach to avoiding overfitting; however, boosting may aggravate the overfitting problem; if so, it is possible to mitigate this problem by selecting better parameters in the boosting method. 5. Conclusions In this study, we put forward a new method for developing fuzzy rule-based models in the framework of both bagging and boosting ensemble mechanism and reported the results of a series of experimental studies. The comparative studies helped demonstrate the performance of ensemble fuzzy rule-based models and to determine several significant parameter settings for the models, such as the number of prototypes (clusters), the number of involved input attributes and the number of base models. The main results clearly indicate that for most datasets, the proposed ensemble fuzzy rule-based models significantly outperform individual fuzzy models. The underlying nature of the proposed methodology suggests that future studies explore building some other types of fuzzy models and implement such models in other fields, such as classification and decision-making. Acknowledgments Support from the Canada Research Chair (CRC) Program and the Natural Sciences and Engineering Research Council of Canada (NSERC) is fully acknowledged. This work was also partially supported by the National Centre for Research (NCN) under Grant No. UMO-2012/05/B/ST6/03068 and the National Natural Science Foundation of China (Grant No. 41372341). Xingchen Hu is supported by the China Scholarship Council under Grant No. 201306110018. Declaration of competing interest No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.knosys.2019.05.011. References [1] J. Zhang, Z. Deng, K.S. Choi, S. Wang, Data-driven elastic Fuzzy logic system modeling: Constructing a concise system with human-like inference mechanism, IEEE Trans. Fuzzy Syst. PP (99) (2017) 1. [2] H. Zuo, G. Zhang, W. Pedrycz, V. Behbood, J. Lu, Fuzzy regression transfer learning in Takagi–Sugeno Fuzzy models, IEEE Trans. Fuzzy Syst. 25 (6) (2017) 1795–1807. [3] B. Rezaee, M.H.F. Zarandi, Data-driven fuzzy modeling for Takagi–Sugeno– Kang fuzzy system, Inform. Sci. 180 (2) (2010) 241–255. [4] A.A. Adebowale, E.F. Sami, Modeling and identification of nonlinear systems: A review of the multimodel approach–Part 1, IEEE Trans. Syst. Man Cybern.: Syst. (2017).

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

[5] S.M. Zhou, J.Q. Gan, Low-level interpretability and high-level interpretability: a unified view of data-driven interpretable fuzzy system modelling, Fuzzy Sets and Systems 159 (23) (2008) 3091–3131. [6] Leo Breiman, Random forests, Mach. Learn. 45 (1) (2001) 5–32. [7] J. Xia, S. Zhang, G. Cai, L. Li, Q. Pan, Adjusted weight voting algorithm for random forests in handling missing values, Pattern Recognit. 69 (C) (2017) 52–60. [8] H.R. Zhang, F. Min, Three-way recommender systems based on random forests, Knowl.-Based Syst. 91 (C) (2016) 275–286. [9] H. Ishwaran, The effect of splitting on random forests, Mach. Learn. 99 (1) (2015) 75–118. [10] J. Hu, T. Li, H. Wang, H. Fujita, Hierarchical cluster ensemble model based on knowledge granulation, Knowl.-Based Syst. 91 (2016) 179–188. [11] J.M. Cadenas, M.C. Garrido, R. Martínez, P.P. Bonissone, Extending information processing in a fuzzy random forest ensemble, Soft Comput. 16 (5) (2012) 845–861. [12] K. Trawiński, O. Cordón, A. Quirin, On designing fuzzy rule-based multiclassification systems by combining FURIA with bagging and feature selection, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 19 (04) (2011) 589–633. [13] R. Scherer, Multiple Fuzzy Classification Systems, Vol. 288, Springer, 2012. [14] R. Scherer, Designing boosting ensemble of relational fuzzy systems, Int. J. Neural Syst. 20 (05) (2010) 381–388. [15] X.Z. Wang, et al., A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning, IEEE Trans. Fuzzy Syst. 23 (5) (2015) 1638–1654. [16] X. Wang, P. Yuan, Z. Mao, M. You, Molten steel temperature prediction model based on bootstrap feature subsets ensemble regression trees, Knowl.-Based Syst. 101 (C) (2016) 48–59. [17] S. Wang, L. Yu, L. Tang, S. Wang, A novel seasonal decomposition based least squares support vector regression ensemble learning approach for hydropower consumption forecasting in China, Energy 36 (11) (2011) 6542–6554. [18] F. Gasir, K. Crockett, Z. Bandar, Inducing fuzzy regression tree forests using artificial immune systems, Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 20 (supp02) (2012) 133–157. [19] F. Gasir, K. Crockett, On the suitability of type-1 Fuzzy regression tree forests for complex datasets, in: International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems, Springer International Publishing, 2016. [20] D. Aleksovski, J. Kocijan, S. Džeroski, Ensembles of Fuzzy linear model trees for the identification of multioutput systems, IEEE Trans. Fuzzy Syst. 24 (4) (2016) 916–929. [21] P.B. Zhang, Z.X. Yang, A novel adaboost framework with robust threshold and structural optimization, IEEE Trans. Cybern. PP (99) (2016) 1–13. [22] P. Rathore, J.C. Bezdek, S.M. Erfani, Ensemble fuzzy clustering using cumulative aggregation on random projections, IEEE Trans. Fuzzy Syst. (2017). [23] P. Bonissone, J.M. Cadenas, M.C. Garrido. A. fuzzy random forest, A fuzzy random forest, Internat. J. Approx. Reason. 51 (7) (2010) 729–747. [24] R. Jiang, A. Bouridane, D. Crookes, Privacy-protected facial biometric verification using Fuzzy forest learning, IEEE Trans. Fuzzy Syst. 24 (4) (2016) 779–790. [25] J. Mendes-Moreira, C. Soares, A.M. Jorge, Ensemble approaches for regression: A survey, ACM Comput. Surv. 45 (1) (2012) 10. [26] Y. Ren, L. Zhang, P.N. Suganthan, Ensemble classification and regressionrecent developments, applications and future directions, IEEE Comput. Intell. Mag. 11 (1) (2016) 41–53. [27] P. Lancaster, M. Tismenetsky, The Theory of Matrices: With Applications, Elsevier, 1985. [28] T. Takagi, M. Sugeno, Fuzzy Identification of systems and its applications to modeling and control, IEEE Trans. Syst. Man Cybern. 1 (1985) 116–132. [29] W. Pedrycz, M Reformat, Evolutionary fuzzy modeling, IEEE Trans. Fuzzy Syst. 11 (5) (2003) 652–665. [30] M.M. Ahmed, N.A.M. Isa, Knowledge base to fuzzy information granule: A review from the interpretability-accuracy perspective, Appl. Soft Comput. 54 (2017) 121–140. [31] X. Hu, W. Pedrycz, X. Wang, Granular fuzzy rule-based models: a study in a comprehensive evaluation of fuzzy models, IEEE Trans. Fuzzy Syst. 25 (5) (2017) 1342–1355. [32] X. Hu, W. Pedrycz, X. Wang, Development of granular models through the design of a granular output spaces, Knowl.-Based Syst. 134 (2017) 159–171. [33] J.C. Bezdek, R. Ehrlich, W. Full, FCM: The fuzzy c-means clustering algorithm, Comput. Geosci. 10 (2–3) (1984) 191–203. [34] I.H. Witten, E. Frank, M.A. Hall, C.J. Pal, Data Mining: Practical Machine Learning Tools and Techniques, Morgan Kaufmann, 2016. [35] Y. Freund, R.E. Schapire, Experiments with a new boosting algorithm, in: ICML, Vol. 96, 1996. [36] J. Friedman, T. Hastie, R. Tibshirani, Additive logistic regression: a statistical view of boosting (with discussion and a rejoinder by the authors), Ann. Statist. 28 (2) (2000) 337–407. [37] D.L. Shrestha, D.P. Solomatine, Experiments with adaboost. RT, an improved boosting scheme for regression, Neural Comput. 18 (7) (2006) 1678–1710.

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),

X. Hu, W. Pedrycz and X. Wang / Knowledge-Based Systems xxx (xxxx) xxx

11

[38] H. Drucker, Improving regressors using boosting techniques, in: ICML, Vol. 97, 1997. [39] J. Sun, H. Fujita, P. Chen, H. Li, Dynamic financial distress prediction with concept drift based on time weighting combined with adaboost support vector machine ensemble, Knowl.-Based Syst. 120 (2017) 4–14. [40] Winston Haynes, Students T-Test, Springer New York, 2013.

Please cite this article as: X. Hu, W. Pedrycz https://doi.org/10.1016/j.knosys.2019.05.011.

and

X. Wang,

Random

ensemble

of

fuzzy

rule-based

models,

Knowledge-Based

Systems

(2019),