Expert Systems With Applications 140 (2020) 112869
Contents lists available at ScienceDirect
Expert Systems With Applications journal homepage: www.elsevier.com/locate/eswa
Machine learning-based design support system for the prediction of heterogeneous machine parameters in industry 4.0 Luca Romeo a,b,∗, Jelena Loncarski d, Marina Paolanti a, Gianluca Bocchini c, Adriano Mancini a, Emanuele Frontoni a a
Department of Information Engineering (DII), Università Politecnica delle Marche, Via Brecce Bianche 12, Ancona 60131, Italy Department of Cognition, Motion and Neuroscience and Computational Statistics and Machine Learning, Fondazione Istituto Italiano di Tecnologia Genova, Italy c Digital Product Specialist with Xelexia s.r.l, Pesaro, Italy d Department of Engineering Sciences, Division for Electricity Research Uppsala University, Uppsala, Sweden b
a r t i c l e
i n f o
Article history: Received 8 February 2019 Revised 2 August 2019 Accepted 9 August 2019 Available online 10 August 2019 Keywords: Design support system Machine learning Decision tree Nearest-Neighbor Neighborhood component features selection
a b s t r a c t In the engineering practice, it frequently occurs that designers, final or intermediate users have to roughly estimate some basic performance or specification data on the basis of input data available at the moment, which can be time-consuming. There is the need for a tool that will fill the missing gap in the optimization problems in engineering design processes, by making use of the advances in the artificial intelligence field. This paper aims to fill this gap by introducing an innovative Design Support System (DesSS), originated from the Decision Support System, for the prediction and estimation of machine specification data such as machine geometry and machine design on the basis of heterogeneous input parameters. As the main core of the developed DesSS, we introduced different machine learning (ML) approaches based on Decision/Regression Tree, k-Nearest Neighbors, and Neighborhood Component Features Selection. Experimental results obtained on a real use case and using two different real datasets demonstrated the reliability and the effectiveness of the proposed approach. The innovative machine learning-based DesSS meant for supporting the designing choice, can bring various benefits such as the easier decision-making, conservation of the company’s knowledge, savings in man-hours, higher computational speed and accuracy. © 2019 Elsevier Ltd. All rights reserved.
1. Introduction The 4th Generation of Industrial Revolution called “Industry4.0” was triggered by the rapid development of several fields including electric and electronic, information and advanced manufacturing technology. Germany is leading this transformation, enabling manufacturing and service innovation (Heiner, Fettke, Feld, & Hoffmann, 2014). The penetrating influence of information technologies into the manufacturing systems, lead to an increasing collection of data (Yin, Xi, Sun, & Wang, 2018). In fact, nowadays companies are dealing with the big data challenge (Chen & Zhang, 2014; Lee, Kao, & Yang, 2014). Collected manufacturing big data can be further analyzed and used to create innovative applications, enabling the product and service optimization. ∗
Corresponding author. E-mail addresses:
[email protected] (L. Romeo),
[email protected] (J. Loncarski),
[email protected] (M. Paolanti),
[email protected] (G. Bocchini),
[email protected] (A. Mancini),
[email protected] (E. Frontoni). https://doi.org/10.1016/j.eswa.2019.112869 0957-4174/© 2019 Elsevier Ltd. All rights reserved.
The smart manufacturing is currently evolving the concept of real physical systems into high-level cyber technologies. The main objectives in this context are the development of the dynamic and flexible business and engineering processes, where the smart factories will have the capabilities of self-awareness, self-prediction, self-comparison, self-reconfiguration, and self-maintenance (Lee et al., 2014). They will solve many key issues in manufacturing such as meeting the individual customer requirements, optimized decision-making, resource and energy efficiency. In this context, predictive algorithms based on machine learning (ML) approaches can represent a viable solution in order to analyze data, predict machine performance degradation, autonomously manage and optimize services, products and needs (Razavi-Far, FarajzadehZanjani, & Saif, 2017; Susto, Schirru, Pampuri, McLoone, & Beghi, 2015). Starting from the concept of smart factory and reaching the application of information technology, the missing gap is the development of decision analytics in between (Horita, de Albuquerque, Marchezini, & Mendiondo, 2017). The Decision Support System
2
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
Fig. 1. DesSS framework: the user can upload the dataset from the front end app. The cloud platform manages the training process while saving the dataset and the machine learning model in the cloud storage. The user can provide the request for the prediction of the output based on the learned model.
(DSS) is an effective tool which can facilitate decision making in order to solve the identified problem. DSSs integrate multiple functions such as analysis, modeling, prediction, optimization and diagnosis, typically combining data and models (Kulhavy, 2003). Nowadays, DSSs are popular in various domains, including business, engineering, military, and medicine (Cabrerizo, MorenteMolinera, Pérez, López-Gijón, & Herrera-Viedma, 2015; Liu, Hsiao, & Hsiao, 2014; Malmir, Amini, & Chang, 2017; Prasad & Ratna, 2018). DSSs are used for different purposes in a manufacturing industry (Karmarkar & Gilke, 2018; Sancin, Dobravc, & Dolak, 2010). Some of the examples are decision support for process control, process quality control (Jiang, Sun, Wang, & Zhang, 2010), assisting managers (Khatun & Miah, 2016), customer satisfaction (Alaoui & Tkiouat, 2018), supply chain (Villegas & Pedregal, 2018), etc. Since nowadays database and data warehousing technologies are becoming increasingly efficient, also the interest in data-driven DSS is growing (Kulhavy, 2003). Arising from the DSSs, a novel concept of data-driven Design Support System (DesSS) is introduced in this paper to address some of the optimization problems in engineering design processes. Usually the design problems are rather complex and sometimes they involve also interdisciplinary aspects. The designers’ tasks are various, from information gathering, problem-solving and thinking, documenting and planning the work, to the models that need to be implemented in order to verify the theoretical implications. All these tasks may be highly time-consuming, but also with the risk of failing. The design automation can viably be adopted in this case in order to ease the design process. The overview of the proposed DesSS is presented in Fig. 1 including the app engine, cloud storage and cloud platform. The core of the DesSS framework is the cloud platform, i.e. the design assistant based on artificial intelligence and machine learning algorithms. These algorithms are able to predict the features of the new possible versions of the product or a service by learning, correlating, and interpreting the parameters of a database containing the preliminary features of a product. The main goal of DesSS is to provide the decision support in the design field, which is mainly discussed in this paper. Another important goal is the collection and conservation of company knowledge over time by using the labels collected by the human designers for the development and training of machine learning algorithms and storing these models together with datasets in cloud storage. Even though the fundamental goal of DSS and Expert System (ES) is basically the same, i.e. they seek to improve the quality of the decision, there are still many differences when comparing their objectives, operation, user and development methodology (Ford, 1985). A DSS is an interactive, computer-based information system that utilizes data and models together with a comprehensive database in order to solve unstructured or semi-structured prob-
lems. On the other hand, an ES is a computer program which is knowledge-based, containing an expert’s knowledge for a particular domain and aiming at solving structured problems. Additionally, ES contains an explanation mechanism which provides the user of the ES with some detail of the reasoning process and why a certain solution is recommended (Turban & Watkins, 1986). With this in mind, DesSS is more close to the DSS rather than ES, having the same objectives (improving the quality of the decision), same operation (flexibility in confronting the problem, programming language), same type of the users (applied to business area, the user is often the decision-maker who helped design the system), and same development methodology (high degree user involvement in order to ensure the DSS effectiveness, flexibility,possibility of decision support in early stages). On the contrary, ES is not flexible and it requires longer and more complex developing and testing stage. The application domain of DesSSs is varying from design (design of the 3D office chairs (Jindo, Hirasago, & Nagamachi, 1995), interior design (Ogino, 2017), etc.), assisting persons (Okamoto et al., 2016), construction (Tanaka & Tsuda, 2016), product-service systems (Akasaka, Nemoto, Kimita, & Shimomura, 2012), and many more. DesSSs are usually knowledge-based (Blondet, Duigou, & Boudaoud, 2019; Kinoshita, Sugawara, & Shiratori, 1988; Tseng & Huang, 2008) or simulation-based (Tanaka & Tsuda, 2016). The knowledge-based DesSSs rely mainly on expert knowledge of the designers and are highly dependent on the quality of the knowledge. They are usually subject to subjectivity. Moreover, in the early stage of the design, the designers face the following challenges: (i) the challenge of creating the specific knowledge about the complex engineering systems, (ii) the challenge of developing knowledge-based methods, (iii) the challenge of building the ability to learn from data and cases, (iv) the challenge of capturing and reusing the implicit knowledge, etc (Wang et al., 2017). On the other hand, the design process optimization can be reproduced more objectively by the use of numerical simulations. Computer Aided Engineering (CAE) is a tool that supports finding the outcome by applying discrete solution of partial differential equations for the phenomena to be analyzed. In some cases, with complicated structure of target and with different heterogeneous input, the designers might have difficulties in applying the numerical simulation in order to get the solution for the specific design problem. In both cases, the design process requires time and knowledge of the designers. As the new technologies can deliver information faster, and data warehousing allows manipulation of large amount of data available, the authors feel the lack of datadriven approaches for its integration with DesSSs. This motivation arose from the fact that the intelligent decision making and decision support are closely interrelated to modern business development. This is especially true in the case of Industry 4.0 paradigm, where the continuous increase of available data opens the realm of possibilities to machine learning approaches. Some approaches (Ogino, 2017; Okamoto et al., 2016) tried to extend this formulation by introducing respectively multi-agent system and the probabilistic Naive Bayes (NB) classifier for selecting the right design choice. In particular, most related to our work is the paper (Ogino, 2017) which proposed the application of datadriven approach (i.e., NB) for selecting the best design idea suitable for the user’s requirement design style. However, the main difference with respect to the above-mentioned paper (Ogino, 2017) can be resumed according to the different (i) field of application and (ii) the employed machine learning approaches. Previously, another methodology was used to support the design, called PAPRIKA methodology (Hansen & Ombler, 2008; Heikkilä, Dalgaard, & Koskinen, 2013). Therefore, the method proposed involves the pairwise ranking of all undominated pairs of all possible alternatives represented by the value model. This approach is a familiar solution to the pervasive problem of how to combine alternatives’ fea-
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
tures on multiple criteria in order to rank alternatives (ComesañaCampos, Cerqueiro-Pequeño, & Bouza-Rodríguez, 2018; Hansen & Ombler, 2008). Differently from (Comesaña-Campos et al., 2018; Hansen & Ombler, 2008; Heikkilä et al., 2013), our approach has more general implications: our target outputs are not a finite set of alternatives which are ranked according to a total score. Indeed, we treated the design problem as a classification/regression task where both continuous and categorical target output may be predicted according to a model trained with past data. The goal of this paper is to introduce the novel concept of machine learning-based DesSS in Industry 4.0 framework. In particular the accent was on the employment of standard well-known machine learning approaches for an innovative application. We propose a DesSS to support the designing choices based on the available data of the company production processes. The core of the DesSS system is the Cloud Platform with the machine learning algorithm able to predict the machine specification data and other metrics based on the known parameters from the manufacturer or from the simulation model. For instance, the algorithm is capable of predicting the machine geometry and material on the basis of several input parameters: machine performance (torque, efficiency, speed, etc.), final application, market, and cost range. The experimental verification on two datasets confirms the appropriateness and the effectiveness of the proposed machine learningbased DesSS in the estimation of the machine specification data and other output of interest. In this context, we introduce the application of three standard machine learning approaches such as Decision/Regression Tree (DT/RT) (Breiman, Friedman, Olshen, & Stone, 1984), and KNearest Neighbors (KNN) (Altman, 1992), and Neighborhood Component Features Selection (NCFS) (Yang, Wang, & Zuo, 2012) in order to extract decisional information from two heterogeneous sets of data in a short and versatile time. The choice of applying standard machine learning algorithms was motivated by the availability of the data (often the data available are few and it is not the case of big data), and when compared to other machine learning techniques such as Support Vector Machine (SVM) (Cortes & Vapnik, 1995), NB (Russell & Norvig, 2002), Restricted Boltzmann Machine (RBM) (Larochelle & Bengio, 2008) and Deep Belief Network (DBN) (Hinton, Osindero, & Teh, 2006), they show superior behavior in terms of interpretability, convergence, computation effort and accuracy. The predicted parameters are then used for the recommendation towards the best technical choice for the designers and technicians. At the same time, thanks to the integration with not purely technical parameters (such as cost and market), the tool can be used for marketing and sales purposes (e.g., for the estimation of the project feasibility, presentations of offers). Moreover, the DesSS is updated continuously after every iteration, and it is possible to re-train the model. It is worth mentioning that, when compared to the traditional design flow where the designer performs simulations or decides on the basis of his own experience, machine learning-based DesSSs allow considerable savings in terms of man-hour. Another important benefit in applying the machine learning-based DesSS can be seen in the presence of cloud storage, having in this way the possibility to have the models available on request at any time and preserving the companies’ knowledge over time. To the best of authors’ knowledge, when considering methodologies that are not based on DesSSs, for the prediction or estimation of some electrical motor specification data and parameters model-based approaches and simulation tools (e.g., CAE) are usually used (Duan, Zivanovic, Al-Sarawi, & Mba, 2016; Haque, 2008; Krings et al., 2017; Tessarolo, Martin, Diffen, Branz, & Bailoni, 2014a; Tessarolo et al., 2014b). However, when compared with the simulation tools such as CAE, the proposed machine learning-based DesSS results in higher computational speed and accuracy. Accord-
3
ingly, the employed machine learning model (i.e., DT and KNN) is easily interpretable with respect to other classifiers (e.g., neural networks) proposed in the literature for the estimation of motor data and parameters ((Bilski, 2014; Jin, Wang, & Yang, 2017; Jirdehi & Rezaei, 2016; Song et al., 2017)). The proposed approach allows to extract the most relevant predictors while designing new variants of products. Additionally, the paper differs from the approaches in the literature dealing with the parameter estimation and controller design like for example adaptive control methods (El-Sousy, 2013; Li, Chen, & Yao, 2018; Xu, Huang, Su, & Shi, 2019). These approaches utilize the specific simplified models of electric motors in order to design a proper controller for improving the performance of electric drive systems. On the contrary, our work aims to provide recommendations towards the best technical choices, learning from a different set of heterogeneous input parameters, not purely electrical or mechanical (e.g., cost, market). Moreover, the proposed ML approach, as the main core of DesSS, can be generalized in various applications, as it is not limited only to electric drive application. In particular, even though the given example of a real use case falls within the field of prediction of specific parameters of electric motors, the proposed DesSS can be applied to any other field in Industry 4.0 framework other than electrical and electronic engineering, where the design is of interest and where is necessary to improve current design practice, i.e. architecture, chemistry, medicine, etc. by adapting the algorithm and making use of the specific datasets. To summarize, the main contributions of this paper in the context of DSS and ES are: (i) the introduction of a tool that can help the designers to estimate quickly some performance or specification data, (ii) the introduction of the novel concept of machine learning-based DesSS to support the designing choice, (iii) higher computational speed and accuracy compared to simulation tools (notable savings in man-hour when compared to the traditional design practice), (iv) the preservation of the company’s knowledge over time and having the possibility to have the models on request at any time, (v) the proposed DesSS can be extended also to the other fields, where the current design practice needs improvements. The paper is organized as follows: Section 2 gives an introduction to the DesSS and describes the deployed machine learning algorithms; Section 3 describes the two datasets used for the analysis; final sections present the results (Section 4) together with the description of the real use case and DesSS implementation (Section 5), and finally the conclusions with future works (Section 6). 2. Method DesSS framework is presented in Fig. 1, and it includes the app engine (where user uploads the dataset from the front end app), cloud storage and cloud platform (manages the training process while saving the dataset and the machine learning algorithm in the cloud storage). The core of the DesSS system are the machine learning algorithms. These algorithms are able to propose the characteristics of the new possible versions of the product or a service by learning, correlating, and interpreting the parameters of a database, which represents the characteristics of a product. DesSSs arise from DSSs and are optimized for engineering design processes. Machine learning-based DesSSs are able to analyze every type of data: continuous variables, discrete variables and nonnumerical classes. This feature makes these tools superior to other tools (e.g. operating on mathematical basis), being more robust and allowing a broader and complete set of solutions. The machine learning-based DesSS has the possibility to re-train the model every time new data are inserted, optimizing and refining predictive capacity over time.
4
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
The main goals of the DesSS are: 1. Decision support in the design field which is mainly discussed in this paper, 2. Collection and conservation of company knowledge over time by using the labels collected by the human designers for training the machine learning algorithms while storing these models together with datasets in cloud storage. DesSSs are able to offer solutions with virtually low computation time. When compared to classic design flow where the designer needs to do calculations and simulations in order to choose the specific product or decide on the basis of own experience, machine learning-based DesSSs allow considerable savings in terms of time and costs. We propose the application of three standard machine learning algorithms in order to solve the regression and classification task on two datasets. These algorithms are the main core of the presented DesSS, as shown in Fig. 1. Once the parameters are loaded and the domain is chosen, the web app can send a request to the cloud platform which receives the data from the web interface and starts the training process. At the same time, the machine learning model is updated continuously after new inputs and outputs are supplied. The dataset inserted in the front end app as well as the machine learning model are also stored in the cloud storage. Afterwards, the user can provide the request for the prediction/recommendation of the output based on the unseen candidate’s input parameters inserted in the web interface. More details on the real world implementation of the framework is given in Section 5. The study aims to implement and validate the proposed DesSS over two different real dataset. The first one is related to heterogeneous set of electric motors including single-phase induction motors, shaded-pole induction motors, and permanent magnet synchronous motors both single- and three-phase. The second dataset is related to the parameters of the compressor, in particular to the blade geometry. More details on the two datasets are given in Section 3. 2.1. Machine learning approaches We decided to apply DT/RT (Breiman et al., 1984), KNN and KNN+NCFS (Yang et al., 2012) for solving both the classification and the regression task. The rational motivation behind this choice lies in the main advantages of the considered machine learning models. Although both classifiers allow learning a non-linear decision boundary, the RT/DT model comprises of the following strengths: •
•
Interpretability: the RT/DT often relies on an intuitive notion of intrepretability. However, the degree of interpretability depends on the model size (i.e., number of nodes and depth of the tree) (Molnar, Casalicchio, & Bischl, 2019); Computation effort for the training phase: also a shallow tree can be a fast classifier (Sani, Lei, & Neagu, 2018). The training time for DT is usually much faster than RBM and DBN (Safavian & Landgrebe, 1991).
In order to support the motivation behind the application of RT/DT for solving our task, we have quantified the measures of interpretability (see Section 4.3) in terms of model size (i.e., number of nodes and depth of the tree). We have also measured the computation effort for the training phase in Section 4.1.3 (see Fig. 7(a)). Accordingly, the KNN classifier presents the following advantages: •
Simple to understand: the explanation of the output of the KNN can be provided based on the neighbors. This aspect is
•
•
•
the results of the intrinsic interpretability of the lazy learner approach (Cunningham & Delany, 2007). Easy to implement and debug: the memory-based methodology behind the KNN classifier is transparent (Cunningham & Delany, 2007). Convergence: as data goes to infinity the error rate of the algorithm converges to twice the minimum achievable error rate of an optimal classifier (i.e., one that exploits the true class distributions) (Bishop, 2006). The assumption to obtain convergence is that exists a true decision surface-expressed with the attributes (i.e., the conditional density function is continuous and not equal to zero) (Duda, Hart, & Stork, 2012). No training step: KNN does not explicitly build any model, it simply predicts the new data based on the most similar historical data (i.e., majority class in the nearest neighbor) (Bishop, 2006; Duda et al., 2012).
Although the prediction of KNN could become time-consuming during the testing stage the experiment results highlight how the computation time of KNN for the testing phase is reasonably fast (see Fig. 7(b)). The interpretability is one of the most salient factor in a DesSSs. We should know not only the predicted parameter but also why and how the prediction was made. Knowing the “why” can help to discover more about the problem and the most salient predictors involved in the prediction process. Thus, the necessity for interpretability in DesSSs arises from the problem formulation (DoshiVelez & Kim, 2017), meaning that it is not enough to get the answer (the what). The model should also give an explanation how it came to the answer (the why), because a correct prediction only partially solves the original problem formulation. For instance the model may be able to properly predict geometry parameters based on different heterogeneous features, unveiling the prediction process as well as the main predictors correlated with respect to the considered target response. Moreover, considering apparently simple and easily interpretable ML model allows to decrease the gap between data-driven and knowledge-based approaches, with the aim of not employing the black box model which reduces the interpretability of the function modeled by the ML algorithm. For instance, the logic behind the DT classifiers allows to understand why a certain element was assigned to one class or the other. We have also considered an extension of a weighted KNN, named KNN+NCFS in order to improve the interpretability of the KNN model, while decreasing the overfitting and the generalization error. 2.1.1. Decision/Regression tree The implemented DT/RT is a CART model (Breiman et al., 1984) which selects the predictor that maximizes the splitting criteria gain over all possible splits of all predictors. The splitting criteria gain for the DT is the Gini’s diversity index defined as follows:
Qτ ( T ) =
K
pτ k ( 1 − pτ k )
(1)
k=1
where pτ k is the proportion of data points in region Rτ assigned to class k. This measure encourages the formation of regions in which a high portion of the data points are assigned to one class. Accordingly, the Gini index is more robust than the misclassification rate for growing the tree because (i) it is more sensitive to node probabilities and (ii) it is differentiable and hence better suited to gradient-based optimization approaches (Bishop, 2006). For what concern the regression tree, the splitting criteria correspond to the residual sum-of-squares:
Qτ ( T ) =
xn ∈Rτ
{yn − f τ ( xn )}2
(2)
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
5
Fig. 2. Regression and Decision Tree corresponding to the partitioning of input space related to the dataset I respectively for the prediction of the Stator Turns (Turns A) (a) and Stator Material (b) (see Section 3.1 for more details).
Table 1 Oracool dataset: input & output. I/O
Label
Type
Input Input
Frequency, Voltage, IP-Grade, Application, Market RPM at Best Efficiency Point (BEP), RPM at max torque, Min RPM, Max RPM, Efficiency, Torque at BEP, Max Torque, Rated Power, Standard cost, Variable cost Stator type, Stator thickness, Stator material, Rotor type, Rotor material Stator height, Numeber of Stator Turns (Turns A), Diameter of Stator Turns (Diameter Turns A), Number of Rotor Turns (Turns B), Diameter of Rotor Turns (Diameter Turns B), Resistance, Rotor height
Categorical Integer/Real
Output Output
where xn is the predictor, yn is the target response and fτ is the mapping function. Notice that the CART model can handle both categorical and real input variables. Fig. 2a and b show respectively an example of the RT and DT designed to split the input space in the employed real dataset (see Section 3 for more details). The target variable for the Decision Tree is the Stator material, while the output for the Regression Tree is the Diameter Turns A (see Table 1). Within each region/leaf, there is a separate model for predicting the target response. In the classification task, we aim to assign each region to a specific class while in the regression task we might simply predict a constant over each region (see Fig. 2b). 2.1.2. K-nearest neighbors and neighborhood component features selection The KNN is a lazy learning approach which implements a simple and efficient nonlinear decision rule. This approach often yields consistent results compared with other state-of-the-art machine learning approaches, such as neural networks and support vector machine. Additionally, we decided to go further exploring a weighted fashion of KNN classifier, named NCFS approach (Yang et al., 2012), which learns an optimal feature weighting by minimizing the regularized leave-one-out error. Then, the NCFS aims to maximize the leave-one-out classification accuracy:
E (ω ) =
i
j
yi j pi j − λ
d
ωl2
(3)
l=1
where ω is the weighting vector, pij is the probability of xi selects xj as its reference point and yij is a variable set to 1 if and only if yi = y j and yi j = 0 otherwise. We set the regularization term λ = 10−5 in the validation set, as the best choice in order to perform features selection while avoiding overfitting. Fig. 3 shows an example of optimal feature weights computed for the employed real datasets - Oracool (see Section 3 for more details), in order to estimate the target variable Stator Material. Although the KNN is widely used to predict discrete response value, it may be easily generalized to estimate real output value in the regression task. In this case, the output is the average of the values of its k nearest neighbors (Altman, 1992).
Categorical Integer/Real
3. Datasets Two different datasets were employed in the study. The first dataset, Dataset I - Oracool, contains the different specification data and other metrics of the electric motor. The input parameters are connected to the machine performance, field of application, market, costs, and the output parameters are referring to the geometry and design. The second dataset, Dataset II - Idelph, contains different compressor parameters and other metrics. The input parameters are connected to the blade geometry, while the output parameters are referred to: input mass flow, blade torque, total input/output pressure, mass flow rate, etc. The two datasets are described in more details below.
3.1. Dataset I The first dataset employed in this study is called Oracool and it is composed of different motor specifications data and other metrics. The motors considered here are single-phase induction motors, shaded-pole induction motors, and permanent magnet synchronous motors both single- and three-phase. The rated power of the considered motors ranges from 50 W to 800 W, the maximum torque ranges from 0.05 Nm to about 0.5 Nm, and the rated speed ranges from 10 0 0 rpm to 120 0 0 rpm. The input parameters can be divided in motor performance (e.g., torque, voltage, efficiency, speed-RPM), field of application, i.e. Application (e.g., washing machine, dishwasher, air conditioner, heat pump), geographic market, i.e. Market (e.g., USA, Europe) and range of costs, i.e. Standard and Variable cost. Based on these heterogeneous input parameters, the main goal is to provide a reliable prediction giving as the output the motor geometry and motor design as described in Table 1. The different inputs (predictors) are presented in details in Table 1, as well as the outputs to be estimated by different machine learning algorithms. The predictors are described by a categorical (i.e., type of application), integer (i.e., minimum/maximum RPM) or real variables (i.e., Rated Power).
6
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
Fig. 3. Feature weights computed from the NCFS algorithm related to the real dataset for the prediction of the Stator material. Table 2 Idelph dataset: input & output. I/O
Label
Type
Input
Inlet angle, Outlet angle, Inlet diameter, Outlet diameter, Axial length, Chord length, Chord angle, Number of blades, LE radius, Maximum curve-camber, Position of the maximum camber, Trailing edge TE thickness, LE position angle, Speed Input mass flow, Blade torque, Total input pressure, Total output pressure, Mass flow rate, Prevalence, Input power, Output power, Efficiency
Integer/Real
Output
3.2. Dataset II The second dataset employed in this study is called Idelph. The dataset enclosed the compressor geometry features, which has been used to estimate some of the compressor parameters and other metrics. In particular, the input parameters are referred to the blade geometry (inlet/outlet angle, inlet/outlet diameter, axial length, chord length and angle, number of blades, leading edge (LE) radius, maximum curve-camber, position of the maximum camber, trailing edge (TE) thickness, LE position angle, and speed), and the output parameters to be estimated are: input mass flow, blade torque, total input/output pressure, mass flow rate, prevalence, input/output power, and efficiency. These parameters are listed in Table 2. For the sake of simplicity, the input parameters, i.e. blade geometry characteristics, are not explained in detail here. More details on this can be found in Boyce (2012). Regarding the output parameters, for the total input/output pressure is considered the sum of the static and dynamic pressure; for the prevalence is considered the difference of the static pressure between input and output; input power is electrical power and output power is the mechanical power of the compressor; and efficiency is the ratio between input and output power.
Integer/Real
On the other hand, the label assignment for the design output parameters (i.e., stator type in Dataset I) was driven by the usage of professional skills of the designers as well as their related knowledge on the market and application. The above-described labeling procedure offers different challenges: •
•
It is time consuming in terms of human resources. This leads to not having the possibility to always retrieve the labels from the target datasets. Generally speaking, the amount of the data fed into the supervised learning algorithm is limited to the capability and availability of the designers. This problem supports our choice in applying apparently simple machine learning models on a relatively small dataset without bothering more complex models which might be more susceptible to overfitting. The labels are affected by the inter-designers variability. The designers can choose different output parameters based on their different motivation of the application domain. We have tried to alleviate this problem by averaging the response of the three expert designers according to a majority vote approach.
In this context, as future direction, we are considering to employ the PAPRIKA approach (Hansen & Ombler, 2008; Heikkilä et al., 2013) in order to improve the labeling procedure by ranking the different response of each expert designers, in order to evaluate the confidence level of the assigned label.
3.3. Labeling procedure 4. Experimental results The labeling procedure was performed by three expert designers who exploited their professional experience to assign the output values, based on the provided input features. The choice behind the geometrical, mechanical and electrical output parameters is also supported by (i) the partial knowledge of the physical equations describing those particular relationships and (ii) the employed sensor measurement. Hence, a model-based tool supported the labeling procedure for the parameters used in this analysis.
In this Section the experimental results are presented, obtained from the real use case, i.e. an example of innovative consultancy&design company. The company is offering the technical project support and in this regard it has developed an open innovation and project development platform, enabling the cooperation between companies and specialists by sharing skills and competencies. More details on the real use case and DesSS implementation
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
7
Table 3 Range of Hyperparameters (Hyp) for the proposed ML models and all competitors’ ML approaches. Model
Hyp
Range
DT/RT
max n◦ of splits min n◦ of leaf size n◦ of neighbors
{5, 10, 15, 20, 25} {50, 60, 70, 80, 90, 100} {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} {10−3 , 10−2 , 0.1, 1, 10, 100} {5, 10, 15, 20, 25} {50, 100, 150, 200, 250} { all4 , all3 , all2 , al l } {100, 200, 300, 400, 500} {100, 200, 300, 400, 500} {10−3 , 10−2 , 0.1, 1, 10} {10−2 , 0.1, 1, 10, 102 , 103 , 104 } {10−2 , 0.1, 1, 10, 102 , 103 , 104 } {10−5 , 10−4 , 10−3 , 10−2 , 0.1} {1, 2, 4, 8, 16} {16, 32, 64, 128, 256}
KNN KNN+NCFS RF
Boosting approaches SVM linear (Bilski, 2014) SVM Gaussian (Bilski, 2014) RBM, DBN (Jin et al., 2017)
λ
max n◦ of splits n◦ of RT n◦ of features to select max n◦ of splits max n◦ of cycles Box Constraint Box Constraint Kernel Scale learning rate n◦ of hidden layers n◦ of units
are given in Section 5. All the experiments are reproducible and they were performed using Azure virtual machine (D4 v3 instance) with 2.3 GHz Intel XEON E5-2673 v4 (Broadwell) processor and 16 GB RAM. 4.1. Results: classification and regression task In this Section, the performance of the ML algorithms selected for DesSS implementation, according to Section 2.1, was evaluated for the two datasets. Additionally, the proposed DesSS approach based on DT/RT and KNN was compared with respect to other state-of-the art approaches employed in different application domains, ranging from the prediction or estimation of some electrical motor data (Bilski, 2014) to DesSS (Ogino, 2017). In particular we have considered the following classifiers: •
•
•
Naive Bayes (NB) classifier employed in Ogino (2017) for interior design; The support vector machine (SVM) employed in Bilski (2014) for the identification of motor parameters and in Wang (2011) for solving a regression task in order to predict the relationship between product form elements and product images; The deep belief network (DBN) employed in Jin et al. (2017) for the estimation of motor data and parameters.
Linear and Gaussian SVM as well as the DBN were also considered for solving the regression task. The experimental comparison includes also other bagged (i.e., Random Forrest (RF)) and boosting methodologies for solving the classification (i.e., linear programming (LP), random under-sampling (RUS), random Subspace (Subspace) and Totally corrective (Total)) and the regression task (i.e., Bag and least-square (LS)). Moreover we have also compared the performance of the proposed approach with respect to CAE simulation software, widely used in engineering design practice. The employed CAE is the commercial package ANSYS with a finite element analysis. We computed in both experiments a 10-fold Cross Validation (10-CV) procedure, while the hyperparameters were optimized by implementing a grid search in a nested 5-CV. Hence, each split of the outer CV loop was trained with the optimal hyperparameters (in terms of macro-f1 or RMSE) tuned in the inner CV loop. Although this procedure was computationally expensive, it allowed to obtain an unbiased and robust performance evaluation (Cawley & Talbot, 2010). Table 3 shows the different hyperparameters for the proposed ML models and all competitors’ ML approaches, as well as the gridsearch set.
The performance of the classification task is evaluated in terms of macro-f1 score. The macro-f1 is computed averaging the f1 score computed for each label. Since this metric does not take into account the class imbalance it represent a more consistent metric than standard accuracy. The performance of the RT is analyzed in terms of Mean Square Error (MSE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and Pearson Correlation. Afterwards we have performed the analysis of computation effort in terms of training and testing time. Finally, the stability of the model was analyzed upon the new inputs and outputs that are supplied over time. 4.1.1. DATASET I • Classification task Fig. 4(a) gives the comparison of the three classifiers, i.e. DT, KNN, and KNN+NCFS in terms of macro-f1 score, taking into account the Dataset I. It can be noted that the best prediction result was obtained for KNN+NCFS, with the lowest score 0.83, in case of the estimation of stator material. The KNN shows the lowest score for the stator material of about 0.78. Since the NCFS is selecting and modeling the most relevant features, the obtained results show an improvement in the generalization performance with the respect to KNN. The DT has the lowest score for the estimation of the stator material of about 0.67. The lower score for DT can also be noted for the estimation of stator type and rotor material compared to the other two algorithms. Table 4 shows the results of the best machine learning approach (i.e., KNN+NCFS) for solving the classification task for all possible categorical outputs of Dataset I (i.e., Stator type, Stator thickness, Stator material, Rotor type and Rotor material). The performance is summarized in terms macro-f1 score. The prediction of Stator material seems more challenging, while the prediction of other parameters is noticeably high. This is also expected, having in mind that this design parameter may be affected by an higher intrinsic inter-variability of the designers choice during labeling procedure.
Table 4 Macro-f1 for the KNN+NCFS for all categorical outputs of Dataset I. Output (Dataset I)
macro-f1(std)
Stator type Stator thickness Stator material Rotor type Rotor material
1 1 0.83(0.19) 1 0.90(0.14)
8
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
Fig. 4. Dataset I: classification and regression results.
•
Regression task
Fig. 4(b) shows the Pearson correlation between the predicted output and the ground truth for RT, KNN and KNN+NCFS. It is clearly visible from Fig. 4(b) that for the parameters from Dataset I, the lowest correlation factor was obtained in the case of Diameter Turns B with value of 0.491 in the case of RT. The rational motivation behind the very high results of KNN and KNN+NCFS may lie in the power of this approach to learn non-linear decision boundary while being robust to noisy training data. Although we have performed the standard pruning strategy while selecting the best hyperparameters, the RT may be very sensitive to noise data since it is a high-variance model (i.e., relatively small changes in the data can yield large changes in the resulting model). The lowest value for KNN and KNN+NCFS can be seen for the prediction of Rotor height parameter. Table 5 shows the results of the KNN-NCFS in terms of Mean Square Error (MSE), Root Mean Square Error (RMSE), Main Absolute Percentage (MAPE), and correlation, while considering the prediction of all the outputs from Dataset I. From this table very high values in terms of correlation can be noted for all possible target
outputs, which was also expected. Additionally, a high correlation does not always reflect a low MSE, RMSE and MAPE: in the estimation of some outputs (i.e., Turns A and Resistance) there is a systematic error (that can be easily filtered) between the predicted response and the ground truth, but at the same time the two variables are highly correlated. For instance, in the case of Resistance prediction both the error (i.e., MSE = 9499.24, RMSE = 54.971 and
Table 5 Mean Square Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and correlation for the predicted real response of KNN+NCFS for Dataset I. Output (Dataset I)
MSE
RMSE
MAPE
Correlation
Stator height Turns A Diameter Turns A Turns B Diameter Turns B Resistance Rotor Height
20.415 2.485 · 105 0.001 7.325 · 103 0.001 9499.224 18.395
3.475 449.353 0.033 66.950 0.024 54.971 3.765
0.056 0.155 0.055 0.056 0.074 2.839 0.059
0.937 0.933 0.956 0.949 0.983 0.978 0.895
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
9
Table 6 Comparison with respect to other machine learning-based DesSS in terms of macro-f1: classification Task Dataset I. Method
Stator type
Stator thickness
Stator material
Rotor type
Rotor material
DT KNN KNN+NCFS RF LP Boost RUS Boost Subspace Boost TotalBoost NB (Ogino, 2017) SVM Linear (Bilski, 2014) SVM Gaussian (Bilski, 2014) RBM DBN(Jin et al., 2017)
0.94 1 1 0.96 0.21 0.96 0.94 0.21 0.85 1 1 0.96 0.96
0.98 1 1 0.98 0.35 0.48 0.99 0.35 0.92 1 0.99 1 1
0.67 0.77 0.83 0.71 0.83 0.76 0.56 0.76 0.41 0.57 0.68 0.42 0.41
0.97 1 1 0.98 0.35 0.47 1 0.35 0.93 1 0.99 1 1
0.83 0.88 0.90 0.82 0.66 0.89 0.62 0.65 0.62 0.72 0.75 0.58 0.54
Table 7 Comparison with respect to other machine learning-based DesSS in terms of correlation: regression Task Dataset I. Output
Stator height
Turns A
Diameter Turns A
Turns B
Diameter Turns B
Resistance
Rotor height
RT KNN KNN+NCFS RF Bag Boosting LS Boost SVM linear (Bilski, 2014) SVM Gaussian (Bilski, 2014) RBM DBN (Jin et al., 2017)
0.84 0.93 0.94 0.91 0.93 0.84 0.86 0.15 0.25 0.25
0.86 0.95 0.93 0.93 0.90 0.94 0.19 0.35 0.12 0.08
0.87 0.95 0.96 0.92 0.89 0.92 0.90 0.64 0.72 0.60
0.90 0.97 0.95 0.96 0.96 0.93 0.45 0.17 0.01 0.04
0.49 0.99 0.98 0.92 0.96 0.92 0.96 0.86 0.91 0.69
0.81 0.98 0.98 0.99 0.98 1 0.64 0.22 0.13 0.11
0.89 0.84 0.89 0.88 0.87 0.92 0.74 0.06 0.18 0.11
MAPE = 2.839) and the correlation (0.978) are high, thus revealing a similar trend with a systematic error between the two responses. Since the MAPE is used to normalize or weight the errors by the inverse of the actual observation value (De Myttenaere, Golden, Le Grand, & Rossi, 2016), it may lead to a biased estimation that is different with respect to the MSE and RMSE (Tofallis, 2015). However, the MAPE can be a salient measure in order to evaluate the performance of electrical features (e.g., input/output power). •
Comparison with respect to other machine learning-based DesSS
Table 6 shows the comparison of the machine learning algorithms employed in the implementation of DesSS with respect to other state-of-the art approaches summarized in Section 4.1, for the classification task. The comparison is given for all the outputs to be predicted from Dataset I. The highest scores are given in bold for each predicted output. It can be noted that KNN+NCFS gives the highest score for all the output parameters. This results support the choice arisen in Section 2.1, where the easiest-interpretable ML algorithms have been preferred for the DesSS implementation. Although in some cases SVM, RBM and DBN achieved very competitive performance, the computation effort is greater (see Section 4.1.3) and the interpretability of the model is lower (e.g., SVM Gaussian, DBN). Table 7 shows the comparison of the regression task. In this case KNN and KNN+NCFS give the highest scores for five out of eight output parameters (i.e., Stator height, Turns A, Diameter Turns A, Turns B and Diameter Turns B). On the other hand, the LS Boost and RF are competitive in order to predict Resistance and Rotor Height. Although these Boosting and Bagged tree strategies allow to increase the generalization performance (i.e., decreasing the overfitting), the KNN+NCFS discloses very similar results. •
Comparison with respect to CAE simulation software
Fig. 5(a) shows the comparison between the predicted Diameter Turns A parameter and its real values obtained with the CAE simulation, mainly used for the engineering design purposes. For the
comparison the best performing ML algorithm from the Table 7 has been used, i.e. KNN+NCFS. The good matching between the two methods can be observed for most of the samples. 4.1.2. DATASET II • Regression task Fig. 6 shows the Pearson correlation between the predicted output and the ground truth. In this case, KNN+NCFS shows the lowest values for almost all the output parameters, while the KNN is the best, except Total input pressure parameter, which seems most challenging to predict. This fact outlines the differences between the two datasets. Since the KNN+NCFS is most effective when dealing with a redundant and irrelevant features, all the inputs of Dataset II may be relevant in order to perform a reliable prediction, without the need to perform any features selection/scaling. Table 8 shows the results of the regression task for KNN in terms of MSE, RMSE, MAPE, and correlation, while considering the prediction of all the outputs from Dataset II. Very high values can be noted in terms of correlation for all possible target outputs, with the lowest identified for the Total input pressure (0.697). This
Table 8 Mean Square Error (MAE), Root Mean Square Error (RMSE), Mean Absolute Percentage Error (MAPE) and correlation for the predicted response of KNN regression for Dataset II. Output (Dataset II)
MSE
RMSE
MAPE
Correlation
Blade torque Input mass flow Total input pressure Total output pressure Mass flow rate Prevalence Input power Output power Efficiency
0.042 0.006 0.060 359.472 < 10−8 370.887 < 10−3 < 10−3 < 10−3
0.182 0.067 0.233 17.661 < 10−4 16.547 0.015 0.014 0.010
0.048 0.029 0.128 0.036 0.036 0.037 0.054 0.061 0.007
0.950 0.948 0.697 0.934 0.952 0.932 0.926 0.913 0.926
10
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
proaches overcome in different parameters (i.e., Total input/output pressure, mass flow rate, prevalence, input/output power and efficiency) the proposed ML models. They are higher but comparable with the proposed ML models. This may reflect the greater difficulty in solving the regression task in the case of Dataset II with respect to Dataset I. However, the proposed ML approaches achieved a consistent correlation score, ensuring a lower computational effort and higher model interpretability. The lowest scores still remain for the Total input pressure parameter for all the methods. In particular the low correlation score of RBM and DBN approach for Total input/output pressure and Prevalence can be explained by the fact that this model may be more sensitive to overfitting. In fact, these models achieved a high correlation (i.e., > 0.7) in the training stage for Total input/output pressure and Prevalence but they are not able to generalize well in the test set (i.e., < 0.2). •
Comparison with respect to CAE
Fig. 5(b) shows the comparison between the predicted Blade Torque parameter and real values obtained with the CAE simulation. For the comparison the best performing ML algorithm from the Table 9 has been used, i.e. RT. The good matching between the two methods can be observed for most of the samples.
Fig. 5. Predicted vs Real response.
Fig. 6. Dataset II: regression results (Correlation of the predicted output w.r.t. the ground truth).
confirms the effectiveness of the selected ML approaches for the DesSS implementation. •
4.1.3. Computation effort analysis on dataset II Starting from the high performance achieved by the proposed ML model, we decided to test the computation effort with respect to other competitors as well as the CAE simulation tool. Fig. 7 compares the time effort (test and training stage) of the proposed regression algorithms (i.e., RT, KNN and KNN+NCFS) with respect to other algorithms used in the experiments and the CAE simulation tool. In particular, the computation time was averaged over all the target parameters of Dataset II. The computation time duration of the CAE tool is significantly higher (p < .01) than the time spent by all ML models for performing both the training and test stage. In particular, the RT and KNN are able to learn the model significantly (p < .01) faster than other competitors. Although the computation effort of the KNN+NCFS for predicting the output is greater than other competitors (i.e., SVML, SVMG, DBN, RBM), it can be considered reasonably fast (on average 1.455 ± 0.148 ms) in order to give a timely and consistent prediction, while increasing the KNN model interpretability by weighting the most relevant predictors. Having in mind the Table 9, where some Boost algorithms showed superior behavior, here it is clear that KNN and RT have lower computational time for learning the model. All the ML based methodologies show a reasonably fast computation time for the testing stage. This confirms the effectiveness of the selection of the ML approaches with respect to the state-of-the-art approaches for the DesSS implementation, when considering a trade-off between the model interpretability, computation effort and accuracy prediction. Moreover, it is worth mentioning that CAE requires a knowledge about the physical system and the development of the real physical model for each dataset, while ML model we propose is able to learn relevant pattern from data. Accordingly it may be quite robust with the respect to the change of the dataset/ parameters, requiring only the new training stage, i.e. less timeconsuming.
Comparison with respect to other machine learning-based DesSS
Table 9 shows the comparison of the machine learning algorithms employed in the implementation of DesSS with respect to other state-of-the art approaches summarized in Section 4.1, in terms of correlation. The comparison is given for all the outputs to be predicted from Dataset II. The highest scores are given in bold for each column. The performances of Boosting and RF ap-
4.2. Stability of the model The DesSS is updated continuously once new inputs and target outputs are available. This lead to re-train the proposed ML algorithms. In this context we have studied how the predictor importance changes over time according to its update. Fig. 8 shows
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
11
Table 9 Comparison with respect to other machine learning-based DesSS in terms of correlation: Regression Task Dataset II. Method
Blade Torque
Input mass flow
Total input pressure
Total output pressure
Mass flow rate
Prevalence
Input power
Output power
RT KNN KNN+NCFS RF Bag Boost LS Boost SVML (Bilski, 2014) SVMG (Bilski, 2014) RBM DBN (Jin et al., 2017)
0.95 0.95 0.93 0.95 0.95 0.95 0.95 0.93 0.45 0.76
0.92 0.95 0.93 0.94 0.94 0.94 0.92 0.93 0.56 0.76
0.76 0.70 0.65 0.80 0.81 0.77 0.66 0.80 0.10 0.11
0.92 0.93 0.88 0.94 0.94 0.94 0.91 0.73 0.07 0.07
0.95 0.95 0.93 0.96 0.96 0.98 0.87 0.92 0.14 0.41
0.92 0.93 0.88 0.93 0.94 0.94 0.91 0.73 0.16 0.16
0.92 0.93 0.90 0.94 0.93 0.93 0.92 0.89 0.74 0.56
0.90 0.91 0.86 0.93 0.93 0.93 0.92 0.89 0.73 0.49
the predictor importance (weights) for the prediction of Total output pressure in Dataset II, computed for each timestamp as the model was re-trained with new input-output data available. These weights was strictly related with the model coefficients. In particular, for RT they were computed by summing changes in the mean squared error due to splits on every predictor and normalizing the sum by the number of branch nodes. On the other hand, for the KNN-NCFS we have computed the predictor importance according to the weights learned by NCFS. All the weights were normalized from 0 to 100. The features weights change over time according to the new information provided within the model. This fact highlights the importance of updating the model continuously over time.
0.93 0.93 0.81 0.94 0.92 0.92 0.92 0.82 0.76 0.45
5.1. Real use case A web platform was developed for validating the DesSS with the main purpose to increase the engineering and innovation of the company. This process enables collaboration between those who need a particular expertise on a project or an activity with those who owns it. The service offered allows companies, professionals, experts in various disciplines, and technicians a worldwide cooperation on the projects. The intention was to integrate the DesSS developed in this work with the services that company offers, with the main benefits: •
•
4.3. Measuring the interpretability of DT/RT model The interpretability of DT/RT model was measured according to the model size (i.e., number of nodes and depth of the tree) of the learned trees (Molnar et al., 2019). For DT, the number of nodes is the number of tests that form up the decision boundaries while the depth of the tree is the maximum number of tests that have to be made for a single example to be classified (Rüping, 2006). The model size of the learned RT for the Idelph dataset is higher than the model size of the learned DT and RT for the Oracool dataset (see Fig. 9).
Efficiency
•
•
•
•
Facilitation of the easier decision-making of the designers and technicians; Possibility to merge the classification of real data obtained from simulation tools or laboratory measurements and with design data of different nature (classes or values); Creation of models specifically created on proprietary data and management of corporate knowledge; Continuous updating and refinement of the model thanks to daily use that expands the database with new verified data; Use of the tool as simulation software (such as CAE software) able to define algorithms that provides a model of reality, without knowing in detail the physics, sometimes very complex, that governs the phenomenon. Immediacy of the response once the machine learning model is created, higher computations speed and accuracy compared to simulation tools or other approaches;
5. Desss implementation on a real use case 5.2. Desss implementation In this section is described in more details the real use case used in this analysis together with the real world implementation of the DesSS framework, as presented in Fig. 1.
DesSS was implemented on the real use case described in Section 5.1, according to DesSS framework presented in Fig. 1.
Fig. 7. Comparison in terms of training and testing time.
12
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
Fig. 8. Predictor importance changes over time once the model was retrained with new input/output data available.
Fig. 9. Model interpretability of DT/RT model in terms of model size (i.e., the number of nodes [a] and depth of the tree [b]).
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
13
Fig. 10. Web App description on a real use case (Dataset II: Idelph).
possible to obtain the predicted output of DesSS. The admin has the access to the cloud where the data storage and ML algorithms are accessible. Starting from the obtained experimental results the KNN, KNN+NCFS and RT were chosen for solving the classification (i.e., KNN and KNN+NCFS) and regression (i.e., KNN, KNN+NCFS and RT) task as the best trade-off between the model interpretability, computation effort and predictive performance. These ML algorithms were encapsulated in the cloud platform as the main core of the implemented DeSS. The predicted responses provide reasonable support to guide the design toward the best possible technical choice. 6. Conclusions
Fig. 11. User experience flow chart and DesSS framework.
Fig. 10 shows the web app of the DesSS related to the second dataset-Idelph space domain (i.e., Idelph Wing Design). The Setup Spaces field choice is visible for the creation of the new space and the parameters to be inserted during the training stage. In particular, several blade geometry parameters are being inserted and grouped in terms of type, class, min/max value, unit of measure, role, etc. Additionally, the parameters to be estimated (e.g., out diameter, in diameter) have been selected as the outputs. Fig. 11 outlines how the user-designer interacts with the DesSS framework already given in Fig. 1. User, from the interface, has the access to the front end app (see Fig. 10). There the domain of the project can be defined as well as the space where the input and output parameters can be inserted and defined. Afterwards, it is
The main goal of this work was to create a system that brings together in an interesting tool many years of experience of designers and laboratory technicians. It provides the research and product development departments with a tool that can guide the designer towards the best possible technical choice. We propose a novel concept of machine learning-based DesSS to support the designing choice, based on the available data of the company production processes. In particular, the focus was on the employment of standard well-known machine learning approaches for an innovative application. DesSS supports the designing choices based on the available data of the company production processes. The core of the DesSS system is the Cloud Platform with the design assistant based on machine learning algorithms. The system has been tested with the two Datasets, made available from the real use case. It is based on machine learning approaches and is able to predict the machine specification data and other metrics based on the known parameters from the manufacturer. For instance, the algorithm is capable of predicting the machine geometry and material on the basis of several input parameters: machine performance (torque, efficiency, speed, etc.), final application, market, and cost range. Another example comes from the second dataset, where DesSS is capable of predicting the compressor performance (blade torque, mass flow, pressure, el. power, etc.) on the basis of blade geometry parameters as input to the system. In this respect, the application of three standard machine learning approaches such as DT/RT, KNN, and NCFS was viably introduced as the trade-off between interpretability and model complexity. Moreover, the experimental results confirm that they are also a good compromise between predictive accuracy, speed and robustness (i.e., results are stable over time). These algorithms are able to extract decisional information from an heterogeneous set of data in a short and versatile time while ensuring higher interpretability of the model.
14
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869
However, in a more challenging scenario (i.e., Idelph dataset) the increase in model complexity may lead to a decrease in model interpretability. On the other hand, small trees are unnatural and do not enclose enough information (Bratko, 1997). The experimental results (see Fig. 9) demonstrated how the learned DT/RT is discriminative (high performance) and at the same time interpretable (i.e., the maximum number of tests [decision rules] that have to be made for a single example to be classified is less than 20 for the Idelph dataset and less than 10 for the Oracool dataset). The predicted parameters are then used for the recommendation towards the best technical choice to the designers and technicians. The main advantages and benefits of the real world implementation of the developed system can be seen in: i) easier decision-making; ii) higher computational speed and accuracy compared to simulation tools (notable savings in man-hour when compared to the traditional design practice); iii) the preservation of the company’s knowledge. Experimental results, obtained on the real use case by using two datasets, demonstrated the appropriateness of the application of the machine learning approaches as the main core of the DesSS. In particular, the results show high reliability of the KNN, KNN+NCFS and RT for solving the classification and regression task. When compared to other machine learning approaches available in the literature for estimation of motor parameters or for DesSS implementation such as Support Vector Machine (SVM), Naïve Bayes (NB), Deep Belief Network (DBN), etc. the chosen ML algorithms show higher performance for the majority of the outputs to be predicted. Moreover, also the computational effort is lower compared to other ML approaches and CAE tool. Future works may be addressed to the investigation of the scalability of the system and the development of the general framework and guidelines for easier serialization. Moreover, the use of the DesSS tool can be extended to other fields of application. In this scenario, we are working to integrate the DesSS into a serverless platform where it is possible to trigger the ML components (i.e., DT, RT, KNN and KNN+NCFS) by ingestion event (Azure blob creation, AWS S3 file creation). Hence, it is possible to manage this event by using a message/event bus connected to a lambda function or more in general a cloud function that could invoke the ML framework. This configuration will enhance the scalability of the system while allowing the continuous update of the ML model. Starting from the impact and the implications we aim to overcome some limitations of our study by following specific future directions:
•
•
•
The validation of the proposed DesSS in a specific domain with a limited set of data. There is a need to test the proposed machine learning approaches with different datasets belonging to other domains. In particular, we aim to scale up the proposed approach in a more challenging dataset, e.g., high unbalanced setting with a huge amount of heterogeneous data; The management of the intrinsic variability of the annotation procedure. The related future direction aims to improve the annotation procedure by ranking the different response of each expert designers (e.g. PAPRIKA approach (Hansen & Ombler, 2008; Heikkilä et al., 2013)); The optimization of the online training procedure. As the volume of the input increases, the continuous update of the KNN may disclose a challenge due to the problem of the “curse of dimensionality” and high in-memory search cost (Yang, Yu, & Liu, 2014). We plan to optimize the already introduced features selection based techniques (NCFS) in order to deal with this challenge while preserving the interpretability of the model.
Declaration of competing interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Credit authorship contribution statement Luca Romeo: Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Software, Writing - original draft. Jelena Loncarski: Formal analysis, Investigation, Writing - original draft. Marina Paolanti: Investigation, Validation, Writing - review & editing. Gianluca Bocchini: Data curation, Validation, Software, Visualization. Adriano Mancini: Supervision, Validation, Software, Writing - review & editing. Emanuele Frontoni: Supervision, Validation, Writing - original draft. References Akasaka, F., Nemoto, Y., Kimita, K., & Shimomura, Y. (2012). Development of a knowledge-based design support system for product-service systems. Computers in Industry, 63(4), 309–318. Alaoui, Y. L., & Tkiouat, M. (2018). Developing a decision making support tool for planning customer satisfaction strategies in microfinance industry. In International conference on intelligent systems and computer vision (ISCV) (pp. 1–7). Altman, N. S. (1992). An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician, 46(3), 175–185. Bilski, P. (2014). Application of support vector machines to the induction motor parameters identification. Measurement, 51, 377–386. Bishop, C. M. (2006). Pattern recognition and machine learning (Information science and statistics). Berlin, Heidelberg: Springer-Verlag. Blondet, G., Duigou, J. L., & Boudaoud, N. (2019). A knowledge-based system for numerical design of experiments processes in mechanical engineering. Expert Systems with Applications, 122, 289–302. Boyce, M. P. (2012). 7 - axial-flow compressors. In Gas turbine engineering handbook (pp. 303–355). Oxford: Butterworth-Heinemann. (4th ed.). Bratko, I. (1997). Machine learning: Between accuracy and interpretability. In Learning, networks and statistics (pp. 163–177). Springer. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. Monterey, CA: Wadsworth and Brooks. Cabrerizo, F. J., Morente-Molinera, J. A., Pérez, I. J., López-Gijón, J., & Herrera-Viedma, E. (2015). A decision support system to develop a quality management in academic digital libraries. Information Sciences, 323, 48–58. Cawley, G. C., & Talbot, N. L. (2010). On over-fitting in model selection and subsequent selection bias in performance evaluation. Journal of Machine Learning Research, 11(Jul), 2079–2107. Chen, C. P., & Zhang, C.-Y. (2014). Data-intensive applications, challenges, techniques and technologies: A survey on big data. Information Sciences, 275, 314–347. Comesaña-Campos, A., Cerqueiro-Pequeño, J., & Bouza-Rodríguez, J. B. (2018). The value index as a decision support tool applied to a new system for evaluating and selecting design alternatives. Expert Systems with Applications, 113, 278–300. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273–297. Cunningham, P., & Delany, S. J. (2007). k-nearest neighbour classifiers. Multiple Classifier Systems, 34(8), 1–17. De Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean absolute percentage error for regression models. Neurocomputing, 192, 38–48. Doshi-Velez, F., & Kim, B. (2017). Towards a rigorous science of interpretable machine learning. arXiv:1702.08608. Duan, F., Zivanovic, R., Al-Sarawi, S., & Mba, D. (2016). Induction motor parameter estimation using sparse grid optimization algorithm. IEEE Transactions on Industrial Informatics, 12(4), 1453–1461. Duda, R. O., Hart, P. E., & Stork, D. G. (2012). Pattern classification. John Wiley & Sons. El-Sousy, F. F. M. (2013). Adaptive dynamic sliding-mode control system using recurrent RBFN for high-performance induction motor servo drive. IEEE Transactions on Industrial Informatics, 9(4), 1922–1936. Ford, F. N. (1985). Decision support systems and expert systems: A comparison. Information & Management, 8(1), 21–26. Hansen, P., & Ombler, F. (2008). A new method for scoring additive multi-attribute value models using pairwise rankings of alternatives. Journal of Multi-Criteria Decision Analysis, 15(3–4), 87–107. Haque, M. H. (2008). Determination of NEMA design induction motor parameters from manufacturer data. IEEE Transactions on Energy Conversion, 23(4), 997–1004. Heikkilä, T., Dalgaard, L., & Koskinen, J. (2013). Designing autonomous robot systems-evaluation of the r3-cop decision support system approach. In Safecomp 2013-workshop DECS (ERCIM/EWICS workshop on dependable embedded and cyber-physical systems) of the 32nd international conference on computer safety, reliability and security (p. NA).
L. Romeo, J. Loncarski and M. Paolanti et al. / Expert Systems With Applications 140 (2020) 112869 Heiner, L., Fettke, P., Feld, T., & Hoffmann, M. (2014). Industry 4.0. Business & Information Systems Engineering, 6(4), 239–242. Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554. Horita, F. E., de Albuquerque, J. P., Marchezini, V., & Mendiondo, E. M. (2017). Bridging the gap between decision-making and emerging big data sources: An application of a model-based framework to disaster management in brazil. Decision Support Systems, 97, 12–22. Jiang, X., Sun, X., Wang, S., & Zhang, X. (2010). An intelligent decision support system for process quality control. In IEEE international conference on automation and logistics (pp. 637–642). Jin, L., Wang, F., & Yang, Q. (2017). Performance analysis and optimization of permanent magnet synchronous motor based on deep learning. In 20th international conference on electrical machines and systems (ICEMS) (pp. 1–5). Jindo, T., Hirasago, K., & Nagamachi, M. (1995). Development of a design support system for office chairs using 3-d graphics. International Journal of Industrial Ergonomics, 15(1), 49–62. Jirdehi, M. A., & Rezaei, A. (2016). Parameters estimation of squirrel-cage induction motors using ANN and ANFIS. Alexandria Engineering Journal, 55(1), 357– 368. Karmarkar, A., & Gilke, N. (2018). Fuzzy logic based decision support systems in variant production. Materials Today: Proceedings, 5(2, Part 1), 3842– 3850. Khatun, M., & Miah, S. J. (2016). Design of a decision support system framework for small-business managers: A context of b2c e-commerce environment. In Future technologies conference (FTC) (pp. 1274–1281). Kinoshita, T., Sugawara, K., & Shiratori, N. (1988). Knowledge-based design support system for computer communication system. IEEE Journal on Selected Areas in Communications, 6(5), 850–861. Krings, A., Cossale, M., Tenconi, A., Soulard, J., Cavagnino, A., & Boglietti, A. (2017). Magnetic materials used in electrical machines: A comparison and selection guide for early machine design. IEEE Industry Applications Magazine, 23(6), 21–28. Kulhavy, R. (2003). A developer’s perspective of a decision support system. IEEE Control Systems, 23(6), 40–49. Larochelle, H., & Bengio, Y. (2008). Classification using discriminative restricted Boltzmann machines. In Proceedings of the 25th international conference on machine learning (pp. 536–543). ACM. Lee, J., Kao, H.-A., & Yang, S. (2014). Service innovation and smart analytics for industry 4.0 and big data environment. Procedia CIRP, 16, 3–8. Li, C., Chen, Z., & Yao, B. (2018). Adaptive robust synchronization control of a dual-linear-motor-driven gantry with rotational dynamics and accurate online parameter estimation. IEEE Transactions on Industrial Informatics, 14(7), 3013–3022. Liu, E., Hsiao, S.-W., & Hsiao, S.-W. (2014). A decision support system for product family design. Information Sciences, 281, 113–127. Multimedia Modeling. Malmir, B., Amini, M., & Chang, S. I. (2017). A medical decision support system for disease diagnosis under uncertainty. Expert Systems with Applications, 88, 95–108. Molnar, C., Casalicchio, G., & Bischl, B. (2019). Quantifying interpretability of arbitrary machine learning models through functional decomposition. arXiv:1904. 03867. Ogino, A. (2017). A design support system for indoor design with originality suitable for interior style. In International conference on biometrics and KANSEI engineering (ICBAKE) (pp. 74–79). Okamoto, S., Takematsu, S., Matsumoto, S., Otabe, T., Tanaka, T., & Tokuyasu, T. (2016). Development of design support system of a lane for cyclists and pedestrians. In 10th International conference on complex, intelligent, and software intensive systems (CISIS) (pp. 385–388). Prasad, D., & Ratna, S. (2018). Decision support systems in the metal casting industry: An academic review of research articles. Materials Today: Proceedings, 5(1, Part 1), 1298–1312.
15
Razavi-Far, R., Farajzadeh-Zanjani, M., & Saif, M. (2017). An integrated class-imbalanced learning scheme for diagnosing bearing defects in induction motors. IEEE Transactions on Industrial Informatics, 13(6), 2758–2769. Rüping, S. (2006). Learning interpretable models. Ph.D. thesis. der Universitat Dortmund am Fachbereich Informatik. Russell, S. J., & Norvig, P. (2002). Artificial intelligence: A modern approach (2nd Edition). Prentice Hall. Safavian, S. R., & Landgrebe, D. (1991). A survey of decision tree classifier methodology. IEEE Transactions on Systems, Man, and Cybernetics, 21(3), 660–674. Sancin, U., Dobravc, M., & Dolak, B. (2010). Human cognition as an intelligent decision support system for plastic products’ design. Expert Systems with Applications, 37(10), 7227–7233. Sani, H. M., Lei, C., & Neagu, D. (2018). Computational complexity analysis of decision tree algorithms. In M. Bramer, & M. Petridis (Eds.), Artificial intelligence XXXV (pp. 191–197). Song, J., Dong, F., Zhao, J., Zhao, J., Qian, Z., & Zhang, Q. (2017). A new regression modeling method for PMSLM design optimization based on k-nearest neighbor algorithm. In 20th International conference on electrical machines and systems (icems) (pp. 1–5). Susto, G. A., Schirru, A., Pampuri, S., McLoone, S., & Beghi, A. (2015). Machine learning for predictive maintenance: A multiple classifier approach. IEEE Transactions on Industrial Informatics, 11(3), 812–820. Tanaka, Y., & Tsuda, K. (2016). Developing design support system based on semantic of design model. Procedia Computer Science, 96, 1231–1239. Tessarolo, A., Martin, M. D., Diffen, D., Branz, M., & Bailoni, M. (2014a). Practical assessment of homothetic dimensioning criteria for induction motors. In 7th iet international conference on power electronics, machines and drives (PEMD 2014) (pp. 1–6). Tessarolo, A., Martin, M. D., Giulivo, D., Diffen, D., Lipardi, G., & Mazzuca, T. (2014b). A heuristic homotetic approach to the dimensioning of induction motors from specification data. In AEIT annual conference - from research to industry: The need for a more effective technology transfer (AEIT) (pp. 1–5). Tofallis, C. (2015). A better measure of relative prediction accuracy for model selection and model estimation. Journal of the Operational Research Society, 66(8), 1352–1362. Tseng, T.-L. B., & Huang, C.-C. (2008). Design support systems: A case study of modular design of the set-top box from design knowledge externalization perspective. Decision Support Systems, 44(4), 909–924. Turban, E., & Watkins, P. R. (1986). Integrating expert systems and decision support systems. MIS Quarterly, 10(2), 121–136. Villegas, M. A., & Pedregal, D. J. (2018). Supply chain decision support systems based on a novel hierarchical forecasting approach. Decision Support Systems, 114, 29–36. Wang, K.-C. (2011). A hybrid kansei engineering design expert system based on grey system theory and support vector regression. Expert Systems with Applications, 38(7), 8738–8750. Wang, R., Wang, G., Yan, Y., Chen, S., Allen, J. K., & Mistree, F. (2017). A framework for knowledge-intensive design decision support in model based realization of complex engineered systems. In 2017 ieee international conference on industrial engineering and engineering management (IEEM) (pp. 230–234). Xu, D., Huang, J., Su, X., & Shi, P. (2019). Adaptive command-filtered fuzzy backstepping control for linear induction motor with unknown end effect. Information Sciences, 477, 118–131. Yang, C., Yu, X., & Liu, Y. (2014). Continuous knn join processing for real-time recommendation. In 2014 IEEE international conference on data mining (pp. 640–649). IEEE. Yang, W., Wang, K., & Zuo, W. (2012). Neighborhood component feature selection for high-dimensional data. Journal of Computers, 7(1), 161. Yin, C., Xi, J., Sun, R., & Wang, J. (2018). Location privacy protection based on differential privacy strategy for big data in industrial internet-of-things. IEEE Transactions on Industrial Informatics. 1–1.