Available online at www.sciencedirect.com
ScienceDirect Procedia CIRP 59 (2017) 190 – 195
The 5th International Conference on Through-life Engineering Services (TESConf 2016)
Determination of optimum criteria for condition-based maintenance of automatic ticket gates using remote monitoring data Yusuke Sato*, Akihiro Morimoto, Shozo Takata Department of Business Design and Management, School of Creative Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo 169-8555, Japan *Corresponding author. Tel.: +81-3-5286-3299; fax: +81-3-3202-2543. E-mail address:
[email protected]
Abstract
Condition-based maintenance is effective in improving availability by preventing failure occurrences, especially in the case where the lives of equipment or components are unstable because of varying operating and environmental conditions. However, failure symptoms are not necessarily detected by monitoring systems in an accurate manner. In such cases, we need to determine the proper criteria for the effective execution of preventive maintenance to minimize the total effects, which include both the effects of successful preventive maintenance and those of unsuccessful ones. This paper proposes a method to determine the optimum criteria for executing the preventive maintenance of the mechatronics equipment. The error messages generated from the sensor signals do not have one-to-one correspondence with the component deterioration and failures because the mechatronics equipment usually use sensors, which are equipped for control purposes,to monitoring the failure symptoms. Therefore, we need to devise the method to relate the error messages with the deterioration and failures. We propose a four-step procedure for this purpose, in which a structural and functional analysis is integrated with a history data analysis. The proposed methods are applied to the automatic ticket gates installed in the train stations in Japan to verify their effectiveness. 2016The The Authors. Published by Elsevier B.V. ©©2016 Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the Programme Committee of the 5th International Conference on Through-life Engineering Services Peer-review responsibility of the scientific committee of the The 5th International Conference on Through-life Engineering Services (TESConf 2016) (TESConfunder 2016). Keywords: Preventive maintenance criteria; Remote monitoring; Condition-based maintenance; Mechatronics equipment
1. Introduction With the growing importance of maintenance in the recent years [1], remote maintenance systems have been introduced in various fields [2, 3]. The remote maintenance system remotely collects information on machine conditions and identifies the deterioration or failure symptoms, which can be used to trigger preventive maintenance (PM) actions. The
remote maintenance systems are effective in reducing inspection and diagnosis costs, especially when many machines need to be maintained or when the machines are installed in places that are difficult to access. Machines, such as air-conditioning equipment in large buildings and gas turbine engines in power generation plants spread around the world, are typical examples of these cases.
2212-8271 © 2016 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of the scientific committee of the The 5th International Conference on Through-life Engineering Services (TESConf 2016) doi:10.1016/j.procir.2016.09.007
191
Yusuke Sato et al. / Procedia CIRP 59 (2017) 190 – 195
The remote maintenance systems are widely applied to mechatronics equipment (e.g., automatic ticket gates (ATG) in train stations and air-conditioning systems in buildings) because they are usually equipped with many sensors for control purposes, which make the implementation of the remote maintenance systems easy. However, the sensors for control purposes cannot directly monitor failure occurrences or symptoms. Therefore, we need to devise the methods for extracting information on the deterioration and failure symptoms from the monitored data and setting the proper criteria for PM executions. In this case, we have to consider the effects of false alarms and overlooking the symptoms. Any monitoring system could make a misjudgment. Therefore, we need to optimize the criteria for triggering PM execution to minimize the effects of the monitoring system misjudgment. Many studies were conducted on the condition-based maintenance (CBM). They focused on the diagnosis and prognosis of deterioration and failures based on the monitoring data [4, 5] or setting the criteria for PM executions [6]. Most of these studies adopted either the stochastic approach using the stochastic model or the structural approach, in which the behaviors of the machines with defective components are analyzed based on the structural and functional analysis of the machines. However, few studies integrated both approaches. Few studies also dealt with the CBM using the sensors for control purposes, instead of dedicated sensors. This paper proposes a method to relate the monitoring data with deterioration and failures and determine the optimum criteria for PM execution of mechatronics equipment. The rest of the paper is organized as follows: Section 2 explains the procedure of setting the criteria for PM execution. Section 3 presents the proposed procedure applied to automatic ticket gates installed in the train stations to demonstrate its effectiveness. Section 4 concludes the paper. 2. Determination of the optimum criteria for the PM execution 2.1. Proposed procedure of determining the optimum criteria for the PM execution The mechatronics equipment are commonly equipped with a monitoring system for detecting the malfunctions. These monitoring systems usually use sensors for control purposes because of the installation cost. However, they do not necessarily and directly monitor the deterioration or failures of the mechanisms. They usually detect the behaviors of the mechanisms or objects to be handled, such as tickets in the case of automatic ticket gates. Therefore, the monitored data obtained by the sensors for control purposes do not have a one-to-one correspondence with the deterioration and failures of the mechanisms, which is why we need to develop the method to set the proper criteria of the PM execution. We assume that the error messages in this study are generated by the controllers of the mechatronics equipment (hereafter called machines) based on the certain logic using the monitoring data. The criteria are expressed as “PM is executed when a certain combination of the error messages is generated for more than
Step1:Structural and functional analysis of the machines
Step2:Analysis of history data
Failure group related with component A
Structural and functional analysis
Error 1 Error 3 Error 4 Error 8 Error 9 Error 13
Failure group related with component A
Error 1 Error 3 Error 4 Error 13 Increasing ratio of the number of error messages
Step3:Identification of the combinations of the error messages used for the criteria Failure group related with component A
Error1 and 3 Error1 and 4 Error1 and 13 Error3 and 4 Error3 and 13 Error4 and 13
Combination Success rate of the of the PM error messages Error1 and 3 1.00 Error1 and 4 0.66 Error1 and 13 0.00 Error3 and 4 0.00 Error3 and 13 0.57 Error4 and 13 0.79
Step4:Determination of optimal criteria based on the simulation Error1: 2 times in 3 days Error3: 4 times in 7 days Error4: 8 times in 6 days Error13: 5 times in 5 days Error1: 3 times in 5 days Error3: 5times in 7days Error4: 4 times in 3 days Error13: 4 times in 7 days
Fig. 1. Procedure for determining the optimal criteria for the PM executions a certain number of times Ne during a certain period of time Tc”. Figure 1 shows the proposed procedure for determining the optimal criteria comprising four steps. First, the error messages that relate failures are identified based on the structural and functional analysis of the machines. Second, the quantitative relationships between the error messages and the failures are evaluated using the historical data. Third, the proper combinations of the error messages for an effective detection of the failure symptoms are selected. Fourth, the optimal criteria for the PM execution are determined based on the simulation using historical data.
2.2. Step 1: Structural and functional analysis of the machines We first examine the relations between the deterioration and the failures and the error messages based on the machine mechanisms. Accordingly, the structural and functional relationships among the components of the machines are identified for this purpose. We can analyze the effects of the deterioration and failures of the components on the machine behaviors and the objects dealt with by the machine based on these relations. The effect of the component propagates depending on its structural and functional relations if it deteriorates or fails. Furthermore, the behavioral changes of the machine and the objects, which are recognized as machine failures, are induced. These changes could be detected by the sensors, and the error messages are generated. We can relate the error messages with the component deterioration and failures by analyzing these processes.
Yusuke Sato et al. / Procedia CIRP 59 (2017) 190 – 195
Failure occurrence
Table 1. Increasing ratio of the error message
2.41 0.00 3.66 0.00 2.33 0.00 4.22
2.44 5.19 6.71 0.00 1.33 3.33 6.44
0.00 0.00 3.41 0.00 0.00 1.44 0.45
1.33 0.00 0.44 0.00 0.20 0.00 0.00
2
Date
Multiplication
6
Failure occurrence
4 2 0
Date Fig. 2. Multiplication of the change in the number of error messages with respect to time Hco
Ratio
Hco 0.1 0.9
0.1 0.9
0.08
0.08
0.06 0.04 0.02
0.06
The number of successful PM executions
0.04
Unnecessary PM executions
0.02 0
0 The number of co-occurrences of two error messages
(a) The period before the failure occurrence
The number of co-occurrences of two error messages
(b)The rest of the period
Fig. 3. Frequency distributions of the cof
which induce unnecessary PM executions. Therefore, the success rate of the PM execution can be expressed as Ns / (Ns + Nf). Accordingly, we search for the best success rate for any combination of the error messages extracted as the candidate in Step 2, with respect to each failure group, by changing the threshold value, Hco. We decide that the corresponding combination of the error messages can be used for failure prediction if the best success rate exceeds the predetermined value. 2.5. Step 4: Determination of the optimal criteria based on the simulation
Error1 Error3 Error4 Error8 Error9 Error13 Failure1 4.92 Failure2 12.33 Failure group Failure3 4.53 related with Failure4 6.73 component A Failure5 0.00 Failure6 5.37 Failure7 30.46
Error1 Error2
4
0
2.4. Step 3: Identification of the combinations of the error messages used for the criteria As already mentioned, the error message and the component deterioration and failures do not have a one-to-one correspondence. Therefore, we should consider making use of the combinations of the error messages to predict the failure occurrences. We now explain the method used in the case of combining two error messages. First, we calculate the period of TMA moving average of the number of each error message. We can obtain the change in the number of cases, where two messages are concurrently generated (Figure 2), by multiplying the moving averages of the two error messages. The frequency distributions of such co-occurrences are then created in two exclusive cases. One case is for the period of TFD before the failure occurrences of a certain failure group. The other is for the rest of the whole period covered by the historical data used for the analysis. On the one hand, the dark area in Figure 3(a), where the values are greater than the threshold, shows the cumulative number of the successes of the failure prediction of the corresponding group, Ns, if we set the threshold of the number of the co-occurrences as shown in Hco. On the other hand, the dark area in Figure 3(b) shows the cumulative number of the false alarms of the failures Nf,
6
Product of the two error messages
The structural and functional analysis is effective in identifying the relations between the error messages and the component deterioration and failures. However, we can only obtain the qualitative relations. The analysis of the failures and the error messages, which are actually generated in the past, is effective in evaluating the quantitative relations. We define an increasing ratio of the number of the error message as a ratio between the average number of the error messages during a certain period of time TIR before the failure occurrence and that during the time of the normal operation. We then calculate the increasing rate of the error message for each combination of a failure instance and an individual error message (Table 1). We mark the increasing ratio, which exceeds a certain threshold value, LIR. The error message is identified as the candidate, which could be used in the criterion for predicting the group of the failures caused by the specific component, if the number of the marked ratios with respect to the specific error message obtains the majority in the specific failure group categorized in terms of the responsible components.
The number of error message
2.3. Step 2: Analysis of the history data
Ratio
192
3.50 1.41 13.77 4.44 0.23 2.44 12.24
Yusuke Sato et al. / Procedia CIRP 59 (2017) 190 – 195
We previously selected the error message combinations, which can be used for the criteria of the PM executions, based only on the success rate of the PM executions. In this step, we need to optimize the criteria considering the various kinds of failure occurrence effects and maintenance executions, including breakdown and preventive maintenance. We adopt a simulation using the historical data to determine the optimal value of the period, Tc, and the threshold number of the error messages, Ne, which provide the minimum effects. We assume that the monitoring data history is available for a certain period of time, in which only the breakdown maintenance (BM) is executed without adopting the PM. The simulations are conducted for each single error message and each combination of the error messages with respect to each failure group. The numbers of successes, oversights, and false alarms in the simulation are counted. “Success” means that the failure occurred in a certain period of time after detecting the error messages satisfying the criterion. “Oversight” indicates that the failure occurs without generating the error messages satisfying the criterion. Meanwhile, “false alarm” means that the error messages satisfying the criterion are generated, but the failure did not occur. Tradeoff relations exist between the number of oversights and that of false alarms. The false alarms decrease, but the oversights increase, when we set the criteria as more stringent by increasing, for example, Ne. In other words, the number of unnecessary PMs reduces, but that of the BMs increases. On the contrary, the situation reverses when the criteria are eased by reducing Ne. The simulations are iterated by changing Tc and Ne. The best criterion for each error message combination for each failure group is searched for such that the total effect during the simulation period becomes minimum. The total effect is evaluated based on the number of PMs and BMs identified in the simulation and includes the cost of PMs and BMs and the effects of the service stoppage caused by the failures and maintenance executions. The effect of the PM is generally smaller than that of the BM because the PM needs a shorter downtime and can be executed during the time period when the service stoppage has smaller effects. 3. Application to automatic ticket gates 3.1. Basic features of automatic ticket gates The proposed method is applied to the ATGs installed in the train stations in Japan. The tickets are checked before boarding and after alighting in every station. The type of ATGs studied in this research consists of four main units: body, door, ticket transfer, and IC card units. The failures of the ticket transfer unit account for 67% of the total failures that occurred in the ATGs even if various failures occur in each unit. The ticket transfer unit is also important from the maintenance point of view because of its complex mechanism. Therefore, the proposed method is applied to the ticket
Fig. 4. Schematic of the ticket transfer unit transfer unit to verify its effectiveness. Figure 4 shows the schematic of the ticket transfer unit. The ticket is inserted from the right side and transferred along the broken line indicated in the figure. The ticket transfer unit is further divided into ten modules, which mainly consist of belts, rollers, sensors, and solenoids. The monitoring data of the ATGs are sent through the network and accumulated in the server in the maintenance center. The monitoring data comprise the operation data, which contain the number of input tickets and touched IC cards, and the number of the error messages generated by the sensor signals. A total of 77 types of error messages are related to the ticket transfer unit. In addition to the monitoring data, the maintenance history is also accumulated. The history contains the failure data, which represent the dates of failure occurrences, ATG numbers, phenomena, causes of failures, and treated components. The maintenance policy of the ATGs is BM in the past. PM was introduced two years ago. However, the criteria for executing PM are determined in an empirical manner. Therefore, we applied the proposed method to optimize the criteria for the PM executions. 3.2. Determination of the optimum criteria for PM execution First, the error messages that relate deterioration and failures are identified based on the drawing of the mechanism and the document explaining the logics of error message generation. Second, we identify the error messages that relate the failures based on the history data of 252 ATGs. A total of 970 failures occurred in three years. These failures are categorized into groups in terms of responsible components. Some failure groups have failures that rarely occur. We consider the PM criteria to the failure group with more than five failures during the analyzed period because identifying the error messages related to such a failure group is difficult. Consequently, we deal with 55 failure groups, which included a total of 901 failure occurrences. We set the period of time, TIR, for calculating the increasing rate of the number of messages to seven days. The threshold value LIR is then set to 2. As a result, we can identify the error messages, which are related to each failure group.
193
Yusuke Sato et al. / Procedia CIRP 59 (2017) 190 – 195
Third, we identify the combinations of the error message used to predict the failure occurrences. The number of the error message varies depending on the number of the input ticket. Hence, we use a rate of the error messages instead of the absolute number of the error messages. The rate of the error messages is defined as the ratio of the number of error messages to that of the input tickets. TMA and TFD are set to 15 and 60 days, respectively, to calculate the PM success rates. We then select the error message combination, which provides the best success rate of more than 0.75. Consequently, 16 pairs of error messages are identified as the candidates to predict the failure occurrences. Fourth, the PM optimal criteria are determined based on the simulation using the historical data. We conducted the simulation using the data from 145 ATGs. The sensors installed in the ticket transfer unit mainly monitor the input ticket behavior. Hence, the rate of the error messages changes depending not only on the condition of the mechanism, but also on the conditions of the input tickets. Furthermore, the average error rate per ATG changes to a certain extent even if the machines are in normal conditions. Therefore, 145 ATGs are classified into five groups depending on the maximum and average values of the rates of the error messages. The optimum criteria for the PM execution in each group are determined. As regards the maintenance execution effects, we assume that the ratio of the PM and BM effect is 1:2. 3.3. Results and discussions Table 2 shows the simulation result. With the current empirical criteria for the PM execution, the PM only succeeds for eight out of 49 times of PM executions. In contrast, PM succeeds for 64 out of 66 times of executions when the optimum criteria based on the proposed method are applied. We investigate the effectiveness of the criteria, in which the error message combinations are considered. Figure 5 shows the comparison of the number of PM successes based on the criteria with a single error message and with the single and combination of the error messages. The successes of the PM are 16 times if the criteria with a single error message are used. The number of the successful PM quadruples when the criteria with the combinations of the error messages are used in addition to that with the single error message. This result shows that the usage of the criteria with the error message combination significantly improves the capability of detecting the failure symptoms and the PM effectiveness. This finding also suggests that we could obtain better results if we consider the criteria with the combinations of more than three types of error messages. However, we need to overcome the problem of the simulation time in this case because of the large number Table 2. Simulation result The number of PM executions
The number of successful PM executions
The number of BM executions
Current empirical criteria
49
8
829
Optimized criteria
66
64
773
70 60
The number of successful PMs
194
50
48
40 30 20 10
16
16
0 criteria based on criteria basedon a single and a single error message a pair of error messages
Fig. 5. Comparison of the number of successful PMs of possible combinations. We can improve the failure predictability by using the criteria with the combinations of multiple error messages. However, we could only prevent a small proportion of the ticket transfer unit failures of the ATGs. We need more indepth analyses of the deterioration of the mechanisms of the unit and the behaviors of the tickets in the deteriorated mechanisms to improve this situation. At the same time, various techniques developed to deal with the big data analysis could be applied to this issue. We assume the history data of the period in the proposed method when the PM is not adopted. The simulation for optimizing the criteria must be performed herein. However, we may need to revise the criteria after the PM implementation because changes in the operation environment and the mechanisms may appear. Therefore, the future work would also include the investigation of how to regularly revise the criteria after implementing the PM policy.
4. Conclusion This paper proposes a method to determine the optimum criteria for executing the PM of mechatronics equipment by integrating a structural and functional analysis with the analysis of the history data. We addressed the issue of the monitoring system of mechatronics equipment, which usually uses sensors for control purposes. They cannot directly detect the deterioration and failures. Hence, the error messages generated from the sensor signals do not have a one-to-one correspondence with the deterioration and failures of the mechanisms. We propose constituting the criteria with the combinations of the multiple error messages to deal with this issue. The proposed method was applied to the automatic ticket gates installed in the stations in Japan to verify its effectiveness. References [1] S. Takata, F. Kimura, F.J.A.M. van Houten, E. Westkamper, M. Shpitalni, D. Ceglarek, and J. Lee. Maintenance: Changing role in life cycle management. Annals of the CIRP, 53; 2004. [2] M. Mori and M. Fujishima. Sustainable service system for machine tools. Procedia CIRP; 2013.
Yusuke Sato et al. / Procedia CIRP 59 (2017) 190 – 195 [3] F. Sittner, D. Aschenbrenner, M. Fritscher, A. Kheirkhah, M. Krau, and K. Schilling. Maintenance and telematics for robots (MainTelRob). IFAC Symposium on Telematics Applications; 2013. [4] A.K.S. Jardine, D. Lin, and D. Banjevic. A review on machinery diagnostics and prognostics implementing condition-based maintenance. Mechanical Systems and Signal Processing, 20; 2006. [5] S. Yin, S. Ding, and D. Zhou. Diagnosis and prognosis for complicated
industrial systems — part 1. IEEE Transactions on Industrial Electronics; 2016. [6] E. Khoury, E. Deloux, A. Grall, and C. Berenguer. On the use of timelimited information for maintenance decision support: A predictive approach under maintenance constraints. Mathematical Problems in Engineering; 2013.
195