Group maintenance scheduling for two-component systems with failure interaction

Group maintenance scheduling for two-component systems with failure interaction

Accepted Manuscript Group maintenance scheduling for two-component systems with failure interaction Li Yang , Yu Zhao , Xiaobing Ma PII: DOI: Referen...

1011KB Sizes 2 Downloads 35 Views

Accepted Manuscript

Group maintenance scheduling for two-component systems with failure interaction Li Yang , Yu Zhao , Xiaobing Ma PII: DOI: Reference:

S0307-904X(19)30065-4 https://doi.org/10.1016/j.apm.2019.01.036 APM 12642

To appear in:

Applied Mathematical Modelling

Received date: Revised date: Accepted date:

12 May 2018 9 January 2019 23 January 2019

Please cite this article as: Li Yang , Yu Zhao , Xiaobing Ma , Group maintenance scheduling for two-component systems with failure interaction, Applied Mathematical Modelling (2019), doi: https://doi.org/10.1016/j.apm.2019.01.036

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Highlight A novel failure interaction framework is constructed considering hazard rate increment.



Environmental uncertainties are incorporated into dependent failure processes.



A new group maintenance strategy is developed to prevent dependent failures.



The applicability is validated by a case study on an electrical distribution system.

AC

CE

PT

ED

M

AN US

CR IP T



* Corresponding author. E-mail addresses: [email protected] (L. Yang), [email protected](X. Ma), 1

ACCEPTED MANUSCRIPT

Group maintenance scheduling for two-component systems with failure interaction Li Yanga,b, Yu Zhaoa, Xiaobing Maa* a b

School of Reliability and Systems Engineering, Beihang University, Beijing, China

Department of Mechanical and Industrial Engineering, University of Toronto, Toronto, Ontario, Canada

Abstract Failure dependence is a common phenomenon in complex and large-scale industrial plants, which may

CR IP T

cause tremendous economic losses and unacceptable safety risks. This paper investigates an advanced group and opportunistic maintenance policy for a two-component system with failure interaction. The interaction between soft and hard failures is characterized by random hazard rate increments, with fully consideration of environmental uncertainties. To sufficiently utilize the economic dependence between components,

AN US

Scheduled maintenance windows are equally spaced for group maintenance of the whole system. Whenever a component is replaced at a maintenance window, the other one is also opportunistically replaced. Furthermore, failure of component 2 is removed by minimal repair, whereas replacement is left to the subsequent window. The objective is to jointly optimize the maintenance interval and the number of

M

maintenance windows via the artificial bee colony algorithm. A case study on an electrical distribution system is provided to validate the applicability of the adopted policy.

ED

Keywords: maintenance, multi-component system, failure interaction, scheduled window, cost optimization.

PT

1 Introduction

A majority of industrial devices/equipment undergo multiple failure processes during their lifetime due

CE

to complex system structures and harsh environmental stresses [1-2]. These unexpected failures will ultimately result in tremendous production losses and hazards to personnel and property safety [4]. For this

AC

reason, preventive maintenance is of significant importance due to its capacity in restoring the system performance and reducing the economic losses due to failures [5,6]. A well-scheduled maintenance plans can effectively balance the trade-off between failure-induced losses and maintenance-induced costs, and thus enables cost-effective maintenance resource allocations. Substantially, failure processes of industrial systems can be classified into two categories: (a) hard failure and (b) soft failure [7,8]. The former is generally fatal and self-announcing, whereas the latter will only reduce performance level/production output of the systems [9]. Failure dependence between hard and soft failures is very common in industrial systems and has been extensively reviewed in literature [10,11]. 2

ACCEPTED MANUSCRIPT

Corresponding works are mainly investigated within two frameworks. The first framework is the degradation-threshold-shock (DTS) framework, which concentrates mainly on the failure interaction between internal degradation (soft failure) and external shock (hard failure) (e.g., [12,13]). The other is the lifetime dependence framework [14-16], which defined failure interaction mainly from the perspective of lifetime distribution and/or system hazard rate. In the past two decades, it has been successfully adopted in various types of multi-component industrial systems [17,18].

CR IP T

Although reliability models within the lifetime dependence framework are rich, we notice that most of them concentrated on only one kind of dependence, either “hard to soft” or “soft to hard”[19]. This may be restrictive or insufficient in characterizing failure characteristics of many complex industrial systems. A typical example is failure behaviors of power systems. In practice, maintainers of electrical distribution

AN US

systems (EDS) and wind turbines (WT) can usually observe that: (1) hard failures act as shocks and significantly accelerate the soft failure processes (e.g., internal degradation) [20,21]; (2) when entering the soft failure state, the system is much more susceptible to fatal malfunction [22,23] . This motivates us to construct a novel failure interaction framework, which incorporates both types of dependence via the

M

variations of hazard rates. Accordingly, each hard failure accumulates a certain amount of increments to the hazard rate of a soft failure, whereas the hazard rate of a hard failure increases significantly upon the arrival

ED

of a soft failure. This framework is one step further to reality and has application potentials in maintenance/safety engineering of complex industrial systems, particularly energy and manufacturing

PT

systems.

Within this framework, we further construct a new reliability model from the perspective of random

CE

hazard rate increment. This is inspired by the definition of shot-noise process studied in Lemoine and Wenocur [24] and its extended application in Cha et al. [25]. According to our model, the magnitudes of

AC

increments due to both soft and hard failures are random variables. It is an extension to the current literatures, which assumed that the hazard rate or probability increment is a constant or a deterministic function (see, e.g., [26]). Our extended model has two contributions to reliability engineering: (a) it sufficiently considers the impacts of environmental uncertainties on system failures; (b) it allows a more precise evaluation and analysis of system reliability, and therefore a more elaborate maintenance plan. To deal with such failures, we innovatively provide a comprehensive group maintenance strategy, with fully consideration of the characteristics of failure interactions. The key step of the strategy is to provide maintenance windows according to a constant interval. During these windows, scheduled maintenance 3

ACCEPTED MANUSCRIPT

actions can be executed to detect the soft failure state and/or renew the system via replacement. A notable feature of this strategy is that, whenever a component is replaced at a window, the other one is always opportunistically replaced. Such opportunistic maintenance can effectively reduce the non-negligible failure risks triggered by failure interactions. Compared with existing maintenance strategies of two-component systems, our strategy provides a more sufficient utilization of the economic dependence by sharing both the downtime and set-up cost between components [27]. Furthermore, unlike the simulation methods adopted for

CR IP T

most group maintenance strategies, our strategy enables the analytical calculation and analysis of maintenance costs. This facilities both the actual executions and theoretical explorations of group maintenance [28,29].

Another superiority of our maintenance strategy to other group maintenance strategies lies in its

AN US

flexibility, which allows “postponement” of failure removals. According to our strategy, the lowest level of maintenance (minimal repair) is executed upon the arrival of the first hard failure, and replacement of this component is “postponed” to the subsequent maintenance window. This is motivated by the fact that a well scheduled preventive replacement incurs much less cost than an unscheduled corrective replacement because

M

of a more sufficient preparation of maintenance resources. It is worth noting that, our strategy is a relatively general strategy, which can be transformed into other class/advanced strategies under specific parameter

ED

settings. Additionally, when the repair cost is not too expensive, our maintenance strategy outperforms other group maintenance strategies from the perspective of reducing operational costs. This further enhances its

PT

application value in maintenance engineering.

Our reliability models and maintenance strategies can provide new perspectives about advanced

CE

operation & management, particularly for managers of manufacturing plants, energy devices and critical infrastructures etc. Common features of these systems include: (a) they are usually large-scale systems with

AC

complicated structures and multiple failure modes [30]; (b) they are usually exposed to harsh and/or uncertain environmental stresses, resulting in relatively high failure risks [31]; (c) maintenance costs of these systems are key components of the total operation costs. On one hand, our reliability model shows its superiorities in charactering the inner connections of different failure behaviors, providing managers a better understanding of failure mechanisms and appropriate diagnosis time. On the other hand, the proposed maintenance strategy enables managers to schedule more flexible and elaborate group maintenance plans to reduce operation costs.

4

ACCEPTED MANUSCRIPT

The remainder of the paper is organized as follows. Section 2 introduces the failure behavior and maintenance strategy of the two-component system. Section 3 formulates the reliability and maintenance cost model, and accordingly carry out the optimization analysis. Section 4 introduces some classic/advanced strategy for comparison. A case study on an electrical distribution system is presented in Section 5. Conclusions and future research directions are provided in Section 6.

Problem statement

CR IP T

2

We consider an industrial system composed of two critical components connected in series. Failure of component 1 is soft and hidden, which only reduces its performance level. Failure of component 2 is hard and self-announced, immediately resulting in the breakdown of this component. Such systems are common in actual industrial applications. For instance, the electric power distribution system consists of a capacitor

AN US

and a transformer [17]. Capacitor is subject to gradual wear, leading to the reduction of capacitance (soft failure), whereas the transformer is stopped immediately when suffering superfluous damage such as appearance of high voltage harmonics. Another example is the peristaltic pump used to pump fluids for patients. Generally, failure of the electronic part is fatal and immeasurable, and the battery part undergoes a

M

soft failure when the voltage degradation reaches a certain level. 2.1 Failure characteristics

ED

We assume that failures of both components arrive according to Nonhomogeneous Poisson Processes (NHPP), which are common processes extensively reviewed in maintenance engineering [4]. Additionally,

PT

both components are subject to monotone increasing hazard rates. Notice that, there exists dependence



CE

between failures of these two components, which are described below. Dependence 1. When component 1 enters the soft failure state, component 2 becomes more susceptible

AC

to hard failures. This is reflected by an abrupt increase of hazard rate upon the arrival of the soft failure. Denote the random time to a soft failure by X s . Then the random hazard rate of hard failure, h (t ) , is expressed as

 h (t ), h (t )    h (t ), 1

t  Xs,

2

t  Xs.

where h2 (t )  h1 (t ).

5

(1)

ACCEPTED MANUSCRIPT



Dependence 2. On the other hand, a hard failure of component 2 causes damage to component 1 and immediately results in the increase of its hazard rate. This is motivated by the definition of shot-noise process (see, e.g.,[2624]). Denoted by Wi , i  1, 2,

the hazard rate increment due to the i th hard

failure, which is an independent and identically distributed random variable with density function g ( wi )

s (t )  s (t ) 

Nh ( t )

W, i 1

i

CR IP T

and expectation w . Then the random hazard rate of soft failure, s (t ) , is expressed as

(2)

where N h (t ) is the number of hard failures occurring in interval (0, t ) .

AN US

From Eq.(2), the hazard rate of soft failure is determined by two main factors: (a) the operational age of component 1; (b) the occurrence number of hard failure. A sample path of such dependence is illustrated in Fig. 1. The baseline hazard rate of component 1 is s (t )  t 2 / 10 represented by the dotted line, and the actual hazard rate s (t ) is represented by the solid line. It can be seen that when the ith(i  1,2,...) hard

AC

CE

PT

ED

M

failure occurs, s (t ) increases by a random amount Wi , where Wi ~ E (0.36) .

Fig. 1. A sample path of hazard rate increment for soft failure.

2.2 Maintenance strategy In this section, a novel group maintenance strategy is proposed to capture the above-mentioned failure characteristics and ensure a more cost-effective maintenance resource allocation and preparation. Three categories of maintenance action, i.e., inspection, repair and replacement are comprehensively scheduled. The detailed strategy is described below. 6

t t

ACCEPTED MANUSCRIPT

 Scheduled maintenance window. A scheduled maintenance window (also known as a scheduled down, see, e.g.,[27,29]) is provided for the two-component system every z time units according to the calendar time. Upon the arrival of the maintenance window, the system is shut-down for the execution of preventive maintenance actions to restore the state of both components. The main advantage of such a window is two-fold: (a) sharing the set-up cost of the whole system; (b) utilizing the economic dependence by sharing

CR IP T

the scheduled downtime.  Periodic inspection. Offline inspections are required to detect the soft failure state of component 1. Such inspections require the shutdown of the system. Therefore, it is reasonable to execute inspections during the scheduled maintenance windows with a constant interval z . We assume that inspections are perfect in that they can completely reveal the soft failure state. The cost of an inspection during the window is denoted as

AN US

cin .

 Replacement and repair. If both components remain normal at the m th maintenance window, then the whole system is preventively replaced. The preventive replacement costs of component 1 and component 2

 zk , zk 1  , k  0,1,

M

during this window are c p1 and c p2 , respectively. Otherwise if a soft failure is triggered between

, m  1 , it can only be revealed at the k th inspection. Here zk  kz is the time of the k

ED

th maintenance window. In that case, component 1 is correctively replaced at this window with a cost cc1 ,

PT

whereas component 2 is also opportunistically replaced with a cost c p2 . Furthermore, if a hard failure of component 2 is triggered between

 zk , zk 1  ,

it is minimally repaired

CE

immediately with a cost cm . The hazard rate of component 2 remains the same after the repair. The duration of a minimal repair is negligible compared with the interval z . Afterwards, component 2 is preventively

AC

replaced at the k th maintenance window, during which component 1 is also opportunistically replaced. Remark. The main principle of our maintenance strategy is to renew the system at the scheduled maintenance window. This is motivated by two main facts: (a) a scheduled replacement during the window is much cheaper than an unscheduled replacement due to a more sufficient preparation of maintenance resources; (b) the economic dependence between the two components can be utilized. Therefore, when this system suffers an unscheduled down due to a hard failure of component 2, we tend to provide the minimum level of maintenance to remove the failure, and then wait for the preventive

7

ACCEPTED MANUSCRIPT

replacement at the subsequent scheduled window [27]. At the same time, component 1 can also be opportunistically replaced to utilize the economic dependence. This opportunistic replacement can effectively reduce the failure risk of component 1, because the hazard rate of component 1 may suffer a significant increase during this wait time because of the occurrence of hard failures. For the same reason, it is natural to perform a scheduled replacement if a soft failure of component 1 is detected by periodic inspections. Recall that, the hazard rate of component 2 suffers an abrupt increase once

2 at the same time, so as to utilize the economic dependence. Nomenclature and notations Electrical Distribution System

NHPP

Nonhomogeneous Poisson Process

s (t )

Hazard rate of component 1 in a baseline environment

 s (t )

Hazard rate of component 1 considering hard failure influence

N h (t )

Number of hard failures in (0, t ) for component 2

h (t )

Hazard rate of component 2 when component 1 is normal

h (t )

Hazard rate of component 2 when component 1 enters the soft failure state

Wi

Hazard rate increment of soft failure due to the i th hard failure

Ts

Random time to soft failure of component 1

M

AN US

EDS

ED

2.3

CR IP T

component 1 enters the soft failure state. Hence, it is also reasonable to opportunistically replace component

1

CE

AC

Rs (t )

PT

2

Reliability function of component 1

f s (t )

Density function of component 1

Th

Random time to hard failure of component 2

z

Interval between two successive maintenance windows

m

Number of scheduled maintenance windows

cm

Cost of an minimal repair for component 2

8

ACCEPTED MANUSCRIPT

Cost of a corrective replacement for component 1

c p1

Cost of a preventive replacement for component 1

c p2

Cost of a preventive replacement for component 2

cin

Cost of a periodic inspection

cd

Economic loss per unit time due to soft failure of component 1

CR IP T

3

cc1

Model analysis

In this section, we are devoted to construct and analyze the maintenance model based on the reliability characteristics of both components. Accordingly, the expected cost per unit time of the system is calculated

AN US

and optimized. 3.1 Reliability analysis

We start with the reliability of component 2, which is related to the system condition of component 1. Denote by Th and Ts the random time to a hard failure of component 2 and a soft failure of component 1,

M

respectively. Then we have the following proposition.

ED

Proposition 1. Considering the impact of soft failures, the reliability of component 2 at time t is given as

(3)

PT

t  ts   t  t Rh  t   Fs  t  exp    h1  u  du    f s  ts  exp    h1  u1  du1   h2  u2  du2  dts .  0  ts  0  0  

Proof. Consider two possible situations. The first situation is that neither a soft failure nor a hard failure

CE

occurs before t . Its probability is given as

AC

 t  Rh1  t   Pr Ts  t , Th  t   Fs  t  exp    h1  u  du  ,  0 

(4)

 t  where Fs t  is the survival function given as Fs  t   exp    s  u  du  .  0 

In the second situation, a soft failure occurs at ts   0, t  , and the hazard rate of component 2 increases

abruptly from h1  ts  to h2  ts  . The corresponding survival function is t  ts  R  t    Pr Ts  ts , Th  t  dts   f s  ts  exp   h1  u1  du1   h2  u2  du2  dts ,  0  0 0 ts   t

t

2 h

9

(5)

ACCEPTED MANUSCRIPT

where f s  t   dFs  t  / dt . Summing up the above two scenarios, we have the reliability of component 2 as

Rh  t   Rh1  t   Rh2  t  t  ts   t  t  Fs  t  exp    h1  u  du    f s  ts  exp    h1  u1  du1   h2  u2  du2  dt s ,  0  ts  0  0  

(6)

which concludes the proof.

CR IP T

We proceed with the reliability function of component 1, which is related to the number of minimal repair of component 2. The following proposition provides its reliability.

Proposition 2. Given that component 2 is minimally repaired upon failure, the reliability of component 1,

AN US

Rs (t ) , is calculated as

 t  Rs (t )  exp    s (u )  h1 (u )  h1 (u )Gˆ (t  u ) du  ,  0 





(7)

where Gˆ (u) denotes the Laplace Transform of g (t ) , which is defined as 

M

Gˆ (u )   exp(ut ) g (t )dt.

(8)

0

 0,t 

ED

Proof. Denote by Nh (t ) the number of hard failures occurring in 1

when component is normal. Using

the law of total probability, the survival function of component 1, Rs (t ) , can be calculated via the following

CE

PT

equation



Rs (t )   P(Ts  t , N h (t )  n).

(9)

n0

AC

Denote by Si (i  1, 2,..., n) the arrival time of the i th hard failure, and wi the realization for the hazard rate increment of soft failure due to the i th hard failure. Then the following equation holds P(Ts  t , N h (t )  n) t

s2 



0

0 0

0

=

   P(T

s

 t | N h (t )  n,W1  w1 , Wn  wn , S1  s1 ,

, S n  sn )

1

 gW1 ,...,Wn ( w1 ,, wn )dw1

dwn f S1 ,..., Sn , Nh ( t ) ( s1 ,, sn , N h (t )  n)ds1

2

3

10

dsn .

(10)

ACCEPTED MANUSCRIPT

The first term of the above integrand denotes the probability that no soft failure occurs before t conditional on n hard failures arriving in  0,t  . Using Eq. (2), we can obtain

P(Ts  t | N h (t )  n,W1  w1 , Wn  wn , S1  s1 ,

n  t  , S n  sn )  exp    s (u )du   wi (t  si )  . i 1  0 

(11)

The second term in Eq. (10) is the joint density function for total n hazard rate increments of soft

CR IP T

failures. Note that Wi , i  1, 2, , n are independent and identically distributed variables. Thus, the joint probability density function is the multiplier of each density function, i.e., n

gW1 ,...,Wn (w1 ,, wn )= g ( wi ). i 1

(12)

AN US

The third term denotes the joint density function for both the occurrence number and occurrence time of hard failure, i.e.,

n  t  f S1 ,..., Sn , Nh ( t ) ( s1 ,, sn , N h1 (t )  n)   h1 ( si ) exp    h1 (u )du  . i 1  0 

(13)

Based on the results obtained in Eqs. (11)-(13), the joint distribution in Eq. (10) can be given as n

where n!  1 2 

n .

(14)

ED

M

 t  t  exp     s (u )  h1 (u )  du   h1 (u )Gˆ (t  u ) du   0  0  , P (Ts  t , N h1 (t )  n)  n!

, the survival function of component 1, Rˆs (t ) , is given as

CE

number n  1,2,

PT

The detailed derivation of Eq. (14) is provided in Appendix A.1. Summing up all possible failure

  t  Rs (t )   P(Ts  t , N h1 (t )  n)  exp    s (u )  h1 (u )  h1 (u )Gˆ (t  u ) du  , n0  0 



(15)

AC



which concludes the proof. 3.2 Cost model formulation Based on the reliability characteristics of both components, we are able to formulate the maintenance cost model of the system. Recall that whenever one component is replaced, the other one is always opportunistically replaced. Hence, the system is always renewed at scheduled maintenance windows. Therefore, we can directly use the expected cost rate of the system to evaluate the performance of the maintenance policy. 11

ACCEPTED MANUSCRIPT

In order to obtain the expected cost rate, the renewal scenarios of the system requires investigation. Based on the replacement types at the component level, there are two categories of renewal scenarios as follows.  Scenario A. Both components are simultaneously preventively replaced at a scheduled window.  Scenario B. Component 1 is correctively replaced at a scheduled window, whereas component 2 is

CR IP T

preventively replaced at the same window. In the following, we will separately calculate the expected renewal cycle cost and length due to each renewal scenario based on the reliability functions obtained in Section 3.1. 

Expected renewal cycle cost and length due to Scenario A

AN US

We start from Scenario A. It can be further divided into two sub-scenarios, as shown in Fig. 2. The first sub-scenario is that no failure is triggered throughout the m maintenance windows, and the system is renewed at the m th maintenance window. Its probability is easily given as

Pp1  z, m   Pr Ts  zm ,Th  zm   Pr Ts  zm  Pr Th  zm   Fs  zm  Fh1  zm  ,

(16)

M

where Fs  zm  and Fh1  zm  are the survival functions calculated as follows

 zm   zm  Fs  zm   exp    s  u  du  , Fh1  zm   exp    h1  u  du  .      0   0 

ED

(17)

PT

The second sub-scenario is that the first hard failure is triggered at th   zk , zk 1  , k  0,1, , m  1 , and

CE

the system is preventively renewed at z k 1 . The following proposition is required for calculating the probability of this sub-scenario.

AC

com 1

zm-1

zm

(a)

com 2

com 1

time

zk

zk+1 time

(b) th

zm-1

zm

com 2

time

zk

zk+1

Schemed maintenance window

Periodic inspection

Soft failure (component 1)

hard failure (component 2)

Fig. 2. Two sub-scenarios for preventive replacement of the whole system. 12

time

ACCEPTED MANUSCRIPT

Proposition 3. Given that the first hard failure occurs at t h , The probability of component 1 being normal at z k 1 is

 zk 1  Pr Ts  zk 1 | Th  th   Fs  zk 1  Gˆ ( zk 1  th )exp    h1 (u )  h1 (u )Gˆ ( zk 1  u ) du  .  t   h 





(18)

The detailed proof of this proposition is left to Appendix A.2. Based on Proposition 2, the probability

Pp2  z , k   Pr Ts  zk 1 , Th   zk , zk 1   

zk 1

 Pr T

s

 zk 1 | Th  th  f h1  th  dth

zk



zk

  

  

  

th

zk 1





  

h (th )exp    h (u )du  Gˆ ( zk 1  th )exp    h (u )  h (u )Gˆ ( zk 1  u ) du  dth 1

 Fs  zk 1  Fh1  zk 1 

zk 1



1

0

th

1

1

(19)

 zk 1  h1 (u )Gˆ ( zk 1  u )du dth .  t   h 

h (th )Gˆ ( zk 1  th )exp  1

AN US

zk 1

 Fs  zk 1 

CR IP T

of this sub-scenario can be calculated as

zk

p2 Accordingly, the expected number of minimal repair triggered under this sub-scenario, E  N mini  , is

EN

zk 1

p2 mini

  F z F z   s

k 1

h1

k 1

(20)

th

is the cumulative intensity given as  h1  th    h1  u du . 0

ED

where h1  th 

M

zk

 zk 1  ˆ  h1  zk 1  th  h1 (th )G ( zk 1  th )exp   h1 (u )Gˆ ( zk 1  u )du dth , t   h 

Based on Eqs. (16)-(18), we can obtain the expected renewal cycle length due to Scenario A,

PT

E  La  m, z   , as

m 1

E  La  m, z    zm Pp1  z , m    zk 1 Pp2  z , k .

(21)

CE

k 0

The expected renewal cycle cost due to Scenario A, E  Ca  m, z   , can be derived in a similar way.

AC

Recall that the preventive replacement cost of the system is c p1  c p2 . Additionally, when component 1 is preventively replaced at s m or opportunistically replaced thanks to hard failure of component 2, inspection is no longer required. Consequently, we have m 1   E  Ca  m, z    c p1  c p2  Pp1  z , m    Pp2  z , k     m  1 cin Pp1  z , m  k 0  





m 1

  kcin Pp2  z , k   cm E  N k 0

13

p2 mini

,

(22)

ACCEPTED MANUSCRIPT

where k is the number of inspections when the system ends at sk 1 . 

Expected cycle cost/ length due to Scenario B This scenario consists of three sub-scenarios, as depicted in Fig. 3. One possible sub-scenario is that the

first soft failure of component 1 occurs at ts   zk , zk 1  , k  0,1,

and component 2 remains normal until

CR IP T

z k 1 . Recall that the hazard rate of hard failure increases abruptly from h1 (t ) to h2 (t ) upon the arrival of

a soft failure. Hence, the probability of this sub-scenario is calculated as Pc1  z , k   Pr  zk  Ts  zk 1 , Th  zk 1 

zk 1



zk

zk 1  zk 1   ts  Fh  zk 1  f s (ts )exp    h1 (u )du  exp    h2 (u )du  dts   f s (ts ) Fh1  ts  2 dts ,    t  F t   h s 0 z   2 k  s 

AN US

 ts  where f s  ts   s (ts )exp    s (u )du  .    0 

(23)

Accordingly, the expected downtime of soft failure due to this sub-scenario,  c1  z, k  , is

1

zk 1

 z

k 1

 ts  f s (ts ) Fh1  ts 

zk

M

 c  z, k 

Fh2  zk 1  Fh2  ts 

dts .

(24)

ED

We then consider the sub-scenario where both the first soft and hard failures occur between  sk , sk 1  , whereas the soft failure occurs first. The probability of this scenario is given as



PT

Pc2  z , k   Pr  zk  Ts  Th  zk 1 zk 1 zk 1

  ts

CE

zk

th  ts  f s (ts )h2  th  exp    h1  u  du   h2  u  du  dth dt s  0  ts   zk 1 zk 1 Fh  ts  Fh2  th  Fh  ts  f s (ts )h2  th  1 dth dts    f s (t s ) f h2  th  1 dth dt s Fh2  ts  Fh2  ts  zk t s



zk 1 zk 1

 

AC

zk



zk 1



ts



f s (ts ) Fh2  ts   Fh2  zk 1 

zk

Fh1  ts 

 F t  h2

dts 

s

zk 1



zk

(25)

 Fh  zk 1   f s (ts ) Fh1  ts  1  2  dts .  Fh2  ts   

Accordingly, the expected downtime  c2  z, k  and the expected number of minimal repair due to this c2 sub-scenario E  N mini  are respectively given as

14

ACCEPTED MANUSCRIPT  Fh  zk 1    ts  f s (ts ) Fh1  ts  1  2  dts ,  Fh2  ts   zk  zk 1 zk 1 F t  c2 E  N mini     h2  zk 1  th  f s (ts ) fh2 th  Fh1  ts  dth dts . h2 s zk t s

 c  z, k  2

zk 1

 z

k 1

The final sub-scenario is that both failures occur between

(26)

 sk , sk 1  and the hard failure occurs first.

CR IP T

Before the derivation of this probability, we first introduce the following proposition. Proposition 4. Given that the first hard failure occurs at t h , the density function of soft failure at t s is calculated as

 ts  f s  ts | Th  th   Fs  t s  exp    h1 (u )  h1 (u )Gˆ (t s  u ) du    t   h  t s      Gˆ (ts  th )  s (ts )  h1 (ts )  h1 (th )   h1 (u ) gˆ (ts  u ) du   gˆ  t s  th   ,     th    



AN US



(27)

where gˆ  th  is the density function given as gˆ  th   dGˆ  th  / dth . com 1

zk

zk+1 ts

zk

zk+1

zk

ED

com 1

M

(a) com 2

zk

zk+1 time

ts

(b)

th com 2

time

zk+1

zk

time

zk+1

PT

ts (c)

com 1

time

time

th

zk

zk+1

time

AC

CE

com 2

Schemed maintenance window

Periodic inspection

Soft failure (component 1)

hard failure (component 2)

Fig. 3. Three sub-scenarios for corrective replacement of component 1 while preventive replacement of component 2.

The detailed proof is left to Appendix A.3. Based on Proposition 3, we can obtain the probability of this sub-scenario as

15

ACCEPTED MANUSCRIPT

Pc3  z , k  

zk 1 zk 1

  f t s

zk

s

| Th  th  f h1  th dts dth

th

   ts   Fs  ts  Fh1  ts  h1 (th ) exp   h1 (u )Gˆ (ts  u )du    t  zk 1 zk 1   h   dts dth .    ts    zk th     Gˆ (ts  th )  s (ts )  h1 (ts )   h1 (u ) gˆ (t s  u ) du   gˆ  t s  th        th    

(28)

CR IP T

Hence, the expected downtime due to this sub-scenario is    ts    zk 1  ts  Fs  ts  Fh1  ts  h1 (t s )exp   h1 (u )Gˆ (t s  u )du    t  zk 1 zk 1   h  dts dth ,  c3  z, k      ts    zk t h     Gˆ (ts  th )  s (ts )  h1 (ts )   h1 (u ) gˆ (ts  u )du   gˆ  t s  th        th    

(29)



c3 E N mini



AN US

c2 and the expected number of minimal repairs under this sub-scenario, E  N mini  , is

  ts     h1  ts  th    h2  zk 1  ts  Fs  ts  Fh1  ts  h1 (t ) exp   h1 (u )Gˆ (t s  u )du   t   zk 1 zk 1  h   dts dth .    ts      zk t h    Gˆ (ts  th )  s (ts )  h1 (ts )   h1 (u ) gˆ (t s  u )du   gˆ  t s  th          t h      



(30)

ED

M



Combining Eqs. (23)-(30), the expected renewal cycle length due to Scenario B, E  Lb  m, z   , is given

PT

as

m 1 3

CE

E  Lb  m, z     zk 1Pc j  z, k .

(31)

k 0 j 1

AC

Analogously, the expected renewal cycle cost due to Scenario B, E  Cb  m, z   is given as



E  Cb  m, z    cc1  c p2

m 1 3

m 1 3

  P  z, k     k  1 c k  0 j 1

cj

k  0 j 1

m 1

 c  P  z, k   P  z, k    c k 0

in

c2

c3

16

m

P

in c j

 z, k  

 E  N   E  N . c2 mini

c2 mini

(32)

ACCEPTED MANUSCRIPT

3.3 Cost optimization Based on the expected cycle costs and lengths derived in Section 4.2, we are able to calculate the expected cost rate of the system via the renewal-reward theory. Summing up Eqs. (21),(22),(31),(32), the expected cost rate, C  m, z  is given as C  m, z  

E  Ca  m, z    E  Cb  m, z   E  La  m, z    E  Lb  m, z  



CR IP T

 c kp Pp2  z , k   cin Pc1  z , k      c3 p2 c2 3 c mp Pp1  z , m  +  k 3   c m E  N mini   E  N mini   E N mini c P z , k  c  z , k     k 0  c  c d  cj  j j 1  j 1   m 1 m 1  3  zm Pp1  z , m     zk 1 Pp2  z , k    zk 1 Pc j  z , k    k 0 k 0  j 1 





(33) ,

AN US

m 1

where

c kp  c p1  c p2  kcin , k  0,1,

, m  1,

cck  cc1  c p2  kcin , k  0,1,

, m  1.

(34)

M

The objective of this model is to find the minimal expected cost rate via the optimization of the maintenance interval z and the number of maintenance windows m . Hence, the optimization problem can

ED

be described as

 c kp Pp2  z , k   cin Pc1  z , k      c3 p2 c2 3 c mp Pp1  z , m  +  k 3   cm E  N mini   E  N mini   E N mini k  0  cc  Pc  z , k   cd   c  z , k   j j j 1  j 1  min C  m, z   , m 1 m 1  3 m , z (35)  zm Pp1  z , m     zk 1 Pp2  z , k    zk 1 Pc j  z , k    k 0 k 0  j 1 



CE

PT

m 1



.

AC

subject to z  0, m  1, 2,



The analytical optimization of Eq.(35) is almost intractable. Nevertheless, we find that this is a

low-dimension unconstrained optimization problem, which can be effectively solved by several random search methods. In this article, we choose a recently developed random search method, namely, artificial bee colony (ABC) algorithm (see, e.g., Karaboga et al. [32]). Compared with other classic random search methods, such as genetic algorithm, swarm particle algorithm, it has a relatively efficient and concise search process (with limited control parameters), which enables a more time-saving calculation process [33]. Another notable feature of this algorithm is its strong robustness. It shows good performance in finding 17

ACCEPTED MANUSCRIPT

globally optimal solutions and ensuring the precision and stability of solution [34]. This is realized by avoiding premature convergence via the scout phase. It is worth noting that, the ABC algorithm shows good performance in both continuous optimization and discrete optimization [32]. For this reason, it has been extensively used for solving various industrial optimization problems, including supply-chain optimization, project scheduling optimization, power system optimization etc [3534]. In recent years, this advanced algorithm has also been successfully used for complex

CR IP T

maintenance optimization problems. For instance, Wang et al. [36] adopted the ABC algorithm to realize the joint optimization of inspection interval and inspection number for a multi-state system. Yang et al. [37] designed a hybrid maintenance strategy to deal with competing failures, and optimized both the inspection and replacement interval via ABC algorithm.

AN US

Start

Input parameters of failure distributions and cost functions

Set population size NP,maximum iteration number nmax , maximum inspection number M, non-improvement threshold Nlim

M

Set initial inspection number m =1 and initial expected cost set Cse  

ED

Set iteration number n =0; Generate initial population zmj , j  1, , NP according to Algorithm 1

m  m 1

j Calculate R  m, zm  , j  1, , NP based on Eq. (22)

Update optimal solution zm^ for current m

CE

PT

Execute scout phase based on Algorithm 3

AC

Execute employed and onlooker bee phase based on Algorithm 2



Cse  Cse , C  m, zmj 



n  n 1

Yes

n  nmax No Yes

mM No

m  arg min Rse  *





m



Output m* , zm^ * and R m* , zm^ *



End

Fig. 4. Cost optimization flowchart based on the ABC algorithm. Recall that our joint optimization problem is an unconstrained and simple one, which only has a 18

ACCEPTED MANUSCRIPT

continuous random variable and an integral variable. Hence, it is suitable to be solved using the ABC algorithm, since its search process is concise and efficient. The framework of the optimization procedure based on the ABC algorithm is depicted in Fig. 4. To validate the existence of global optimal solutions, we will further compare the optimization results obtained by ABC algorithm and enumeration algorithm in the numerical example. The main step of this algorithm is to search the optimal real variable set z m^ at the current inspection

CR IP T

number m, m 1,2, , M  , and then compare the value of each expected cost rate R  m, zm^  for all

m 1,2, , M  , and then choose the optimal m* . The search process of  ^m consists of three phases. The first phase is the employed bee phase, whose main motivation is to continuously improve the quality of

AN US

current solutions. The pseudo-code of this phase is provided by Algorithm 2 in Appendix C. Afterwards, the onlooker bee phase is executed to avoid premature convergence of solutions. This phase starts by choosing an onlooker bee based on the roulette wheel method [32], and then repeat the procedure in Algorithm 2. The third phase is called the scout phase, which aims at removing non-improvement solutions, whose

Some classic/advanced maintenance strategies

ED

4

M

pseudo-code is provided by Algorithm 3 in Appendix C.

In order to illustrate the superiority of the proposed maintenance strategy, we also introduce two

PT

classic/advanced maintenance strategies for comparison. The first strategy is the classic group maintenance strategy, which provides maintenance windows according to the calendar time. The second strategy is an

CE

extended group maintenance strategy for multi-component systems, namely, opportunistic group maintenance strategy, where opportunistic maintenance is scheduled to take advantage of the economic

AC

dependence. The detailed comparison in maintenance cost between these strategies is provided in Section 5. 4.1 Strategy 1: Classic group maintenance strategy This strategy provides scheduled maintenance windows for the overhaul of the whole system. Similar

strategies can be found in Elsayed [38]. Specially, windows are scheduled at calendar time T , 2T ,

, during

which component 1 is preventively replaced with a cost c p1 , and component 2 is preventively replaced with a cost c p2 . Prior to the arrival of each window, component 1 undergoes m  1 periodic inspections to detect the soft failure state with an interval z  z  T / m and a cost cin* per inspection. If a soft failure occurs 19

ACCEPTED MANUSCRIPT

between  zk , zk 1  , k  0,1,

, it is revealed at z k 1 and removed by corrective replacement with a cost cc*1 .

Additionally, when component fails between  0, zm  , it is correctively replaced immediately with a cost cc*2 . Under this strategy, we have cc*1  cc1 because no maintenance window is provided to share the set-up cost and the downtime cost. Notice that, the renewal-reward theory is no longer applicable since the utilization of the calendar time. Nevertheless, the cost calculation can be easily realized via the simulation

z, m, z  0, m 1,2, 

CR IP T

algorithm. The objective of this strategy is to seek for the optimal decision variable set such that the expected cost per unit time of the system Cst1  m, z  is minimized.

The optimization procedure can also be realized via the ABC algorithm. 4.2 Strategy 2: opportunistic group maintenance strategy

AN US

This strategy further considers an advanced group maintenance strategy, namely, group maintenance strategy to ensure the sufficient utilization of economic dependence. Similar strategies can be observed in Rafiee et al. [39]. To be specific, if both components remain normal until z m , the whole system is preventively

replaced. Prior to this, component 1 is inspected every z time units to detect the soft failure state. If

ED

opportunistically replaced simultaneously.

M

component 1 is found failed at an inspection, it is correctively replaced immediately, whereas component 2 is

The main difference between this strategy and the strategy in Section 2 lies in the maintenance plan for

PT

component 2. In this strategy, once component 2 undergoes a hard failure, a corrective replacement is immediate with a cost cc2 instead of only performing a minimal repair. At the same time, component 1 is

CE

also opportunistically replaced simultaneously with a cost co1 . Here co1  c p1 because the downtime incurred by a corrective replacement is larger than that incurred by a scheduled preventive replacement.

AC

Similar to the proposed maintenance strategy, this strategy also ensures the immediate calculation of the

system maintenance cost (via the renewal-reward theory) instead of simulation, thanks to involvement of opportunistic maintenance. There are three possible renewal scenarios, i.e., (a) both components are preventively replaced at z m ; (b) component 1 is correctively replaced at zk 1 , k  0,1,

, m  1 , whereas

component 2 is also opportunistically replaced; (c) component 2 is correctively replaced upon the arrival of a hard failure, whereas component 1 is also opportunistically replaced.

20

ACCEPTED MANUSCRIPT

Let c0k  co1  cc2  kcin , k  0,1, , m  1. Then summing up these three scenarios, the expected cost per unit time of the system is given as 2  k  cc  cin  Pc1  z , k  +cd   c j  z , k      m j 1 c p Pp1  z , m      k 0  k co Pp2  z , k   Pc2  z , k   Pc3  z , k     m 1



m 1

z k 0

m 1

m





Pp1  z , m    zk 1 Pc1  z , k    c1  z , k    c2  z , k    c3  z , k  k 0



(36)

,

CR IP T

Csr 2  m, z  

where the expected cycle lengths due to corrective replacement of component 2 are respectively given as  zk 1  ˆ  c1  z , k   Fs  zk 1  Fh1  zk 1   h1 (th )G ( zk 1  th )exp   h1 ( x)Gˆ ( zk 1  u )du dth , t  zk  h  zk 1  Fh  zk 1    c2  z , k    th f s (ts ) Fh1  ts  1  2  dts ,   F t   h s zk 2  

AN US

zk 1

(37)

M

   ts   th Fs  ts  Fh1  ts  h1 (t )exp   h1 (u )Gˆ (ts  u )du    t  zk 1 zk 1   h   dts dth ,  c3  z , k      ts    zk th     Gˆ (ts  th )  s (ts )  h1 (ts )   h1 (u ) gˆ (ts  u )du   gˆ  t s  th        th    

 c  z, k   1

ED

and the expected downtimes incurred by soft failure of component 1 are respectively given as zk 1

  zk 1  ts  f s (ts )Fh ts  1

Fh2  ts 

dts ,  c2  z, k  

zk 1 zk 1

  t

zk

h

 ts  f s (ts ) f h (th )

ts

Fh1  ts 

Fh2  ts 

dth dts . (38)

PT

zk

Fh2  zk 1 

Thus, the optimization problem can be expressed as

CE

2  k  c  c P z , k + c  c j  z, k         in c1 d  c m j 1 c p Pp1  z , m      k 0  k c P z, k  P z, k  P z, k   o p2   c2   c3   

AC

min Cstr 2  m, z   m , z

m 1



m 1

z k 0

m 1

m



Pp1  z , m    zk 1 Pc1  z , k    c1  z , k    c2  z , k    c3  z , k 

subject to z  0, m  1, 2,

5



k 0





(39)

.

Case study In this section, we conduct a case study on an electrical distribution system (EDS) to illustrate the

applicability of the proposed reliability and maintenance model. Capacitor and transformer are two critical components of EDS and both are susceptible to harsh environmental stresses [3130]. Among them, failure of 21

ACCEPTED MANUSCRIPT

a capacitor bank (e.g., electric leakage, voltage breakdown) is soft and hidden, which does not cause the stoppage of the electrical system. Nevertheless, failure of a transformer (e.g., discharging, blast) is hard, i.e., whenever it occurs, the system stops operating immediately. It should be noticed that, these two failure modes in EDS may interact in one way or another. When a transformer fails, the voltage in the system will fluctuate, which causes additional damage to the capacitor bank, resulting in an increased hazard rate [17]. On the other hand, when the capacitor bank enters the soft

CR IP T

failure state, the electrical stress level of the transformer suffers a significant promotion, which increases its failure risk. To deal with the above-mentioned dependent failures, maintenance windows are provided according to a constant interval, during which inspections can be performed to detect failures of the capacitor bank.

AN US

The intensities of both soft failure and hard failure have the same form as follows b1  h1 (t )  a1  t h (t )   . b2  h2 (t )  a2  t

s (t )  a0  t , b0

(40)

The detailed values for the distribution parameters of the electrical distribution system are presented in

M

Table 1, and the values of separate maintenance cost items are provided in Table 2.

a0

Value

0.26

b0

a1

b1

a2

b2

w

0.88

0.25

0.64

0.75

0.64

0.5

PT

Parameter

ED

Table 1. Parameter values for failure distributions of the electrical distribution system

Item

cc1

c p1

c p2

cin

cd

cm

AC

CE

Table 2. Parameter values for system availability and optimal inspection policy analysis.

8000 (yuan)

5000(yuan)

11000(yuan)

1000(yuan)

3000(yuan/month)

2000(yuan)

Value

5.1 Reliability evaluation Under the parameter setting given in Table 1, the reliability and density functions of the lifetime of the capacitor bank can be calculated by Eq.(15), as shown in Fig. 5 and Fig. 6. Notice that, w  0 means that failures of the capacitor bank are not affected by failures of the transformer. In addition, we can observe from Fig. 3 that, the impact of hard failures on the reliability of the capacitor bank is slight at the beginning of the 22

ACCEPTED MANUSCRIPT

operating stage. Nevertheless, when the operational time increases to a certain level, the reliability function is affected by hard failures in a more significant way due to the continually cumulated hazard rate increment. Finally, one can see that the two reliability functions approach zero at the range of t 17,20 . 1

w=0 w=0.5

CR IP T

0.8

Rs(t)

0.6

0.4

0 0

2

4

6

AN US

0.2

8

10 t

12

14

16

18

20

Fig. 5. Reliability function of the capacitor bank. 0.12

M

0.1

PT

0.04

ED

fs(t)

0.08 0.06

w=0 w=0.5

0.02

2

4

6

8

10 t

12

14

16

18

20

Fig. 6. Density function of the capacitor bank.

AC

CE

0 0

5.2 Maintenance cost comparison Based on the algorithm developed in Section 3.2, we are able to seek for the optimal maintenance costs

of the EDS system under the proposed maintenance strategy and the two strategies introduced in Section 4. The values of the control parameters in the ABC algorithm are given in Table 3. Here the selection of control parameters requires some explanations. It has been shown that the ABC algorithm is not sensitive to the size of colony NP. Hence, a normal size 20 is enough for the optimization calculation [33]. On the other hand, the

23

ACCEPTED MANUSCRIPT

setting of non-improvement number N max has a large impact on the optimization outcome. A large N max will reduce the exploration ability of the algorithm, and small one is not good for the cooperative search of the whole colony. Generally, it is recommended to set N max to be the multiple of the colony number NP and the solution dimension D [34]. Hence, we set N max  20  2  40 . Finally, the value of the maximum inspection number M and the upper bound of the inspection interval z up must ensure that the optimal

requirement.

CR IP T

solution is covered. According to our calculation, M  30 and z up  20 can effectively meet this

Table 3. Values of Control parameters in the ABC algorithm. NP

M

Value

20

30

Nmax

z

AN US

Parameter

40

dn

, z up  , month

 0,20

The optimization outcomes of the optimal maintenance costs under the proposed maintenance strategy and two compared maintenance strategies are provided in Table 4. To validate the precision and efficiency of

M

the optimization results, we also calculate the optimal solutions using the enumeration method. The results

ED

show that, the precisions of the optimal z* under the three strategies are all above 99.7% using ABC, and the precisions of the optimal m* are all 100%. Moreover, the ABC algorithm can save about 85%

PT

calculation time.

Table 4. Comparison of the three maintenance strategies in terms of the optimal maintenance cost

CE

and the corresponding decision variable Optimal decision variables

Optimal expected cost per unit time

Proposed strategy

 z , m   1.12 month, 6

C  z* , m*   5947yuan/month

Strategy 1

 z , m   1.03 month,6

Cstr1  z* , m*   7059yuan/month

Strategy 2

 z , m   1.34 month, 5

Cstr 2  z* , m*   6512yuan/month

AC

Maintenance Strategy

*

*

*

*

*

*

We can easily conclude from Table 4 that, although Strategy 1 and Strategy 2 are convenient for execution, they are sub-optimal from the perspective of saving maintenance cost. Compared with the proposed strategy, an extra 18.6% and 9.5% maintenance cost is incurred. This validates the superiority of

24

ACCEPTED MANUSCRIPT

our policy. It could also be observed that the optimal maintenance interval for our strategy is the smallest. This is because a more frequent replacement/inspection plan is required to mitigate the additional failure risks due to failure interaction. We then test the variation of the minimal maintenance cost and the corresponding optimal policy in terms of a critical parameter, the increment expectation w , as shown in Table 5. It can be seen that when w increases, both the optimal maintenance interval and the number of windows decrease while the average

CR IP T

long-run cost rate increases. An explanation for this is that the expected life cycle of the capacitor bank decreases as w increases. Hence, in order to reduce the penalty costs due to downtime, preventive actions including inspection and replacement should be performed more frequently when w is large. Furthermore, notice the fact that the maintenance cost of the transformer is larger than that of the capacitor bank.

transformer on the capacitor bank.

AN US

Consequently, if w is large, transformer should be replaced more frequently to reduce the impact of

Table 5. Optimal maintenance cost and the corresponding maintenance policy of the EDS in terms of w . m*

C  z* , m* 

7

4863

1.71

6

5441

1.12

6

5947

0.93

5

6369

0.72

5

6710

0.61

4

6988

z*

0.1

2.25

M

w

0.3

0.7 0.9

PT

1.1

ED

0.5

CE

5.3 Sensitivity analysis

To further validate the superiority of the proposed maintenance strategy, we conduct several sensitivity

AC

analyses on some critical distribution and cost parameters. We first test the sensitivity of the increment expectation

w . The comparison results are depicted in Fig. 10, where the range of w is  0,1 with step size 0.1. It can

be seen that the proposed strategy always incurs the minimal cost among the three strategies in this range. Nevertheless, the superiority of this strategy to strategy 2 is weakened as w increases. This is because a large w indicates a heavy influence of hard failure to soft failure, and it is more suggested to replace the transformer immediately upon the arrival of hard failure instead of at the subsequent maintenance window.

25

ACCEPTED MANUSCRIPT

8000 Proposed strategy Strategy 1 Strategy 2

7000 6500 6000 5500 5000 4500 4000

0

0.1

0.2

0.3

0.4

0.5 w

0.6

0.7

CR IP T

Optimal expected cost per unit time

7500

0.8

0.9

1

Fig. 7. Optimal expected cost per unit time in terms of w under the three strategies (step size 0.1).

AN US

We then test the sensitivity of the optimal expected cost per unit time on the impacts of three critical cost parameters, i.e., (a) minimal repair cost c m ; (b) preventive replacement cost of the transformer c p2 and (c)

corrective replacement of the capacitor bank c f1 . These parameters are chosen because they can directly reflect

M

the impacts of opportunistic maintenance and group maintenance on maintenance cost variation. 8000

7000

6500

Strategy 1 Strategy 2

ED

7500

PT

Optimal expected cost per unit time

Proposed strategy

CE

6000

5500

AC

5000

1000

1500

2000

2500 Cm

3000

3500

4000

Fig. 8. Sensitivity of the optimal expected cost per unit time to cm .

The sensitivity of the optimal maintenance cost to c m is shown in Fig. 8, where cm takes values from 1000 to 4000 with step size 500. Obviously, both strategy 1 and strategy 2 are not affected by this parameter since no minimal repair is provided. For this reason, we can immediately decide whether minimal repair is required via this sensitivity analysis. From Fig. 8, the proposed strategy always outperforms strategy 1, and is better than strategy 2

26

ACCEPTED MANUSCRIPT

unless cm  3500 . Therefore, we can conclude that the proposed strategy is a cost-effective option when a minimal repair is not very expensive. 8500 Proposed strategy Strategy 1 Strategy 2

7500

7000

6500

6000

5500

5000 0.8

1

1.1 Cp2

1.2

1.3

1.4 4

x 10

AN US

0.9

CR IP T

Optimal expected cost per unit time

8000

Fig. 9. Sensitivity of the optimal expected cost per unit time to c p2 . Similar results can be found in Fig. 9, which depicts the variation of the optimal maintenance cost in terms of

c p2 . Here c p2 takes values from 8000 to 14000 with step size 1000. We can observe that both the proposed

M

strategy and strategy 2 outperform strategy from the perspective of saving cost. Additionally, the proposed

ED

strategy is more cost-effective than strategy 2 until c p2 exceeds 13500. This illustrates that maintainers tend to wait for a well-scheduled preventive replacement when the replacement cost is not too high. 8000

Strategy 1 Strategy 2

7500

AC

CE

Optimal expected cost per unit time

PT

Proposed strategy

7000

6500

6000

5500 5000

6000

7000

8000 Cf1

9000

10000

11000

Fig. 10. Sensitivity of the optimal expected cost per unit time to c f1 . Finally, we investigate the sensitivity of the optimal maintenance cost to c f1 , where c f1 takes values from 5000 to 11000. From Fig. 10, we can easily see that the proposed strategy is always better than the other two 27

ACCEPTED MANUSCRIPT

strategies regardless of the value of c f2 . Additionally, the advantage of the proposed strategy to strategy 2 is more obvious when c f2 is small. An explanation is that when the consequence of failure is not serious, it is more encouraged to leave the renewals of both components to a scheduled window, with the purpose of sufficiently preparing maintenance resources and utilizing the remaining useful lifetime.

6 Final remarks

CR IP T

This study innovatively presented a group maintenance strategy for a two-component system with failure interaction. Different from most existing literatures, we simultaneously considered two categories of dependence from the perspective of random hazard rate increment. Renewal of this system is always scheduled at maintenance windows, which enables the maximum utilization of economic dependence by

AN US

sharing the set-up cost and downtime cost between components. We provided a case study on an electrical distribution system to illustrate the superiority of the maintenance policy. The result shows that our strategy is more cost-effective than two classic/advanced group maintenance strategies, particularly when the hazard rate increment or the minimal repair cost is not too large. The optimization results can provide managers of

M

energy devices, manufacturing plants and critical infrastructures some new insights about advanced cost control and maintenance scheduling methods through a fully consideration of system structures and failure

ED

mechanisms. It is worth noting that, our maintenance strategy can be relaxed to several classic/advanced maintenance strategies under specific parameter settings. Hence, managers can flexibly adjust/modify these

PT

maintenance parameters according to the properties of detailed repairable systems. As a further extension, it would be interesting to consider an opportunistic condition-based maintenance

CE

schedule for a two-component system. The rapid development of sensing technologies enables executions of maintenance actions based on the current system state. Combining maintenance programs with the economic

AC

manufacturing quantities (EMQ) [40] or inventory decisions [41] is another interesting extension for future study, since an effective maintenance policy heavily relies on the operation conditions of production systems and the accessibilities of spare components [42]. We are also interested in incorporating the failure dependence into warranty menu designs.

Acknowledgment This work was supported in part by the National Natural Science Foundation of China (Grant No. 61473014).

28

ACCEPTED MANUSCRIPT

References

AN US

CR IP T

[1] Q. Qiu, L. Cui, H. Gao, Availability and maintenance modelling for systems subject to multiple failure modes, COMPUT. IND. ENG. 108 (2017) 192-198. [2] L. Yang, Y. Zhao, R. Peng, Y. Zhao, Hybrid preventive maintenance of competing failures under random environment, RELIAB. ENG. SYST. SAFE. 174 (2018) 130-140. [3] L. Yang, ZS. Ye, CG Lee, SF Yang, R Peng, A two-phase preventive maintenance policy considering imperfect repair and postponed replacement. EUR. J. OPER. RES. 274(3) (2019) 966-977. [4] S. Song, D.W. Coit, Q. Feng, H. Peng, Reliability Analysis for Multi-Component Systems Subject to Multiple Dependent Competing Failure Processes, IEEE. T. RELIAB. 63(1) (2014) 331-345. [5] P. Li, W. Wang, R. Peng, Age-based replacement policy with consideration of production wait time, IEEE. T. RELIAB, 65(1) (2015) 235-247. [6] Q. Qiu, L. Cui, Reliability evaluation based on a dependent two-stage failure process with competing failures, APPL. MATH. MODEL. 64 (2018) 699-712. [7] C.D. Lai, M. Xie, Stochastic ageing and dependence for reliability, Springer, London, 2006. [8] S. Taghipour, D. Banjevic, Periodic inspection optimization models for a repairable system subject to hidden failures, IEEE. T. RELIAB. 60(1) (2011) 275-285. [9] B. Liu, R.H. Yeh, M. Xie, W. Kuo, Maintenance Scheduling for Multicomponent Systems with Hidden Failures, IEEE. T. RELIAB. 66(4) (2017) 1280-1292.

AC

CE

PT

ED

M

[10] M.T. Lai, Y.C. Chen. Optimal periodic replacement policy for a two-unit system with failure rate interaction. INT. J. ADV. MANUF. TECH. 29(3-4) (2006) 367-371. [11] K.T. Huynh, A. Barros, C. Berenguer, I.T. Castro, A periodic inspection and replacement policy for systems subject to competing failure modes due to degradation and traumatic events, RELIAB. ENG. SYST. SAFE. 96(4) (2011) 497-508. [12] N.C. Caballé, I.T. Castro, Analysis of the reliability and the maintenance cost for finite life cycle systems subject to degradation and shocks, APPL. MATH. MODEL. 52 (2017) 731-746. [13] L. Jiang, Q. Feng, D.W. Coit, Reliability and maintenance modeling for dependent competing failure processes with shifting failure thresholds, IEEE. T. RELIAB. 61(4) (2012) 932-948. [14] D.N.P. Murthy, D.G. Nguyen, Study of two-component system with failure interaction, NAV. RES. LOG. 32(2) (1985) 239-247. [15] J.P. Jhang, S.H. Sheu, Optimal age and block replacement policies for a multi-component system with failure interaction, INT. J. SYST. SCI., 31(5) (2000) 593-603. [16] R.I. Zequeira, C. Bérenguer. On the inspection policy of a two-component parallel system with failure interaction. RELIAB. ENG. SYST. SAFE. 88(1) (2005) 99-107. [17] H.R. Golmakani, H. Moakedi, Periodic inspection optimization model for a two-component repairable system with failure interaction, COMPUT. IND. ENG. 63(3) (2012) 540-545.

[18] I.T. Castro, A model of imperfect preventive maintenance with dependent failure modes, EUR. J. OPER. RES. 196(1) (2009) 217-224. [19] Q. Qiu, L. Cui, H. Gao, H. Yi, Optimal allocation of units in sequential probability series systems, RELIAB. ENG. SYST. SAFE. 169 (2018) 351-363. [20] K. Atashgar, H. Abdollahzadeh, Reliability optimization of wind farms considering redundancy and opportunistic maintenance strategy, ENERG. CONVERS. MANAGE. 112 (2016) 445-458.

29

ACCEPTED MANUSCRIPT

AC

CE

PT

ED

M

AN US

CR IP T

[21] C. Zhang, W. Gao, S. Guo, Y. Li, T. Yang, Opportunistic maintenance for wind turbines considering imperfect, reliability-based maintenance, RENEW. ENERG. 103 (2017) 606-612. [22] Q. Qiu, L. Cui, J. Shen, L. Yang, Optimal Maintenance Policy Considering Maintenance Errors for Systems Operating under Performance-based Contracts, COMPUT. IND. ENG. 112(2017) 147-155. [23] L.D. Arya, S.C. Choube, R. Arya, Probabilistic reliability indices evaluation of electrical distribution system accounting outage due to overloading and repair time omission, INT. J. ELEC. POWER. 33(2) (2011) 296-302. [24] A.J. Lemoine, M. L, Wenocur. On failure modeling, NAV. RES. LOG. 32(3) (1985) 497-508. [25] J.H. Cha, J. Mi, On a Stochastic Survival Model for a System Under Randomly Variable Environment. METHODOL. COMPUT. APPL. 13(3) (2011) 549-561. [26] J.H. Cha, M. Finkelstein, G. Levitin, On preventive maintenance of systems with lifetimes dependent on a random shock process, RELIAB. ENG. SYST. SAFE. 168 (2017) 90-97. [27] J. Arts, R. Basten, Design of multi-component periodic maintenance programs with single-component models, IIE. TRANS. 50(7) (2018) 606-615. [28] L. Wang, H. Hu, Y. Wang, W. Wen, P. He, The availability model and parameters estimation method for the delay time model with imperfect maintenance at inspection, APPL. MATH. MODEL. 35(6) (2011) 2855-2863. [29] Q. Zhu, H. Peng, B. Timmermans, G.J.V. Houtum, A condition-based maintenance model for a single component in a system with scheduled and unscheduled downs, INT. J. PROD. ECON. 193 (2017) 365-380. [30] N.C. Sahoo, S. Ganguly, D. Das, Multi-objective planning of electrical distribution systems incorporating sectionalizing switches and tie-lines using particle swarm optimization, SWARM. EVOL. COMPUT. 3 (2012) 15-32. [31] R.N. Allan, R. Billinton, I. Sjarief, L. Goel, K.S. So, A reliability test system for educational purposes-basic distribution system data and results, IEEE. T. POWER. SYST. 6(2) (1991) 813-820. [32] D. Karaboga, B. Akay, A modified Artificial Bee Colony (ABC) algorithm for constrained optimization problems, APPL. SOFT. COMPUT. 11(3) (2011) 3021-3031. [33] D. Karaboga, B. Basturk, A powerful and efficient algorithm for numerical function optimization: Artificial bee colony (ABC) algorithm, J. GLOBAL. OPTIM. 39(3) (2007) 459-471. [34] B. Akay, D. Karaboga, Artificial bee colony algorithm for large-scale problems and engineering design optimization, J. INTELL. MANUF. 23(4) (2010) 2001-2014. [35] A. Sing, An artificial bee colony algorithm for the leaf-constrained minimum spanning tree problem, APPL. SOFT. COMPUT. 9(2) (2009) 625-631. [36] H. Wang, W. Wang, R. Peng. A two-phase inspection model for a single component system with three-stage degradation. RELIAB. ENG. SYST. SAFE. 158 (2017) 31-40. [37] L. Yang, X. Ma, R. Peng, Q. Zhai, Y. Zhao, A preventive maintenance policy based on dependent two-stage deterioration and external shocks, RELIAB. ENG. SYST. SAFE. 160 (2017) 201-211. [38] E.A. Elsayed, Reliability Engineering, second ed., Springer, Berlin, 2010. [39] M. Shafiee, M. Finkelstein, An optimal age-based group maintenance policy for multi-unit degrading systems. RELIAB. ENG. SYST. SAFE, 134 (2015) 230-238. [40] M.K. Salameh, M.Y. Jaber, Optimal lot sizing with regular maintenance interruptions. APPL. MATH. MODEL, 21(2) (1997) 85-90. [41] M.Y. Jaber, S.K. Goyal, M. Imran, Economic production quantity model for items with imperfect quality subject to learning effects, INT. J. PROD. ECON. 115(1) (2008) 143-150. [42] C.H. Glock, M.Y. Jaber, A multi-stage production-inventory model with learning and forgetting effects,

30

ACCEPTED MANUSCRIPT

rework and scrap. COMPUT. IND. ENG, 64(2) (2013) 708-720.

Appendix A. A.1. Derivation of Eq. (14) The Derivation process is as follows. Summing up Eqs. (10)-(13), the joint probability of

 X s  t, Nh (t )  n

can be calculated as

 t t  exp     s (u )  h (u )  du    0 0

s2 



n

     (s ) g (w )exp  w (t  s )  dw 0 i 1

0 0

h

i

i

Which can be further simplified as

P( X s  t , N h (t )  n) s2 



 I 0 0

 t  t   exp     s (u )  h (u )  du     0  0 0

n

s1  s2   sn

0



I 0

s1  s2 

i

i

1

  (s ) g (w )exp  w (t  s )  dw

AN US

 t t  exp     s (u )  h (u )  du    0 0

CR IP T

Pr  X s  t , N h (t )  n

i 1

h

i

i

i

i

dwn ds1

dwn ds1

1

 n    sn   h ( si ) g ( wi )exp   wi (t  si )  dwi  du   i 1  

(41) dsn,

dsn

n

(42)

n

M

 t  t    exp     s (u )  h (u )  du    h (u )   g (v)exp  v(t  u )  dv  du     0  0 0    , n!

ED

where I is an indicator function whose value is 1 if the statement is true, and 0 otherwise.

CE

PT

Let Gˆ (u) denotes the Laplace Transform of g  v  , i.e., 

Gˆ (u )   exp(uv) g  v  dv.

(43)

0

AC

Then Eq.(42) becomes

n

 t  t  exp     s (u )  h (u )  du   h (u )Gˆ (t  u )du   0  0  . P ( X s  t , N h (t )  n)  n!

(44)

A.2. Proof of Proposition 3 The detailed proof is provided as follows. Given that the first hard failure of component 2 occurs at

th   zk , zk 1  , the instant hazard rate of component 1 suffers an abrupt jump as follows

31

ACCEPTED MANUSCRIPT

s (th ), s (th )  w1 ,

before the hard failure , after the hard failure

s (th )  

(45)

where w1 is the realization of the random hazard rate increment due to the first hard failure. Then based on the conclusion in Proposition 1, we are able to calculate the conditional survival function

Pr Ts  zk 1 | Th  th  as zk 1  th  Pr Ts  zk 1 | Th  th    exp    s (u )du   s (u )  w1  h1 (u )  h1 (u )  Gˆ ( zk 1  u )  du  g  w1  dw1  0  0 th   (46) zk 1     Fs  sk 1   exp    h1 (u )  h1 (u )  Gˆ ( zk 1  u )  du   exp   w1  zk 1  th   g  w1  dw1 .  t  0  h 









CR IP T





AN US

Recall that Gˆ (t )   exp   w1t g  w1  dw1 is the Laplace transformation of g  w1  . Hence, Eq. (46) can 0

be simplified as

 zk 1  ˆ Pr Ts  zk 1 | Th  th   Fs  zk 1  G ( zk 1  th )exp    h1 (u )  h1 (u )Gˆ ( zk 1  u ) du  ,  t   h 



(47)

M

which concludes the proof.



ED

A.3. Proof of Proposition 4

The detailed proof is given as follows. Based on Proposition 2, the survival function of soft failure at t s

PT

given that the first hard failure occurs at t h is

CE

 ts  ˆ Rs  ts | Th  th   G (ts  th ) Fs  ts  exp    h1 (u )  h1 (u )Gˆ (ts  u ) du  .  t   h 



AC

Differentiating Rs  ts | Th  th  with respect to t s , we can obtain

32



(48)

ACCEPTED MANUSCRIPT

f s  ts | Th  th   

dRs  ts | Th  th  dts

  ts    Gˆ (ts  th ) Fs  t s  exp    h1 (u )  h1 (u )Gˆ (t s  u ) du     t    ts  h1 (u )      h   du     gˆ  ts  th  Fs  ts  exp      ts  th  h (u )Gˆ (t s  u )           1 ˆ   s (ts )  h1 (ts )  h1 (th )   h1 (u ) g (ts  u )du   th       s (ts )  h1 (ts )     ts   t  ˆ ˆ s  Fs  ts  exp    h1 (u )  h1 (u )G (t s  u ) du    G (t s  th )   gˆ  t s  th   .   t      h1 (u ) gˆ (ts  u )du   h    th   



(49)



CR IP T





This concludes the proof.

Appendix B

Algorithm 1. Procedure of generating initial populations. 2:

do

AN US

1: for j  1 to NP / 2

Generate the initial solution z mj according to Eq. (50)

zmj  z dn    z up  z dn  ,

(50)

where  is uniformly distributed within range  0,1 ; z dn and z up are the lower and upper bound of z mj respectively; Set non-improvement number nnon  0 ;

M

3:

ED

4: end for

Algorithm 2: procedure of executing employed and onlooker bee phase. 1: for j  1 to NP / 2 do

Generate new solutions zmjnew according to Eq.(51)

PT

2:

zmjnew  zmj    smj  smi  ,

3:

NP / 2 and i  j ;

Calculate the expected cost rate C  m, zmj  and C  m, zmjnew  respectively based Eq.(33);

if C  m, zmj   C  m, zmjnew  then

AC

4:

CE

where  is a uniformly distributed number within  1,1 ; i 1,

(51)

z mj  z mjnew ; nnon  0 ; 5: 6: else nnon  nnon  1 ; 7: 8: end if 9: end for

33

ACCEPTED MANUSCRIPT

Algorithm 3: procedure of executing scout phase. 1: for j  1 to NP / 2 do 2: if nnon  N lim then

AC

CE

PT

ED

M

AN US

CR IP T

3: replace z mj with a new solution generated by Eq.(50); 4: end if 5: end for

34