Computers & Industrial Engineering 66 (2013) 710–719
Contents lists available at ScienceDirect
Computers & Industrial Engineering journal homepage: www.elsevier.com/locate/caie
Optimal replacement policy for a repairable system with multiple vacations and imperfect fault coverage q Madhu Jain a, Ritu Gupta b,⇑ a b
Department of Mathematics, I.I.T. Roorkee, Roorkee, Haridwar, Uttrakhand 247 667, India Department of Mathematics, Institute of Basic Science, Khandari, Agra 282 002, India
a r t i c l e
i n f o
Article history: Received 26 November 2012 Received in revised form 9 September 2013 Accepted 12 September 2013 Available online 21 September 2013 Keywords: Reliability Multiple vacations Imperfect fault coverage Optimal replacement policy Supplementary variable technique
a b s t r a c t The present investigation deals with the reliability analysis of a repairable system consisting of single repairman who can take multiple vacations. The system failure may occur due to two types of faults termed as major and minor. When the system has failed due to minor faults, it is perfectly recovered by the repairman. If the system failure is due to major faults, there are some recovery levels/procedures that recover the faults imperfectly with some probability. However, the system cannot be repaired in ‘as good as new’ condition. It is assumed that the repairman can perform some other tasks when either the system is idle or waiting for recovery from the faults. The life time of the system and vacation time of the repairman are assumed to be exponential distributed while the repair time follows the general distribution. By assuming the geometric process for the system working/vacation time, the supplementary variable technique and Laplace transforms approach are employed to derive the reliability indices of the system. We propose the replacement policy to maximize the expected profit after a long run time. The validity of the analytical results is justified by taking numerical illustrations. Ó 2013 Elsevier Ltd. All rights reserved.
1. Introduction During recent years, with the growing complexity of the modern embedded applications in many real time systems viz. computer, communication, electrical power, distributed computing, production, transportation, defense systems, etc., maintaining the system has become a great challenge for the system developers and engineers. In dealing with repairable systems, the renewal is taken into account by providing the repair or replacement. The effectiveness or performance of the systems depends on recovery/reconfiguration mechanism including fault detection, location and isolation so that the required level of reliability and availability is achieved. Sometimes, these mechanisms may fail to recover the system from the faults. The undetected/uncovered faults can propagate through the system, and may lead to an overall system failure. This phenomenon is called imperfect fault coverage. Considerable research efforts have been made in the reliability analysis of the repairable systems with imperfect fault coverage. In this context, we present some earlier research works which are worth mentioning here. Pham (1992) studied the reliability analysis of a high voltage system consisting of a power supply and two
q
The manuscript was handled by the area editor Min Xie.
⇑ Corresponding author. Tel.: +91 1292261719.
E-mail addresses:
[email protected] (M. Jain), gupta_ritu84@yahoo. co.in (R. Gupta). 0360-8352/$ - see front matter Ó 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.cie.2013.09.011
transmitters with imperfect coverage by assuming constant failure rate of fault coverage. The reliability of M-out-of-N: G system with imperfect fault coverage under the provision of single and multiple repair facilities was examined by Akhtar (1994). Amari, Dugan, and Misra (1999) and Amari, Pham, and Dill (2004) discussed the optimal design issues to maximize the system reliability subject to imperfect fault coverage. An efficient method to examine the reliability of phased-mission systems with imperfect fault coverage by introducing multi-mode failures was given by Xing (2007) whereas Myers (2007) did the modeling for the reliability of Mout-of-N system for four different coverage models. In this direction, the analysis of repairable system with imperfect coverage by means of binary decision diagrams and Bayesian approach has been considered by Myers and Rauzy (2008), and Ke, Lee, and Hsu (2008b), respectively while a repairable system with imperfect coverage under the fuzzy environment was discussed by Ke, Huang, and Lin (2008a). Kuniewski, Weide, and Noortwijk (2009) have done sampling inspection for the reliability evaluation of deteriorating systems under imperfect fault detection. Simulation inferences for a repairable system with general repair distribution and imperfect fault coverage have been studied by Ke, Su, Wang, and Hsu (2010). Xing, Amari, and Wang (2012) proposed an efficient method to evaluate the exact reliability of a system under phased-mission requirements and imperfect fault coverage. Recently, Wang, Yen, and Jian (2013) investigated reliability and sensitivity analysis of a repairable system with imperfect coverage under service pressure
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
condition whereas the reliability analysis for optimal structure of multi-state redundant system with multi-fault coverage has been examined by Peng, Mo, Xie, and Levitin (2013). Most of the repairable systems are deteriorative since the age of the system components can neither remain uniform for long nor the operating time of the system can be continuous. Due to techno-economic constraints, the successive working times of the system after repair become shorter and the consecutive repair times after failures becomes longer; then in this situation the replacement is better option instead of providing repair to the system. It is realized by the decision makers to choose the optimal number of replacements after providing the N times repair to the failed system; such type of problem is known as optimal replacement policy. The growing interest of the researchers in the optimal replacement problems in different scenarios can be noticed in the literature. In 1988, Lam first introduced the geometric process to describe the optimal replacement problem. The objective was to replace the failed system depending upon the optimal working age or the time of the Nth failure. Zhang (1999) studied a cold standby repairable system having two identical components and single repairman. He maximized the long-run expected reward per unit time to design the optimal replacement policy based on the number of repairs of a component. Zhang and Wang (2007) discussed the optimal replacement policy based on the working age of a component in a cold standby repairable system consisting of two dissimilar components wherein one component is given priority in use. Later on Zhang and Wang (2009) examined the same system to provide the optimal replacement policy based on the number of repairs of a component. Lin, Jian, Zhijun, and Bo (2011) presented condition-based replacement policy for cold standby repairable system under the impact of imperfect maintenance actions. Most recently, an optimal replacement policy for a repairable deteriorating system subject to random shocks following generalized d-shock process has been proposed by Zong, Chai, Zhang, and Zhao (in press). In the studies mentioned above, it is considered that the repairman is always available to repair the failed system but in practice it is not true. It can be seen in many real time repairable systems that the repairman may perform some secondary jobs or might take a sequence of vacations of random durations when he is idle so as to make time utilization or improve the profit of the system. The vacation of the repairman may be of different durations; such a vacation policy is termed as multiple vacation policy and has been considered by many researchers to facilitate the optimal replacement policy concerned with the reliability analysis. Zhang (2008) presented a geometrical process repair model for a simple repairable system with delayed repair. This work has been extended by Zhao and Yue (2011) by incorporating the multiple delayed vacations. A replacement policy for a repairable system where the repairman takes multiple vacations was given by Jia and Wu (2009) and Yuan and Xu (2011a, 2011b). An extended replacement policy for a deteriorating system including catastrophic and noncatastrophic failure modes was suggested by Zhang and Wang (2011). A significant contribution has been made by Cheng, Li, and Tang (2012). They presented the optimal replacement policy for a deteriorating series repairable system. Recently, Yu, Tang, Liu, and Cheng (2013) presented optimal maintenance policy by developing a phase-type geometric process repair model with spare device procurement and repairman’s multiple vacations. In some papers, the concept of random repairman vacation time is also referred to as random inter-inspection time dealing with the repairable systems. Naidu and Gopalan (1983) dealt with the analysis of two-unit repairable system with two types of failures subject to random inspection. Badia, Berrade, and Campos (2002) discussed the reliability characteristics of a single unit system with revealed and unrevealed failures subject to optimal inspection and preventive maintenance having the aim of minimizing the cost per
711
unit of time for an infinite time span by selection of a unique interval for both inspection and maintenance. Sugiura, Mizutani, and Nakagawa (2006) considered the random inspection policy and discussed the optimal checking time which minimizes the expected cost. Replacement and inspection policies for products have been considered by Yun and Nakagawa (2010). They obtained the total expected maintenance cost of products with random life cycle and the optimal replacement/inspection interval minimizing the cost. Huynh, Barros, Berenguer, and Castro (2011) suggested condition-based periodic inspection and replacement policies for singleunit systems subject to competing and dependent failures due to deterioration and traumatic shock-events. In this context, Le and Tan (2013) investigated inspection maintenance scheme for deteriorating systems based on an iterative algorithm to minimize the mean long-run cost rate. From literature, it is noticed that the interest in studying the repairable systems has mainly been focused on the system’s behavior depending upon its failure and repair characteristics without taking the concept of repairman’s vacations. We have not found any research work on repairable system with vacation by considering the system recovery/repair according to fault severities. If a little effort is sufficient to recover the system; it is meaningless to provide the full system repairing or replacement that may ultimately lead to loss of money to the system organizers. In case of faults of high severity, there may be provision of several recovery procedures. Therefore it is important to consider the system’s failure/repair behavior depending upon the faults severities. This paper extends the investigation done by Yuan and Xu (2011a) by incorporating the concepts of system failure due to major or minor faults and their recovery can be done according to their level of coverage. Moreover, the concept of imperfect fault coverage in the system is also taken into consideration since the effect of this factor may reflect many real time applications so as to implement the cost-effective design policies. Practical justification of the model: A number of realistic situations arise in software embedded system that can be modeled and analyzed by using recovery/repair. A well planned recovery/repair process is required to improve the system performance at optimum cost. In such systems more profit can be achieved when the repairman in the system might take some secondary jobs/tasks. For illustration, we cite a telecommunication switching system consisting of a group of processors which is implemented based on Clustered Computing Architecture and that uses the Lucent Technologies Reliable Clustered Computing (RCC) product. All the processors are interconnected through a LAN hub that provides connectivity with other elements of the switching system. In this system, there are power dogs (PD) attached to each processor that can power cycle or power down the processor. A watch dog (WD) connected to all the power dogs monitors the functioning from each processor and gives the alert if it detects a processor failure. If the technical engineer finds an error in the system due to minor fault, for example failure of network connections, he immediately starts recovery and removes the error. When a major fault is detected in a processor, recovery is completed into several levels. The diagnostics program that is responsible for the monitoring, maintaining and recovering/repairing of the fault of the system, may be unable to identify the fault whether it is hardware fault or software fault occurred in the processor. In case of fault, if the technical engineer is available, he tries to apply a sequence of recovery actions from least severe to most severe until recovery is successful. At the initial level 1, another working processor switches over the failed processor. If this level does not resolve the fault then the most frequent level 2 ‘process start’ is applied. In case if the problem is not removed at level 2 from the processor, the most severe recovery level 3 is applied that is ‘processor restart after data reload from disk’ which may affect a single processor or all the processors in the cluster. Further, sometimes an engineer
712
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
may allow changing the order of recovery procedures or may skip some recovery levels for any specific failure depending on the nature of the failure. The rest of the paper is organized as follows. Section 2 describes the mathematical model by stating the requisite notations and assumptions. Section 3 facilitates the system of governing equations and its analysis to obtain the transient probabilities in terms of Laplace transform. Section 4 is devoted to derive some explicit expressions for the system performance measures. To examine the expected profit per unit time of the system, optimal replacement policy is suggested in Section 5. Finally concluding remarks are given in Section 6.
2. Model description We consider a repairable system having a single repairman who may go for multiple vacations. The system leads to the deterioration with respect to time; initially system is new but due to deterioration process, as time grows and system fault occurs, it cannot be repaired as good as new one. The system failure occurs due to some faults and these faults are classified as major and minor faults. When the system failure occurs due to minor faults then it is recovered perfectly by the repairman. However, when the system fails due to major faults, due to imperfect recovery process, the repairman repairs the system in multi-stages but the system cannot be repaired in ‘as good as new’ condition. The time interval between the completion of the (n 1)th level repair and the completion of the nth repair of the system is called the nth cycle of the system where n P 1. After completing the successful repair, the repairman goes for vacation.
8 0; > > > > > 1; > > > > > > > > < 2; IðtÞ ¼ > > > > > > m; > > > > > > > :
if the system is in working state at time t
Ym(t) Zn(t) Z nv ðtÞ Vn(t)
S(t) Fn(t) Gm(t)
Hnv ðtÞ Wn(t) gm(t) g m ðsÞ a1(a2) b1(b2)
lm(y) k1 ðk2 Þ h1(h2) C1 C2 C3 C4 C5 N Pi,n(t) Qi,n(t)
if the system is in failed state due to major faults and the repaiman is on vacation if the system is in failed state due to minor faults and the repaiman is on vacation
Pj,n(t, y)dy
Distribution function of the vacation time in the nth ðn P 1Þ cycle Distribution function of the vth ðv P 1Þ vacation time in the nth ðn P 1Þ cycle Distribution function of the waiting time after the occurrence of nth ðn P 1Þ failure at time t PDF of the repair time for the mth ð3 6 m 6 lÞ level recovery Laplace transform of gm(t) Ratio of the geometrical process of the system working distribution for major (minor) faults Ratio of the geometrical process of the repairman’s vacation time distribution for major (minor) faults Repair rate to complete mth ð3 6 m 6 lÞ level recovery Mean failure rate of the system due to major (minor) faults Mean vacation rate of the repairman recovering the failed system that cause major (minor) faults Repair cost per unit time of the system Cost of reward per unit time of the system when the system is working Cost of reward per unit time when the repairman is on vacation Cost per unit time incurred in waiting for the repair after the system has failed Cost of replacement of a new system Number of repairs before replacement Probability that the system is in state i (i = 0, 1, 2, 3, . . . , l) at time t in the nth ðn P 1Þ cycle Probability that the system is in state i (i = 0, 1, 2, 3, . . . , l) at time t in the nth ðn P 1Þ cycle without reaching to the absorbing states Joint probability that the system is in state j (j = 3, 4, 5, . . . , l) at time t and the elapsed repair time lies in (y, y + dy) in the nth ðn P 1Þ cycle
if the system is failed and the repaiman is repairing the failed system for mthlevel recovery; 3 6 m 6 l
Let working state of the system is W = {0} and the set F = {1, 2, 3, . . . , l} represents the failure states. To study the system, following notations and assumptions are made: Notations: Xn(t)
Hn(t)
Random variable denoting the working time of the system in the nth ðn P 1Þ cycle at time t Random variable denoting the repair time for the mth ð3 6 m 6 lÞ level recovery at time t Random variable denoting the vacation time of the repairman in the nth ðn P 1Þ cycle at time t Random variable denoting the vth ðv P 1Þ vacation time of the repairman in the nth ðn P 1Þ cycle at time t Random variable denoting the waiting time of the system after the occurrence of nth ðn P 1Þ failure at time t s.t. E(Vn) = s < +1 Number of cycles in the interval (0, t) Distribution function of the working time in the nth ðn P 1Þ cycle Distribution function of the repair time for the mth ð3 6 m 6 lÞ level recovery; Gm ðtÞ ¼ 1 Gm ðtÞ
Assumptions: At time t = 0, the system is new one and starts to work. The repairman recovers the system perfectly if the system fails due to minor faults. In case of major faults, m ð3 6 m 6 lÞ recovery levels are applied sequentially by the repairman to recover the system. If the recovery procedure at mth ð3 6 m 6 l 1Þ level is imperfect (i.e. unsuccessful) with probability pm2, then the next (m + 1)th level recovery is performed by the repairman to repair the system. If the repairman indentifies a specific failure due to major faults, he applies the recovery procedure to mth ð3 6 m 6 lÞ level with probability qm2 without following any sequential recovery levP els s.t. q ¼ l2 i¼1 qi . The repairman takes his first vacation after the system has started. After returning from the vacation, if the repairman finds a failure in the system waiting for repair then he immediately starts repair to recover from the faults which may be either major or minor, until the recovery is completed. In case of no fault in the system, after returning back from first vacation, the repairman takes his second vacation immediately and so on. The working time of the system and vacation time of the repairman are considered to be exponentially distributed. The repair time distribution is assumed to be general independent and identically distributed (i.i.d.).
713
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
The working and vacation states of the system are described by the continuous-states stochastic processes according to geometrical distribution. The cumulative distribution functions of working and vacation time are given by n1 k
F n ðtÞ ¼ Fðan1 tÞ ¼ 1 eðan n
n1
n1
Hn ðtÞ ¼ Fðbn tÞ ¼ 1 eðbn
n tÞ
;
hn tÞ
;
Gm ðtÞ ¼
0
and
R1 0
t
g m ðyÞdy ¼ 1 e
Rt 0
Pi;k ð0Þ ¼
1; when i ¼ 0 and k ¼ 1 0;
n ¼ 1; 2
Taking Laplace transforms of the Eqs. (1)–(12) respectively, we get
bn P 1;
n P 1;
n ¼ 1; 2
ðs þ k1 þ k2 þ h1 þ h2 ÞP 0;1 ðsÞ ¼ 1
ð13Þ
ðs þ h1 qÞP1;1 ðsÞ ¼ k1 P0;1 ðsÞ
ð14Þ
ðs þ h1 ÞP 2;1 ðsÞ ¼ k2 P0;1 ðsÞ
ð15Þ
d sþ þ l1 ðyÞ P3;1 ðs; yÞ ¼ h1 q1 P1;1 ðsÞ þ h2 P2;1 ðsÞ dy
ð16Þ
d sþ þ lj2 ðyÞ Pj;1 ðs; yÞ ¼ 0; dy
ð17Þ
lm ðyÞdy
;
36m6l
tg m ðtÞdt ¼ l1 . m
The stochastic process fIðtÞ; t P 0g is non-Markovian because of assumption that the repair time is general i.i.d. distributed. The supplementary variable technique is employed so that the process fIðtÞ; y 6 Y m ðtÞ 6 y þ dy; SðtÞ ¼ k; t P 0; k P 1g constitutes a generalized Markov process. Now, we construct the differential equations governing the model by assuming the elapsed repair time as supplementary variable. For the first cycle i.e. k = 1:
d þ k1 þ k2 þ h1 þ h2 P0;1 ðtÞ ¼ 0 dt
ð1Þ
d þ h1 q1 þ h1 q2 þ þ h1 ql2 P1;1 ðtÞ ¼ k1 P 0;1 ðtÞ dt d þ h2 P2;1 ðtÞ ¼ k2 P0;1 ðtÞ dt
ð2Þ
ð3Þ
@ @ þ þ l1 ðyÞ P3;1 ðt; yÞ ¼ h1 q1 P 1;1 ðtÞ þ h2 P2;1 ðtÞ @t @y @ @ þ þ lj2 ðyÞ Pj;1 ðt; yÞ ¼ 0; @t @y
ð4Þ
j ¼ 4; 5; . . . ; l
ð5Þ
k1
0
0
i¼1 k1
ð19Þ
k1 s þ b2 h2 P2;k ðsÞ ¼ ak1 2 k2 P 0;k ðsÞ
ð20Þ
d k1 k1 sþ þ l1 ðyÞ P3;k ðs; yÞ ¼ b1 h1 q1 P1;k ðsÞ þ b2 h2 P2;k ðsÞ dy
ð21Þ
d sþ þ lj2 ðyÞ Past j;k ðs; yÞ ¼ 0; dy
ð22Þ
Pj;1 ðs; 0Þ ¼ pj3
ð6Þ
1 0
ð23Þ
lj3 ðyÞPj1;1 ðs; yÞdy þ h1 qj2 P1;1 ðsÞ;
d k1 þ b2 h2 P2;k ðtÞ ¼ ak1 2 k2 P 0;k ðtÞ dt
ð8Þ
@ @ k1 k1 þ þ l1 ðyÞ P3;k ðt; yÞ ¼ b1 h1 q1 P1;k ðtÞ þ b2 h2 P2;k ðtÞ @t @y
@ @ þ þ lj2 ðyÞ Pj;k ðt; yÞ ¼ 0; @t @y
¼
ð25Þ !
ak1 1 k1 k1
s þ b1 h1 q
C k ðsÞP0;1 ðsÞ
k1
s þ b2 h2
C k ðsÞP0;1 ðsÞ
k1
ð9Þ
b1 h1 q1 a1k1 k1
P 3;k ðs;yÞ ¼
ð26Þ
!
a2k1 k2
P2;k ðsÞ ¼
k1
s þ b1 h1 q
k1
þ
ð27Þ
b2 h2 a2k1 k2
!
k1
1 G1 ðsÞ
s þ b2 h2
! þ 1 esy G1 ðyÞC k ðsÞP0;1 ðsÞ ð28Þ
j ¼ 4; 5; . . . ; l
"
ð10Þ Pj;k ðs;yÞ ¼
P3;k ðt; 0Þ ¼ b1 h1 q1 P 1;k ðtÞ þ b2 h2 P2;k ðtÞ;
kP1
ð11Þ
lj3 ðyÞPj1;k ðt; yÞdy þ b1k1 h1 qj2 P1;k ðtÞ; k P 1 ð12Þ
k1
b1 h1 q1 ak1 1 k1 k1
k1
þ
b2 h2 a2k1 k2
!
k1
!
j3 Y 1 pm g m ðsÞ þ 1 ðsÞ G 1 m¼1 !#
s þ b1 h1 q s þ b2 h2 ! k1 j3 j3 Y X b h1 ak1 k1 þ 1 k11 qr pm g m ðsÞ þ qj2 s þ b1 h1 q m¼r r¼2
The boundary conditions are k1
ð24Þ
P0;k ðsÞ ¼ C k ðsÞP 0;1 ðsÞ
ð7Þ
0
Z
Solving Eqs. (13)–(24) with the help of initial conditions, we obtain
1
j ¼ 4; 5; . . . ; l
P3;1 ðs; 0Þ ¼ h1 q1 P1;1 ðsÞ þ h2 P2;1 ðsÞ
P1;k ðsÞ
d k1 þ b1 h1 q P1;k ðtÞ ¼ ak1 1 k1 P 0;k ðtÞ dt
ð18Þ
0
ðs þ b1 h1 qÞP1;k ðsÞ ¼ a1k1 k1 P0;k ðsÞ
0
Z
k1
j ¼ 4; 5; . . . ; l
d k1 k1 k1 P0;k ðtÞ þ ak1 k þ a k þ b h þ b h 1 2 1 2 1 2 1 2 dt Z 1 Z 1 l3 X i li ðyÞPiþ2;k1 ðt; yÞdy þ ll2 ðyÞPl;k1 ðt; yÞdy p ¼
Pj;k ðt; 0Þ ¼ pj3
j ¼ 4; 5; . . . ; l
ðs þ a1k1 k1 þ ak1 2 k2 þ b1 h1 þ b2 h2 ÞP 0;k ðsÞ Z 1 Z 1 l3 X i ¼ li ðyÞPiþ2;k1 ðs; yÞdy þ ll2 ðyÞPl;k1 ðs; yÞdy p
For the rest cycles i.e. k P 2:
k1
otherwise
n P 1;
3. Governing equations and analysis
i¼1
an P 1;
The repair time distribution of the system is given by
Z
Initial conditions are defined as
esy Gj2 ðyÞC k ðsÞP0;1 ðsÞ; j ¼ 4;5;...;l where
ð29Þ
714
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
"" ! ! s2 s2 k Y b1 h1 q1 as2 k1 b2 h2 as2 k2 1 1 2 1 þ 1 g 1 ðsÞ p C k ðsÞ ¼ þ s2 s2 G1 ðsÞ s þ b1 h1 q s þ b2 h2 s¼2 ( ! s2 s2 l3 X b1 h1 q1 as2 k1 b2 h2 as2 k2 1 1 2 i ð þ Þ þ1 p þ s2 s2 G1 ðsÞ s þ b1 h1 q s þ b2 h2 i¼2 ! !) s2 i1 i1 i1 s2 Y Y X b1 h1 a1 k1 pm g m ðsÞ þ qr pm g m ðsÞ þ qi g i ðsÞ s2 s þ b1 h1 q m¼r m¼1 r¼2 ( ! ! s2 s2 l3 Y b1 h1 q1 a1s2 k1 b2 h2 as2 1 2 k2 þ 1 þ pm g m ðsÞ þ s2 s2 G1 ðsÞ s þ b1 h1 q s þ b2 h2 m¼1 ! !) # s2 l3 l3 Y X b1 h1 as2 k 1 1 q p g ðsÞ þ q ðsÞ þ g r m m l2 l2 s2 s þ b1 h1 q m¼r r¼2 !# 1 P0;1 ðsÞ s1 s1 s1 k þ a k s þ as1 1 2 þ b1 h1 þ b2 h2 1 2 In particular for k = 1, we get
1 s þ k1 þ k2 þ h1 þ h2
P0;1 ðsÞ ¼ P1;1 ðsÞ ¼ P2;1 ðsÞ ¼
ð30Þ
k1 ðs þ h1 qÞðs þ k1 þ k2 þ h1 þ h2 Þ k2 ðs þ h2 Þðs þ k1 þ k2 þ h1 þ h2 Þ
P 3;1 ðs;yÞ ¼
1 s þ k1 þ k2 þ h1 þ h2
ð31Þ
h1 q1 k1 h2 k2 þ s þ h1 q s þ h2
4.1. System reliability Since all the system states except ‘state 0’ are considered as absorbing states, the stochastic process f~IðtÞ; Y m ðtÞ; SðtÞ; t P 0g constitutes a new generalized Markovian process which has total l absorbing states. Let Q i;k ðtÞ ¼ Prf~IðtÞ ¼ i; SðtÞ ¼ kg, for i = 0, 1, 2, 3, . . . , l and k P 1. Then the system reliability is given by
RðtÞ ¼ Prfthe working time of the system > tg 1 X Q 0;k ðtÞ ¼
ð32Þ
1 G1 ðsÞ
k1
sy
G1 ðyÞ
! 1 h1 q1 k1 h2 k2 1 þ1 ¼ þ ðs þ k1 þ k2 þ h1 þ h2 Þ s þ h1 q s þ h2 G1 ðsÞ !) X j3 j3 j3 Y Y h1 k1 pm g m ðsÞ þ qr pm g m ðsÞ þ qj2 s þ h1 q m¼r m¼r r¼2 esy Gj2 ðyÞ;
j ¼ 4; 5; . . . ; l;
ð34Þ
Taking inverse Laplace transform of the Eqs. (30)–(32), we obtain
P0;1 ðtÞ ¼ eðk1 þk2 þh1 þh2 Þt P1;1 ðtÞ ¼
k1 feqh1 t eðk1 þk2 þh1 þh2 Þt g ðk1 þ k2 þ ð1 qÞh1 þ h2 Þ
k2 P2;1 ðtÞ ¼ feh1 t eðfk1 þk2 þh1 þh2 gÞt g ðk1 þ k2 þ h2 Þ
k1
k1 ðs þ ak1 1 k1 þ a2 k2 þ b1 h1 þ b2 h2 ÞQ 0;k ðsÞ ¼ 0; k P 2
ð39Þ ð40Þ
Laplace transform of the system reliability is obtained as
! þ1 e
ðs þ k1 þ k2 þ h1 þ h2 ÞQ 0;1 ðsÞ ¼ 1
R ðsÞ ¼
1 X Q 0;k ðsÞ ¼
ð41:aÞ
Taking inverse Laplace transform, we get
RðtÞ ¼ eðk1 þk2 þh1 þh2 Þt
ð41:bÞ
4.2. System MTTF The system mean time-to-failure (MTTF) is
MTTF ¼ limR ðsÞ ¼ s!0
ð35Þ ð36Þ
1 ðs þ k1 þ k2 þ h1 þ h2 Þ
k¼1
ð33Þ
Pj;1 ðs; yÞ
ð38Þ
k¼1
Now, we get Laplace transform of the differential equations of the system as
transient results of the reliability indices which are of great interest to the reliability engineers to evaluate the system performance is quite complicated by taking inverse Laplace transform. Now, we derive the expressions for some performance characteristics in Laplace transform form to predict the behavior of system as follows:
1 ðk1 þ k2 þ h1 þ h2 Þ
ð42Þ
4.3. System availability The availability of the system in terms of Laplace transform is
ð37Þ
The reliability indices in the form of Laplace transform may still be beneficial for various practical applications and these can be obtained by selecting some particular values of parameters as per practical application requirement. The solutions of (33) and (34) can be obtained for the case of k = 1 by taking inverse Laplace transform when we select any particular repair time distribution i.e. exponential, gamma, deterministic, Pareto depending upon the practical situations. For k > 1, the inverse Laplace transforms of the (25)–(29) are too tedious to obtain as such the explicit solution is not feasible by this approach.
A ðsÞ ¼
1 1 X X P0;k ðsÞ ¼ P 0;1 ðsÞ þ C k ðsÞP0;1 ðsÞ k¼1
ð43Þ
k¼2
4.4. System failure frequency Laplace transform of the failure frequency of the system is
xf ðsÞ ¼
1 X ða1k1 k1 þ ak1 2 k2 ÞP 0;k ðsÞ k¼1
¼ ðk1 þ k2 ÞP0;1 ðsÞ þ
1 X ða1k1 k1 þ ak1 2 k2 ÞC k ðsÞP 0;1 ðsÞ
ð44Þ
k¼2
4. Performance measures
4.5. Working probability of the repairman
In the previous section, we have established the system states probabilities in the form of Laplace transform. To get the
Laplace transform of the probability of the repairman being in working state can be obtained as
715
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
P W ðsÞ ¼
1 X l X
Using Eq. (49), Eq. (47) becomes
P i;k ðsÞ
k¼1 i¼3
"
k1 k1 b1 h1 q1 ak1 b h2 ak1 1 k1 2 k2 þ 2 k1 k1 s þ b 1 h1 q s þ b 2 h2
¼ þ
(( k1 l X b1 h1 q1 ak1 1 k1 k1
s þ b1 h1 q
i¼4
i3 Y
k1
m¼1 Z 1
0
1 G1 ðsÞ
k1 b h2 ak1 2 k2 þ 2 k1 s þ b2 h2
b1 h1 ak1 1 k1
pm g m ðsÞ þ
!
!
!
þ1
!Z
EV n ¼
1
esy G1 ðyÞdy
0
1 G1 ðsÞ
k1
!)
m¼r
1
The expected long run profit for the model by including above assumptions (i)–(iii), is evaluated as follows: Theorem. The expected long-run profit per unit time is given by 1 1 n¼1 ðan1 k1 þ an1 k2 Þ C 1 1
PN
2
1
1
n¼1 ðan1 k1 þ an1 k2 Þ þ 1
PN1 Pl n¼1
PN1 Pl
2
n¼1
1 m¼1 lm C 4 ðN 1Þð1 1
m¼1 lm þ ðN 1Þð1
aÞ 1s C 5
1
aÞ s
ð46Þ
Proof. The expected vacation time of the repairman in a cycle is given by (cf. Jia & Wu, 2009) n
EV ¼
N X
1
n¼1
bn hn
n1
! 1 X n Hv ðtÞ dFðan1 tÞ n
ð47Þ
v ¼1
P Now we define the probability distribution of vi¼1 Z ni which follows 1 gamma distribution with scale parameter bn1 h and shape parameter n n v. Since Z ni are independent and identically exponentially distributed, then we have
Hnv ðtÞ ¼
Z
t
0
v 1 n1 n1 ðb hn Þ xv 1 expðbn hn xÞdx ðv 1Þ! n
ð48Þ
Therefore,
) v 1 n1 n1 ðbn hn Þ xv 1 expðbn hn xÞ dx 0 v ¼1 ðv 1Þ! 80 9 1 v 1 Z t< X n1 1 = ðbn hn xÞ n1 n1 @ Ab hn expðb hn xÞ dx ¼ n n ; ðv 1Þ! 0 : v ¼1
Z 1 X Hnv ðtÞ ¼ v ¼1
¼
Z 0
t
t
( 1 X
n1
n¼1
n¼1
1 an1 1 k1
þ
1 an1 2 k2
ð50Þ
1
2
2
m
ð45Þ
The waiting time distribution of the system after the occurrence of nth failure in the system is considered as identical and independent distributed (i.i.d.). We replace the system by a new one when it has failed nth times besides of repairing. The time to replace the failed system by new one is assumed to be negligible. When the repairman returns from his vacation, he starts the recovery of faults immediately with probability a for the case when the faults are either major or minor.
PN
N N X X EX n ¼
The expected profit in a renewal cycle The expected length of a renewal cycle P PN1 Pl 1 1 ðC 2 þ C 3 Þ Nn¼1 an11 k þ an11 k C 1 n¼1 m¼1 lm C 4 ðN 1Þð1 aÞ s C 5 1 2 1 2 ¼ PN PN1 Pl 1 1 1 1 þ n¼1 n¼1 an1 k þ an1 k m¼1 l þ ðN 1Þð1 aÞ s
In this section, we determine the optimal value of parameter N by maximizing the expected profit per unit time at equilibrium. To design optimal policy, we made some more assumptions which are as follows (cf. Jia & Wu, 2009):
ðC 2 þ C 3 Þ
0
tdFðan1 tÞ ¼ n
ECðNÞ ¼
5. Optimal replacement policy
ECðNÞ ¼
n¼1
1
Now using the renewal reward theorem (cf. Ross, 1996), we obtain
!
þ1
i3 i3 X Y qr pm g m ðsÞ þ qi2
s þ b1 h1 q r¼2 esy Gi2 ðyÞdy C k ðsÞP 0;1 ðsÞ
N Z X
n1
bn hn dx ¼ bn hn t ð49Þ
Since a1 > 1, a2 > 1 and 0 < b1, b2 < 1, the expected long-run profit per unit time is monotonically increasing (decreasing) when the number N is small (large). Thus there exists an optimal replacement policy to determine N⁄ which maximizes the profit EC(N⁄). h In particular, if we set the parameters a1 = a2 = a, b1 = b2 = b, k1 þ k2 ¼ k, h1 + h2 = h, li(y) = l(y) for 1 6 i 6 l and take coverage P Pl2 as perfect i.e. p ¼ l3 i¼1 pi ¼ 1 and q ¼ i¼1 qi ¼ 1, then the results established in sections 4 and 5 match with earlier exiting results and our model coincides with the model studied by Jia and Wu (2009), and Yuan and Xu (2011a,b). 6. Numerical results In this section, we present the sensitivity analysis to simulate the behavior of the system by taking a numerical illustration. The numerical simulation carried out here, provides the justification of the model proposed and is helpful to the decision makers to examine the quantitative effects of variation of basic input parameters on the system performance measures which are often needed for improving the efficiency of the redundant systems. We compute the performance metrics of a repairable system having single repairman under the imperfect fault coverage. The effects of different parameters on the expected long run profit of the system, reliability and mean time-to-failure of the system are explored. The numerical results are summarized in Table 1 and visualized in Figs. 1–3. To obtain the optimal replacement policy and evaluate the numerical results of other performance measures of the system, we set default parameters as k1 ¼ 0:01, k2 ¼ 0:05, h1 = 0.1, h2 = 0.2, l1 = 1.5, l2 = 0.1, l3 = 0.2, a1 = 1.001, a2 = 1.002, a = 0.05, s = 0.02, C1 = $50, C2 = $1,000,000, C3 = $1,000,000, C4 = $1000 and C5 = $100,000. The optimal number of repairs before replacement i.e. N and corresponding maximum expected long run profit per unit time of the system EC(N) are computed by varying different cost elements and failure/repair parameters; the numerical results for the same are summarized in Tables 1 and 2, respectively. Fig. 1(i–iv) illustrates the effect of expected long run profit by varying the number of repairs before replacement (N) for different values of parameters a1, a2, k1 and k2 , respectively. We observe from Fig. 1(i–ii) that as N increases, EC(N) increases rapidly for initial values of N up to N = 4 and then after it starts to decrease gradually. It is also seen that as we increase the value of a1 and a2, the expected profit of the system decreases. Thus, it is recommended that the system be replaced when N = 4. In Fig. 1(iii–iv), it is noticed that EC(N) increases sharply up to N = 4, then after it decreases slowly before achieving the asymptotically constant value. Fig. 2(i–iv) displays the trend of the system reliability against time for different values of parameters h1, h2, k1 and k2 , respectively. It is clear from these figures that the reliability of the system decreases sharply with the increase in time and then it attains almost constant value. It is noticed from Fig. 2(i–ii) that as vacation rate of the repairman decreases, the system reliability increases; on the contrary, the system reliability decreases with the increment in k1 and k2 which match with our expectation.
716
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
Table 1 The optimal threshold parameter (N) and corresponding EC(N) for different cost elements. Cost elements (in thousands $) C1
C2
C3
C4
C5
0.05 0.10 0.15 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05
1000 1000 1000 700 800 900 1000 1000 1000 1000 1000 1000 1000 1000 1000
1000 1000 1000 1000 1000 1000 1200 1400 1600 1000 1000 1000 1000 1000 1000
1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.3 1.5 1.7 1.0 1.0 1.0
100 100 100 100 100 100 100 100 100 100 100 100 120 130 140
N
EC(N) (in $)
5 6 6 10 9 8 9 7 5 11 10 8 11 14 15
1480.20 1476.70 1473.60 1070.50 1204.10 1339.80 3029.70 3426.80 3834.10 2353.30 2164.60 1977.10 1425.30 1407.30 1391.90
Fig. 1. Effect of (i) a1, (ii) a2, (iii) k1 , and (iv) k2 on the expected long run profit.
Fig. 3(i–ii) and Fig. (iii–iv) display the effect of failure rates k1 and k2 on the system MTTF for varying values of h1 and h2. Fig. 3(i–ii) reveals that as k1 increases, the system MTTF decreases rapidly for lower values of k1 and then further decreases slowly until it attains almost constant value. Fig. 3(iii–iv) exhibits a gradual decrement in the system MTTF while increasing k2 ; however for higher values of k2 , it ultimately becomes almost constant. Further, an increment is seen in the system MTTF for the lower values of h1 and h2. Overall we conclude that
replacements and earn high profit. Moreover, it is recommended that the system could not be replaced until the number of repairs before replacement reaches to its optimal value N. On increasing the parameters h1 and h2, the system reliability decreases. It is noticed that the system reliability is higher for the smaller values of failure rates k1 and k2 , which is what we expect in realistic situations. The system mean time-to-failure is remarkable effected with the change in the parameters k1 and k2 .
The expected long run profit per unit time is highly sensitive to the small changes in the parameters a1, a2, k1 and k2 . The small values of a1 and a2 give the high profit as they require fewer
Our observations based on numerical results may be fruitful for the system analysts to improve the system reliability effectively under the care of the single repairman. The system organizers
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
Fig. 2. Effect of (i) h1, (ii) h2, (iii) k1 , and (iv) k2 on the system reliability.
Fig. 3. Effect of (i) h1, (ii) h2, (iii) k1 , and (iv) k2 on the system MTTF.
717
718
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719
Table 2 The optimal threshold parameter (N) and corresponding EC(N)for different failure and repair rates. Failure and repair parameters k1
k2
l1
l2
l3
0.008 0.009 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01
0.05 0.05 0.05 0.03 0.04 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05 0.05
1.5 1.5 1.5 1.5 1.5 1.5 1.2 1.4 1.6 1.5 1.5 1.5 1.5 1.5 1.5
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.04 0.06 0.08 0.1 0.1 0.1
0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.3 0.4
can achieve greater profit even when the repairman is engaged in other tasks. 7. Conclusion The provision of repair as per need of the system depending upon fault severity may be one of the ways to reduce the maintenance cost of the system. In this paper, we have proposed the replacement policy for a deteriorative repairable system with single repairman and imperfect fault coverage. With the help of geometric process, non-Markov model has developed successfully by incorporating the supplementary variable for the elapsed time of repair/recovery procedure. The system’s repairing is proposed to be done by the server according to the severity of the faults occurred and the corresponding recovery levels. The incorporation of vacation of the repairman in our model may be helpful to the system organizers to get more profit as the repairman might take a sequence of vacations. The model presented seems to be more realistic than other exiting models due to incorporation of many concepts which were not taken simultaneously in previous studies available in the literature. The cost analysis presented demonstrates that these measures may further be used to suggest the optimal replacement policy for the system developers and analysts for improving the system efficiently and availability at a desired level. The expression derived for the expected profit function of the system can be easily implemented to determine the optimal value of the replacement time as demonstrated by taking an illustration. References Akhtar, S. (1994). Reliability of k-out of-n: G systems with imperfect fault-coverage. IEEE Transactions on Reliability, 43, 101–106. Amari, S. V., Dugan, J. B., & Misra, R. B. (1999). Optimal reliability of systems subject to imperfect fault-coverage. IEEE Transactions on Reliability, 48, 275–284. Amari, S. V., Pham, H., & Dill, G. (2004). Optimal design of k-out of-n: G subsystems subjected to imperfect fault coverage. IEEE Transactions on Reliability, 53, 567–573. Badia, F. G., Berrade, M. D., & Campos, C. A. (2002). Optimal inspection and preventive maintenance of units with revealed and unrevealed failures. Reliability Engineering & System Safety, 78(2), 157–163. Cheng, G. Q., Li, L., & Tang, Y. H. (2012). Optimal replacement policy for a deteriorating series repairable system with multi-state. Systems Engineering – Theory & Practice, 32, 1118–1123. Huynh, K. T., Barros, A., Berenguer, C., & Castro, I. T. (2011). A periodic inspection and replacement policy for systems subject to competing failure modes due to degradation and traumatic events. Reliability Engineering & System Safety, 96(4), 497–508. Jia, J., & Wu, S. (2009). A replacement policy for a repairable system with its repairman having multiple vacations. Computers & Industrial Engineering, 57, 156–160.
N
EC(N) (in $)
7 6 5 16 11 5 6 5 5 12 10 7 5 4 3
1385.80 1432.70 1480.00 594.973 1027.30 1480.00 1478.80 1479.70 1480.40 1421.40 1447.00 1465.50 1480.00 1494.00 1504.70
Ke, J. C., Huang, H. I., & Lin, C. H. (2008a). A redundant repairable system with imperfect coverage and fuzzy parameters. Applied Mathematical Modelling, 32, 2839–2850. Ke, J. C., Lee, S. L., & Hsu, Y. L. (2008b). On a repairable system with detection, imperfect coverage and reboot: Bayesian approach. Simulation Modelling Practice and Theory, 16, 353–367. Ke, J. C., Su, Z. L., Wang, K. H., & Hsu, Y. L. (2010). Simulation inferences for an availability system with general repair distribution and imperfect fault coverage. Simulation Modelling Practice and Theory, 18, 338–347. Kuniewski, S. P., Weide, J. A. M., & Noortwijk (2009). Sampling inspection for the evaluation of time-dependent reliability of deteriorating systems under imperfect defect detection. Reliability Engineering & System Safety, 94, 1480–1490. Le, M. D., & Tan, C. M. (2013). Optimal maintenance strategy of deteriorating system under imperfect maintenance and inspection using mixed inspection scheduling. Reliability Engineering & System Safety, 113, 21–29. Lin, T., Jian, Y., Zhijun, C., & Bo, G. (2011). Optimal replacement policy for cold standby system. Chinese Journal of Mechanical Engineering, 24, 1–7. Myers, A. F. (2007). K-out of-n: G system reliability with imperfect fault coverage. IEEE Transactions on Reliability, 56, 464–473. Myers, A. F., & Rauzy, A. (2008). Assessment of redundant systems with imperfect coverage by means of binary decision diagrams. Reliability Engineering & System Safety, 93, 1025–1035. Naidu, R. S., & Gopalan, M. N. (1983). Analysis of a two-unit repairable system with random inspection subject to two types of failure. Microelectronics Reliability, 23(3), 449–451. Peng, R., Mo, H., Xie, M., & Levitin, G. (2013). Optimal structure of multi-state systems with multi-fault coverage. Reliability Engineering & System Safety, 119, 18–25. Pham, H. (1992). Reliability analysis of a high voltage system with dependent failures and imperfect coverage. Reliability Engineering & System Safety, 37, 25–28. Ross, S. M. (1996). Stochastic processes (2nd ed.). New York: Wiley. Sugiura, T., Mizutani, S., & Nakagawa, T. (2006). Optimal random and periodic inspection policies. In H. Pham (Ed.), Reliability modeling in analysis and optimization (9th ed., pp. 393–403). Singapore: World Scientific Publishing Co. Pvt. Ltd. Wang, K. H., Yen, T. C., & Jian, J. J. (2013). Reliability and sensitivity analysis of a repairable system with imperfect coverage under service pressure condition. Journal of Manufacturing Systems, 32, 357–363. Xing, L. (2007). Reliability evaluation of phased-mission systems with imperfect fault coverage and common cause failures. IEEE Transactions on Reliability, 56, 58–68. Xing, L., Amari, S. V., & Wang, C. (2012). Reliability of k-out of-n systems with phased-mission requirements and imperfect fault coverage. Reliability Engineering & System Safety, 94, 1480–1490. Yu, M., Tang, Y., Liu, L., & Cheng, J. (2013). A phased-type geometric process repair model with spare device procurement and repairman’s multiple vacations. European Journal of Operational Research, 225, 310–323. Yuan, L., & Xu, J. (2011a). A deteriorating system with its repairman having multiple vacations. Applied Mathematics and Computation, 217, 4980–4989. Yuan, L., & Xu, J. (2011b). An optimal replacement policy for a repairable system based on its repairman having vacations. Reliability Engineering & System Safety, 96, 868–875. Yun, W. Y., & Nakagawa, T. (2010). Replacement and inspection policies for products with random life cycle. Reliability Engineering & System Safety, 95(3), 161–165. Zhang, Y. L. (1999). An optimal geometric process model for a cold standby repairable system. Reliability Engineering and System Safety, 63, 107–110. Zhang, Y. L. (2008). A geometrical process repair model for a repairable system with delayed repair. Computers and Mathematics with Applications, 55, 1629–1643.
M. Jain, R. Gupta / Computers & Industrial Engineering 66 (2013) 710–719 Zhang, Y. L., & Wang, G. J. (2007). A deteriorating cold standby repairable system with priority in use. European Journal of Operational Research, 183, 278–295. Zhang, Y. L., & Wang, G. J. (2009). A geometric process repair model for a repairable cold standby system with priority in use and repair. Reliability Engineering & System Safety, 94, 1782–1787. Zhang, Y. L., & Wang, G. J. (2011). An extended replacement policy for a deteriorating system with multi-failure modes. Applied Mathematics and Computation, 218, 1820–1830.
719
Zhao, B., & Yue, D. (2011). The optimal replacement problem of a repairable system with its repairman having multiple delayed vacations. Journal of Information & Computational Science, 8, 2155–2163. Zong, S., Chai, G., Zhang, Z. G., & Zhao, L. (2013). Optimal replacement policy for a deteriorating system with increasing repair times. Applied Mathematical Modelling (in press).