Microelectron. Reliab., Vol. 36, No. 10, pp. 1565-1568, 1996 Copyright © 1996 ElsevierScience Ltd Printed in Great Britain. All rights reserved 0026-2714/96 $9.50+.00
Pergamon
0026-2714(95) 00181-6
PROBABILISTIC ANALYSIS OF A TWO-UNIT WARM SYSTEM SUBJECT TO HARDWARE AND HUMAN FAILURES
STANDBY ERROR
M. A. W. M A H M O U D Mathematics Department, Faculty of Science, AI Azhar University, Nasr City, 11884, Cairo, Egypt and M. A. E S M A I L Mathematics Department, Teachers College Riyadh, 11491, P.O. Box 4341, Saudia Arabia
(Received for publication 18 September 1995) Abstract--Reliability analysis of a two-unit warm standby redundant system under hardware and human error failures is studied. The regenerative point technique is used. Various measures of reliability of the system are derived assuming all time distributions are general. Copyright © 1996 Elsevier Science Ltd.
INTRODUCTION
NOTATION
Various authors [1 4] have studied the stochastic behaviour of some standby r e d u n d a n t systems. Various measures of reliability of the systems were derived. In these papers the authors did not take into account the second type of failure, which is human error failure, despite 20-30% of failures being due to it (see [5]). Many authors [1, 6-12, 14] have studied the reliability of some standby systems taking into account hardware and human error failures. In all these papers constant failure rates for hardware and human error time distributions have been assumed. Recently M a h m o u d [13] has studied a two-unit cold standby redundant system subject to hardware and human error failures. He derived various reliability measures for the system considering cold standby. The aim of the paper is to study the probabilistic analysis of the same system as in [13] assuming the warm standby. All time distributions are assumed to be arbitrary. Many reliability measures such as mean time to system failure, availability and steady state availability, expected busy periods, and the net gain of the system are derived. Finally, particular cases are stated. ASSUMPTIONS 1. The system consists of two identical units. 2. Initially one unit is operating and the other is in standby (warm standby). 3. The online unit suffers two types of failure, namely, hardware and human error failures. 4. The switch is perfect and instantaneous. 5. The hardware failure, human error failure, standby failure and repair times are all arbitrary. 6. After repair, the unit is as good as new. 7. There is one repair facility. 1565
Eo
E
F,(t)(t = l, 2) F~(t) G,(t)(l = 1, 2) 6~(t) qij(t), Qij(t)
state of the system at t = 0 set of regenerative states set of non-regenerative states c.d.f, of hardware and human error failures, respectively c.d.f, of standby failure time c.d.f, of repair times due to hardware and human error failures, respectively c.d.f, of repair time of standby failure p.d.f, and c.d.f, of time for the system transits from regenerative state Sj to
Sj k k %, Qq(t)
lq~(t) mij
m~
p.d.f, and c.d.f, of time for the system transits from regenerative state Si to S~ via the non-regenerative state Sk ~ ff~ c.d.f, of time to system failure starting from state Si ~ E contribution to mean sojourn time in state S~, when system transits direct to
Sj
contribution to mean sojourn time in state S~, when system transits to Si via Sk E
,u i
M~(t)
A v~(t) Ri(t)
H(t) S
© S
u~(I = 1, 2)
P [system sojourns in Si for at least time t] dt P[system is up initially in state S~e E at time t without passing through any other regenerative state or returning to itself through one or more state ~/~] P[the system is up at time tiE o = S~ ~ E] P[serverman is busy with repair due to hardware failure, human error failure and standby failure, at time t starting from state S~E E]. the expected profit incurred in [0, t] dummy variable in Laplace transform (LT) symbol for LT symbol for ordinary convolution implies ~ff unless otherwise stated the mean repair time due to hardware failure and human error failure
1566
M . A . W . Mahmoud and M. A. Esmail
Is,! _
..-s,)a
', ,
"t
: ."
;.._
Fig. 1. Transition diagram: (©) up state; (f~) down state; ( ~ ) regenerative state. Symbols f o r states o f the system
f
O/OQ/Oc 2 unit in normal mode/in normal mode continued from state S,/in normal mode continued from state S2. Orl/Orz/str unit is in failure mode due to hardware failure and under repair of type 1/unit is in failure mode due to human error failure and under repair of type 2/unit is in failure mode due to standby failure and under repair. wr,/Owr 2 waiting for repair of type 1/waiting for repair type 2. R J R 2 s t R unit is under repair of type 1 continued from earlier state/unit is under repair of type 2 continued from earlier state/unit is under repair due to standby failure continued from earlier state. Considering these symbols, the system can be in any one of the following states. S o t ( O , s) S 2 = (Or z, O) S4 = (OWrl, R,) $7 = (Owr 2, R2)
S, = S¢2 = Ss = S8 =
(Or,, O) (Oc2, s) (Owr2, R,) (Owr 1, stR)
S. S3 : S6 = S9 = =
(Oc,, s) (O, str) (OWrl, R2) (Owr2, stR)
P"' 22 "~-
=
Up states: So, S,, S~: $1, S2 and S s. Down states: S,, S~, $6, $7, Ss and S9. One can see that So, SI and S2 are regenerative states, but S~,, S~, Ss, S,, S~, $6, $7, Ss and S~ are non-regenerative states.
P27 = .I (~2(t)F,(t) dF2(t),
: 21 ~
TRANSITION PROBABILITIES AND S O J O U R N TIMES Since t r a n s i t i o n n o n - z e r o P/is are
probabilities
Po = QiJ(°°),
the
22 It is easy to p r o v e that: Po~ + / o 2
Pol =
f ffs(t)F2(t)dFl(t), P02=f Fs(t)Fl(t)dF2(t)
+ Pos = Po, + eo2 + P~o%'8~ +/'~o 3'9~ = 1
P , . + P,. + P,~ = Fief' + P ~ T + P;*,) + P ~ = 1 = --21 "~- "22 -~ p(261} q- P(272) ~-- 1 P~o3; s~ + P~o~ ~ = 1o3, 21
P(o329' : ; ( ; i
fs(X)Fl(x)df2(x))dGs(t)
/o:t)e:t>aF,(t>
-,~"(")=
P~")
-[-"22
p,.,
p~c~, = p ~ . ,
p~%~ = p~6,
-D (~7 ) __ P27.
The m e a n s o j o u r n times are: Po = mox +
mo2 + mo3 =
Pl = ml~, + m,~ + m , s =
I ~ , ( t)F2(t)Fs(t _ ) dt
f~,(
t)F2(t)G,(t) dt
-
Fl(t)F2(t)O2(t) dr.
P2 = m2c2 + m26 + m27 = d
A two-unit warm standby system
Also, the following recursive relations can be obtained:
MEAN TIME TO SYSTEM FAILURE
The time to system failure (TSF) can be regarded as the first passage to any of the failure states. Employing the arguments of the theory of regenerative processes, we obtain the following recursive relations.
rio(t)
=
1567
A V 0 = Mo(t ) + qo3(t) + (qox(t) + qCo3/S)(t))©AVI(t) + (qo2(t) + q~o~9)(t))© AV2(t ) (5) A V1 = Ml(t) + q(lC~l(t)+ (q~C~)(t)+ q ~ ( t ) ) ~ A Vl(t ) + (q~C~)(t)+ q~(t))© A V2(t)
l:,(t)F2(t)Fs(t ) + qoa(t) + qo,(t)© [I~(t) + qo2(t)© [12(t )
(1)
(c2) A V2 = M2(t) + q~C~)(t)+ (q21 (t) + q~,6x~(t))© A Vl(t )
+ (q~)(t) + q(2~(t))© A V2(t).
YI,(t) = F,(t)F2(t)G,(t) + q~)(t) + qt;~l'(t)© Yl,(t) + q~xC~)(t)©l=12(t)
(2)
(c~)
(7)
Taking LT for eqns (5-7) and solving for A V~o(S),we obtain
l=12(t) = l:~(t)F2(t)G2(t) + q~)(t) + q~)(t)© [I,(t) + q22 ( t ) © ]~2(t).
(6)
A V~(s) = N,(s)/DI(s ), (3)
Taking LT for eqns (1-3) and solving for I~*(s), we get
where
DI(s ) = (1 - q*~¢~)(s)- q*~4)(s))(1 - ~/22~*(c2)[e''tt°)- - q*tT)(s)) - (q*(2~)(s ) + q12*ts)(s))(q21"t¢2)(s) + q21"(6)(s)), and
[l*(s) = No(s)/Do(s),
N a(s) = (M*(s) + q*3(s))Dl(s) + (M*(s)
where
Do(S) -----(1
--
q~'~)(s))(1 -- ~q 2 2* ( ~',,~)'1 )~
--
*(cD * + q13 (s)){qol(s) + q~[3'S'(s))( 1
,,,(~)~,(c2)¢~ ~,°)''121 k°)',
q*[~'(s)
"/12
- q~'[7)(s)) + (q~[6'(s) + q~[¢2)(s))(q*2(s)
No(S) = (a,(s) + a(s))Do(s) + (a2(s)
+ q~2~' 9)(s))} + (M~(s) - q~[*2'(s)){(q~2(s )
+ fl(s)){q~,(s)(1 - q*2(s)) + q~2(s)q~a(s)}
+ q~C23"9'(s))(1 - q~')(s) - q~4'(s)) + (q*l~25)(s)
+ (a3(s) + 7(s)){q~2(s)(1 - q'~(s))
+
+ q~,(s)q~2(s)}, and
,(c~) (s))(qo,(S) , q12 + q~x3'S)(S))} •
Note that D~(0) = 0. In steady state
• ,(s) = f e-'%(t)F2(t)F~(t) dt,
AVo(OO) = lira sAV~o(S) = N~(O)/D,(O),
(8)
s~O
• 2(s) = f e-"F~(t)F2(t)Gx(t) dt,
where NI(0) = (#l + D(cD'lgD(c2) -13 J~-21 + P~26))
~3(s) = f e-S'Fl(t)F2(t)G2(t) dt,
p(c ~)~t p(c D
+ (/~2 + - 2 3 ~ v 1 2 + P ~ ) , __
*
~($) -- q03(S), fl(S) =
/~*(cDis'~ '113
~. 1~
)~(S) =
~*(c2) t./23
.
D'I(0) = (P(I'~) + -p(~)~,_(6) -12/ktr~21
-~-
tn~l)
"F
m(2~)~± "_(v)~ '22/
Then
M T S F = No(O)/Do(O),
(4)
and
where No(0) = (Po + Poa)Oo + (#1 ~- P]~)){Po,( 1 -- -22u(cz)
m]C~)~ f tG,(t)F2(t ) dF,(t),
m]'~)= f tGl(t)i ~'(t) dF2(t )
D p(c~)-~
+ , o 2 - 2 , .,) + (#2 + P~)){P02(1 - P~[)) 1o p(c:z)'( "F
* 01~12
j
m~4x) =
and D0(0) = (1 -- P(~;))(1
--
u(¢*)~ --221 - - ~'(~')°(~) --12 ~21 -
AVAILABILITY ANALYSIS
f (fo t
)
F2(x) dFl(x ) dGl(t),
m;~=f t(fi ~,(x)dF2(~))d~,(t), m~2~'), : f tG2(t)F2(t) dF,(t),
m~2~)=f tG2(t)F,(t) dF2(t)
By probabilistic arguments, we get
Mo(t) -= F~(t)F2(t)F~(t)
m~ =
M,(t) = F~(t)F2(t)G,(t ) M2(t ) -= F~(t)F2(t)G2(t ).
'"22-(7)=
t
~2(x) dF~(x) dG2(t),
f (fo t
FI(x) dF2(x
))
dG2(t).
1568
M.A.W. Mahmoud and M. A. Esmail BUSY PERIOD ANALYSIS
The following recursive relations can be obtained for Ri(t):
The expected profit per unit time in steady state is given as follows: H = lim H(t)/t = lim s2H*(s) = K A V*d - CR* t ~
Ro(t ) = qol(t)© Rt(t) + qoE(t)© R2(t ) + qo3(t), gl(t )
=
=
(~l(t) + q~C~)(t) + (q~C~)(t) + q~4)(t))©Rl(t)
+ (q~C~l(t) + q ~ ( t ) ) © R2(t ),
(10)
R2(t ) = (~2(t) + q~C~)(t) + (q~2C~)(t)+ q~6)(t))©Rx(t)
+ (-(~:~'t) q22 t + qt272~(t))©R2(t),
(1 1)
Taking LT for eqns (9-11) and solving for R~(s), we get
N2(s) = q~3(S)Dl(S) + G*(s){q~l(s)(1 - u22"~*t~)t¢~o,_ q~t27J(s)) + qo*:(s)(1 - .2~"*(c:)'°~o,+ q*]6~(s))}
+ d~(s){q*z(S)(1 -- 'l"*lc')(¢~x, to, -- ql*]a'(s)) + q~1(s)(q*~f')(s) + q*t25'(s))}. In the long run, the fraction of time for which the systemis under repair, is given by: (12)
s~0
where
N2(O)=ut{Pox(1
P~:~-P~)+
P ~P~:~
+ u2{Po2(1 - n~?) - P(la,') + Pox(n(,~)+ P(I~)}. COST BENEFIT ANALYSIS
H(t) = Kl~,p(t) - CPo(t),
(13)
where K is the revenue per unit of up time, C is the cost per unit time of repair, /~p(t) = f [ A Vo(t) dt, and ~o(t) = J l Ro(t) dr. Then, from eqn (13), we have
H*(s) = K~*p(s) - C~'~(s).
[/
-
CNo(O)]/O,(O).
Particular cases: (1) Taking Fs(t) = 1.Vt, we get the results of [13]. (2) If we take F 1 = 1 - e -~', F2(t) = G2(t) = 0 and Ga(t) = G(t)Vt, then S 3 in this case becomes regenerative state, and alter some calculations we obtain the results of [1]. REFERENCES
R*(s) = N2(s)/DI(s), where
Ro = Ro(oO) = lim sR~(s) = Nz(O)/D'~(O),
s~O
(9)
1. D. P. Gaver, Jr, Time to failure and availability of paralleled-system with repair, IEEE Trans. Reliab. R-12, 30-38 (1963). 2. M. A. W. Mahmoud, M. M. Mohie El-Din and M. El-Said Moshref, Optimum preventative maintenance for a 2-unit priority-standby system with patience-time for repair, Optimization 29, 261-279 (1994). 3. S.K. Singh, R. P. Singh and Sindhu Shukla, Cost-benefit analysis of a 2-unit priority-standby system with patience-time for repair, IEEE Trans. Reliab. 40(1), 11-14 (1991). 4. R. Gara, Busy period analysis of a 2-unit warm standby system with imperfect switch, Microelectron. Reliab. 25(4), 675-393 (1979). 5. D. Meister, The problem of human-initiated failures, Eighth National Symposium on Reliability and Quality Control (1962). 6. B. S. Dhillon, On human reliability bibliography, Microelectron. Reliab. 20, 371-373 (1980). 7. B. S. Dhillon, Stochastic models for predicting human reliability, Microelectron. Reliab. 22, 491-496 (1982). 8. B. S. Dhillon and R. B. Mishra, Reliability evaluation of system with critical human error, Microelectron. Reliab. 24, 743-759 (1984). 9. B. S. Dhillon and S. N. Rayayoti, Reliability analysis of non-maintained parallel system subject to hardware failure and human error, Microelectron. Reliab. 25, 111-122 (1985). 10. P. P. Gupta and R. K. Sharma, Reliability and MTTF analysis of non-repairable parallel redundant complex system under hardware and human failures, Microelectron. Reliab. 26, 229-234 (1986). 11. P. P. Gupta and R. K. Sharma, Reliability analysis of a two state repairable parallel redundant system under human failure, Microelectron. Reliab. 26, 221-224 (1986). 12. Y. Hatoyama, Reliability analysis of a three state system, IEEE. Trans. Reliab. R-28, 386-393 (1979). 13. M. A. W. Mahmoud, Probabilistic analysis of a standby system under hardware and human error failures, AMSE Periodicals (in press). 14. E. W. Hogen (Ed.), Human reliability analysis, Nuclear Safety 17, 315-326 (1976).