Reliability Engineeringand SystemSafety45 (1994) 225-234 © 1994 Elsevier Science Limited Printed in Northern Ireland. All rights reserved 0951-8320/94/$7.00
ELSEVIER
Risk analysis of surveillance requirements including their adverse effects* I. S. Kim, ~ S. A . Martorell, b W . E. V e s e l y ~ & P. K. Samanta ~ "Department of Advanced Technology, Brookhaven National Laboratory, Upton, New York 11973, USA b Universidad Politecnica of Valencia, Valencia, Spain ' Science Applications International Corporation, 655 Metro Place South, Suite 745, Dublin, Ohio 43017, USA
(Received 27 May 1993; accepted 3 December 1993)
Technical specifications for nuclear power plants require periodic surveillance testing of the standby systems important to safety. This regulatory requirement is imposed to assure that the systems will start and perform their intended functions in the event of plant abnormality. However, operating experience suggests that, in addition to the beneficial effects of detecting latent faults, the tests may have adverse effects on the plant's operation or equipment. This paper defines those adverse effects of testing from a risk perspective, and then presents a method to quantify their associated risk impact, focusing on plant transients and the wear-out of safety systems. The method, based on probabilistic safety assessment, is demonstrated by applying it to several surveillance tests conducted at boiling water reactors. The insights from this evaluation can be used to determine risk-effective intervals for surveillance tests.
1 INTRODUCTION
effects were used in this study as screening criteria: (1) those potentially leading to a plant transient, (2) those causing unnecessary wear to equipment, (3) those causing unnecessary radiation exposure to plant personnel, and (4) those placing an unnecessary burden on plant personnel. This paper presents a methodology to evaluate quantitatively the risk impact of adverse effects of surveillance testing in the framework of a probabilistic safety assessment (PSA), 5'6 focusing on plant transients initiated during testing, and on the wear-out of safety systems due to testing. These effects generate significant safety concerns because of: (1) plant abnormality, which may challenge safety systems and the plant operators; and (2) equipment degradation, which increases the unavailability of safety systems or functions and thereby reduces the plant's capability to mitigate accidents. The quantitative risk methodology is demonstrated with several surveillance tests conducted at boiling water reactors (BWRs), such as tests of the main steam isolation valves, the turbine overspeed protection system and the emergency diesel generators. The evaluations of the risk-effectiveness of the tests are presented, along with insights from sensitivity analyses of risk impact versus test interval. Section 2 defines the adverse effects of testing from
Surveillance tests are required in nuclear power plants to detect failures in standby equipment as a means of assuring their availability in case of an accident. However, operating experience suggests that some tests may have an adverse impact on safety, t~ This potential for adverse impact is aggravated by the large amount of testing presently required by technical specifications. To address the problem of surveillance test requirements, i.e. the adverse effect on safety exacerbated by the large amount of testing, the US Nuclear Regulatory Commission (NRC) carried out a series of studies. NUREG-1024 made recommendations to enhance the safety impact of surveillance requirements. 2 NUREG-1366 implemented the recommendations by examining all technical specifications surveillance requirements to identify those that should be improved) Four different types of adverse *The submitted manuscript has been authored under Contract No. DE-AC02-76CH00016 with the US Department of Energy. Accordingly, the US Government retains a nonexclusive, royalty-free license to publish or reproduce the published form of this contribution, or allow others to do so, for US Government purposes. 225
226
L S. K i m et al.
a risk perspective. Sections 3 and 4 describe the methodology to evaluate the risk impacts of testing associated with transients and equipment degradations, along with the evaluations of risk effectiveness and sensitivity analysis. Section 5 gives our conclusions.
2 RISKS A S S O C I A T E D
WITH A TEST
Surveillance testing may have two types of risk impact on the plant: a beneficial impact, i.e. a reduction in risk, and an adverse impact, i.e. an increase in risk. The detection of failures through surveillance testing is the beneficial impact of the testing; this risk contribution 'detected' by a test will be called R o . The adverse contribution results from degradations or failures that are due to or related to the test, and from the component unavailability during or as a result of the test; this contribution 'caused' by a test will be called Rc. 2.1 Test-detected risk
The test-detected risk contribution, RD, has been studied previously, and can be quantified by the following formula, using a PSA model: 7 RD = ½ A T ( R , - Ro)
(1)
where A = the component failure rate; T = the test interval of the component; R~ = the core-damage frequency evaluated with the component assumed to be down (i.e. with the component unavailability set to one); Ro = the core-damage frequency evaluated with the component assumed to be up (i.e. with the component unavailability set to zero). This formula defines the benefit of testing relevant to the detection of standby time-related failures. In many cases, all component failures are included in defining standby failure rates; i.e. demand failure contributions are not separated. Under this assumption and the assumption that the component operability is completely restored following the test (i.e. availability of unity), eqn (1) calculates the maximum risk benefit that can be associated with the test. The expression ( R 1 - Ro) in eqn (1) is known in reliability literature as the risk importance or Birnbaum importance, which indicates the sensitivity of the core-damage frequency to the basic event associated with the tested component. The expression ½AT gives the average unavailability when the time-dependent unavailability, u(t), is represented by a linear model; namely, u ( t ) = At.
Surveillance tests may have additional benefits which are not directly quantified: • They may detect failure mechanisms requiring repair at an early stage and thus remove or significantly decrease the possibility of failure. Potential common-cause failures also may be detected. • In much standby mechanical equipment, the testing may help to prevent corrosion or accumulation of impurities, and lubricate various piece-parts, contributing to their reliable performance. These aspects are, however, indirectly included through the component failure rate, )t, which may decrease because of these beneficial influences of surveillance tests. 2.2 Test-caused risk
The test-caused contribution, Rc, may have several different risk contributions; these are listed in Table 1, along with their causes of risk. This risk can be expressed in a general form as Rc = Rtrip + Rwear + Rstate + Rdown
(2)
where, for any specific test, some contributions may be irrelevant or insignificant compared to the others. When the risk-effectiveness of a test program or procedure involving several individual components is evaluated, the contributions for each test, plus those from any interactions, must be considered. Besides those adverse effects contributing to the plant risk in Table 1 (i.e. plant transients, equipment wear, misconfigurations and test downtime), two more may be sometimes encountered: (1) unjustified radiation exposure to plant personnel, and (2)
Table 1. Risk contributions that may be caused by a test and their root causes Identifier
Risk contribution
Causes of risk
Rtrip
Risk from transients or plant trips
R ....
Risk from equipment wear
R~t~
Risk from misconfigurations or errors in component restoration Risk associated with downtime in carrying out the test
Human error, equipment failure, procedure inadequacy Inherent characteristics of the test, procedure inadequacy, human error Human error, procedure inadequacy
Rdown
Unavailability of the component during the test Affected by the test override capability
Risk analysis of surveillance requirements unnecessary burden of work on plant personnel. Unlike the adverse effects in the table, these two are not generally subject to a risk analysis based on the risk measure of core-damage frequency. However, they can be considered qualitatively, together with the results of quantitative risk analysis, in evaluating surveillance requirements. Among the causes of risk in Table 1, previous studies8-1° concentrated on human errors in restoring component status following a test, resulting in system misconfiguration. In terms of the risk contributions in the table, these studies mainly focused o n Rstate, which is most likely to be caused by human errors, with some consideration of Raown. 2.3 Risk-effectiveness and total risk Once risk contributions associated with a test (or a group of tests) are quantified, then the test can be evaluated from a risk perspective. One way is to compare the test-detected contribution, RD, with the test-caused contribution, R o or its specific contribution, such a s Rtrip or R ..... When Rc is assessed by considering all significant contributions constituting Rc, we can define the risk-effectiveness of a test as follows: a test is risk-effective if RD > R o otherwise it is risk-ineffective. If only specific contributions are considered, then the evaluation of the riskeffectiveness of the test is considered with regard to the specific test-caused contributors. For example, if test-caused risk contributions due to trips, Rtrip , a re only considered and we assess that RD > Rtrip, then we can say that the test is risk-effective with regard to test-caused trips. When more test-caused contributions are considered, then broader conclusions can be reached. Another way of evaluating the test from a risk perspective is in terms of its total risk impact, RT, which can be obtained by adding RD and R o RT is the contribution standardly computed in PSAs. Often, Rc is assumed to be zero. However, in many cases, Rc can be significant, as evident from the operating experience. Note that the total risk impact is defined as R D + R c , as opposed to R D - R o because, although we call RD a beneficial risk impact, it also is a kind of risk. An advantage of using this total risk as a measure of evaluating testing is that it helps to define an optimal test interval. The risk-based, optimal test interval, Topt, c a n be taken from the value where RT reaches its minimum, because we should minimize the total risk impact associated with the test. However, this calculational minimum should not be taken too strictly in view of uncertainties in the risk quantification, and also other operational or practical factors.
227
3 TRANSIENTS D U E TO TESTING 3.1 PSA-based methodology The operating history of nuclear power plants indicates that conducting a surveillance test at power may cause a transient that will lead to or require a reactor trip. An example involves the testing of a main steam isolation valve (MSIV): while performing the MSIV operability surveillance, the outboard MSIV went to the fully closed position instead of stopping at the 10% closure limit due to failure of a limit switch on the MSIV. 3'1~ The risk impact of such transients depends on the responses of the plant's safety systems and the operators. These are typically considered in PSAs, 5"6 in which the various plant and operator responses that affect the plant risk are taken into account, using event trees for delineating the progression of accident sequences, and system fault trees for identifying the failure modes and their effects on the system unavailabilities. From a PSA model of the plant, the risk impact of a test-caused transient, Rtrip , c a n be evaluated through the initiating event group associated with the transient: Rtrip = ff)RIE j
(3)
where RIE j denotes the risk impact of the jth initiating event group (e.g. transients with loss of the power conversion system), which is assumed to be associated with the test-caused transient, and 4~ is the proportion by which the frequency of the PSA initiating event group is attributable to the test-caused transients. The proportion ~b can be estimated from the analysis of plant operating data as follows: q~
Ntest =
NIE-j
(4)
where
Ntest = the number of test-caused transient events; N~E-j = the number of transient events belonging to the initiating event group associated with the test-caused transient. To obtain ~, the test-caused transients must be associated with the relevant initiating event groups. For this purpose, we can use the Electric Power Research Institute transient categories ~2 that were originally developed to analyze the historical transient events in the anticipated transients without scram (ATWS) study. The use of the transient categories will facilitate and improve the accuracy of the data analysis, because the extent of detail on the test-caused transients and the PSA initiating event groups are usually quite different. The ATWS study defined 37 BWR and 41 pressurized water reactor
228
L S. Kim et al.
categories based on the different characteristics of the variety of transient events that had occurred or might occur in the plants. To analyze the sensitivity of the test-caused risk due to transients to the variation in the test interval, T, we can first derive the following equation for the probability, Ptrip, that a transient will occur during or as a result of a test: 11 Ptrip =/jT~b
(5)
where /j denotes the frequency of the jth initiating event group used in the PSA model. From eqns (3) and (5), we obtain the following equation:
Ptrip 1A(RI -- R o ) -- ~ RIE-j = 0 /j Topt
(10)
Solving eqn (10) for Topt gives .i /2ptrip *~ IE ..~_j T°pt = ~ / A/j R 1 - R0
(11)
which is identical to Train, eqn (8). However, in general, Tmin may differ from Topt; e.g. when a nonlinear unavailability model is used for Rr, (see Appendix). 3.2 Application to MSIV and TOPS tests
Ptrip Rtrip = / i T / ¢ m j
(6)
This equation can be used to analyze the sensitivity of Rtrip to the variation of T when a reasonable estimate of Ptrip is available. From eqns (1) and (6), we can establish the criteria for risk-effectiveness of surveillance testing with regard to test-caused transients. The test is riskeffective if: ~ /2ptrip Rm-] T> ~/ Mj R1 - R0
test interval, Topt, discussed in Section 2.3:
test risk-effective with regard to test-caused transients
(7) The test is risk-ineffective, if T is smaller than the value of the right-hand side of eqn (7). The risk-effectiveness criterion on the test interval should be used only when a reasonable estimate can be made of the probability, Ptrip, that a transient will occur during a test. In general, the less likely a test-caused transient will occur, the more operating data are necessary. Let Tmi n = the minimum test interval, such that the test is risk-effective with regard to test-caused transients, as long as the test interval is greater than the minimum. We then can derive the following expression by setting R o equal to Rtrip:
The formulas discussed above were used in the framework of a NUREG-1150 PSA for a B W R 6 to evaluate the following tests: (a) quarterly test of the MSIV operability, and (b) weekly test of the turbine overspeed protection system (TOPS). Table 2 shows the categories of B W R transients that were identified as being associated with the tests, based on a consideration of the test characteristics and the effects of the test-caused transients on the plant. For example, the TOPS test may cause the turbine control valve to fail closed, resulting in high steam pressure in the main steam system, and consequently, in a turbine trip. Hence, the transient due to the TOPS test falls into Category 3, 'Turbine trip,' and Category 13, 'Turbine bypass or control valves cause increased pressure (closed).' To use the transient categories in the context of the PSA model, they then were associated with the initiating event groups modeled in the plant-specific PSA, based on the characteristics of the categories and the initiating event groups. Table 2 also shows the initiating event groups identified as being associated with the categories. Categories 6 and 7, into which the transient events from the MSIV testing can be classified, are associated with initiating event group T2, which incorporates transients with the power conversion system unavailable. Categories 3 and 13 of
__ a /2ptrip R I E - j Train
~[ Alj R 1 -- R o
(8)
The risk-effectiveness criterion can be stated in terms of Tmi n a s follows: if T > Tmi,, the test is risk-effective, otherwise it is risk-ineffective. To evaluate the test in terms of the total risk impact, we first need an expression for RT. By adding RD and gtrip , we have
Table 2. Association of test-caused transients, transient categories and PSA initiating event groups Test MSIV
Transient category 6 7
p trip
RT = 1A T(R~ - R0) + / ~
Rm-]
(9)
Differentiating eqn (9) with respect to T and setting to zero, we obtain the following equation for the optimal
TOPS
3 13
Description Inadvertent closure of one MSIV Partial MSIV closure Turbine trip Turbine bypass or control valves cause increased pressure (closed)
PSA initiating event group T2 T2 T3A T3A
Risk analysis of surveillance requirements
229
3.3 Insights from sensitivity analysis
the TOPS test are associated with initiating event group T3A, which encompasses transients with the power conversion system initially available, except those due to an inadvertent open relief valve in the primary system and those involving loss of feedwater. Using the transient categories and the risk impacts of associated initiating event groups, we obtain the risk impact of a test-caused transient and the probability of a transient occurring during a test: Rtrip = 1.8E-7 per reactor-year, and Ptrip ----6.7E-2 per test for the quarterly MSIV testing; R t r i p - - 3 . 7 E - 8 per reactor-year, and Ptrip = 1"7E-3 per test for the weekly TOPS testing. Thus, a transient is more likely (by approximately a factor of 40) to occur during an MSIV test than during a TOPS test; accordingly, the MSIV test incurs larger adverse risk from test-caused transients than does the TOPS test (by a factor of 5). Using eqn (1) in the framework of a computerized IRRAS (integrated reliability and risk analysis system) 13 model of an NUREG-1150 PSA, we obtain the beneficial risk impact of the quarterly MSIV test: RD = 5"2E-7 per reactor-year. This test is risk-effective because the beneficial impact is about three times larger than the adverse impact. The RD for the TOPS test cannot be evaluated from the NUREG-1150 PSA, because this PSA does not model the turbine control valves; thus, the riskeffectiveness cannot be evaluated. Therefore, only the quantitative values of Rtrip and Ptrip c a n be taken into account in evaluating the TOPS test, unless the PSA model is modified to give a value for RD. Sensitivity analysis was performed for the MSIV testing (Fig. 1). This figure shows the sensitivity of three different kinds of risk impacts to variation of the test interval, T: Rtrip , R D and the total risk impact of the test, RT.
From the results of the sensitivity analysis depicted in Fig. 1, we can obtain the following insights: (1) Rtrip decreases as T is increased, because fewer
transients are expected as the test is conducted less frequently (see eqn (6)). However, RD increases with the increasing test interval because then, the test is more likely to detect a failure (refer to eqn (1)). (2) The curves for RD and R,rip intersect when the test interval is approximately 54 days. This value is the minimum test interval, Tm~n,which was previously discussed (see eqn (8)). The test interval must be longer than 54 days for the MSIV testing to become risk-effective, otherwise it is risk-ineffective. (3) The risk-effectiveness of the test with regard to test-caused transients also can be seen by comparing the test-detected risk contribution to the test-caused risk contribution due to transients. In the region where T > 54 days, RD is larger than Rtrip , and thus, the test is risk-effective. Where T < 5 4 days, the test is risk-ineffective. (4) In this study, we used the database of licensee event reports (LERs) for 30 BWRs for 1985, assuming that the operability of MSIVs is tested quarterly at all the plants. However, some plants were found to test the MSIVs more frequently; e.g. the operators were performing a biweekly surveillance when the test failure occurred. If we assume that the minimum test interval of 54 days is applicable to this plant, we can say that the biweekly test is risk-ineffective
1.0E-03 -E~R t~
1.0E-04:
-~-R 0
R trip
T
1.0E-05 >., 0 ¢t
1.0E-06
O" l, ¢1)
1.0E-07
cO
E
0
1.0E-08 1.0E-091 1.0E- 10
L
0
_ _
I
I
50 100 150 Test Interval for MSIV Testing (days)
200
Fig. 1. Sensitivity of the core-damage frequency impact to the test interval for the MSIV testing (RD = test-detected risk impact; R~r~p= test-caused risk impact due to transients; RT = total risk impact of the test).
230
(5)
L S. K i m et al. with regard to test-caused transients, because the interval is shorter than 54 days. Even if we consider other types of adverse risk impacts that are not negligible compared t o Ttrip , the test will be risk-ineffective. Sensitivity analyses, as shown in Fig. 1, can be very useful in defining a test interval or evaluating a potential modification to it. The calculated, optimal test interval, Top, is 54 days, the same as Train; however, the curve of RT indicates that the total risk impact only marginally increases when T is changed, for example, from 54 or 91 days (i.e. the current quarterly testing) to 150 days. Note that the more the interval is extended beyond Topt, the more the test will become risk-effective at the expense of a greater total risk impact. In addition, we also should take into consideration that the sensitivity curves of Rtrip and RT to the variation of T depend on the parameters A, the equipment failure rate, and Ptrip, the probability of a transient occurring during testing, which are assumed constant with respect to the test interval. The values of A and Ptrip may change, especially when the test is conducted less frequently than formerly. The failure rate, A, may decrease if the test had an adverse effect on the equipment, or it may increase if the test had a positive effect in terms of lubrication, cleanup, required movement of pieceparts, etc. The value of Ptrip may increase because the operators are more likely to make errors when testing less frequently. Included in these aspects of the decision process, the extension of the test interval, based on the sensitivity analyses, should be limited to a reasonable value, i.e. by no more than a factor of two. When experience is gained with this extension, further modifications may be justified.
4 EQUIPMENT WEAR FROM TESTING 4.1 Test-caused degradation model Some safety-significant components of nuclear power plants, such as a diesel generator or an auxiliary feedwater pump, are tested so often--generally monthly, and sometimes more often--that there may be progressive wear-out of the equipment due to the accumulation of test-caused degradation. Furthermore, as time passes, the component also will show aging effects, such as corrosion or erosion. Together, these will increase the unavailability of the component, and thereby, the unavailability of the associated safety system and function, which will, in
turn, reduce the plant's capability of preventing or mitigating an accident. The degradation from testing and aging effects are induced by two kinds of stresses, i.e. demand and standby stresses. Demand stress acts only when the equipment is asked to function or is operating. Standby stress acts while the equipment is in the standby state. For standby components that are periodically tested, generally the combination of both stresses causes the equipment to degrade, and ultimately, to fail. Based on the concept of stress on equipment and the characteristics of the degradation mechanisms caused by testing and aging, we can formulate the following test-caused component degradation model: q(n,t)=p(n)+
f
nT+t
A(n,t')dt'
for
t E [ 0 , T]
JnT
(12) p(n) = Po + popln A(n, t) = ),o + 2toP2n + ~u
for
u c [0, n T + t]
(13) (14)
where number of tests performed on .the equipment; t = time elapsed since the last test; q(n, t ) = component unavailability as a function of the number of tests performed and the elapsed time; p ( n ) = f a i l u r e probability for demand caused failures; T = test interval; n T + t = time since the last renewal point; A(n, t ) = standby failure rate (per unit time) for failures occurring between tests; P0 = residual demand-failure probability; p l = t e s t degradation factor associated with demand failures; A0 = residual standby time-related failure rate; P2 = test degradation factor for standby timerelated failures; a = aging factor associated with aging alone. Equations (12) to (14) represent a model that has been linearized from the original, nonlinear testcaused degradation model. 1~ This linear model can be used for most purposes and is used here. In eqn (12), the unavailability, q(n, t), and the standby failure rate, A(n, t), are represented as a function of n and t. The reason for this functional notation is that the standby failure rate is assumed to be affected not only by the standby time, but also by test-caused degradation. Therefore, component unavailability becomes a function of the number of tests performed on the component and the time elapsed since the last renewal. However, the demand failure probability, p(n), is represented in eqn (12) as a n =
Risk analysis of surveillance requirements function of only the number of tests, n, i.e. it is assumed that the demand-failure probability depends only on how many tests were conducted on the component. The expressions for the two basic degradation parameters, p(n) and h(n, t), are formulated in eqns (13) and (14) in terms of their variables n and t. In eqn (14) the time-dependent aging mechanism on the standby failure rate is represented by a Weibull distribution. 4.2 Formulas for risk-impact analysis
The test-caused component degradation model, presented above, provides a means to estimate the time-dependent component unavailability and its resultant risk impact as a function of the number of tests on the component and the time elapsed since the last overhaul. Let Rc.n be the average increase in core-damage frequency or test-caused risk contribution resulting from test-caused degradations of n tests on the equipment. We can evaluate /~c,, using the following formula:
- the average risk level between [0, T]
= A~-~[R1 - Ro] (15)
where At~n denotes the average increase in component unavailability that results from n tests, and only the test-caused degradation effect is taken into account without considering the aging effect, i.e. a = 0. Based on these formulas, we can establish the following criterion on the number of tests for risk-effectiveness with regard to test-caused degradation: ~hoT
degradation model and the formulas for evaluating the risk impact associated with such degradations are based on the following assumptions that should be considered in using the approaches: (1) Test-caused component degradations affect demand failure probability, and also standby failure rate, i.e. the component will be more vulnerable to both demand and standby time-related failures as more tests are performed on it. (2) The standby time-related failure rate increases because of test-caused degradation effects, as well as aging effects. (3) The time-dependent aging mechanism on the standby failure rate can be represented by a Weibull distribution. (4) The demand degradation or failure mechanism is not affected by time. In other words, the demand failure probability depends on only the number of tests performed on the equipment, but not on the idle or dormant time.
4.4 Application to diesel-generator test
/~c,, = the average risk level between [nT, nT + T]
= (p,pon + ½PzhoTn)[R1 - Ro]
231
nth test risk-effective with regard
n
The test-caused component degradation model not only incorporates aging effects, but separately takes into account test-caused degradation. However, the
Using the model, a sensitivity study was made on the risk impact of test-caused equipment degradations versus test interval in the framework of a N U R E G 1150 PSA. We chose to study the emergency diesel generator, because of the concern about test-caused degradations on this component and because of the availability of the reliability data to estimate the degradation parameters of the model. The method presented here can also be applied to other components. The values of the degradation parameters, such as Pl and P2, were estimated for diesel generators under the following assumption: When the number of tests is large, the average increase in component unavailability, which is evaluated by the test-caused component degradation model, is the same as that estimated by the aging model. 14 Figures 2 and 3 show the sensitivity to monthly and quarterly testing of the diesel generators, respectively, of three different kinds of core-damage frequency impacts: (1) the test-detected core-damage frequency contribution, /~D, (2) the test-caused core-damage frequency contribution due to equipment wear, /~c,n, and (3) the total core-damage frequency impact of the test,/~T,,. The results show that the test is risk-effective for monthly testing up to 61 tests i.e. approximately 5 years after the last overhaul, and for quarterly testing up to 111 tests i.e. about 28 years after the last
I.S. Kim et al.
232 1.0E-04
o ¢-
1.0E-05
== cr U.
O~
E ca ¢3 P O O
1.0E-06
i
1.0E-07 i 0
L
.1-
J
J
i
60
80
100
120
140
i
20
40
N u m b e r of T e s t s for M o n t h l y D i e s e l G e n e r a t o r
160
Testing
Fig. 2. Evaluation of risk-effectiveness for monthly diesel-generator testing (/~o = test-detected risk impact; Rc., = test-caused risk impact due to equipment wear; RT., = total risk impact of the test).
renewal time. However, when the test is no longer risk-effective, the total impact for quarterly testing is greater than that for monthly testing by about a factor of three. Figure 4 shows the risk-effective lifetime for diesel-generator testing as a function of test interval. The lifetime increases with increasing test interval because the test-caused degradation effects accumulate more slowly. However, the total risk at the end of the lifetime also increases when the test interval is increased. For example, if the test interval of 1 month is extended to 3 months, then the risk-effective lifetime will increase from 5 to 28 years, i.e. by a factor of 5.6. However, as indicated above,
"~ ¢0
the total risk at the end of the lifetime for quarterly testing will be about three times higher than that for monthly testing. (The total risk at the end of the lifetime is 3-5E-5 per reactor-year for monthly testing, and 1-1E-4 per reactor-year for quarterly testing.) The numerical results from this analysis should be interpreted cautiously. The data available to estimate the degradation parameters are sparse, which significantly influences the results. In this study, these parameters were estimated from reliability studies of several diesel generators. For specific applications, we recommend combining data from diesels with similar design, test, maintenance and overhaul characteristics to estimate these parameters, and consequently, to
1.0E-04
~ _ _ ~ _ ~ _ ~
,,.
[]
f ~ . ~ . ~ - ~ .~-t~ - - I - - - t - - - ~ - ~
u
c
1.0E-05
cr
P
i,
@
E
1.0E-06
0 U
1.0E-07 0
i
r
+
20
40
60
i
i
i
t
80
100
120
140
N u m b e r of T e a t s for Q u a r t e r l y
160
D i e s e l G e n e r a t o r Testing
Fig. 3. Evaluation of risk-effectiveness for quarterly diesel generator testing ( R D ~ test-detected risk impact; Re.. = test-caused risk impact due to equipment wear; RT,. ----total risk impact of the test).
Risk analysis of surveillance requirements
233
1.0E-03
,80
I-n
Q. 1.0E-04
!40
..I .
.
"0 UJ
1.0E-05
/
e~
.
.
/
f
t'-
!20
i 1.0E-06 0
2
3
4
5
6
Test Interval for Diesel Generators (months)
Fig. 4. Evaluation of risk-effective lifetime and the total test risk at the end of lifetime versus test interval of diesel generators. (Dotted lines indicate the lifetime and the total risk for monthly and quarterly testing.)
determine the requirements.
effective
test
and
overhaul
5 C O N C L U D I N G REMARKS The safety significance and risk-effectiveness of surveillance test requirements can be evaluated by explicitly considering the adverse effects of testing, based on the concepts and methods discussed in this paper. The results of quantitative risk evaluation can be used in the decision-making process to establish the safety significance of the surveillance testing and to screen surveillance requirements. This methodology should be used for those components where surveillance tests are judged to cause adverse effects, and in conjunction with qualitative evaluations from engineering considerations and operating experience, such as the burden of work and radiation exposure to plant personnel caused by the tests. By reducing the amount of testing, the adverse effects of testing also will be reduced in general. The discussion here focused on the risk impact of testing with respect to single components, because surveillance requirements are generally imposed on each single component. However, sometimes we may encounter another type of impact especially with regard to other components. For example, testing a diesel generator may help to detect c o m m o n cause failures in the redundant diesel generators. This beneficial impact also can be qualitatively considered in the decision-making process, unless a m e t h o d is developed to quantify its risk impact through further research.
ACKNOWLEDGMENTS The authors would like to acknowledge C. Johnson, Jr and M. Wohl of the US Nuclear Regulatory Commission, and T. Mankamo of Avaplan Oy, Finland, for their insightful comments. The work was performed under the auspices of the US Nuclear Regulatory Commission. The views expressed in the paper are those of the authors and do not necessarily reflect any position or policy of the USNRC.
REFERENCES 1. USNRC, A survey by senior NRC management to obtain viewpoints on the safety impact of regulatory activities from representative utilities operating and constructing nuclear power plant. NUREG-0839, Aug. 1981. 2. USNRC, Technical specifications--Enhancing the safety impact. NUREG-1024, Nov. 1983. 3. Lobel, R. & Tjader, T. R., Improvements to technical specifications surveillance requirements. NUREG-1366, Aug. 1990. 4. Mankamo, T. & Pulkkinen, U., Test interval optimization of standby equipment. Technical Research Centre of Finland Notes 892, Sept. 1988. 5. USNRC, PRA procedures guide. NUREG/CR-2300, 1-2, Final Report, Jan. 1983. 6. USNRC, Severe accident risks: An assessment for five US nuclear power plants. NUREG-1150, 1-2, Final Report, Dec. 1990. 7. Samanta, P. K., Wong, S. M. & Carbonaro, J., Evaluation of risks associated with AOT and STI requirements at the ANO-1 nuclear power plant. NUREG/CR-5200, BNL-NUREG-52024, Aug. 1988. 8. Apostolakis, G. E. & Bansal, P. P., Effect of human error on the availability of periodically inspected
234
9.
10.
11.
12.
13.
14.
L S. Kim et al. redundant systems. 1EEE Trans. Reliability, R-26(3) (1977) 220-5. McWilliams, T. P. & Martz, H. F., Human error considerations in determining the optimum test interval for periodically inspected standby systems. IEEE Trans. Reliability, R-29(4) (1980) 305-10. Lee, J. H., Chang, S. H., Yoon, W. H. & Hong, S. Y., Optimal test interval modeling of the nuclear safety system using the inherent unavailability and human error. Nucl. Engng Design, 122 (1990) 339-48. Kim, I. S., Martorell, S., Vesely, W. E. & Samanta, P. K., Quantitative evaluation of surveillance test intervals including test-caused risks. NUREG/CR-5775, BNLNUREG-52296, Feb. 1992. McClymont, A. S. & Poehlman, B. W., ATWS: A reappraisal. Part 3: Frequency of anticipated transients. EPRI Report, Electric Power Research Institute, Palo Alto, CA, NP-2230, Jan. 1982. Russel, K. D., McKay, M. K., Sattison, M. B., Skinner, N. L., Wood, S. T. & Rasmuson, D. M., Integrated reliability and risk analysis system (IRRAS): Version 2.5, 1 (Reference Manual). NUREG/CR-5300, E G G 2613, Feb. 1991. Vesely, W. E., Kurth, R. E. & Scalzo, S. M., Evaluation of core melt frequency effects due to component aging and maintenance. NUREG/CR-5510, June 1990.
The model accommodates the effect of accelerating failure mechanisms on the unavailability, e.g. due to the accumulation of impurities or corrosion. We then have the following formula for RD in place of eqn (1): RD = ~
AtE dt(R1 - Ro) = ~AT2(R1 - Ro)
Setting RD from eqn (A.2) equal to Rtrip from eqn (6), we obtain the equation for the minimum test interval, Train, for risk-effectiveness evaluation: P trip
1A T2in(R1 - go) --- ~
RIE_ /
(A.3)
Solving eqn (A.3) for Tmin, we have Tmin
= .3/ 3ptripRlE-j
~ A/~(R1 - Ro)
With the nonlinear model, following formula for RT:
we now
RT = 31-/~T2(R1 - R0) ~_Pt.~ ]~ /iT Jt~IZ j
(A.4) have
the
(A.5)
Differentiating eqn (A.5) with respect to T and setting to zero, we can obtain Topt as
APPENDIX: PROOF OF Tmi. ~ Topt WHEN USING NONLINEAR UNAVAILABILITY FOR RD
t3/ 3ptripRIE-j
Assume, as an example, the following nonlinear model for c o m p o n e n t unavailability: u(t) = At2
(A.2)
(A.1)
Topt= ~/2A/j(Ra - Ro)
(A.6)
Comparing eqn (A.6) with eqn (A.4), we can see, in general, that Tmi, differs from Topt.