Modelling software reliability prediction with optimal estimation techniques D Christodoulakis and G Panziou
Software reliability prediction is as vital to the software industry as to any other industry that relies on the quality of software. Reliability prediction, however, requires a solid model of the behaviour of the software system and a well suited prediction methodology and algorithms. The paper focuses on the second task. Optimal estimation techniques are applied to predict the future behaviour of software systems. An initial implementation of the proposed methodology has shown encouraging results. The observed discrepancy between the real and estimated values of the mean time to failure of software systems was low and sometimes around zero. software reliability, reliability prediction, modelling, estimation techniques
Since the software crisis in the late 1960s, much effort has been spent on the study of software reliability modelling and analysis of the behaviour of software systems under operational conditions. The outcome of these studies is a large number of software reliability models 1'2. Little work has been done, however, to establish concrete software reliability prediction tools. A major criticism of existing models is that most are not supported by appropriate prediction algorithms, and therefore they cannot be trustworthily consolidated in software reliability prediction tools 3. The aim of this paper is to present a new model for software reliability prediction, which, except for its ability to describe the stochastic behaviour of software failures, is particularly useful for software reliability monitoring and prediction during software testing and maintenance. In addition, as the proposed model is supported by prediction algorithms that have been successfully applied in a broad range of problem areas (e.g., estimation of river flows, chemical process control, nuclear reactor parameter estimation, etc.), their usability and applicability are guaranteed. The framework of the approach is the theory of optimal estimation. An optimal estimator is usually defined as an algorithm that processes measurements to deduce a minimum error estimate of the state of a 'physical' system by using knowledge of the system and Computer TechnologyInstitute and ComputerEngineeringDepartment, Universityof Patras, 26500 Patras, Greece. Part of this work was presented at the 3rd InternationalConference on Fault-TolerantComputing-Systems.Bremerhaven,FRG (1987) 88
measurement dynamics, assumed statistics of measurement errors, and initial condition information4. 'Physical system' in this approach is a software system that is considered under regression-testing conditions. According to the US Department of Defense standards and contractual requirements for software development (DOD -STD-480), regression testing is defined as 'that kind of testing which is not simply performed at the end of coding, but rather must be planned and developed in parallel with the software system itself'. For every item in the system specification, a corresponding test must be performed to check that the software meets each specification. Even after the system is installed the set of tests should be kept and updated as well, so that, as the system is modified during maintenance, previous tests can be re-run to check that the system still meets the imposed specifications 5. The usual way to describe physical systems in the framework of optimal estimation is the state space model. The dynamics of the system are represented by a set of first-order differential equations that describe the system forcing and control functions. The variables involved in these equations represent the state of the system at a particular point. Given the system description and initial values of the state variables, from that point forward the state of the system can be computed for any time-instance in the future. The state space model discussed here is based on results obtained by Currit et al. 6. State space variable of the model is the so-called mean time to failure (MTTF). The MTTF of software systems is defined as the mean operation time between successive system failures. As has been proved 7'8, the MTTF is highly correlated with the amount of testing of software systems. The evolution of the MTTF during the testing process is an indicator of the work done to improve the system's reliability. The prediction model discussed in this paper is mathematically formulated in two basic equations: the first describes the software system during regression testing, the second represents the process of the MTTF measurement. Both equations are based on the assumption that software errors and failures observed during regression testing are repaired when engineering changes take place. An engineering change is performed during regression testing and aims to correct
0950-5849/90/0100884)5© 1990 Butterworth & Co (Publishers) Ltd
information and software technology
those software errors that are observed before it. After each engineering change the set of tests should be rerun to check that the errors are corrected. The quantity measured between two successive engineering changes is the MTTF. In case the errors and failures are repaired immediately after their occurrence, the Mq-TFs coincide with the operation times between successive engineering changes. This paper mainly focuses on the prediction methodology. The realities of the model construction have been adopted from Currit et al. 6, where the interested reader can find basic ideas about software certification, relationships between certification and the software life-cycle, and, finally, how those ideas are reflected in the discussed model. The second section of this paper introduces the state space model and the prediction methodology. The third section presents the results obtained from an attempt to verify the model with test data found in the current literature. The paper concludes by discussing some application perspectives and future plans.
engineering change EC(ti) a number of errors is removed and that the error removal at the engineering change EC(ti) has a direct impact on the failure rate hi+l, it is obvious that for the initial MTI'F this yields
x(to) = (1 - Pi - P2 - ... - p~)x(t~), v = 1,2 .... ,m
With additional assumptions, it has been shown 6 that the failure probability of Pi has the distribution of geometrically decreasing terms
Pi = 7r(1 - a)i-1, 0 < ~ r < l , 0 < a < l , i~>l Thus from equation (1)
x(to)
(1
"rr(1 - (1 - ~)~)) -
a
x(tv),
v = 1,2...m (2)
From equation (2) it is easy to imply a-~(1-(1 -a)v-')
x(tv) = a--Tr(1--(1--a) ~) x(t~_,), v = l , 2 ..... m MATHEMATICAL MODEL AND PREDICTION METHODOLOGY Let EC(ti) denote the engineering change applied on a software product at time ti and let EC(to), EC(tl) .... , EC(tm) be a list of successive engineering changes. EC(to) denotes the initiation of the regression-testing process and EC(tm) denotes the last applied engineering change. The time interval [ti 1,t~] between the engineering changes EC(ti_l) and EC(ti) is the time where those system failures occur (e.g., system crashes, software does not meet the requirements, etc.) that lead to the engineering change EC(ti). Now let hi be the failure rate at the time an engineering change EC(6) takes place and hA be the residual failure rate after the mth engineering change, i.e., hA is the total failure rate from the mth engineering change until the software is retired. If ho is the initial failure rate then ho = hA -{--hi + h2 + ... -k- h,,, and the failure rate EC(ti), is hi+l = h o -
after the engineering change
hi+l,
hi -- h2-- ... -- h,
The probabilities for failures at the engineering changes EC(ti)are given by the formula
Pi
=
h(--)' i = 1,2 .... m
where p.x + Pl + P2 + + P m = 1. As mentioned in the introduction, for each time interval [ti i,ti] a corresponding M T T F x(ti) is assumed that can be defined as the reciprocal of the failure rate at that time 6. x(to) represents the mean time to failure, which corresponds to ho. If it is assumed that at each .
.
.
vol 32 no 1 january/february 1990
(1)
(3)
The factor F(t~, t~l) represented by the fractional part of equation (3) ¢x_,rr(l_ (l_ec)v l)
F(q.,t~_,) = ec_~r(l_(l_e0~)
(4)
is always greater than one, for 0 < ~ < 1 and v = 1,2 ..... m.This means that the M T I ' F at time t,, is always greater than the M T T F at tv_l, which seems to be natural, for as regression testing proceeds, the software probably includes less errors. Equation (3) represents the process of M T T F improvements during regression testing. But this is not the whole truth. Until now no consideration has been given to the case where errors are introduced at engineering changes. Software developers know very well that testing or maintenance of software products is itself a frequent source of errors. Myers, for example, states that experience has shown that correction of software errors has a high probability (usually from 20 to 50 per cent) of introducing a new error into a program 9. Taking into account this observation a new variable w(t,. j), which represents the way the measured M T T F is influenced by the software errors introduced at the engineering change EC(t,. 1), is introduced into the model. Thus from equations (3) and (4) x(/~.) = F(t,.,t,._l)x(t~._l) + w(t,. l)
(5)
The state variable x(to) at time to is a Gaussian random variable with mean value ~(to) and covariance Po, which can be estimated, w(t) is a Gaussian, white noise stochastic process, w(t,,) at each time t~ has zero mean and covariance Q(t,.), which is calculated by formula 4 Q(t,,) = q(tv)(tv+, - tv) After some investigations it was found that for 89
F(tv+l, tv) q(tv) = 0.4~(to) the prediction algorithms gave a satisfactory behaviour. Equation (5) is one of the two basic equations in the proposed state space model. The second one represents the process of MTTF measurements during regression testing and has the form
z(tv_,) = x(t~_,) + y(tv-,)
(6)
Variable z(tv_l) represents the actual measurement of the MTTF at t~_l and variable y(t~_~) corresponds to the so-called measurement noise. In the approach here the measurement noise is the discrepancy between the real MTFF and the measured one. y(t) is a Gaussian, white noise stochastic process, y(t~) at each time tv has zero mean and covariance equal to one. y(t) and w(t) are stochastic processes independent of each other. Now the prediction problem, stated in broad terms, is the computation of the conditional mean values E[x(ti)lz(to),Z(q),...,z(t~)] for i = v + l , v + 2 , . . . , v + N , where x(ti) and z(ti) denote the real and measured values of the M'ITF, respectively. Just for convenience reasons let E [ x ( t i ) l z ( t o ) , Z ( t O . . . . . z(/~)] be denoted by Yc(ti/t~). The measurements Z(to),Z(tl) ..... z(t~) are used to predict the state variable x, N time units ahead of the last time of measurement. The prediction equation used for the computation of ~(t~+N/t~) is given by the formula
Yc(tv+dtv) = F(tv+N/t~)~(tv/t~) where e(tv+N,tv)
= F ( t v + N , t v + N _ l ) F ( t v + N _ 1 , t v + N - 2 ) . . . F ( t v + l ,iv)
F(ti,ti_l) is the factor defined in equation (4), and Yc(tJt~) denotes the estimate of MqTF at time tv, given the measurements Z(to),Z(q) .... ,z(t~). The estimation error for 2(t~+N/t~) is given by the formula e(tv+N/t~) = F(t~+N,t~)P(t~/tv)F'(t~+N,t~) v+N-1
t
+ •
x~ /=~ F(t~+N,tj+I)Q(ti)F (tv+N,tI + 1) where P(tv,tv) denotes the error covariance 1°. Up to this point the presentation of the model and the prediction methodology have been completed. As the prediction algorithms can be easily found in any textbook on optimal estimation4'1° discussion on the estimation model will be skipped over and instead some interesting results from an attempt to implement and test the methodology are presented.
from a random number generator and real data found in the literature. The real data 3 are two lists of observed MT1Ts, containing 85 measurements in the first list and 135 in the second. Running the developed system with the artificial data gave some surprising results. The estimated MTTFs were close to the measured values and the estimation error was around zero. The prediction algorithms and the model cooperated perfectly with each other. These results are shown in the Appendix, which contains three Tables and three Figures. Table 1 shows the measured and estimated MTFF for the artificial data. This is represented graphically in Figure 1. Considering both the Table and Figure, as indicated in the Figure the estimation error is nearly zero. The same is also true for the real data mentioned before. Tables 2 and 3 represent the measured and estimated values of the MTTF, shown graphically in Figures 2 and 3. For both lists of real data, the prediction methodology reflects well the evolution of the MTFF. CONCLUSIONS Optimal estimation has been successfully applied in many areas of physical systems4. In this paper it has been shown that it can be successfully applied also to software systems. The authors believe that this result opens up a new application domain for optimal estimation, which should be explored further to make software systems more secure than they are today. Finally, mention should be made of two main open problems that the authors intend to consider in the near future. It is well known that the main advantage of optimal estimation is that, if the mathematical model of a physical system is present, then in most cases there are appropriate prediction algorithms. The only restriction for their use is that the model must fulfil a set of assumptions about the statistical behaviour of the stochastic processes and the random variables. The statistical properties of the model as described previously assure that the model cooperates harmoniously with the algorithms. Nevertheless, effort is needed to determine the best suited prediction algorithms. Finally, work is necessary to check the performance of the prediction algorithms when they apply to real-time systems and in particular to safety-critical ones. ACKNOWLEDGEMENT Many thanks to Mr P Soupos, who implemented the algorithms and corrected (some) mistakes. The authors have benefited greatly from discussions with him.
REFERENCES TESTING RESULTS The model was implemented using standard algorithms for optimal estimation (Kalman filtering algorithms). For test data, use was made of artificial data generated 90
I Hoecker, H, Itzfeldt, W D , Schmidt, M and Timm, M 'Comparative description of software quality measures' GMD-Studien Nr. 81 GMD, St. Augustin, FRG (March 1984) 2 Dale, C J 'Software reliability evaluation methods' information and software technology
British Aerospace Dynamics Group Report ST26750 (1982) 3 AbdeI-Chaly, A, Chan, P and Littlewood, B 'Evaluation of competing software reliability predictions' IEEE Trans. Soft. Eng. Vol 12 No 9 (September 1986) 4 Gelb, B (ed) Applied optimal estimation MIT Press, Cambridge, MA, USA (1984) 5 Borning, A 'Computer system reliability and nuclear war' Commun. ACM Vol 30 No 2 (February 1987) pp 112-131 6 Currit, P A, Dyer, M and Mills, H D 'Certifying the reliability of software' IEEE Trans. Soft. Eng. Vol 12 No 1 (January 1986) pp 3-11 7 Littlewood, B 'What makes a reliable program few bugs or a small failure rate?' in Proc. National Computer Conf. (1980) pp 707-713 8 Musa, J 'The measurement and management of software reliability' Proc. IEEE Vol 68 No 9 (September 1980) pp 1131-1143 9 Myers, G J Software reliability: principles and practices John Wiley, New York, NY, USA (1976) 10 Anderson, B and Moore, J Optimal filtering Prentice Hall, Englewood Cliffs, N J, USA (1979)
lannino, A et M. 'Criteria for software reliability model comparisons' IEEE Trans. Soft. Eng. Vol 10 No 6 (November 1984) pp 687-691 Jelinski Z and Moranda P B 'Software reliability research' in Freiburger, W (ed) Statistical computer performance evaluation Academic Press, New York, NY, USA (1972) pp 465-484
APPENDIX Table 1. Test results received from artificial data (read left to right) estim-MTTF
8.65986685 36.220374 52.8593494 68.2005581 80.0600021 92.5175993 91.61 :~015 99.2821765 109.924696 99.8I~4082 106.4114111 114. 119418 117.552773 I 11.4'~3576 118.3 J,5013 I I 1.76657 II 1.2,1018 I21.57354I 122.9'91t845 124.~92289 118.3,11t645 123.7:,7584 116.390614 114.8r:,1592 120.120456 II6.8~2784 113.712897 107.2",16999
8.57567981 36.2449383 52.9977928 68.2280454 80.0894178 92.3921751 91.5038509 99.3253918 1119.94111168 99.7061484 1116.419326 114.143682 117.641205 I I 1.492256 118.211381t9 I 11.7113123 I 11.275721 I21.527494 122.957774 124.684137 118.319651 123.82482 116.47548 114.844814 120.185883 116.788298 113.9157112 1(17.391/317
MTI'F 21.0667309 42.0366198 62.519839 71.7778323 82.0054315 92.55684112 98.980635 106.231874 107.547189 95.7957817 107.103101 114.415934 115.07431tl 116.632514 113.055991 108.136152 118.705253 120.948262 121.655733 121.273361 127.975931 119.59.2(184 115.692323 116.349612 1(19.845557 112.873615 110.554038 106.034555
v o l 32 n o 1 j a n u a r y / f e b r u a r y
MTTF
estim-MT1T
IvlTI'F
estim-MTTF
MTTF
estim-MTIT
4.79 5.54 6,93 1.7 4.69 18.08 5,96 22.3 4,095 3.63 2.77 2.13 2,98 24.4 10.34 5.65 9.27 1.81 31.54 20,37 4.9 0.85 18.66 43.22 54.9 27.16 7.25 10.9 9.94
4.78521956 5.53850133 6.933%507 1.69956709 4.69861326 18.0690868 5.95687126 22.2821708 4.0493858 3.62916371 2.77338302 2.13610217 2.99304571 24.3818406 10.3311683 5.64897641 9.26511973 1.81536044 31.516008 20.3585077 4.90069895 0.866797293 18.6435293 43.1911993 54.8554237 27.1654226 7.27773236 10.9288132 9.94198676
2.66 10.34 5.97 1.17 11.74 1.35 7.57 4.37 5.35 5.22 13 16.2 18,74 0.05 24.41 11.19 44.62 14.85 21.25 14.81 5.93 28.36 4.9 14.18 15,2 21.75 19.63 2.45
2.62515458 10,336768 5.97176773 1.17063791 11.733161 1.36715975 7.56849668 4.38815305 5.34873751 5.21843502 12.9898063 16.1859711 18.7242915 0.0743243637 24.3959619 11.1844751 44.5847217 14.8369922 21.1603562 14.8155425 5.92897365 28.3325618 4.91371791 14.2089564 15.2395791 21.7554058 19.617626 2.45846212
2.77 9.49 1.17 12.74 6.93 2.77 4.37 3.4 2.77 6.13 8.21 16.01 6.18 1.49 4.6 4.37 7.14 7.57 8.84 5.59 17.69 2.13 14.37 10.23 32.81 35.05 39.79 11.94
2.77253761 9.49288527 1.17535078 12.7285188 6.93519903 2.76863153 4.37331169 3,40103466 2.77261537 6.1291194 8.21482234 16.0102224 6.19255838 1.48858727 4.61978329 4.37680982 7.17739593 7.57725879 8.8523023 5.59921047 17.6782634 2.15615366 14.360563 10.2339719 32.7924653 35.0367327 39.7698684 11.9305375
Table 3. Test results received from real data (in hundredths of seconds)
BIBLIOGRAPHY
MTI'F
Table 2. Test results received from real data (in ten thousandths of seconds)
es(im-MTTF
MTTF
21.00166557 28.0688237 42.1322626 49.3291965 62.645023 65.774997 71.8377335 75.3415672 81.8921789 88.4155911 92.54268118 94.2148584 99.1546243 101.876859 106.169931 103.594136 107.614422 104.378348 95.7753567 102.311381 107.032228 112.580274 114.545159 115.016108 115.052083 112.2206 116.527355 12/I.696464 113.106435 112.325132 108.176129 109.596645 118.647617 118.898893 120.753177 125.169823 121.722773 122.175994 121.235{183 120.693158 127.811768 126.716803 119.604632 120.737881 115.716433 114.4(t7657 116.138442 121.448628 1119.872745 107.524605 112.586948 112.330048 110.609921 107.020571 106.018138 109.0119599
1990
estim-MT1T 28,0212245 49.3084984 66,0078103 75.5243256 88.3095361 94.2257842 101.886411 1113.559354 104.407689 102.31282 112.597275 115.085871 112.487397 120.881476 112.368529 1119.593498 119.04505l 125.302012 122.234526 1211.825078 126.763827 120.665335 114.444519 121.599088 107.414921 112.161005 107.119869 109.123692
MTTF
estim-MTTF
MTTF
estim-MTTF
MTTF
estim-MTFF
.03 .81 .02 .15 .77 .88 .26 .55 4.22 11.46 .36 .08 1.76 3.0 4.52 1.93 8.16 .21 3.57 .31 11.0 3.65 .1 379 81 5,29 8.28 2.96 17.83 7.07 7.25 14.61 2.61 14.35 1.118 12.47 8.75 18.97 4.46 9.48 .75 1.0 3.71 33.21 54.85
.0297032641 .818029972 .020560286 .160329164 .767677899 .882197135 .27000034 .5767088412 4.18553287 11.348807 .358522039 .079219844 1.74928841 3.01509437 4.50134679 1.931146927 8.08767631 .223612399 3.54823365 .3311066533 .0729874395 3.64647678 .152910427 3.8(1421813 8.03315113 5.26754335 8.21462268 2.97515258 17.761/1553 7.119695295 7.25332338 14.7534367 2.5863911157 14.2949969 1.0833363 12.6496764 8.733117823 18.85511199 4.45418931 9.48328669 .745821198 1.52550868 3.77761542 33.482223 54.3761421
.3 1.15 .91 1.38 .24 6.7 1.14 2.42 1.8 6 ,04 2.27 .58 .97 2.55 .06 13.51 2.33 1.93 3.69 2.32 12.22 .16 .44 2.9 2.81 10.11 17.55 8.6 .33 23.23 8.43 18.0 .3 0.0 9.43 2.45 4.47 1.12 10.82 4.82 .1 7.9 10.45 11.6
.296614615 1.14901182 .901327742 1.36815808 .245470911 6.64321447 1.13152019 2.40200161 1.82374632 6.05313398 .043138556 2.24852356 .591504228 .990101201 2.56918639 .0783562555 13,4568869 2.30934838 1.94587841 3.65705728 2.29796854 12.1359427 .159930633 .472988288 2.95033385 2.8340978 10.0914183 17.4070962 8.68981659 .396349777 23.0733514 8.492~1125 17.8488718 .437219457 .0106219879 9.46156869 2.51160483 4.61104332 1.25171081 10.8068938 4.78005323 .113676923 7.85958059 10.6758279 12.0194143
1.13 1.09 1.12 .5 1.08 1.2 3.25 .68 .1 ,15 0.0 .65 4.57 2.63 1.97 .79 1.48 1.34 2.36 7.48 3.3 5.43 5.29 1.29 3.0 1.6 4.45 10.64 9.83 8.68 29.3 .12 8.65 1.43 31.1 7.() 7.29 3.86 9.9 .22 55.119 10.71 61.5 6.48 18.64
1.12496735 1.102559787 1.1186883 .509234132 1.07190651 1.2548494 3.22941077 .697141118 .117038659 .208205172 .000424634 .66573633 4.53100514 2.61393224 1.9759026 .783023155 1.59750735 1.34951386 2.35594552 7.44252481 3.29017894 5.49576492 5.23970059 1.28198962 2.99951437 1.61210109 4.50531574 10.7063535 9.81882179 8.59878999 29.2389501 .2(1208675 8.74019433 1.42026593 30.7951729 7.(12413548 7.24314852 3.86736391 9.81520471 .323803097 54.5967178 10.6061074 60.974063 6.52113954 18.5750861
91
12f
12f
125
I0£
IO£
I00
7~=
7~
75
5(
5£
50
2~
2~
25
. IO . . .20. . 50 . . 40 . . .50
60' 17~1801
,
. . . . .
,
I0
2o
__,__,,
30 40
,,
50 60
,
70
,
,
o
8O
b
8
.I
I
I
I0
I
20
I
I
I
I
I I I
30 40
I
50 60
I
I
70
I
I
I
80
C
Figure 1. Graphical representation of Table 1. (a) MTTF (b) estimated MTTF (c) estimation error
60E
6055 5045-
55 50 45
40-
,40
!
3530"-
30
15 I0 5
I0 I
0
a
I0
20
30
40
50
6O
7O
I
8O
1
b
0
I0
20
30
40
50
60
70
80
Figure 2. Graphical representation of Table 2. (a) MTTF (b) estimated MTTF 50-
5O
45-
45
40-
4O
55-
55
30
3O
2O
2O
15 I0
15
25
Io! 5 oc
a
b
25
50
75
I00
125
Figure 3. Graphical representation of Table 3. (a) MTTF (b) estimated MTTF
92
information and software technology