Testing digital safety system software with a testability measure based on a software fault tree

Reliability Engineering and System Safety 91 (2006) 44–52 www.elsevier.com/locate/ress Testing digital safety system software with a testability meas...

Download PDF

300KB Sizes 1 Downloads 64 Views

Report

PDF Reader
Full Text

Reliability Engineering and System Safety 91 (2006) 44–52 www.elsevier.com/locate/ress

Testing digital safety system software with a testability measure based on a software fault tree Se Do Sohn*, Poong Hyun Seong Department of Nuclear and Quantum Engineering, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong Yusung-gu, Daejeon 305-701 South Korea Accepted 27 November 2004 Available online 1 February 2005

Abstract Using predeveloped software, a digital safety system is designed that meets the quality standards of a safety system. To demonstrate the quality, the design process and operating history of the product are reviewed along with configuration management practices. The application software of the safety system is developed in accordance with the planned life cycle. Testing, which is a major phase that takes a significant time in the overall life cycle, can be optimized if the testability of the software can be evaluated. The proposed testability measure of the software is based on the entropy of the importance of basic statements and the failure probability from a software fault tree. To calculate testability, a fault tree is used in the analysis of a source code. With a quantitative measure of testability, testing can be optimized. The proposed testability can also be used to demonstrate whether the test cases based on uniform partitions, such as branch coverage criteria, result in homogeneous partitions that is known to be more effective than random testing. In this paper, the testability measure is calculated for the modules of a nuclear power plant’s safety software. The module testing with branch coverage criteria required fewer test cases if the module has higher testability. The result shows that the testability measure can be used to evaluate whether partitions have homogeneous characteristics. q 2005 Elsevier Ltd. All rights reserved. Keywords: Digital safety system; Testability; Entropy; Fault tree; Homogeneous partition

1. Introduction The protection system in a nuclear power plant stops the production of nuclear energy if abnormal conditions occur. With a single failure design criterion, the safety system should withstand the failure of a single component and have on-line testing capability. Such features require the system to be built in multiple redundant channels. The safety systems were designed with analog devices, but these systems are being replaced with digital systems and new issues are arising. In the analog system, each function was implemented with separate hardware components. The digital system is implemented by sharing hardware and software components for the functions. With this sharing, the digital safety system is more vulnerable to common mode failures.

* Corresponding author. Tel.: C82 42 868 4330; fax: C82 42 861 1488. 0951-8320/$ - see front matter q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.ress.2004.11.015

Thus, for the digital protection system, verification is needed that the plant is safe against common mode failures. The digital system is designed using the commercialgrade software and hardware. Verifying the quality of these commercial-grade products is required for nuclear safety systems. This process is defined as a commercial dedication process and detailed guidelines for the process have been provided [1]. In addition to having seismic and environmental qualifications, the digital system is also required to have immunity from electromagnetic interference and radio frequency interference. Software design for a digital safety system, along with its verification and validation, is a major area of research. Several regulatory guidelines have been issued, including guidelines for software life cycle, configuration management, software requirement specification, software verification and validation, data communications, and software testing. When developing a safety system, software testing takes a significant effort. The testing is performed in a very

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

specific manner for each system. Although various testing methods exist, there are no unique criteria to prove whether enough testing has been performed. This paper describes software testability based on the importance of each statement and the failure probability from a software fault tree. The following section describes the design of the safety system with the use of predeveloped software. The proposed testability measure and its application to modules of the safety software are described in Section 3.

2. Safety system with commercial-grade equipment The Core Protection Calculator System (CPCS) is a safety system that calculates the departure from the nucleate boiling ratio and the local power density (LPD) based on the conditions of the reactor coolant system. The system consists of four redundant channels from the sensors to the processing units, with the exception of the control element assembly (CEA) position signals that are configured in two channels. The CPCS has a four-channel configuration that shares the CEA position signal using data communication. Serial data communication is the main method of sending safety information between redundant channels. A serial data link with a fiber optic modem isolates the shared signals, and the data communication is unidirectional. Data sharing between redundant channels used to be conducted with isolation devices such as a transformer or an analog fiber optic modem, and it was vulnerable to noise signals. Isolation was performed for each shared signal, thereby limiting the signal sharing. The data link, however, has the capability of overcoming the noise by error detection and correction checks; no accuracy is lost and one data link can share multiple signals. By using a serial data link to share data, each channel can access redundant CEA signals. A human–machine interface is implemented with a color flat panel display and a network interface that connects all the processors in a channel. Besides enhancing data communication between devices, the network connection for the display device simplifies the interfaces within the system. With a high-speed data network, more data is fed to the operator for the status of the system. The information, including trend charts for important variables, is provided in a user-friendly manner. The system is implemented with commercial-grade equipment such as a programmable logic controller, which has been used extensively in the protection systems of fossil fuel plants. 2.1. Software design Software development is based on the model of a waterfall life cycle; the actual development process might adopt something like a spiral model. The development cycle

45

fits well with a waterfall model because the software design process is based on documentation. Application software is developed using a functional graphical language. The software development environment contains predeveloped program elements. These elements are library routines of commonly used functions. Application software is built using the predeveloped function blocks and database elements of a function chart. Software is designed by selecting appropriate function blocks and making logical connections among these predeveloped function blocks. Because the function blocks have already been certified from previous design and operating experience, the software testing only needs to focus on the intended functions of the safety system software. Thus, module level software testing is either eliminated or can be verified easily. The execution sequence is fixed with a top–down, left– right execution of the function blocks in the function chart. In the program, the processing sequence has a fixed order of reading inputs from database elements and other modules; it then executes its body, and sends outputs to database elements and other modules. With the program’s fixed execution sequence, the deterministic characteristic can be easily demonstrated. In the case of the CPCS, the algorithm is very calculation intensive; designing exclusively with basic function blocks makes the software very bulky. The custom function blocks are developed for special functions, and these are used with predeveloped function blocks to make application software. Part of the application program in a function chart is shown in Fig. 1. The function blocks are represented as rectangular box with inputs on the left side and outputs on the right side. The PCPGM, CONTRM, FUNCM are control structure where it is used to specify the execution cycle times and priorities, etc. The application programs are blocks FLOWSENS and FLOWMOD1. The numbers on each function block are just the terminal numbers for each inputs and outputs. The custom function block is programmed in block entity to perform specific function with specified inputs and outputs. As written at the bottom of the Fig. 1, the direction of the execution order is top to bottom and left to right. All the inputs and outputs to the function block are shown in the function block and they should be connected to the other function blocks. The custom function blocks are treated the same way as in predeveloped function blocks. The verification and validation is performed throughout the development life cycle. The verification process is performed for each design phase to confirm whether the output of each upstream design phase is well implemented in the downstream design. Besides verification and validation, preliminary hazard analysis for software is performed. The design information, failure modes and effects analysis for the system are used in the analysis. Any failure in a system component, including input or output failure and abnormal behavior of the processor module, is considered to affect the performance of the system. The identified hazard should be reflected in the design so that

46

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

Fig. 1. Program in function chart.

either the hazard is removed or a method is implemented to prevent the hazard conditions from being reached. 2.2. Use of predeveloped software As with the operating system, the predeveloped function blocks are commercial software. Commercial software is developed for a general purpose and is used in a variety of applications. As such, there exists commercial software that has a quality and has been in operation for a long period. This kind of large-volume usage of commercial software provides the reliability that cannot be met with software customized for a specific purpose. Furthermore, because the market for real-time software and hardware is mature, many reliable products are available. One of the problems with these commercial products is that the advancement of technology is so fast that long-term support for the product is not easily obtained. Some companies provide a minimum obsolescence time

to customers. This kind of long-term support is essential for products to be used in a nuclear power plant. Commercial off-the-shelf software to be used in applications related to nuclear safety must go through the commercial-grade dedication process described elsewhere [1]. If the commercial off-the-shelf software used in a nuclear safety system changes after the commercial-grade dedication, the changes should be reviewed to determine their impact on the system and the reported errors should be evaluated for new releases. The impact of the changes on the application should also be determined. When the impact has been identified, tests should be conducted and verification should be given that the changes to the commercial off-theshelf software were performed in accordance with acceptable nuclear standards. The commercial dedication is performed with an evaluation of the vendor’s software development life cycle and an evaluation of the configuration management, including error corrections. If any deficiencies do not meet

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

the required quality, the process is supplemented by the operating experience of the product. In the CPCS design, an evaluation was performed on the design life cycle and the operating experience. The existing documents were reviewed to verify the existence of the following: (1) well-defined system software requirements, (2) a comprehensive software development methodology, (3) a comprehensive test procedure, (4) a strict configuration management and maintenance procedure, and (5) complete and comprehensive documentation [1]. After verifying that few discrepancies and associated open issues impact nuclear applications, the software can be used in safety applications. The operating history of each product is evaluated by reviewing the hardware and software problems that have arisen since the product’s release and how the problems have been resolved. Either the errors are corrected or verification is given that the specific software and hardware module do not contain these reported errors.

3. Software testing In software development, testing requires large resources. Software testing is classified as either structural testing or functional testing. Structural testing is based on the structure of the software and depends on the data flow or control flow of the software. Functional testing is based on the functional requirements of the software and is used for integrated software testing. Testing can be classified as debug testing or operation testing, depending on the main purpose of the testing. Debug testing is performed to find more faults within limited resources, while operational testing is oriented more towards reliability estimation. Testing is also classified as partition testing or random testing, depending on the methods of selecting the test cases. Comparisons have been made of the effectiveness of partition testing and random testing. Although some observations show that random testing is more effective than the partition testing, many researchers say that partition testing can be made more effective [2]. The selection of homogeneous test cases from each partition was conceived to make partition testing more effective than random testing [3]. In reliability calculations, the effects of the code coverage of the testing have been considered. Chen proposed to adjust software reliability using the results of the coverage of the software testing [4]. Software reliability is based on the periods during which the software executes without failure. A test case is considered ineffective if it does not increase the code coverage and does not cause the program to fail on execution; as a result, ineffective test cases lead to an overestimation of reliability. Comparisons have been made between operational testing and debug testing with respect to reliability improvements in software. Variation is expected in the selection of testing methods not only between different projects but also during the evolution

47

of a single program [5]; in addition, the effectiveness of a testing method depends on many other factors such as organization, the stages in a project, and different projects [6]. In the Federal Aviation Administration, software is classified as level A, B, or C, depending on its criticality. Level A software should satisfy the modified condition and decision coverage based on the software structure. Structural coverage analysis is performed to reveal whether the code structure has been executed with requirement-based test cases. This structural testing can indicate whether the code structure has been verified to the degree required for the level of the application software; whether it produces any unintended functions; and whether it has the thoroughness of requirement-based testing [7]. By selecting test cases on the basis of the modified condition and decision coverage, every possible branch condition is covered. Similarly, structural testing and functional testing are both performed for CPCS software. For structural testing, the criteria for branch coverage testing are applied and every branch is covered for the software modules of the CPCS. Partitions based on the specification or software itself produce partitions that divide the testing space uniformly but not homogeneously. As a result, test cases are selected uniformly from each partition instead of being selected on the basis of proportional partitions known to be safe [13]. 3.1. Measure of software testability based on a fault tree Software testability is defined as the degree to which a system or component facilitates the establishment of test criteria and; the performance of tests to determine whether those criteria have been met. In this definition, testability is focused on the ease with which a test may be performed and evaluated. Freedman proposed a testability measure based on the observability and controllability of software [8]. The observability of a program is defined as the ease of determining whether a specified input affects the output; controllability is defined as the ease of producing a specified output from a specified input. Programs can be made more observable by introducing more input based on implicit states; they can be made more controllable by restricting the output range. Voas defines software testability as the probability that a piece of software would fail on its next execution during testing if the software includes a fault [9]. In such a definition, software testability is predicted by sensitivity analysis that seeds the fault and by observations of the propagation into failures. As design heuristics for testability, Voas suggests the following methods: specification decomposition to reduce error cancellation, minimization of variable reuse, and an increase of exclusive output variables for testing. Software fault tree analysis has been used in the safety analysis of software to investigate the safety aspects of software. Fault tree templates for software structures such

48

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

as if-then-else, function call, and while statements have been described elsewhere [10,11]. For Wolsong Nuclear units 1 and 2, software fault tree analysis was used for the safety analysis of software hazards. Starting with the top events identified in the system level analysis, detailed software fault trees were constructed [12]. Although no software faults or programming errors were identified in the construction and analysis of the fault tress, the examination of the cut sets identified global functional block failures as a potential source of common mode failures and a limited diagnosis check was incorporated into the system. Consequently, safety analysis of the software is performed for the design and verification of safety critical software in nuclear power plants. With the Voas [9] testability definition, testability can be calculated from fault tree evaluation. The probability of an output failure from a statement fault can be evaluated by fault tree analysis. The top event of an output statement failure is traced back to the statements that affect the output statement until the basic statement faults are reached. The result of the fault tree indicates that the probability of output failure is affected by the fault probability of each basic statement. With selection of test cases based on coverage criteria such as “define-use” or “branch”, each statement has at least one probability of being executed. If a location rarely affects the output, the detection of a fault at that location is difficult; this situation results in low testability. If the fault characteristic of a partition is inhomogeneous, the partition should be divided to make the partitions homogeneous. Thus, when test cases are selected evenly with coverage criteria, faults can be detected more easily if the statements have an even probability of affecting the output. The evenness of probability is the degree that probability is evenly distributed. Entropy, which can be used to measure the evenness of the probability that each statement fault will lead to output failure, has been widely used as a quantitative measure of uncertainty in many areas, including thermodynamics, information theory, biology, decision theory, and sociology. Entropy represents uncertainty and randomness: greater entropy means more randomness; an even uncertainty means the evenness of each state [15]. With this similarity, entropy can be used to measure the even probability that a statement fault will result in output failure. In information theory, entropy is a measure of uncertainty; it is defined as follows X HðXÞ Z Prðxi ÞlogðPrðxi ÞÞ (1)

on the importance of each basic statement. The importance of a statement refers to the extent of its contribution to a failure when a fault occurs in that statement. Wolfe describes several methods of computing the importance of basic events in a fault tree [16]. In this study, the Veseley– Fussell measure of importance is used. As the number of states increases, the entropy value increases. Thus the entropy increase due to the number of states is compensated for by dividing by the logarithm of the number of basic statements. This results in the modified entropy that measures the evenness of the contribution of a basic statement regardless of the number of basic statements. A basic statement refers to a statement that is unaffected by other statements, such as a statement that defines variables. The variables of a basic statement refer to the variables used in the basic statement. The modified entropy, ENT, is calculated as follows X i Þlog Iðx iÞ Iðx (2) ENT Z K1=log N i

i Þ is the normalized importance of statement xi and where Iðx N is the number of basic statements or variables. The Veseley–Fussell measure of importance for a basic statement is calculated as follows X IðxÞ Z Prðxc Þ=PrTE (3) c

where Pr(xc) is the probability of cut set C containing x and PrTE is the failure probability of the top event. Normalized importance is calculated as follows: X i Þ Z Iðxi Þ= Iðx Iðxi Þ (4) i

Detecting faults in a module is easier if the output failure rate of the module is higher. The failure rate increases, however, as the number of basic statements in the module increase. The module’s normalized failure rate that compensates for the number of basic statements, lfail, is calculated as follows lfail Z PrTE =ðN !Pre Þ

(6)

where PrTE is the failure probability of the top event, Pre is the failure rate of the basic statements or variables, and N is the number of basic statements or variables. Testability can be calculated as the multiplication of the failure rate and the entropy. The testability, PrTEST, is calculated as follows:

i

where H(X) is the information entropy of the random variable X, which can take on the value x1,.,xn with respective probabilities Pr(x1),.,Pr(xn). The Eq. (1) can be interpreted as a quantitative measure of the amount of uncertainty associated with the probability distribution Pr(x). The top event probability of the fault tree is performed using Boolean algebra. The entropy calculation is based

PrTEST Z lfail !ENT

(7)

The proposed testability measure combines the top event, output, failure probability, and entropy of the importance of basic statements [14]. Thus, it considers the output effect of the fault in each statement of a module. A module with high testability can be tested with test cases that are selected uniformly. If the testability of a certain module is lower than other modules, a statement fault might not

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

appear as an output failure and the addition of special output variables could be necessary. At minimum, modules with low testability need to exercise more test cases than modules with high testability with uniform test case selection method. Software testing based on coverage criteria such as code coverage, branch coverage, and data flow coverage produce partitions that execute every code, branch, and data definition. If the fault characteristic of a partition is inhomogeneous, the partition should be divided to make the partitions homogeneous. The proposed testability can evaluate the effectiveness of the testing based on uniform partitions by evaluating the homogeneity of the partitions. 3.2. Testing and the testability measure A test plan describes the approaches, test requirements, test management, the procedures for hardware control, and the qualification and use of software tools. CPCS software is classified as safety critical and is tested at the module level, along with the unit level and integration level. Besides being tested by these dynamic types of testing, the software is formally reviewed. For each test, the test cases and test results are documented and put under configuration management control. Each module is tested with test cases that are designed to execute every branch. The coverage is checked and any branch not executed is either justified or subject to new test cases designed to execute every branch. After successful module testing, modules are combined together and tested in units. A unit test verifies the initialization capability for the entire input space of the software; dynamic test cases that vary with time are then examined. Test cases that simulate the anticipated operational occurrences of the plant are exercised and the output results and response times are compared to the expected results and response times [17]. Software testing is complicated because neither the control flow nor the data flow is sequential. The software design can be simplified with sequential control flow, though the flexibility of the design is sacrificed to some degree. This kind of flexibility restriction can be found in the guidelines for software language; the guidelines recommend using only safe sets of language features [18]. Function chart is widely used in the programming of the PLC. For the function chart, the control flow is fixed: its execution sequence for the function blocks is from top to bottom and from left to right. The data flow is also fixed: the data flow between the modules of the function blocks is only allowed if the connection is made between specific points of the function blocks. With this fixed control flow and data flow, module testing is performed to verify the functions performed by the individual function blocks. Each module is tested with branch coverage criteria. The test cases are prepared to execute every branch in a module and module coverage is verified with a tool that checks whether all branches have been executed.

49

In branch testing, the testing space is divided uniformly on the basis of the branch spaces. The homogeneous partition of an input space is more efficient than random testing. The homogeneity of branch testing can be verified with the proposed testability. The fault tree was drawn using templates proposed in Leveson [10] and Cha [11] for two modules of the CPCS software, Variable Overpower Trip (VOPT) module and Reactor Power Cutback (RPC) module. The VOPT module monitors the plant power and if the monitored power increases at a rate greater than the pre-defined rate or exceeds the pre-defined maximum value, it generates a signal to stop the power generation by setting output variable JTRP. Fig. 2 shows part of the fault tree for the VOPT module. The rectangle defines the event as the output of a logic gate, where AND and OR gate are the fundamental logic gates for the fault tree. The circle in the figure represents the basic statement fault. The triangle represents the transfer of the event to the other page. The oval represents the condition that should be met for an event. For the VOPT module, the normalized output failure probability, lfail, was 0.533; this calculation assumes a basic variable fault rate of 0.001, as shown in the circle notation of INPUT JRPT; the entropy value, ENT, was 0.992 and the testability measure, PrTEST, was 0.530. This module was modified to generate an additional output for current power, PMAX, in addition to the JTRP. The testability of this modified module as calculated to be 0.786. The VOPT module was modified to generate an additional output for current permissible power level, SPVOPT, in addition to the JTRP. The testability of this modified module is calculated as 0.993. The testability calculation results are summarized in Table 1. This testability calculation shows that the addition of output result in increase in testability. The VOPT modules are tested with branch coverage criteria and random testing. The branch coverage testing required five test cases to cover all the branches, and these test cases are run over 14 fault seeded programs. The number of test cases that detected the faults in VOPT module and modified modules is 1, 7, and 7, respectively. The results are summarized in Table 2. For the same modules, testing was performed with randomly selected 10,000 test cases. The same faults are seeded as in branch coverage testing. Among the 14 fault seeded programs, the number of faults detected is 5, 8, and 13 in three modules. The results are summarized in Table 2. The results are consistent with the calculated testability, more faults are detected in modules with higher testability. The RPC module monitors the two RPC event flag coming from two data link and sets the event flag IRPC and tracks time after the RPC event. It resets the event flag IRPC after pre-defined time elapse or if a mismatch between two flags continues beyond the allowed time duration. For the RPC module, the normalized output failure probability, lfail, was 0.175 and the entropy, ENT, was 0.708, and the testability measure, PrTEST, was 0.124. The module was modified to generate an additional output for flag for first

50

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

Fig. 2. Fault tree for part of the program.

data link, ICB1, in addition to the IRPC. The testability of this modified module as calculated to be 0.156. The RPC module was modified to generate an additional output for time after the event, TCB, in addition to the IRPC. The testability of this modified module is calculated as 0.195. The testability calculation results are summarized in Table 3. The modification to include additional output results in increase in testability. The RPC module Table 1 Testability for VOPT modules Module JTRP JTRP, PMAX JTRP, SPVOPT

and modified modules are tested with branch coverage criteria and random testing. The branch coverage testing required 12 test cases to cover all the branches, and these test cases are run over 12 fault seeded programs. The number of test cases that detected the faults in three modules of RPC is 6, 8, and 10, respectively. The results are summarized in Table 4. For the same modules, testing was repeated with randomly selected 10,000 test cases. The same faults are seeded as in branch coverage testing. Table 2 Number of fault detected in VOPT modules

Testability lfail

ENT

PrTEST

0.533 0.8 0.993

0.992 0.982 1.0

0.530 0.786 0.993

Module JTRP JTRP, PMAX JTRP, SPVOPT

Testing Branch

Random

1 7 7

5 8 13

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52 Table 3 Testability for RPC modules Module IRPC IRPC, ICB1 IRPC, TCB

Testability lfail

ENT

PrTEST

0.175 0.2 0.231

0.708 0.779 0.845

0.124 0.156 0.195

Table 4 Number of fault detected in RPC modules Module IRPC IRPC, ICB1 IRPC, TCB

51

to have a big effect on the fault detection, thus the direct comparison of two experiments having very different variable characteristics is not relevant. Our experience shows that branch coverage testing is adequate for detecting any software errors. The proposed testability measure can be used to access the homogeneity of the partitions that are based on other uniform coverage testing criteria. The testability measure can help optimize the testing by allotting more resources to the modules that have a higher probability of hiding faults; that is, software with lower testability.

Testing Branch

Random

6 8 10

12 12 12

Among the 12 seeded faulty programs, the number of faults detected is 5, 8, and 13, respectively. The results are summarized in Table 4. The results are consistent with the measured testability, more faults are detected in modules with higher testability. When comparing two modules of VOPT and RPC, VOPT module has higher testability than RPC module. Based on analysis of testability measure, more test cases should be exercised for the RPC module than the VOPT module for effective fault detection within the limited resources. The fault in each statement at RPC module has low probability to appear as an output failure and the importance of the statement fault is not as even as in VOPT module. The RPC module consequently needs more test cases to get same degree of fault coverage. With branch coverage test criteria, the VOPT module required five test cases; the RPC module required 12 test cases. The number of required test cases for the branch coverage testing is consistent with the outcome of the testability measure; namely, more test cases are required for the module with lower testability. Even when the test cases from the branch coverage testing are selected uniformly from the branch conditions, the uniform selection of test cases can be evaluated to be proportional by showing that more test cases are selected for lower testability module. But the testing of fault seeding in VOPT and RPC shows that the more faults are detected in RPC module. Among the 14 faults seeded programs, one fault is detected from five branch coverage test cases and five faults were detected from 10,000 random test cases in VOPT module. In RPC module, six faults were detected from 12 branch coverage test cases and all 12 test cases were detected from 10,000 random test cases. More faults were detected in modules with lower testability. This was due to the fault size of the experiment. The fault-seeded variables in VOPT module were floating point values and small values, 0.01% of nominal value, were seeded. The variables in the RPC module were mostly integer variables and the relatively large values, 100% of nominal value, were seeded. This effect of fault size on testing is considered

4. Conclusion The design of the CPCS digital safety system involves the use of standard off-the-shelf hardware and software components. These commercial products require special procedures to verity if they meet the quality standard for a nuclear safety system. The existence of a rigorous design process and related documents for these products, along with reliable operating history, has been confirmed. Appropriate configuration management has been performed and all the reported errors have been corrected. The application software should be designed with a well-defined design process and with verification and validation activities, including testing. In addition to unit testing and integration testing, module testing is performed on safety system software. Module testing is performed with branch coverage criteria, which results in uniform partitions. The effectiveness of the partition testing depends on the homogeneity of the partitions in the test cases. The software testability can be measured based on the entropy of the importance of basic statements and the failure probability of a software fault tree. The testability measure can be applied to the optimization of the software testing, enabling a test engineer to focus on software that has lower testability. The testability measure can also be used to access the homogeneity of partitions; this measure can show whether uniform partitions have a homogeneous probability of containing faults. In module testing of the CPCS with branch coverage criteria, more test cases were required for the module with lower testability, which shows that the proposed testability measure can help access the homogeneity of partitions. References [1] EPRI, Guideline on evaluation and acceptance of commercial grade digital equipment for nuclear safety applications. EPRI TR-106439, Electric Power Research Institute; 1996. [2] Gutjahr WJ. Partition testing vs random testing: the influence of uncertainty. IEEE Trans S/W Eng 1999;25(5):661–74. [3] Chen TY, Yu YT. Constraints for safe partition testing strategies. Comput J 1996;39(7):619–25. [4] Chen MH, et al. Effect of code coverage on software reliability measurement. IEEE Trans Reliab 2001;50(2).

52

S.D. Sohn, P. Hyun Seong / Reliability Engineering and System Safety 91 (2006) 44–52

[5] Pizza M, Strigini L. Comparing the effectiveness of testing methods in improving programs: the effect of variations in program quality. Nineth international symposium on software reliability engineering, ISSRE ’98, Paderborn, Germany; 1998. [6] Frankl PG, et al. Evaluating testing methods by delivered reliability. IEEE Trans Software Eng 1998;24(8):586–601. [7] NASA/TM-2001-210876. A practical tutorial on modified condition/decision coverage; 2001. [8] Freedman RS. Testability of software components. IEEE Trans Software Eng 1991;17(6):553–63. [9] Voas JM, Miller KW. Software testability: the new verification. IEEE Software 1995;12(3):17–28. [10] Leveson NG, Harvey PR. Analyzing software safety. IEEE Trans Software Eng 1983;SE-9(5):569–79. [11] Cha SS, et al. Safety verification in murphy using fault tree analysis. Tenth international conference on software engineering 1988 pp. 377–87.

[12] AECL. SDS1 software hazards analysis report for Wolsong nuclear power plant; 1994. [13] Weyuker EJ, Jeng B. Analyzing partition testing strategies. IEEE Trans Software Eng 1991;17:703–11. [14] Sohn SD, et al. Quantitative evaluation of safety critical software testability based on fault tree analysis and entropy. J Systems and Software 2004;73:351–60. [15] Seong PH. A methodology for simplification of light water reactor system design. PhD Thesis, Massachusetts Institute of Technology, Boston, Massachusetts; 1987. [16] Wolfe WA. Fault tree analysis. Chalk River, Ontario: Atomic Energy Canada Limited; 1978. [17] Sohn SD, et al. Safety computer system, cpcs design in nuclear power plant. J KNS 1994;26(4):502–6. [18] EPRI. Review guidelines on software languages for use in nuclear power plant safety systems. EPRI NUREG/CR-6463, Electric Power Research Institute; 1995.

Testing digital safety system software with a testability measure based on a software fault tree

Testing digital safety system software with a testability measure based on a software fault tree

Recommend Documents