Reliability Engineering 15 (1986) 307-317
Life Cycle Costing Considerations in Reliability Centered Maintenance: An Application to Maritime Equipment N. Jambulingamf and A. K. S. Jardine Department of Engineering Management, Royal Military College of Canada, Kingston, Ontario K7L 2W3, Canada (Received: 3 December 1985)
ABSTRACT This paper covers a brief survey of both reliability centered maintenance ( RC M ) and life cycle cost ( LCC ). The integration of these two concepts is demonstrated with reference to a 75-ton chiller unit (CU) on board a destroyer. The CU consists of six subsystems and is considered by experienced naval personnel to be a highly reliable equipment. The objective of the analysis is to determine if the CU requires preventive maintenance ( PM ) (inspection, adjustments, etc.) and if so to find the optimal PM interval between CU major overhauls (called refits). The optimal interval is that which minimizes the expected maintenance manpower cost over the refit period. The optimal inspection interval has been obtained and this results in reducing the maintenance cost and keeping the equipment safe. The Weibull analysis of the CU using both failure and censored observations is considered. Many difficulties encountered in data collection from a ship maintenance management information system ( S M M I S ) and interpretation are also addressed in this paper. t Present address: Reliability Engineering, Satellite and Aerospace Systems Division, SPAR Aerospace Ltd, 21025 Trans Canada Highway, Ste Anne de Bellevue, Quebec, Canada H9X 3R2. A version of this paper was presented at the 5th National Reliability Engineering Conference--Reliability '85, 10-12 July 1985, Birmingham, UK, and is reproduced by kind permission of the organisers. 307 Reliability Engineering 0143-8174/86/$03.50 © Elsevier Applied Science Publishers Ltd, England, 1986. Printed in Great Britain
308
N. Jambulingam, A. K. S. Jardine
1 INTRODUCTION The maintenance costs of naval systems in recent years have emphasized the requirement that the optimum benefit be realized from the maintenance effort and that availability goals set by operational commanders be achieved. 1 It is important to establish a maintenance policy to balance the resources available (personnel, material and financial) and the degree of operational readiness desired. It is the policy of the Department of National Defence I that the requirements for PM will be determined by use of the RCM approach which establishes: (a) whether PM will be done at all; (b) if so, whether it will be time-based or condition-based; and (c) what the PM tasks will be. The challenge lies in achieving successful implementation of RCM and the anticipated potential for significant reduction of equipment life cycle costs. The prime source for data and information used in this paper is SMMIS and ORAE (Operational Research Analysis Establishment) project report. 2 A brief history of RCM and LCC follows. 1.1 Reliability centered maintenance 3'4
The objective of RCM is to maintain the inherent reliability which was designed into the system. Several RCM-related programs have been initiated in the past which had goals in common with RCM (e.g. reducing the cost of maintenance while retaining equipment inherent reliability) or which revised a maintenance activity through similar processing methods. Some of these were implemented prior to RCM, such as the Army Oil Analysis Program; on-condition maintenance program for aircraft; three-level maintenance for aircraft; preventive maintenance and checks services; and so on. Before 1960, there is no record of any effort to look deeply into the effectiveness of preventive maintenance as a process for avoiding failure. In the late 1950's, the presence of jet aircraft fleets stimulated airline interest in improving the PM for commercial aircraft. In 1967, airlines first applied the decision tree logic technique to the problem of identifying PM task needs. It provided an efficient approach, since it directly faced the primary question of the impact of unreliability on operations. In 1968, a decision tree logic formed the basis for a maintenance program for the Boeing 747. Similar methods
Life cycle costing--application to maritime equipment
309
have been used on DC10, L-1011, Concorde, A-300, Boeing 767 and Boeing 757. In the early 1970's, this work attracted the United States Navy, which applied this new method to both newly-designed and inservice aircrafts S-3, P-3 and F-4. This new method is now called RCM. In mid-1979, this R C M procedure was adopted on new and in-service naval ships. In the Canadian Navy, R C M is an integral part of the design and construction of new patrol frigates, s 1.2 Life cycle cost (LCC) In recent years, the state of the economy, budget limitations, etc., have created an increasing awareness of system and maintenance costs. Systems cost must be viewed from the total life cycle stand point. This trend towards cost consciousness has resulted in the increased emphasis on LCC. 6 LCC analysis has become a vital factor in industry and defence. The Department of National Defence (DND) places emphasis on the importance of manpower as a primary cost factor in the LCC analysis. 7 So, it was felt useful to conduct a LCC study based on the maintenance manpower expended for naval equipments. The application of a LCC analysis can be seen in the latter part of this paper.
2
75-TON C H I L L E R U N I T (CU)
There are 32 chiller units in service on 18 Canadian Forces ships. The distribution of the chiller units is given in Table 1. As part of the air conditioning system, the CU is considered to be a TABLE 1 75-ton Chiller Units in Service
Class of ship ISL (Improved St. Laurent) M K E (Mackenzie) AOR (Auxiliary Oil Replenishment) ANS (Annapolis) TRL (Tribal)
Number of ships
Units per ship
6 1 4 2 2 1 2 2 4 3 Total units in service
Total per class 6 8 2 4 12 32
310
N. Jambulingam, A. K. S. Jardine
highly reliable equipment. The CU consists of six subsystems: namely, (A) compressor, (B) controller, (C) condensor/cooler, (D) 75-ton chiller unit (main), (E) 75 horsepower motor and (F) shell package. The CU's on the Improved St. Laurent (ISL) class destroyers are considered for this study. The reasons for considering ISL class destroyers are as follows: (a) the repair actions against the individual equipment could be identified since each ship was fitted with one unit; and (b) it is possible to consider operating time as a function of calendar time if one assumes that the ships operate continuously in warm climates, and therefore the CU is in use. Equipment renewal is defined as the 'replacement of all wearing parts, adjustment and tuning'. 8 Some of the assumptions to identify equipment renewal are: (i) the equipment is assumed to be renewed after ship refit unless stated otherwise, and (ii) if Ship Repair Unit (SRU) carried out major repair work. The CU is subject to failure and can be repaired at 'three lines'. If the failure is minor it can be repaired on board by ship's staff (first line); otherwise, it would be repaired by SRU/Fleet Maintenance Group (second line) and/or by civilian contracts/other D N D facilities (third line). The maintenance action form (MAF) is completed by ship's staff whenever a PM or corrective maintenance (CM) action is performed and later input to the computer data file called SMMIS. The way in which required data for failure analysis is extracted from SMMIS will now be considered. 2.1 Data collection and difficulties encountered
The required failure information which was collected from SMMIS, called an 'EXTRACTO' report, summarized the CM actions for CU by ship and date. A computer 'dump' file and unsatisfactory condition reports (UCR) were requested to check all of the SMMIS records for 75-ton CU. However, only 13 UCR's were available and no relationship between the UCR's and SMMIS records was found. Thus, SMMIS data was used as the primary source of information. Equipment failure occurs when the equipment is no longer performing within specified limits. The CM action codes 'P--parts replaced due to failure' and 'E--parts repaired' were used to identify equipment failure. However, the date the ship went into refit was used as a censoring point.
Life cycle costing--application to maritime equipment
311
Similarly, 1 January 1981 was used as a censoring point because that was the last day of recording for our study. Some of the difficulties encountered in collecting and interpreting data are as follows: (a)
Some equipments are not assigned a naval equipment index (NEI) number, so that ship's staff were misled into recording failure information against some other equipment. This factor was identified in 1980 when SMMIS 'clean up' took place. (b) The equipment serial number has not been used in SMMIS. Thus, if more than one equipment with the same NEI number is on board ship (e.g. Tribal class ships have three CU's) there is no sure way to identify which equipment was failed. (c) The maintainer's CM action code often appears as '0-parts replaced--no failure' to avoid the extra work involved in a 'failure remarks' column. 9 However, this possibility was not taken into account in this study. (d) It was very difficult to identify the exact operating time of the CU from existing SMMIS computer files. (e) To carry out reliability, availability and maintainability (RAM) analysis, it is necessary to identify not just the equipment failure but the particular failure mode. In fact MAF contains the column 'how malfunctioned' code, but most of the MAF's have 'none of the above' code marked. This would imply that the codes are not appropriate or that the maintainers have difficulty understanding their meaning. One would overcome this difficulty by not having the 'none of the above' possibility code in MAF. There was insufficient data on the 75 horsepower motor and the shell package. It was later investigated and found that these equipments were relatively trouble-free. The failure and suspension times for remaining subsystems are collected separately. The ordered times in days are given in Table 2. The ordered data followed by F (failure) or C (censored) is given by the subsystems. The analyses of these subsystems will now be considered.
2.2 Data analysis It is assumed that each system follows a three-parameter Weibull distribution because of its flexibility. It is also assumed that the interarrival times or failure times are iid (independently and identically
N. Jambulingam, A. K. S. Jardine
312
TABLE 2 75-ton Chiller Unit Ordered Failure Data (in days)
Subsystems
Ordered failure times (days) (I)
(2)
(3)
(4)
A.
Compressor
224F 287F 320C
351F 402C 473C
527F 551F 655F
847C 942C
B.
Controller
18F 62F 210F 284F
320C 402C 463C 473C
548F 563C 567F 869F
962C 1 330C 1 567C
C.
Condensor/cooler
45F 103F 202F 230F
320C 333F 402C 473C
485F 564C 736C 901C
1 135C 1 138F 1 578C
D.
75-ton CU (main)
50F 74F 106F 115F 117F 117F
124F 182F 196C 199F 210F 294F
307F 364F 365F 390F 424F 568F
573C 659F 660F 790F 943C
distributed) random variables. This assumption was verified by trend test.l° The probability density function is
f(t)
= r/-- ~)\q--S--7/
exp -- \ ~ - ~ /
t> •
(1)
with ~ > 0, fl > 0, q > 0. The increment method ~l has been used to handle the censored data analysis. The steps are as follows: (1) arrange the observed failure and suspension data in ascending order. If there is a tie between failure and suspension, the failure time is followed by suspension time. (2) compute 'new increment',/, for all failure data (not for suspensions) using the formula: I=
(n + 1) - previous order number 1 + number of items following suspended set
where n denotes the total number o f ordered data.
(2)
Life cycle costing---application to maritime equipment
313
(3) compute the order number for failures as given in the Appendix in ref. 12. Using the order numbers, now compute median ranks (MR) in percentage for all failures using the formula MR =
order number - 0.3 x 100 n+0.4
(3)
where n is already defined. (4) plot these median ranks (assuming 7 = 0) versus failure time on Weibull probability paper, and fit the best straight line to the plotted points. It is not obvious that we always get a straight line through plotted points. Instead, we may get a curve. It simply indicates that there is a non-zero minimum life parameter, 7. One can find this 7 either by a trial and error method or by an analytical method. The analytical method gives m a x i m u m correlation between ( t - J ' s and MR's for some 7. After finding a suitable 7, subtract the minimum life from each and every failure and suspension time. Now, replot the same median ranks versus modified failure times on Weibull paper to find fl's and t/'S.
The analyses for main system and subsystems are carried out separately through steps (1) to (4). The results obtained are given in Table 3. It is evident from Table 3 that the main system and subsystem D are exponentially distributed because the shape parameter fl is close to unity. Subsystems B and C follow a Weibull distribution with decreasing hazard rate. Subsystem A follows Weibull with increasing hazard rate; that is, the subsystem A is in wearout mode. By making use of these failure characteristics, one can use the R C M approach for PM action. TABLE 3 Weibull Parameters for the Systems
System
Failures
Suspensions
Sample size,
Parameters
n
Main system Subsystem A Subsystem B Subsystem C Subsystem D
40 6 7 7 20
24 5 8 8 3
64 11 15 15 23
1-03 1.27 0.92 0.83 1.06
7-02 164.6 0 18.22 38.41
757"8 693"3 1 187-2 1 143-8 387"7
314
N. Jambulingam, A. K. S. Jardine
2.3 RCM approach The heart of RCM is decision logic which links FMECA (failure mode effects and criticality analysis) data or field maintenance records and maintenance tasks in order to retain equipment inherent reliability and safety levels at the lowest cost in the life cycle. The purpose of a decision logic process is to reduce a complicated problem to a number of simple questions to obtain definite answers that lead to a reasonable and justifiable resolution. A typical decision logic diagram, used by the US Army, 3 is given in Fig. 1. For the 75-ton CU, the decision path is indicated by the thick lines in Fig. 1. In fact this path analysis is attributed to the subsystem A because other systems do not require any PM action. Subsystem A failure is mainly due to broken/leakage cause. Now, for safety considerations, an inspection interval (on-condition inspection) must be established which gives an acceptably low probability of failure during the time period when an impending failure would go undetected• This
DETECTION METHODS POTENTIAL
STATIrF E OR NHC~I,ATIONS
CNITIOkLITY
NAIM'I1ENANCR ACTION 8
7
1NOOLD NAILtRtN
5Y-I
CAN oPEP.ATOR CRg4 DETECT DII~.ADATION ORENATION ~
•-
CAUSIg A SAFETY NAZAND OR NISSION ABORT
PLUS
YES -- --
RUST CONSIDER ONCONDITION INSP~CTION AND HISSION FAILU'RES 11
CAN DEGRADATION BE DETECTED BY eENIOD ZC
| YES
"
INSPNCTION ~
2
6
I s TRlm~ FAILURE RIDD~ 7
"s~S
~
(UNDN-" TECTASLN DURINC OP~qATION)
L N~. . . . . . . . .
i
g
COULD HIDDID4
TAILDI3 E LEAD TO EAFLrr[ NAZAP~ ON NIDSION AI)OIT '~
PNRFOP.N ON-CONDITION INSPNCTION
I. NO
0
YES
YES •
FAILURES BE DETECTED BY PERIODIC
1
1!
NDULD
z~s~Ecrio.,~
•!
ATFgCT FtJNCrlON
!
RE~UINED BY REGULATION/ STATUTE ~P ,.~ . . . . .
i -~---~
1
REPLACE A T
DOES RELIABILITY !
m
WOULD l ~ I NT,
~.IOR
12
TO
FAILURE BE MORE ECONOMICALP
!
'c~l
DECP.~DE ~aITD AF,E OR USACE? i NO
1I No
. CONDITION MONITOR (REPLACE AT FAILDRE NO SCREDUL~)TASKS REQUIRE~) • CONSIDER REDESIGN CONDITION HONITOR (t~o SCHNDUL~I) TASKS
REnUIRED REPLACE/ REPAIR AT FAILURE)
Fig. 1.
Typical decision logic diagram.
Life cycle costing---application to maritime equipment
315
time period is the time from the end o f the last inspection to the time o f the next inspection minus the time from detectable failure onset to failure. 3 Symbolically, we denote the period to be: NTt -- TOS - ( N - - 1)T,
(4)
where T~ = time between inspections, TOS = time from failure onset to failure and N = a positive integer. It has been derived (ref. 3, p. 108) that the probability o f a failure occurring and not being detected during the time between any two inspections can be expressed as follows:
PNT, =
(5)
f
o~
f(t) dt
N - 1)T l
where fit) is the failure density o f the equipment. We know that subsystem A follows a Weibull distribution with fl = 1.27, ~ = 164.6 and r/= 693.3. The service life o f the subsystem A is taken to be the refit cycle length (1460 days) and TOS is 10h. The analysis which is carried out using eqn (5) for TI varies from 30 to 160 days. The results are given in Table 4. F o r each T~, k n u m b e r o f values are calculated and the average o f these k values is given against each Tt. In this case k is the integral part o f ( 1 4 6 0 - ~)/T~. The results are given in Table 4. In this case, if the m a x i m u m allowable eNTt was set at 10%, then the inspection interval period o f 40 days should be TABLE 4
ProbabilityofGoingto Failu~ without DetecfiondufingSp~ificln~als T~ (days)
eNTI (%)
T~ (days)
eNT~ (%)
30 40 50 60 70 80 90
6"99 9"29 11"49 13"47 15"61 17"94 19"48
100 110 120 130 140 150 160
21"42 23"59 25"64 27"57 29"37 31"03 33-32
316
N. Jambulingam, A. K. S. Jardine
scheduled for oil sampling or vibration testing. It is important to note that this inspection interval of 40 days should be implemented only after about 164 days (minimum life parameter).
2.4 LCC approach The present policy is that a 60 days inspection check is in action throughout the refit cycle period. This check is for the whole system as well. Our analysis recommends that there is no need to inspect all the subsystems on a 60 day cycle between refits. Instead, there is need for a 40 days inspection interval that starts from the 165th day after refit, and only for subsystem A (compressor). This recommended policy will keep the system availability high and reduce the inspection cost over the period between refits. The cost data used for LCC was gathered from the ship repair unit and engineering officers who had on-board ship experience. The average number of man-hours to undertake a complete refit of a 75-ton CU is 800, and to repair a compressor is 300. Also, the average man-hour pay is $25. As per our analysis, the real need of overhauling subsystem A requires only 300 man-hours whereas 800 man-hours are being spent for the whole system, which is not needed. One can therefore save $12 500/CU for the period of 4 years. In other words, the savings for the fleet of 32 units would be $400 000, over a 4 years refit cycle.
3
CONCLUSIONS
The R C M approach is an effective tool to drift from the traditional hardtime maintenance to on-condition or health monitoring maintenance in order to minimize maintenance costs. Failure analysis should be done prior to taking any maintenance action or doing cost analysis. It is to be noted that this analysis needs to be done not only at the system level but also at the subsystem level. The three-parameter Weibull distribution fits the data well and the 'increment method' is found to be an asset in handling censored data. Only the subsystem 'compressor' needs P M - the others do not. This leads to the conclusion that there is a considerable amount of savings on maintenance. The 40 days routine inspection after 164 days from refit is important to be considered for system safety. The optimal inspection interval approach is simple to understand and easy
Life cycle costing--application to maritime equipment
317
to implement. Some advanced techniques such as 'mixed-mode failure analysis' and 'trend testing' could be used when it is necessary. This study could be extended to the other maritime equipments on board ship. The results and techniques used in this paper should be very useful to maritime maintenance engineers and life cycle material managers.
REFERENCES 1. DMES, N a M M S Newsletter, 6 (1985), p. A.1. 2. Taylor, I. W. Reliability analysis of maritime equipment: Hazard plotting using SMMIS data, ORAE Report PR 188, 1982. 3. US Army, Reliability centered maintenance reference book, A L M 31-0454RB, 1980. 4. US Navy, Naval Sea Systems Command, reliability centered maintenance handbook, 1983. 5. Gebbie, R. J. Reliability centred maintenance, or, how to put the method into PM madness, unpublished note, Royal Military College of Canada, Kingston, Ontario, 1985. 6. Blanchard, B. S. Design and Manage to Life Cycle Cost, M/A Press, Portland, Oregon, 1978. 7. Altinay, N. and Taylor, I. W. Life cycle cost impact modelling and DND data sources, ORAE Report PR 200, 1982. 8. Tullier, P. Determining reliability and degradation of shipboard machinery, Proc. Ann. Reliability and Maintainability Symposium, Las Vegas, 1976. 9. Canadian Navy, C-03-005-012/AM-002: Naval maintenance management systems manual, Vol. 2, SMMIS. 10. Ascher, H. and Feingold, H. Repairable Systems Reliability, Marcel Dekker, New York, 1984. 11. Kaput, K. C. and Lamberson, L. R. Reliability in Engineering Design, John Wiley, New York, 1977. 12. Natesan, J. and Jardine, A. K. S. Graphical estimation of mixed Weibull parameters for ungrouped multicensored data: its application to failure data, Maintenance Management International, 6 (1986), 115-27.