Analysis of trends in aviation maintenance risk: An empirical approach

Analysis of trends in aviation maintenance risk: An empirical approach

Reliability Engineering and System Safety 106 (2012) 104–118 Contents lists available at SciVerse ScienceDirect Reliability Engineering and System S...

970KB Sizes 0 Downloads 21 Views

Reliability Engineering and System Safety 106 (2012) 104–118

Contents lists available at SciVerse ScienceDirect

Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress

Analysis of trends in aviation maintenance risk: An empirical approach Karen B. Marais n, Matthew R. Robichaud School of Aeronautics and Astronautics, Purdue University, West Lafayette, IN 47907, USA

a r t i c l e i n f o

a b s t r a c t

Article history: Received 11 October 2011 Received in revised form 24 May 2012 Accepted 1 June 2012 Available online 15 June 2012

Safety is paramount in the airline industry. A significant amount of effort has been devoted to reducing mechanical failures and pilot errors. Recently, more attention has been devoted to the contribution of maintenance to accidents and incidents. This study investigates and quantifies the contribution of maintenance, both in terms of frequency and severity, to passenger airline risk by analyzing three different sources of data from 1999 to 2008: 769 NTSB accident reports, 3242 FAA incident reports, and 7478 FAA records of fines and other legal actions taken against airlines and associated organizations. We analyze several safety related metrics and develop an aviation maintenance risk scorecard that collects these metrics to synthesize a comprehensive track record of maintenance contribution to airline accidents and incidents. We found for example that maintenance-related accidents are approximately 6.5 times more likely to be fatal than accidents in general, and that when fatalities do occur, maintenance accidents result in approximately 3.6 times more fatalities on average. Our analysis of accident trends indicates that this contribution to accident risk has remained fairly constant over the past decade. Our analysis of incidents and FAA fines and legal actions also revealed similar trends. We found that at least 10% of incidents involving mechanical failures such as ruptured hydraulic lines can be attributed to maintenance, suggesting that there may be issues surrounding both the design of and compliance with maintenance plans. Similarly 36% of FAA fines and legal actions involve inadequate maintenance, with recent years showing a decline to about 20%, which may be a reflection of improved maintenance practices. Our results can aid industry and government in focusing resources to continue improving aviation safety. & 2012 Elsevier Ltd. All rights reserved.

Keywords: Aircraft maintenance Aviation safety Aviation risk

1. Introduction Maintenance plays an important role in the aviation industry. In the absence of maintenance, most system parts deteriorate due to use or age, which results in wear and eventually failure of the part, which may compromise system safety. The impact of unaddressed deterioration has been shown in dramatic fashion in many accident examples, one of which is the Japan Airlines Flight 123 tragedy. The Boeing 747 of JAL123 suffered a catastrophic failure of the rear bulkhead on August 12th, 1985 just moments after taking off from Tokyo International Airport [2]. The explosive decompression was found to be caused by metal fatigue and, more specifically, by the inadequate repair of the rear bulkhead after a tailstrike incident in 1978. The repair made the damaged area more prone to metal fatigue and, after years of constant pressurization and depressurization, the bulkhead finally failed. This accident remains the deadliest single-aircraft accident in history. Fifteen crew members and 505 of the 509 passengers were killed. Many accident models have been suggested; the underlying concept is the same: accidents result from a combination of factors, such as design errors, mechanical failures, software errors,

n

Corresponding author. E-mail address: [email protected] (K.B. Marais).

0951-8320/$ - see front matter & 2012 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ress.2012.06.003

user errors, and organizational or regulatory factors [22]. These factors can also be classified into immediate causes (e.g., failure of the landing gear) and underlying or root causes (e.g., inadequate inspection program), or into active (e.g., engine fire) and latent failures (e.g., improperly trained maintenance technicians). Designers, operators, and regulators use accident investigations to identify and track these failures and control them or reduce their likelihood in the future. These same factors can also lead to less serious events, referred to as incidents. While incidents usually do not receive the same degree of attention as accidents, incident investigations can still be very helpful in identifying unsafe conditions. Self-reporting and government inspections can also reveal latent failures and trends before they result in incidents or accidents. For example, in late 2011 Airbus discovered that the manufacturing process could introduce cracks inside the wings of the A380, and eventually required operators to perform emergency inspections of the entire fleet [11]. American Airlines recently found cracks on the engine pylons of two of their Boeing 767s, one of which had only reached a third of the flight hours required before a pylon inspection needed to be performed [23]. A significant body of research has been done on improving the reliability of components and reducing the occurrence of mechanical failures. More recently, the human factors aspects of aviation maintenance have also received increasing attention. But few efforts have been made to quantify the contribution of

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

Nomenclature AC&MF AD ASRS ATC

Air Carriers and Maintenance Facilities Airworthiness Directive Aviation Safety Reporting System Air Traffic Control

maintenance to aviation safety. In particular the following questions have not been addressed: 1. What is the extent of maintenance’s contribution to commercial passenger aviation risk, and what is the trend in this contribution? 2. How effective are the mechanisms used to ensure safety at reducing maintenance’s contribution to aviation risk? 3. Are there opportunities for improving these mechanisms? Our objective in this paper is to answer the first question by quantifying the impact of maintenance on aviation safety. We analyze several safety related metrics and develop an aviation maintenance risk scorecard that collects these metrics to synthesize a comprehensive track record of maintenance contribution to airline accidents and incidents. We perform an in-depth analysis of aviation accidents, incidents, and FAA fines and other legal actions in the United States from 1999 to 2008. Section 2 discusses risk metrics and proposes a ‘‘risk scorecard’’ of maintenance risk metrics that we use to summarize the impact of maintenance on safety. Next, we fill this scorecard in by considering the contribution of maintenance to the most serious adverse events, that is, accidents, to less serious incidents, to the precursors of adverse events. Section 3 uses accident data from the NTSB to assess the contribution of maintenance to accidents, while Section 4 uses FAA incident data to assess the contribution of maintenance to incidents. Section 5 uses FAA records of fines and other legal actions (‘‘enforcement actions’’) to identify other potential maintenance errors. Section 6 briefly reviews other potential data sources. Section 7 presents the maintenance risk scorecard for 1999–2008, and Section 8 concludes the paper.

2. Aviation maintenance, aviation safety metrics and the maintenance risk scorecard In this section we briefly review the literature on aviation maintenance safety, and discuss several safety related metrics, their relevance and limitations. Then, we present a ‘‘risk scorecard’’ that can be used to analyze and present trends in aviation accidents. In the following sections we proceed to fill this scorecard out. 2.1. Aviation maintenance safety empirical research Several studies on the impact of maintenance errors on commercial aviation accidents have been performed in the past. Most of these studies (see Ref. [10] for an overview) only look at limited datasets or classify the results into broad categories that make it hard to draw meaningful conclusions about passenger safety. Graeber and Marx [9] classified 122 maintenance errors that occurred over a two-year period into the following four categories: omission (56%), incorrect installation (30%), wrong part (8%), and other (6%). The Civilian Aviation Authority (CAA) of the United Kingdom found the following maintenance errors to be the most common during a three year study: incorrect installation of components, the fitting of wrong parts, electrical wiring discrepancies, loose objects left in the aircraft, inadequate lubrication,

BTS CFR FAA NTSB SAIB SDR

105

Bureau of Transportation Statistic Code of Federal Regulations Federal Aviation Agency National Transportation Safety Board Special Airworthiness Information Bulletin Service Difficulty Report

cowlings, access panels and fairings not secured, fuel caps and refuel panels not secured, and landing gear ground lock pins not removed before departure [10]. A more detailed study has been done for general aviation (GA), as shown in Tables 1 and 2 [8]. The majority of maintenance errors discussed above have been those that occur when an AMT actually performs a particular task. Maintenance errors can also occur when designing the overall maintenance plan or the individual procedures. Various statistics regarding the significance and prevalence of these errors have been published. For example, Graeber and Marx [9] found that flawed maintenance practices were the major factor in 3% of all maintenance-related international aviation accidents between 1959 and 1983. A study of more recent data by Boeing found that changes in maintenance and inspection procedures could have prevented approximately 20% of all accidents between 1982 and 1991 [10]. This illustrates the importance of the maintenance procedures, as the reliability and precision of an AMT is of no value if the procedures themselves are flawed. Maintenance inspection errors play a significant role in both GA and commercial maintenance-related accidents. The environmental factors surrounding the Aircraft Maintenance Technician (AMT) are important since approximately 90% of all maintenance inspection is visual in nature [10]. A study done by Prabhu and Drury [19] identified several major error categories within maintenance and inspections tasks, as shown in Table 3. Poor lighting conditions in the workplace can make inspection difficult and can cause damaged or worn components to go unnoticed. The layout of the workspace can also be a contributing factor to maintenance errors. For example, when vertical restrictions were imposed on an AMT during an inspection task, the AMT’s attentiveness decreased [10]. These environmental factors are being addressed by both airlines and aircraft manufacturers. Boeing, for instance, created a design framework for the B-777 that facilitates access and maintenance [25]. The Human Factors Analysis and Classification System (HFACS), was developed in order to include these types of contributing factors among others when analyzing risk [24].

Table 1 Frequency of maintenance activity for all GA maintenance-related accidents 1998-1997. Maintenance activity

Frequency

Percent

Installation Maintenancenn Maintenance inspection Annual inspection Service of aircraft Adjustment Modification Overhaul Other Non-maintenancen Total

295 217 202 124 91 82 62 59 312 30 1474

20.0 14.7 13.7 8.4 6.1 5.5 4.2 4.0 21.1 2.0 100.0

n Non-maintenance refers to codes used in the NTSB accident reports that are not labeled as ‘maintenance,’ Some examples include landing gear, tailwheel lock, flight manuals, and radar assistance to VFR aircraft. nn Most likely maintenance-related activities which could not be classified in any other category.

106

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

2.2. Aviation safety metrics Table 4 summarizes the nine most commonly used safety metrics in the aviation industry [3,4]: Different groups and organizations require different metrics to guide their actions. For example, Barnett [3,4] argues that the passenger mortality risk metric is the soundest metric from the passenger’s perspective. However, it does not work well for yearto-year comparisons due to the relatively low number of aviation accidents each year. Government agencies and regulatory bodies require metrics that respond more rapidly to year-on-year changes, and that indicate whether safety is decreasing as traffic

Table 2 Distribution of type of installation error [8]. Type of installation error

Frequency

Percent

Incorrect attachmentn Incorrect connectionn Omission Wrong part Reversed installation Total

83 64 63 49 29 290

28.6 22.1 21.7 16.9 10.0 99.3

n The category incorrect attachment refers to the incorrect installation of a fastener or harness while the category incorrect connection refers to the incorrect installation of a component that has another function other than simply connecting two components together, such as a fuel line or a hydraulic line.

Table 3 List of major categories within maintenance and inspections tasks. Category

Example(s)

Defective components Missing component Wrong component Incorrect configuration Incorrect assembly sequence Functional defects Tactile defects Procedural defects

Cracked pylon, worn cables, fluid leakage Bold-nut not secured Incorrect pitot static probes installed Valve inserted backwards Incorrect sequence of inner cylinder spacer and lock ring assembly Wrong tire pressure, over-tightening nuts Seat not locking in position Nose landing gear door not closed

levels increase. Therefore metrics such as accidents/year and accidents/departure may also be useful, especially if they are grouped by severity. Aircraft manufacturers, airlines, and regulatory bodies also need to track less serious accidents and incidents, enforcement actions, airworthiness directives, and service difficulty reports to provide early warning of risk factors. 2.3. The aviation maintenance risk scorecard We therefore propose an aviation maintenance risk scorecard as shown in Table 5 to assess the contribution of maintenance to aviation risk. The first column describes the different metrics we use, while the second and third columns show how they are calculated for all accidents, and for accidents involving maintenance. The last column uses conditional probabilities or ratios to assess the relative contribution to risk of maintenance compared to all possible causes. Thus p(m9a) is the probability that maintenance was involved given that an accident occurred, p(m9f) is the probability that maintenance was involved given that a fatality occurred, and p(m9af) is the probability that maintenance was involved given that a fatal accident occurred. The risk magnifiers (afmagnifier, fmagnifier, gmagnifier) are an indication of whether maintenance accidents are more likely to be fatal or involve higher rates of fatalities when compared to accidents in general. The fatal accident magnifier, afmagnifier, indicates how much more likely maintenance accidents are to be fatal:  naf _m naf ð1Þ af magnif ier ¼ na_m na The magnifier works as follows: if is afmagnifier is less than one, then maintenance is less likely to be fatal than accidents in general, if it is greater than one, then maintenance is more likely to be fatal than accidents in general. The fatality magnifier, fmagnifier, indicates how much more deadly maintenance accidents are:  nf _m nf ð2Þ f magnif ier ¼ na_m na The magnifier works as follows: if is fmagnifier is less than one, then maintenance is less deadly than accidents in general, if it is greater than one, then maintenance is more deadly than accidents in general.

Table 4 Common aviation risk metrics. Metric

Advantages

Disadvantages

Fatal accidents per 100,000 flight hours

Relationship between number of fatalities and amount of travel. Provides a ‘‘cost-benefit’’ ratio

Fatal accidents per million departures Hull losses per million departures Passenger deaths per 100 million passenger miles Passenger deaths per million passengers carried Passenger death risk per randomly chosen flight Annual aviation death risk per million citizens

Is not biased towards longer flights

Does not distinguish between high and low fatality accidents. Does not convey the benefit of changes such as fire retardants that reduce but do not prevent fatalities. Accidents are not correlated with flight hours since most accidents occur during non-cruise phases Does not distinguish between different fatality levels

Accidents per 100,000 flight hours Accidents per million departures

Salient to airframe manufacturers. Is not biased towards longer flights Relationship between number of fatalities and amount of travel. Provides a ‘‘cost-benefit’’ ratio

Does not distinguish between different fatality levels Accidents are not correlated with distance travelled since most accidents occur during non-cruise phases

Provides some indication of relative risk to a given Places greater weight on accidents with full aircraft. Does not distinguish between passenger accidents with different proportions of fatalities Distinguishes between high and low fatality accidents. Is not biased by flight length Easy to calculate

Easy to calculate Is not biased towards longer flights

Does not work well for year-to-year comparisons due to the relatively low number of aviation accidents each year Defined as the ratio of a region’s number of passengers killed in aviation accidents to its total population. Does not distinguish for example between many low fatality accidents and a single high fatality accident Does not distinguish between accidents of differing severity. Accidents are not correlated with flight hours since most accidents occur during non-cruise phases Does not distinguish between accidents of differing severity

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

107

Table 5 Aviation safety scorecard. Metric

All events

Maintenance events

Maintenance vs. all events

Average number of accidents per year

na =years

nam =years

  p m9a ¼ nam =na

Average number of accidents per domestic revenue departure na =departures Average number of fatalities per year nf =years

nam =departures nf m =years

  p m9f ¼ nf m =nf

Average number of fatalities per domestic passenger enplanement Average number of fatal accidents per year

nf =departures

nf m =departures

naf =years

naf m =years

Proportion of fatal accidents

naf =na

naf m =nam

Average number of fatalities per accident

nf =na

nf m =nam

Average percentage of fatalities per fatal accident

nf 9af m =PAX=naf

nf 9af m =PAX=naf m

Passenger mortality riskn

Qa ¼

departures P

xia =departures Q am ¼

i¼1

departures P

  p m9af ¼ naf m =naf af magnif ier ¼ ðnaf m =nam Þ=ðnaf =na Þ

xiam =departures

i¼1

Average number of incidents per year

ðni =yearsÞ

ðnim =yearsÞ

Average number of enforcement actions per year

ðne =yearsÞ

ðnem =yearsÞ

f magnif ier ¼ ðnf m =nam Þ=ðnf =na Þ g magnif ier ¼ ðnf 9af m =PAX=naf m Þ=ðnf 9af =PAX=naf Þ   Q m9a ¼ Q am =Q a   p m9i ¼ ðnim =ni Þ   p m9e ¼ ðnem =ne Þ

n xia is the fraction of fatalities in accident i and xia_m is the fraction of fatalities in maintenance-involving accident i. x is calculated by dividing the number of fatalities on the aircraft with the total number of passengers and crew. Thus in an accident with zero fatalities x is zero; an accident where everyone perishes has x¼ 1.

An alternative fatality magnifier, gmagnifier, also indicates how much more deadly maintenance accidents are by considering the percentage of fatalities: , nf 9af m =PAX nf 9af =PAX g magnif ier ¼ ð3Þ naf m naf The magnifier works as follows: if is gmagnifier is less than one, then maintenance is less deadly than accidents in general, if it is greater than one, then maintenance is more deadly than accidents in general. Q(a) is the passenger mortality risk as defined by Barnett [3,4] and is the probability that a passenger will perish on a randomly chosen flight. Q(m9a) is the fraction of passenger mortality risk that involves maintenance. Finally, p(m9i) is the probability that maintenance was involved given that an incident occurred and p(m9e) is the probability that maintenance was involved given that an enforcement action occurred. We used airline traffic data from the US Bureau of Transportation Statistics (BTS) to calculate the metrics that are normalized by traffic level [5]. Next, we fill this scorecard in by considering the contribution of maintenance to the most serious adverse events, that is, accidents, to less serious incidents, to the precursors of adverse events. We use three primary data sources: (1) The National Transportation Safety Board accident database: this database includes detailed reports on all accidents involving U.S. carriers from 1962 to the present. (2) The FAA incident database (this database includes incident reports of varying detail) and the FAA database of enforcement actions. (3) The Bureau of Transportation Statistics provides airlinereported data including numbers of flights and passengers for all flights originating or terminating in the United States. 3. Accident analysis There are many examples that illustrate the importance of maintenance to safety. For instance, United Airlines Flight 232 crashed on July 19th, 1989 as a result of a failure to detect a fatigue crack in one of the engine’s fan disks.1 The investigation found the inspection and quality control procedures were inadequate [14]. The crack led to the uncontained destruction of the 1 This section is based on and extends a conference paper presented at the European Safety and Reliability Conference (ESREL) in 2010 [21].

engine during flight. 110 of the 285 passengers lost their lives along with one of the 11 crewmembers. Statistical analyses focusing on the contribution of maintenance are rare. This section presents an empirical analysis of commercial passenger aviation accidents in the United States using the scorecard presented in Section 2. We show trends for the full period; safety metrics are given for 1999–2008. 3.1. NTSB accident database The analysis of accident trends is based on Part 121 passenger operations accidents investigated by the National Transportation Safety Board (NTSB) between 1962 and 2008. The NTSB defines an accident as ‘‘an occurrence associated with the operation of an aircraft where as a result of the operation of an aircraft, any person (either inside or outside the aircraft) receives fatal or serious injury or any aircraft receives substantial damage’’. Part 121 is the section of the Code of Federal Regulations (CFR) that refers to ‘‘scheduled or nonscheduled passenger-carrying operations that adhere to regulations that limit operations to controlled airspace and controlled airports for which specific weather, navigational, operational, and maintenance support are available’’ [16]. Scheduled Part 121 includes most passenger airline operations with more than 10 seats and nonscheduled Part 121 includes cargo operations in large aircraft [17].2 Only scheduled passenger operations are included in this study. Table 6 shows the four primary categories used by the NTSB to classify accidents [15]. It is usually impossible to assign exact importance levels to all the factors involved in an accident. Experts often disagree as to the relative importance of one particular cause over another. NTSB reports reflect this difficulty in that they do not assign relative importance to multiple causes and factors. Therefore we considered an accident to be maintenance-related if it had at least one maintenance-related cause or factor. The pre-1982 database describes events in a similar fashion, but with less detail. The transition from the old to the new database in 1982 is indicated on the graphs with a vertical black bar. Direct comparisons between the two periods should be made with care. 3.2. Trends in aviation accidents We first determined how often the different NTSB subject codes were assigned from 1999 to 2008, as shown in Fig. 1. For 2 Before March 1997, scheduled Part 121 included only passenger aircraft with more than 30 seats.

108

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

example, Aircraft Structure was listed as a cause or contributing factor in about 7% of 1095 investigations. Operations/ATC/Maintenance is often reported as a cause or factor, because it encompasses a broad range of factors including pilot error and communication. Fig. 2 therefore shows a

Table 6 NTSB accident classification categories. Category

Example(s)

IA—Primary non-person-related findings Aircraft structure Control surfaces, rudder, fuselage, landing gear Aircraft system Autopilot, hydraulic systems Powerplant Bleed air system, compressor assembly, fuel system Miscellaneous aircraft/ Lights, coolant, fuel, lavatory equipment ATC/weather/airport facility/ Approach aids, radar, meteorological services equipment Miscellaneous/publication Aircraft manuals, charts and other manuals IB—Primary person-related findings Aircraft/equipment Autopilot, communication equipment, performance navigation instruments Operations/ATC/ Missed approach, aircraft control, maintenance compensation for wind II—Direct underlying Inadequate design, inadequate training, events physiological conditions III—Indirect underlying Inadequate surveillance of operation, events insufficient standards

breakdown of the 460 accidents in the ‘Operations/ATC/Maintenance’ category. Maintenance accounted for approximately 4.3% of all subject codes listed in Operations/ATC/Maintenance. The detail with which causes are placed in different categories affects how much information we can obtain from the analysis. For example, in the maintenance category, the general maintenance and inspection codes are listed most frequently, as shown in Fig. 3. The maintenance category (24100) is ambiguous and was most likely used to classify either routine maintenance activities or causes/factors that did not fall into any other category [8]. Thus while it is clear that inadequate inspection is a common problem, we cannot determine from the subject codes whether particular kinds of maintenance errors are more prevalent than others. Fig. 4 shows the trends in the number of accidents. As mentioned earlier, we considered an accident to be maintenance-related if it had at least one maintenance-related cause or factor. Aircraft technology and safety as well as airline traffic have generally increased over time, but over the last 5 decades the number of maintenance related accidents did not decrease commensurately, as is apparent from the graph, and as indicated by the Pearson’s correlation coefficient of 0.50 for the five decades. More recently, maintenance accidents have tracked accidents more closely, with a Pearson’s correlation coefficient of 0.70 from 1999 to 2008. We found that maintenance has, on average, been a cause of or contributing factor to 4.1% of all accidents between 1999 and 2008. That is, in our maintenance risk scorecard, p(m9a) is 4.1%, as shown in Table 8.

Aircraft Structure Aircraft System Powerplant Miscellaneous Aircraft/Equipment ATC/Meteorological/Airport Facility/Equipment Terrain/Runway Conditions Object Weather Conditions Light Conditions Aircraft/Equipment Performance Operations/ATC/Maintenance Direct Underlying Indirect Underlying 0%

10%

20% 30% Percentage of Reports

40%

50%

Fig. 1. Proportions of subject codes listed as contributions to aircraft accidents from 1999 to 2008. Note that since multiple subject codes can be listed for each accident, the relative frequencies do not sum to one hundred percent.

Aircraft Handling Airport Communications/Information/ATC Dispatch Maintenance Meteorological Service Planning/Decision Rotorcraft Operations Miscellaneous Undetermined 0%

5%

10%

15% 20% 25% Proportion of Reports

30%

35%

Fig. 2. Detailed breakdown of the 460 accidents in ‘Operations/ATC/Maintenance’ category 1999–2008.

40%

Maintenance Code

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

24116 - Major Repair 24113 - Modification 24103 - Compliance with AD 24122 - Major Overhaul (engine) 24118 - Record Keeping 24112 - Lubrication 24106 - AAIP/Progressive Program 24121 - Overhaul 24115 - Replacement 24107 - Adjustment 24111 - Installation 24119 - Service Bulletin/Letter 24101 - Service of Aircraft/Equipment 24100 - Maintenance 24102 - Inspection 0

1

2

3

109

Cause

Factor

5

6

4

N/A

7

Count Fig. 3. Frequency distribution of maintenance codes 1999–2008.

Number of Accidents

70 Total Accidents

60

Maintenance-Related Accidents

50 40 30 20 10 2006

2002

1998

1994

1990

1986

1982

1978

1974

1970

1966

1962

0

Year

7.0

Total Accidents

Maintenance-Related Accidents

6.0 5.0 4.0 3.0 2.0 1.0 2008

2006

2004

2002

2000

1998

0.0 1996

Accidents / Million Revenue Departures

Fig. 4. Contribution of maintenance to the number of accidents. The vertical black bar at 1982 indicates the change in classification scheme.

Year Fig. 5. Trends in the number of accidents per million revenue departures.

Since 1999 the number of accidents per revenue departure (both overall and maintenance-related) has declined, as shown in Fig. 5. From 1999 to 2008 the number of accidents per revenue departure declined by about 43% per year, though with a low r2 value of 0.76. Total accidents and maintenance-related accidents have declined in a similar manner; from 1999 to 2008 the Pearson’s correlation coefficient between the total accidents and maintenance related accidents is 0.80. An average of 3.8 accidents occurred per million revenue departures between 1999 and 2008. Further, 0.16 maintenance-related accidents occurred per million revenue departures. As before, we found that, given that an

accident occurred, there is a 4.1% probability that maintenance was involved. Next we determined the number of fatal accidents per year and how many are linked to maintenance, as shown in Fig. 6. Here the Pearson’s correlation coefficient for 1999–2008 is zero, indicating that the relationship is indeterminate, due, in part, to the small numbers involved. Maintenance is linked to between 0 and 2 fatal accidents per year. In some years, maintenance contributes to a significant percentage of all fatal accidents. Between 1999 and 2008, 26.7% of all fatal accidents are maintenance-related. Further, while only 4.4% were fatal, 28.6% of

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

7 6

Total

Maintenance Related

5 4 3 2

2002

2006

2002

2006

1998

1994

1990

1986

1982

1978

1974

1970

0

1966

1 1962

Number of Fatal Accidents

110

Year Fig. 6. Contribution of maintenance to the number of fatal accidents.

Total Fatalities

500

Maintenance-Related Fatalaties

400 300 200

1998

1994

1990

1986

1982

1978

1974

1970

0

1966

100 1962

Number of Fatalities

600

Year Fig. 7. Contribution of maintenance to the number of fatalities.

40 Average Number of Fatalities per Accident

Total

Maintenance Related

30

20

10

2006

2002

1998

1994

1990

1986

1982

1978

1974

1970

1966

1962

0

Year Fig. 8. Average number of fatalities per accident. The outlier in 1979 is the 273 fatality American Airlines accident at O’Hare International Airport in Chicago.

maintenance accidents were fatal. That is, a maintenance-related accident was 6.5 times more likely to involve at least one fatality than accidents in general. The peaks in fatalities during 1979, 1988, 1989 1996 are caused by a few high-fatality accidents. The peaks in fatalities due to maintenance are the result of a single accident in every instance except for 1989, 1996 and 2000, where they are the result of two accidents in each case. Maintenance accidents, if fatal, tend to result in many fatalities. To investigate this

possibility further, we compared the total fatalities associated with aviation accidents with those associated with maintenance in particular, as shown in Figs. 7 and 8. From 1999 to 2008, maintenance was linked to 27.4% of all aviation fatalities. Recalling that maintenance was involved in only 4.1% of accidents, this number suggests that maintenancerelated accidents do tend to have a slightly higher fatality rate than accidents overall. Indeed, between 1999 and 2008, the average number of fatalities for all accidents was 1.3, while for

7.0

Total Accidents

6.0

111

Maintenance-Related Accidents

5.0 4.0 3.0 2.0 1.0

2008

2006

2004

2002

2000

1998

0.0

1996

Fatalities / 10 Million Revenue Passenger Enplanements

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

Year Fig. 9. Trend in the fatalities per 10 million passenger enplanements.

maintenance related accidents it was 7.9, suggesting that the ‘‘fatality risk magnifier’’ for maintenance accidents is 6.13. Here the Pearson’s correlation coefficient is 0.13, indicating there is little relationship between the number of fatalities and the number of maintenance-related fatalities. This finding is expected given the high fatality risk magnifier. As shown in Fig. 9, the number of fatalities relative to flights is low. Between 1999 and 2008, an average of 0.77 fatalities occurred per 10 million passenger departures. The rate of maintenance-related fatalities was 0.21 in 10 million departures. Again, given that a fatality has occurred, there is a 27.4% probability that maintenance was involved. The significant impact of maintenance on safety is also shown by the average proportion of fatalities per fatal accident. Fatal accidents in general experienced 22.4% fatalities, while maintenance-related accidents experienced 49.2% fatalities. That is, maintenance-related accidents experienced 2.2 times more fatalities. Finally, we calculated the last set of metrics in our scorecard: the passenger mortality risk per randomly chosen flight. Recall that, as discussed in Section 2, the passenger mortality risk was defined by Barnett [3,4] as the probability of a passenger dying on a randomly chosen flight. Combining all the events from 1999 to 2008 we obtain death risks of 1 in 14.6 million (total) and 1 in 21.9 million (maintenance-related). These results compare well with those found by Barnett [3]. Therefore, maintenance is related to 14.6/21.9¼66.5% of the passenger mortality risk in commercial aviation. That is, in our maintenance risk scorecard, Q(m9a) for 1999–2008 is 66.5%, as shown in Table 8. Proper maintenance is an essential contributor to the high levels of safety we experience today; in contrast improper maintenance can have tragic effects. In this section we have quantified the impact of improper maintenance on aviation accident risk and found it to be small but significant. While the data does not indicate that maintenance is becoming a larger contributor, it also does not show any convincing trends of decline in the contribution of maintenance. Therefore it is essential that we continue to improve maintenance practices.

4. Incident analysis Many maintenance errors do not result in catastrophic accidents but have the potential to do so. For example, on June 17th, 2004, the air data computers on American Airlines Flight 44 failed during takeoff, resulting in the loss of both airspeed indicators, both altimeters and several other instruments. The flight crew was forced to rely on standby instrumentation. The improper securing of the air data computers was determined to

be the cause of the failure [6]. Another incident involving maintenance occurred during a USA 3000 flight on May 17th, 2008. The pilots experienced problems with the flight controls shortly after takeoff from Miami International Airport. The aircraft returned to Miami and experienced difficulties with the spoilers during landing which caused the left wing tip to contact the ground. The pilots managed to recover and abort the landing as well as make a successful second attempt. The investigation revealed that the left wing spoilers were left in the maintenance mode, causing the aircraft to handle in an unexpected manner [7]). Thus incidents can aid in identifying leading indicators of future accidents. We therefore analyzed all commercial aviation incidents recorded by the FAA between 1999 and 2008.3 The NTSB and FAA define an incident as ‘‘an occurrence other than an accident, associated with the operation of an aircraft, which affects or could affect the safety of operations’’ [17].4 4.1. Incident recording and classification system Each report in the FAA incident database provides information regarding the event such as the date, time, location, and type of aircraft, and provides a short narrative of the event. The database is not structured in a way that lends itself well to a empirical analysis. Although some of the pertinent information is summarized, it is administrative in nature. We therefore developed the incident classification scheme shown in Table 7. The scheme attempts to strike a balance between offering sufficient categories to distinguish between incidents and offering so many categories that it becomes an iteration of the reports. We reviewed the 3242 incident reports filed between 1999 and 2008 and classified the incident into one of the categories listed in Table 7. Each incident was also classified as maintenance or nonmaintenance related, depending on the particular details of the case. Some incidents such as disruptive passengers are clearly not maintenance-related, while other reports explicitly mention inadequate maintenance. If the distinction was less clear (e.g., failed bearings with no further information), we did not classify the incident as maintenance-related; thus the impact of maintenance is underestimated. We also assume that not all incidents are reported. While the ASRS is anonymous, it is voluntary, and as a result, 3 Since the FAA incident database does not distinguish between different parts of commercial aviation, our investigation of FAA incidents includes all commercial aviation activities and not just those operating under Part 121. 4 Recall that an accident is defined as ‘‘an occurrence associated with the operation of an aircraft where as a result of the operation of an aircraft, any person (either inside or outside the aircraft) receives fatal or serious injury or any aircraft receives substantial damage’’.

112

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

Table 7 Incident cause classification scheme. Description

Mechanical Failure/malfunction Engine Landing gear Control surfaces Tire(s) Other Pilot error Taxi Takeoff Climb Cruise Approach Landing Smoke/fire Ground crew Pushback/tug De-icing Other airport vehicles Marshall/wing walker FOD Bird(s) Other Miscellaneous Turbulence Intoxicated/violent PAXs Smoking PAXs Other

Incidents caused by component/system failures or malfunctions Includes the engine and all auxiliary systems Includes the landing gear and auxiliary systems (except for the tires) Includes the flaps and/or slats and all auxiliary systems Includes all tire failures Includes all other components or systems Incidents caused by actions of inactions of the flight crew Flight segment in which incident occurred Flight segment in which incident occurred Flight segment in which incident occurred Flight segment in which incident occurred Flight segment in which incident occurred Flight segment in which incident occurred Incidents caused by smoke and/or fire Incidents caused by actions of the ground crew Includes all incidents caused by a tug. (i.e., during pushback or towing) Includes all incidents caused by a de-icing truck Includes all incidents caused by other airport vehicles Includes all incidents caused by either the marshal or a wing walke Incidents caused by FOD (Foreign Object Damage) FOD incidents caused by birds All other FOD incidents Various other incidents Incidents caused by turbulence Incidents caused by intoxicated, violent and/or disruptive passengers Incidents caused by smoking passengers All other incidents

500

All Reports Maintenance Incidents

400

Percentage Related to Maintenance

300 200

2008

2007

2006

2005

2004

2003

2002

2001

0

2000

100 1999

Number of Incidents

600

Percentage

Category

Year Fig. 10. Trends in maintenance-related incidents.

58% Distribution of FAA Incidents

Percentage Related to Maintenance

40% 30% 16% 9%

Pilot Error

Miscellaneous

0% 0%

5%

8%

5%

2%

0%

Fig. 11. Distribution of all maintenance-related failures 1999–2008.

0%

FOD

6%

Ground Crew

10% 10%

Smoke/Fire

20%

Mech. Failure/ Malfunction

Percentage of Incidents

50%

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

underreporting is likely to occur. In contrast, the FAA incident database records all incidents that were investigated by the FAA. However, not all incidents are reported to the FAA, so we assume that here, too, underreporting occurs. 4.2. Trends in aviation incidents

40%

5. Analysis of fines and other penalties Accidents and incidents are the unfortunate outcomes of errors and failures. There are several ways that civil aviation entities can prevent or reduce the likelihood of these errors and failures. In this section, we analyze the punitive actions that the FAA has taken against airlines and associated organizations. When air carriers and maintenance facilities (AC&MF) do not perform maintenance as required by federal regulations, they run the risk of incurring costs through the form of monetary fines. These fines range widely in magnitude but can be substantial. For example, in 2007, Southwest Airlines was fined $7.5M for failing to inspect the fuselage of 46 of their aircraft for potential cracks. The problem persisted for almost 60,000 flights between 2006 and 2007. Another, more recent example of a considerable fine is the $24.4 M fine imposed on American Airlines in 2010 for failing to adhere to an airworthiness directive. American Airlines failed to inspect and repair wires in the wheel wells of its MD-80 fleet as instructed by the FAA. American Airlines’ lack of maintenance could have caused a fire to occur and could possibly have led to a fuel tank explosion [12]. Had the FAA imposed the maximum penalty per occurrence, the fine could have been as high as $713.9 M, a crippling amount. Here we complement our analysis of accidents and incidents with an analysis of the penalties imposed by the FAA on aviation entities between 1999 and 2008 to estimate the extent to which maintenance contributes to operator actions that may contribute to risk.

Fig. 13. Distribution of maintenance-related mechanical failures and malfunctions 1999–2008.

Distribution of Mech. Failures/Malfunctions

38%

21%

20% 10%

Percentage Related to Maintenance

26%

30% 11%

11%

9%

9%

9%

5%

5%

Fig. 12. Distribution of mechanical failures and malfunctions 1999–2008.

Tire(s)

Control Surfaces

Landing Gear

Other

0% Engine

Percentage of Incidents

At least 6.8% of all commercial aviation incidents in the FAA database between 1999 and 2008 are maintenance-related. Fig. 10 shows the trends in incidents and maintenance incidents. While the number of reported incidents has declined from 1999 to 2008, the trend in the percentage of maintenance-related incidents is less clear. Fig. 11 summarizes all the FAA incidents by category as well as the fraction of those incidents that we identified as being maintenance-related from 1999 to 2008. Note that 58% of incidents involve mechanical malfunction or failure, and maintenance was involved in 10% of these incidents. In most years, mechanical failure (where a component breaks, for example a tire blows) or malfunction (where a component does not operate correctly, for example, a sensor gives an incorrect indication) is the primary type of maintenance incident. Over the ten-year period, an average of 85% of maintenance incidents were related to component or structure failures. We further investigated the distribution of these types of incidents, as shown in Figs. 12 and 13. Engines and landing gear are the most likely to result in a component failure incident, with inadequate maintenance contributing to 11% and 9% of these incidents, respectively. Further, if a maintenance incident occurs, it is most likely to occur on the landing gear or the engine. 41% of maintenance component failures involved the landing gear (35%) or tires (6%), and 31% involved the engines and 42% of engine-related failures occurred during the initial phases of flight, from engine start to climb. To what extent these failures are preventable is an important topic for future work. For example, oil and hydraulic leaks are frequently cited as reasons for engine failure or shutdown; such problems should be detected prior to takeoff. While Part 121 aviation accidents are fortunately rare, incidents occur much more often and can provide additional insight into the industry’s risk. In Section 3 we found that the probability of maintenance being cited as a factor in an accident from 1999 to 2008 was 4.1%. Our analysis of incidents over the same time period revealed a similar level of contribution: we found that maintenance was involved in at least 6.8% of incidents. The slightly higher number may be due in part to the wider range of operations covered by our incident analysis—if so it would indicate that maintenance problems are more likely outside Part 121. The slightly higher number may also be an indication of crews’ and maintenance technicians’ skill in recovering from incidents—for example crews are trained and airports equipped for landings where the nose gear is not deployed.

113

114

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

Number of Legal Enforcement Actions

1,200 1,000 800

All Case Types

All Case Types [AC & MF]

Maintenance

Maintenance [AC & MF]

600 400 200 2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

0

Year of Closing Date

100% All Aviation Entities

AC & MF

80% 60% 40%

2008

2007

2006

2005

2004

2003

2002

2001

0%

2000

20%

1999

Percentage of Enforcement Actions due to Maintenance

Fig. 14. Enforcement activities per year. The black lines show the total numbers for all operators, while the red lines show the numbers for only air carriers and maintenance facilities (AC&MF).

Year of Closing Date Fig. 15. Fraction of enforcement actions related to maintenance.

5.1. FAA enforcement reports The FAA releases quarterly reports summarizing all the enforcement actions that were closed against aviation entities during that quarter. The reports list each action and identify the type of aviation entity involved, the type of legal action taken, and the case type, as well as other pertinent information such as the magnitude of the fine imposed or the duration of a license suspension. Aviation entities can be fined multiple times for the same violation. For many types of infractions, aviation entities can be fined once for each day the problem persists or once for each occurrence of the violation. The FAA usually consolidates these violations into one large fine. Since the quarterly reports provide the number of legal actions taken, we assume that we are undercounting the number of violations. The total number of enforcement actions taken by the FAA every year is probably lower than the number of violations that occurred since not all violations are detected. Further, the large peak in the number of enforcement actions taken in 2001 suggests that the level of policing does not necessarily remain constant and may increase after major aviation events. 5.2. Trends and results Our analysis shows that 35% of all enforcement actions closed between 1999 and 2008 were the result of maintenance violations.5 When we restrict the analysis to air carriers and maintenance facilities (AC&MF), we find a similar result of 36% (thus excluding aircraft manufacturers, component manufacturers, agricultural operators and other types of aviation entities). Figs. 14 and 15 illustrates the number of enforcement actions closed each year as well as the fraction that are maintenance5 Since we are concerned with the risk of aviation from an engineering and operational standpoint, we omit fines and violations related to security.

related. Note that enforcement was increased in 2001, probably as a result of the terrorist attacks. On average, the FAA closed 519 cases per year against all types of aviation entities and 438 per year against air carriers and maintenance facilities. Of these, 180 and 159 were the result of inadequate maintenance. On average, 36% of enforcement actions are related to maintenance, with an apparent decline for 2006–2008. On average, 6.0 penalties are imposed per 100,000 revenue departures, of which 2.1 are maintenance-related. When considering only AC&MF, these figures drop slightly to 5.1 and 1.9, respectively. In other words, most enforcement actions are taken against AC&MF, with other organizations such as aircraft manufacturers, component manufacturers, and agricultural operators incurring relatively few actions. Further, the number of enforcements is uncorrelated with the number of departures (r2 ¼  0.46), that is, as airlines fly more, enforcement does not increase. Different types of enforcement actions can be taken by the FAA. Whether we consider all aviation entities or just air carriers and maintenance facilities, the most popular type of legal action is the civil penalty, which is essentially a fine. The other enforcement actions are certificate revocations or suspensions, which means the airline cannot operate, and consent orders, which are agreements between the FAA and the aviation entities being prosecuted where the aviation entities agree to take steps to enhance the safety of their operations. We found that 85% of enforcement actions are filed against either air carriers or maintenance facilities, which is expected as these two types of aviation entities account for a large portion of the aviation industry (other entities include aircraft manufacturers, component manufacturers, and agricultural operators). 94% of all enforcement actions are orders assessing a civil penalty (95% for AC&MF only). Fig. 16 illustrates the total monetary amount of fines paid by aviation entities each year. Aviation entities paid a total of approximately $8.6 M (USD) in fines each year of which $2.7 M (USD) were due to maintenance violations. Air carriers and

Total

Maintenance

Total [AC & MF]

115

Maintenance [AC & MF]

25,000,000 20,000,000 15,000,000 10,000,000 5,000,000

2008

2007

2006

2005

2004

2003

2002

2000

2001

0

1999

Total amount of Civil Penalties [$]

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

Year of Closing Date Fig. 16. Annual monetary cost of penalties.

Maintenance

Total [AC & MF]

Maintenance [AC & MF]

25,000 20,000 15,000 10,000 5,000 2008

2007

2006

2005

2004

2003

2002

2001

2000

0 1999

Average Civil Penalties [$]

Total 30,000

Year of Closing Date Fig. 17. Average monetary cost of an infraction.

maintenance facilities contribute, on average, a collective total of $7.4 M (USD) towards the total amount paid each year. Of this figure, $2.4 M (USD) was paid as a result of maintenance violations. Fines of more than a million dollars are rarely levied or collected. During the ten-year period eleven fines of a million dollars or more (including a fine of $999,999 levied against American Airlines in 2007) were levied, with two of these fines being waived. Fig. 17 illustrates the average monetary cost of an infraction each year. On average, AC&MF pay approximately $18,000 (USD) per infraction. Maintenance infractions are, on average, slightly less costly at $16,000 (USD). The consistency in the number of enforcements coupled with the surge in 2001 suggests that there may be an opportunity to improve inspection; this remains an important area for further investigation. Further, the relatively low average fine is unlikely to be a significant deterrent to aviation entities. Even the headline-grabbing million-dollar fines are usually negotiated down. The most recent three years analyzed show an apparent decline in the percentage of enforcement actions relating to maintenance. Further analysis is needed to determine whether this trend is real, and, if so, whether it can be attributed to improved maintenance practices.

6. Other data sources This section briefly reviews four other data sources: the aviation safety reporting system (ASRS), airworthiness directives (AD), special airworthiness information bulletins (SAIB), and Service Difficulty Reports (SDR). The ASRS will add two interesting perspectives to the analysis (NASA, 2011). First, since the ASRS is voluntary and penalty-free,

it promotes the reporting of incidents that are more serious or that break federal regulations. Second, while the FAA database records only incidents associated with the operation of an aircraft, the ASRS contains many reports that describe a variety of purely maintenance-related incidents. Some reports state that a component or system may have been improperly serviced or inspected. Other reports describe how a previously maintained component was discovered to have been improperly serviced. Our initial analysis of the ASRS indicates that there were 36,021 ASRS incident reports for passenger operations operating under Part 121 between 1999 and 2008. Since the database does not lend itself to a full-text search, we used the number of reports submitted by maintenance personnel as a proxy for the number of maintenance-related incidents. We assume that maintenance personnel are unlikely to report non-maintenance issues. 7.2% of all reports (2585 reports) were submitted by maintenance personnel. This number is slightly larger than the 6.8% of FAA incidents that we classified as maintenance-related but is most likely still conservative because it does not include maintenance incidents reported by non-maintenance personnel. To improve the safety of aviation operations, the FAA can issue airworthiness directives (ADs). Airworthiness directives are defined by the FAA as a ‘‘legally enforceable rules that apply to the following products: aircraft, aircraft engines, propellers and appliances’’ [1] Pt. 1-59). The FAA will issue an AD when it discovers that ‘‘an unsafe condition exists in the product and the condition is likely to exist or develop in other products of the same type design’’ ([1] Pt. 1-59). Each operation of the aviation product that does not comply with the rules of the AD results in a violation, and if caught, the operator is subject to a fine for each occurrence, as described in the previous section. An AD can address an airworthiness condition in several different ways. For example, it may demand a onetime inspection

116

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

to be performed on a particular component or system. It may also require changes to the maintenance plan. Changes may have to be made to the inspection interval, the inspection process or the procedures surrounding the service of a particular component or system. Further, many ADs are the result of mandatory continuing airworthiness information (MCAI) issued by foreign governments. If the concern has not yet been determined to be hazardous or if the issue does not warrant an AD, the directorates have the option to issue a different type of advisory called a special airworthiness information bulletin (SAIB). The SAIB can be used to advise the aviation community about methods to improve safety. The SAIBs, however, are not mandatory or enforced like ADs are. The FAA can issue an SAIB for any airworthiness concern that does not, at the time of the issuance, pose a safety concern that would warrant the issue of a formal AD [26]. Not all maintenance issues warrant the issuance of an airworthiness directive or SAIB. Such less serious issues are recorded in the Service Difficulty Reporting System (SDRS). This electronic database is operated by the FAA and consists of a collection of service difficulty reports (SDRs) that are voluntarily submitted by the aviation community. The FAA recommends that a report be filed whenever a system or component of an aircraft fails to function properly [7]. The FAA uses the SDRS to help identify important and/or widespread maintenance issues. These issues may then result in a more in-depth investigation and/or the issuance of a SAIB or AD. AD and SAIB are available as pdf documents online but have not been consolidated into a text searchable database. The SDR database is also still incomplete and precluded us from collecting statistically meaningful data. Our initial investigation did find that the number of ADs and SAIBs issued is not correlated with either the number of aircraft or number of aircraft models in service. This finding suggests that the issuance of these bulletins is constrained by resource limitations within the FAA.

7. Discussion Table 8 summarizes the results of the accident, incident, and enforcement actions we analyzed for 1999–2008. Maintenancerelated accidents have, in general, had a higher fatality rate than accidents overall. Maintenance accounts for between 0 and 2 fatal accidents per year and these accidents tend to have high numbers of fatalities. In other words, when the kinds of extensive or critical failures that lead to fatal accidents occur, it is likely that maintenance was involved. Approximately 6.5 times more maintenance accidents involve fatalities than all accidents. Further, these accidents tend to result in more fatalities, with approximately 3.6 times more fatalities on average. Similarly, using the passenger mortality risk metric, maintenance is related to 50.7% of the

risk. Our analysis of trends indicates that this contribution has remained fairly constant over the past decade. Maintenance errors are significant and potentially devastating. Our analyses of incidents, AD, and SAIB indicate that maintenance errors occur frequently, both in terms of failure to adhere to maintenance plans, and in terms of inadequate maintenance plans. Our findings therefore suggest that there is significant potential to improve aviation safety by reducing maintenance errors. Between 1999 and 2008, maintenance was the cause of approximately 6.8% of all commercial aviation incidents recorded by the FAA. 58% of incidents involved mechanical failure or malfunction. It is likely that these failures are preventable. For example, oil and hydraulic leaks are frequently cited as reasons for engine failure or shutdown; such problems may be detectable prior to takeoff. The prevalence of maintenance-related fines and legal actions suggests that many air carriers and maintenance facilities are not always taking the required preventive measures to ensure the safe operation of their aircraft. On average, a significant 438 penalties are imposed on airlines and maintenance facilities each year. 36% of these actions involve inadequate maintenance, with recent years showing a decline to about 20%. The coming years may show whether this decline is a reflection of improved maintenance practices. Figs. 18 and 19 summarize the number of events, ranges in severity and the level of investigation for the risk categories. Different categories of risk are treated in different ways. Accidents are investigated in great depth and the investigations often yield useful guidance on improving safety. Since accidents are fortunately rare, such learning opportunities are infrequent. Other aviation events are investigated in much less depth. Our review of incident reports, in particular, found a wide range in investigative depth. To continue to improve safety, we recommend increasing the level of investigation surrounding aviation incidents. Many of the FAA incident reports simply state that a particular component failed. More detailed investigations would reveal the root causes of component failures and would, we suspect, identify inadequate maintenance to be an important factor. A higher level of investigation would also reveal how maintenance contributes to component failures (e.g., through improper installation or inadequate inspection.). More generally, increased incident investigation would also contribute to other risk drivers such as pilot error. This statistical information could then be used to improve safety. For example, investigations could reveal common maintenance problems across airlines. The costs and benefits of such increased investigation will likely have to be traded-off against other risk management efforts; an interesting avenue for future work is estimating the cost of reducing risk through different approaches.

Table 8 Aviation safety scorecard 1999–2008. Metric

All events

Maintenance events

Maintenance vs. all events

34 3.8 (million) 127.7 2.1/10 (million) 1.4

1.4 0.16 (million) 18.8 0.31/10 (million) 0.4

  p m9a ¼ 4:1%

Proportion of fatal accidents

4.4%

28.6%

Average number of fatalities per accident

3.76 22.4%

13.43 49.2%

1/14.6 (million)

1/21.9 (million)

Average Average Average Average Average

number number number number number

of of of of of

accidents per year accidents per domestic revenue departure fatalities per year fatalities per domestic passenger enplanement fatal accidents per year

Average proportion of fatalities per fatal accident Passenger mortality risk Average number of incidents per year

308

21

Average number of enforcement actions per year

438

159

  p m9f ¼ 14:7%   p m9af ¼ 26:7% af magnif ier ¼ 6:48 f magnif ier ¼ 3:57 g magnif ier ¼ 2:20   Q m9a ¼ 66:5%   p m9i ¼ 6:8%   p m9e ¼ 36%

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

High

117

FAA Incidents

Accidents ASRS Incidents

Severity

ADs

SDRs

Low

SAIBs High

Level of Investigation

Low

Fig. 18. Ranges in the severity and the level of investigation of maintenance events.

Number of Maintenance Events per Year

High SDRs

ASRS Incidents

ADs

SAIBs

FAA Incidents Accidents

Low Low

Level of Investigation

High

Fig. 19. Ranges in the number of maintenance-related events and the level of investigation.

The wealth of information present within the ASRS should also be exploited. The ASRS contains many incidents that illustrate the role of human factors in aviation maintenance. An analysis of these incidents would allow the key human-factors issues to be identified. Improving the quality and utility of the Service Difficulty Reporting System (SDRS) would reduce the time needed to identify and resolve potential safety issues. Improving the structure of the SDRS would provide the FAA with a more powerful tool that could be used to proactively combat risk. Finally, exploiting and improving the AD database, ASRS, and SDRS could also reveal any underlying design and manufacturing flaws that lead to inadequate maintenance. As a final note, we attempted to perform correlation analyses between the accidents, incidents, and enforcement actions. Here we were stymied by several limiting factors, both in terms of the data itself, and in terms of the nature of aviation safety. While ten years of aviation provides a wealth of data, it is insufficient to determine with statistical certainty the effects of actions in one domain (for example enforcements) on aviation risk, because these effects may be subject to significant time delays. Further,

we were not always able to restrict our analysis to Part 121. We hope to expand our data coverage to enable such analysis in future work.

8. Conclusions We began this paper by posing three questions about aviation maintenance: 1. What is the extent of maintenance’s contribution to commercial passenger aviation risk, and what is the trend in this contribution? 2. How effective are the mechanisms used to ensure safety at reducing maintenance’s contribution to aviation risk? 3. Are there mechanisms that can be improved? To help answer the first question, we analyzed several safety related metrics and develop an aviation maintenance risk scorecard that collects these metrics to synthesize a comprehensive

118

K.B. Marais, M.R. Robichaud / Reliability Engineering and System Safety 106 (2012) 104–118

track record of maintenance contribution to airline accidents and incidents. Our findings echo those of others in that aviation risk has decreased substantially since 1962, although it has leveled somewhat in recent years. The percentage of maintenance-related accidents has remained close to constant. Unfortunately the remaining maintenance accidents tend to be more serious than accidents in general and our analysis of incidents and enforcement actions suggest that the prevalence of maintenance problems is not declining. In a safety-critical industry, inspection and maintenance schedules and practices should be designed in such a way that safety-critical component failures are minimized. The high proportion of maintenancerelated enforcements and incidents involving mechanical failures suggest that there may be issues surrounding both the enforcement and design of maintenance plans. While accident investigations can provide a wealth of information to improve safety, accidents are fortunately rare. Incidents should be investigated in more depth—here the ASRS could provide useful additional information. Similarly, the SDRS also provides a potentially effective way of reducing risk—one that is frequently missed now due to the difficulty in using the system. More finely grained data for example by airline or by geographic region would be useful and that is a subject of ongoing work. Here our goal was to provide a quantitative basis for estimating the contribution of maintenance to risk. In future work we hope to engage operators and regulators in more finely grained analyses of maintenance and its contribution to aviation risk.

References [1] 14 CFR, Pt. 1-59. Retrieved 15-March 2011 from Electronic Code of Federal Regulations: /http://ecfr.gpoaccess.gov/S. [2] Aviation Safety Network. Aircraft accident Boeing 747SR-46 JA8119. Retrieved 12 August 2009 from aviation safety network database: /http://aviation-saf ety.net/database/record.php?id=19850812-1S. [3] Barnett A. Measure for measure. Aerosafety World Magazine; 2007. 48–52, November 2007. [4] Barnett A. Aviation safety and security. In: Belobaba Peter, Odoni Amedeo, Barnhart Cynthia, editors. In The Global Airline Industry. Wiley; 2009. p. 2009. [5] BTS. About the BTS. Retrieved 30 March 2011 from Bureau of transportation statistics: 2011 /http://www.bts.gov/about/S.

[6] FAA. ASIAS brief report—20040617009879C. Retrieved 21 January 2011 from aviation safety information analysis and sharing (ASIAS) system: 2004a /http://www.asias.faa.govS. [7] FAA. ASIAS Brief report—20080517835079C. Retrieved 12 February 2011 from aviation safety information analysis and sharing (ASIAS) system: 2008 /http://www.asias.faa.govS. [8] Goldman SM, Fiedler ER, King RE. General aviation maintenance related accidents: A review of ten years of NTSB data. Office of Aerospace Medicine. 2002; Springfield, VA: Federal Aviation Administration. [9] Graeber RC, Marx DA. Reducing human error in aircraft maintenance operations. Proceedings of the flight safety foundation international federation of airworthiness 46th annual international air safety seminar (pp. 147–60). 1993; Arlington, VA: Flight Safety Foundation. [10] Latorella K A, Prabhu P V. A review of human error in aviation maintenance and inspection. International Journal of Industrial Ergonomics 2000;26(2): 133–61. [11] Milmo. Airbus launches investigation into A380 cracks. The Guardian, 15 February 2012. Retrieved 09 March 2012 from 2011 /www.guardian.co.ukS. [12] Mouawad J.. FAA proposes record $24 million fine against American Airlines. Retrieved 9 December 2010 from New York Times: 2010 /http://www. nytimes.com/2010/08/27/business/27fine.htmlS. [13] NASA. ASRS—aviation safety reporting system. Retrieved 30 March 2011 from Program Briefing: 2011 /http://asrs.arc.nasa.gov/overview/summary. htmlS. [14] NSTB. Aircraft accident report: United Airlines Flight 232 (NTSB/AAR-90/06). 1990; Washington, DC: National Transportation Safety Board (NTSB). [15] NTSB. Avdata. Retrieved 11 January 2009 from national transportation safety board—aviation. 1999 /http://www.ntsb.gov/avdata/codman.pdfS. [16] NTSB. Annual review of aircraft accident data—2005 (NTSB/ARC-09/01). 2009a; Washington, DC: National Transportation Safety Board (NTSB). [17] NTSB. NTSB aviation. Retrieved 17 February 2010 from aviation accident database: 2009b /http://www.ntsb.gov/aviationquery/AviationQueryHelp. htmlS. [19] Prabhu P, Drury CG.. A framework for the design of the aircraft inspection information environment. Proceedings of the seventh FAA meeting on human factors issues in aircraft maintenance and inspection. 1992; Washington, DC: Federal Aviation Administration. [21] Robichaud MR, Arnac SR, Marais KB. The impact of maintenance on passenger airline safety, European safety and reliability conference (ESREL 2010), 2010; Rhodes, Greece, September 2010. [22] Saleh JH, Marais KB, Bakolas E, Cowlagi RV. Highlights from the literature on accident causation and system safety: review of major ideas, recent contributions, and challenges. Reliability Engineering and System Safety 2010. [23] Schlangenstein M. Boeing calls for more inspections after 767 engine-pylon cracks. Retrieved 19-August 2010 from Bloomberg. 2010 /http://www. bloomberg.com/news/2010-06-22/american-air-awaiting-test-results-oncracks-in-boeing-767-engine-pylons.htmlS. [24] Wiegmann DA, Shappell SA,. Human error analysis of commercial aviation accidents: application of the Human Factors Analysis and Classification System (HFACS). Aviation, Space, and Environmental Medicine 2001;72(11):1006–16. [25] Marx, D. (1992). Looking towards 2000: the evolution of human factors in maintenance. Proceedings of the sixth FAA meeting on human factors issues in aircraft maintenance and inspection. Springfield, VA: National Technical Information Service, (pp. 64-76). [26] FAA. (). Order 8110.100A. Washington, DC; September 18, 2010.