Reliability Engineering and System Safety 124 (2014) 201–206
Contents lists available at ScienceDirect
Reliability Engineering and System Safety journal homepage: www.elsevier.com/locate/ress
Proportional hazards models of infrastructure system recovery Kash Barker n, Hiba Baroud School of Industrial and Systems Engineering, University of Oklahoma, USA
art ic l e i nf o
a b s t r a c t
Article history: Received 25 July 2013 Received in revised form 18 December 2013 Accepted 21 December 2013 Available online 30 December 2013
As emphasis is being placed on a system0 s ability to withstand and to recover from a disruptive event, collectively referred to as dynamic resilience, there exists a need to quantify a system0 s ability to bounce back after a disruptive event. This work applies a statistical technique from biostatistics, the proportional hazards model, to describe (i) the instantaneous rate of recovery of an infrastructure system and (ii) the likelihood that recovery occurs prior to a given point in time. A major benefit of the proportional hazards model is its ability to describe a recovery event as a function of time as well as covariates describing the infrastructure system or disruptive event, among others, which can also vary with time. The proportional hazards approach is illustrated with a publicly available electric power outage data set. & 2013 Elsevier Ltd. All rights reserved.
Keywords: Infrastructure systems Recovery Proportional hazards
1. Introduction The resilience of infrastructure systems, both in the US and globally, is of significant concern. Following the events of September 11, 2001, the primary planning concern revolved around the protection from and prevention of terrorist attacks. However, as accidents and natural disasters become more prevalent and more impactful (e.g., Hurricanes Katrina and Rita in 2005, the Deepwater Horizon oil spill in 2010, the Japanese earthquake and tsunami in 2011), recent efforts have been placed on resilience, or the ability to “bounce back,” from such disruptions. Demonstrated in several works on resilience [1–6] is the significance of accepting that disruptions will indeed occur and focusing on two aspects: (i) lessening the impact of such disruptions, and (ii) improving the speed with which recovery occurs. As such, the US Department of Homeland Security [7] emphasizes “strengthen national preparedness, timely response, and rapid recovery” of US infrastructure, particularly due to their interconnectedness with other infrastructure, industries, and the workforce. Resilience is often described as a function of robustness, or the ability of a system to resist the initial adverse effects of a disruptive event, and rapidity, or the rate or speed at which a system is able to return to an appropriate operability following the disruption [8]. Modeling the rapidity aspect of resilience, particularly as a rate of recovery, is addressed in this work. Introduced here is a data-driven statistical technique to model the recovery of infrastructure systems following a disruptive event. Guikema [9] provides an introduction to the use of statistical methods (e.g., generalized linear models) for performing n Correspondence to: School of Industrial and Systems Engineering, University of Oklahoma, 202 West Boyd Street, Room 124, Norman, OK 73019, USA. Tel.: þ 1 405 325 3721; fax: þ1 405 325 7555. E-mail address:
[email protected] (K. Barker).
0951-8320/$ - see front matter & 2013 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.ress.2013.12.004
probabilistic risk analysis, with several applications in modeling and estimating the initial impacts of natural disasters to electric power service [10,11], among other infrastructures [12,13]. MacKenzie and Barker [14] apply regression to the study of interdependent recovery following an electric power outage to populate parameters in a multi-industry interdependency model. This work describes the proportional hazards model (PHM) [15], a standard survival analysis technique in biostatistics with applications found in reliability engineering, as a means to derive a temporal, condition-based rate of recovery for infrastructures impacted by a disruptive event. Reliability growth can be described as a function of the dynamic assumptions of the underlying system [16]. Reliability applications of PHM include analyzing factors that impact the reliability behavior of systems and components [17] and planning for preventive maintenance repairs due to degrading condition variables [18]. Similar to the reliability analysis, policymakers should strive for improved resilience analysis encompassing the recovery dynamics of a disrupted system. Preparedness activities should not only enhance the reliability of the system but also its resilience (e.g., its strength in responding to and recovering from a disruption [19]). Further, decisions should be made prior to and in the aftermath of a disruption to effectively allocate resources to reduce impacts [20] and enhance recovery activities [21]. Resilience modeling so far has been concerned with the optimization of preparedness and resource allocation strategies under different circumstances, but no research exists that models the trajectory of recovery over time given external and internal impacts. Different systems have different recovery paths due to their structure, the nature of the disruption, the surrounding environment, among many other factors. A model that can capture this relationship and translate the evolution of recovery over time is the PHM. We extend the use of the PHM, primarily used herein to model failure rate or infection rate (and sparse applications in modeling repair rates [22,23]), to
202
K. Barker, H. Baroud / Reliability Engineering and System Safety 124 (2014) 201–206
model recovery rate and supplement the rapidity component of existing resilience paradigms. As will be depicted later, we model time-to-recover and its associated covariates rather than the tradition time-to-failure in traditional reliability analysis. Section 2 provides background on the PHM, while Section 3 discusses its use in the infrastructure recovery context, with an application in recovery from electric power outages from publicly available data. Concluding remarks are provided in Section 4.
2. Methodological background Survival analysis is a technique used to describe the duration between events. Many phenomena in medical research, engineering, and economics can be described using survival analysis techniques. For example, medical histories [24–26] and the failure of engineered systems [27–29] have been described using survival analysis. Survival distributions can be described by four functions at time t: (i) the probability density f(t), (ii) the cumulative distribution function F(t), (iii) the survivor (or reliability [30]) S(t)¼ 1–F(t), and (iv) the hazard h(t)¼f(t)/S(t) functions. One means to estimate the survival function with the incorporation of covariate effects, or the risk factors influencing the occurrence of and duration between events, is the proportional hazards model (PHM) [15]. Provided in Eq. (1), the PHM hazard function describing the rate at which failures occur is a function of a time-driven baseline hazard function, h0 ðtÞ, and the state/condition of the system (or covariates influencing the hazard), vector xðtÞ. β is a vector of regression coefficients reflecting the effect of the state of the system on the hazard function. hðt; xðtÞÞ ¼ h0 ðtÞexpðβ xðtÞÞ T
ð1Þ
Generally speaking in a reliability engineering context, h0 ðtÞ can be derived from a pdf fit to time-to-failure data. A system is viewed periodically during inspection, and at each inspection time t i , the covariates xðt i Þ are recorded, as well as a 0/1 indicator of “no failure” or “failure.” Regression parameters are then fit using the method of maximum likelihood. Likewise for a biostatistics application, patients are observed at t i , physical characteristics xðt i Þ of the patient are recorded, and the presence or absence of an ailment is noted. A primary reason for its popularity is that it allows the ability to assess the effect of covariates on the hazard function, and ultimately the likelihood of the event, with a semi-parametric approach, (i.e., solving for β without specifying a baseline hazard function, h0(t)). As such, the typical use of the PHM is used for descriptive purposes (i.e., identifying factors that significantly impact hazard/survival and interpreting the elements of β), not prescriptive purposes (i.e., actually estimating hazard/survival given a set of covariates) (e.g., [31]). Another reason for the popularity of PHM in certain contexts is that observations can be censored, a term to describe when an event is incomplete for an observation during the observed period (e.g., when applying PHM to model the occurrence of cancer in patients, some patients may leave the study without cancer appearing). Such censored observations are no less important than those for whom observation is complete, and they would not be included in other similar statistical methods (e.g., logistic regression). The survival function is estimated using the Breslow estimator, where the function takes the form of a step function due to the assumption that hazard between distinct failure points is constant [26]. Other approaches can incorporate the effects of covariates on the hazard function or the probability of event occurrence. Generalized linear models can determine the likelihood or rate of occurrence of an event given covariates with a wide range of
distributions for the link function, though their ability to capture time-varying conditions is lacking [32]. The accelerated failure time model is used especially when an underlying hazard function is known, or when the baseline hazard function would vary from observation to observation [33]. A more general representation of the relationship among the survivor function Sðt; xÞ and the baseline survival function S0 ðtÞ is the Royston–Parmar family of models which rely on the transformation gðdÞ such that gðSðt; xÞÞ ¼ T gðS0 ðtÞÞ þ β x, where gðdÞ can represent the proportional hazard, proportional odds, or probit families [34]. However, the PHM can account for the effect of time-varying covariates, x(t), particularly useful for post-disruption decision making as the state of the system can vary over time. Allison [35] points out that the approach given time-varying covariates is technically a nonproportional hazards model. More specifically, the PHM assesses time independent covariates while an extension of the PHM (e.g., Cox regression, extended Cox regression) model time-varying covariates or a combination thereof [22,36,37,35].
3. Infrastructure recovery with proportional hazards models The use of PHMs has mostly been limited to describing hazard (or failure) rates, though many other important data-driven rates of occurrence can be described as a function of (i) time, and (ii) covariates. As such, the idea of modeling the time-dependent, state-of-the-system-dependent evolution of occurrence rates is extended in this work to modeling the rate of recovery of infrastructure systems following a disruptive event. As parameter μ is often used in maintenance context to describe repair, such notation is adopted in Eq. (2). The usefulness of μðt; xðtÞÞ lies in modeling how recovery rate changes over time (e.g., recovery rate would likely decrease over time after the initial disruption), perhaps more so when accounting for xðtÞ. Recalling that resilience is a function of robustness and rapidity, when paired with robustness, this innovation in the PHM will help derive the concept of rapidity from data sources. Eq. (2) accounts for both time independent and time-varying covariates, x and x(t), respectively.
μðt; xðtÞÞ ¼ μ0 ðtÞexpðβT x þ βT xðtÞÞ
ð2Þ
Further, the likelihood V ðt; xðtÞÞ of full recovery before time t and under condition x(t), shown in Eq. (3) is extended from the reliability literature: Vðt; xðtÞÞ would be the equivalent of the cumulative distribution function for failure, or the probability that the event occurs prior to time t when xðtÞ is exhibited at t. This measure would provide a decision maker with an idea of how likely it is that recovery will occur by a given point in time and for a given set of covariates (potentially time-varying covariates representing state variables at time t). Z t Vðt; xðtÞÞ ¼ 1 exp μðy; xðyÞÞdy ð3Þ 0
A qualitative illustration of Eqs. (2) and (3) could include the response to and recovery of disrupted highway segments by dispatching emergency assistance vehicles. Time-invariant covariates could include the particular highway segment, number of vehicles involved, response category issued, and the location from which emergency assistance was transmitted. Time-varying covariates could include the number of responders expediting the cleanup. Periodic review of the accident could result in a 0/1 outcome of “cleanup still underway” or “cleanup complete and traffic resumed.” μðt; xðtÞÞ would provide decision makers with an idea of how the recovery rate progresses over time with certain covariate values. Similarly, V ðt; xðt ÞÞ provides the likelihood that recovery will occur prior to time t under certain conditions models with the covariates.
K. Barker, H. Baroud / Reliability Engineering and System Safety 124 (2014) 201–206
Note that the model portrayed in Eqs. (2) and (3) assumes that all covariates vary with time, thus presenting the PHM in its broadest form. Some combination of covariates could vary with time or be independent of time. However, due to the nature of the data collected in the illustrative example that follows, covariates are assumed to be time independent.
μðt; xÞ ¼ μ0 ðtÞexpðβT xÞ
ð4Þ
3.1. Illustrative example: electric power recovery Because of their importance to the economy and the availability of data, power outages have been frequently examined in the risk analysis and reliability literature [10,38,39], with primary focus given to anticipating the impacts of disruptive events (e.g., number of customers affected). Interest here lies in using the statistical proportional hazards model to describe the trajectory of recovery of power outages. The PHM approach to infrastructure recovery is illustrated with a case study of electric power outages in the United States. The set of data used describes US power outages from January 2002 to June 2009 [40–47]. Several data fields describing these outages are publicly available, including the time and date of a power outage, the power company that suffers the outage, the state or states affected by the power outage, the number of customers who lost power, the type or cause of disturbance, the loss in megawatts, and the date and time that power was restored. From these data fields, the variables chosen for the model are described in Table 1. Regional aspects of recovery were included in the model with the variable region as certain communities more prone to disruptions may have electric service providers more adept at quicker recovery (or vice versa). Further, the cause of the disturbance may impact recovery rate, and may be of interest in developing a prescriptive model of recovery. Thus, the variable cause was also included. Unlike the reliability and biostatistics applications noted previously, the data available from the U.S. Energy Information Administration (EIA) do not contain periodic inspections (i.e., periodic observations of the progress of recovery). As such, we artificially impose periodic inspection every 10 h: if the power outage was still on-going at the observed time, dependent variable recovery was recorded as 0 (suggesting that the recovered event still had not occurred), and recovery was recorded as 1 when the outage was recovered at the observed time. The 10-hour observation interval was chosen arbitrarily and could be reduced to better gage recovery in a real-time situation. The periodic inspections would be more necessary in a situation where time-varying Table 1 Variables used in the proportional hazards infrastructure recovery model. Variable Region
Description
Author-defined geographical region categories in which the outage took place: Pacific, Southwest, Midwest, Northeast, East, Gulf Cause Author-defined broad categories based on the Energy Information Administration (EIA) specific causes: equipment failure, loading issues, fire, earthquake, thunderstorm, winter storm, hurricane, tornado Time of Categorical variable describing whether the outage occurred in the day AM or PM Loss Continuous variable quantifying the loss associated with the outage (in megawatts) Customers Continuous variable quantifying the peak number of customers without power during the outage Duration Continuous variable quantifying the time required for full power restoration (in hours)
203
covariates are collected. A sensitivity analysis could be performed to assess the implications of the observation interval as covariates change over time. An example outage, which lasted 48.5 h, is shown in Table 2. Variable region was coded as a dummy variable relative to a Gulf baseline, and cause was coded relative to a thunderstorm baseline. TU and TL represent the current and previous time period at the observation. The data set consists of 162 power outage observations, with descriptive statistics provided in Table 3. A histogram of the recovery times of the 162 observations is found in Fig. 1. Several distributions to fit these data were examined, thus providing the time-dependent distribution for use in the baseline hazard function, μ0 ðt Þ. The results of the goodness-of-fit tests for these distributions are found in Table 4. The Anderson–Darling goodness-of-fit test was first employed. As some of the p-values are represented by a range instead of an exact value, we use the test statistic to draw inferences. The Anderson–Darling test is one-sided with a null hypothesis that the data follows the distribution in question, therefore a large test statistic (or equivalently a small p-value) would reject the null hypothesis. From the table, the data are certainly not normally or lognormally distributed, but gamma, Weibull, or exponential distributions are candidates. The parameters of those distributions were computed using the maximum likelihood estimation (MLE) method, and similar to the Anderson–Darling result, gamma, Weibull, and exponential distributions had the best fit with the largest log-likelihood functional values. Although the Weibull distribution did not score the highest in terms of the test statistic or the log-likelihood function, it is more suitable to represent a Table 2 Example coded power outage. ID
Region
Cause
AM
Loss
Customers
TU
TL
Recovery
147 147 147 147 147
midwest midwest midwest midwest midwest
thunderstorm thunderstorm thunderstorm thunderstorm thunderstorm
1 1 1 1 1
168 168 168 168 168
184,000 184,000 184,000 184,000 184,000
10 20 30 40 48.5
0 10 20 30 40
0 0 0 0 1
Table 3 Descriptive statistics. Descriptor
Sample size
Loss
mean
Customers
sd
mean
Recovery time (hrs) sd
mean sd
All
162
493.3 1236.1 233,339.2 411,374
Region Pacific Southwest Midwest Northeast East Gulf
30 7 44 23 25 33
238.9 322.3 595.8 262.6 411.9 846.9
235.8 127.5 1634.8 244.8 402.6 1904.3
373,313.3 339,191.1 214,384.3 105,689.5 130,988.2 275,416.4
444,970 80.5 83.3 682,690 179.0 113.5 366,580 92.2 64.1 69,850 68.1 99.5 75,540 62.5 60.1 594,442.6 77.3 107.7
Cause Equipment Loading Fire Earthquake Thunderstorm Winter storm Hurricane Tornado
6 12 2 1 88 34 18 1
219.2 192.3 246.5 110.0 435.1 297.7 1460.2 1000.0
128.1 193.0 44.6 – 1174.0 260.2 2460.1 –
79,765.7 177,969.4 69,577.0 59,886.0 184,201.9 326,983.1 415,249.6 186,000.0
52,702.4 13.2 9.8 359,133.2 26.8 41.5 8594.2 6.5 1.4 – 31.8 – 371,661.6 70.7 70.0 397,401.9 107.4 74.4 784,101.7 168.5 141.3 – 55.4 –
Time of day AM PM
68 94
552.3 1456.6 232,296.1 349,940 450.7 1054.7 234,093.8 452,480
82.7 86.9
88.0 88.8 79.0 85.8
204
K. Barker, H. Baroud / Reliability Engineering and System Safety 124 (2014) 201–206
Table 5 Proportional hazards model statistical results. Variable
Parameter estimate
Standard error
p-value
Hazard ratio
Pacific Southwest Midwest Northeast East Equipment Loading Fire Earthquake Winter storm Hurricane Tornado AM Loss Customers
1.76 2.53 1.73 1.51 1.03 1.54 1.17 2.60 1.47 0.34 2.15 0.70 0.04 1.18E 4 4.16E 7
0.33 0.49 0.31 0.35 0.32 0.46 0.37 0.80 1.04 0.21 0.38 1.02 0.17 1.10E 4 2.39E 7
o 0.0001 o 0.0001 o 0.0001 o 0.0001 0.0014 0.0008 0.0015 0.0012 0.1561 0.1080 o 0.0001 0.4933 0.8291 0.2826 0.0811
0.17 0.08 0.18 0.22 0.36 4.67 3.23 13.44 4.36 0.71 0.12 2.01 0.96 1.00 1.00
Fig. 1. Frequency distribution of the recovery time (in hours).
Table 4 Anderson–Darling goodness-of-fit results for several distributions of recovery time.
Normal Lognormal Gamma Weibull Exponential
Test statistic
p-value
Log-likelihood
3.59 8.86 0.34 0.39 0.81
o0.005 o0.005 40.25 40.25 0.203
895 952 875 876 877
baseline recovery function as it produces a more interesting (nonconstant, like the exponential distribution) baseline recovery (i.e., hazard) function, as shown in Eq. (5). The gamma distribution is typically used to model the aggregation of failure times for n components, or in this application, the aggregation of the times to recovery for several consecutive disruptive events. Various explorations into baseline hazard functions in reliability applications include Weibull [48], log-logistic [49] and constant [50]. The best fit parameters for the Weibull distribution were shape parameter k¼ 0.9102 and scale parameter λ ¼79.2683. The timedependent, state-dependent recovery rate, given the Weibull baseline, is then provided in Eq. (6). Note that for the EIA data set analyzed here, there were no time-varying covariates, x(t), only constant descriptors of the outages, x, though Eq. (6) is written generally. A shape parameter of 0.9102 (i.e., less than 1) depicts a decreasing shape for the distribution of the recovery time, suggesting that most of the disruptions require a short time to be repaired and fully recovered. Drawing an analogy to decreasing failure rate, which is described by 0 ok o1 for the Weibull distribution applied in a reliability context, most recoveries take a short amount of time, but as recovery takes longer, the recovery rate gets smaller. This suggests that as time passes, recovery can take much longer. Note that this observation is particular to the power outage recovery example, as other recovery contexts might result in different shapes for the recovery rate trajectory.
μ0 ðtÞ ¼
k t k1
ð5Þ
λ λ
μðt; xðtÞÞ ¼
k t k1
λ λ
expðβ xðtÞÞ T
ð6Þ
The regression coefficients, their statistical significance, and the hazard ratios for each variable are provided in Table 5. The Efron
method [51] was used for handling tied recovery times in the calculations of the models, providing an approximation of the exact marginal log-likelihood. The coefficient estimates are indicators of the magnitude of the impact of the factors affecting the rate of recovery. Whether the factor has a positive or negative effect on the rate of recovery can be determined from the sign of the coefficient or from the hazard ratio. Inferences can be drawn from these estimates, provided they are accurate enough with a small standard error. Mentioned previously, most of the covariates are dummy variables and are assumed to be time-invariant. For example, the covariate describing the time of day when the power outage occurred, there are two possibilities: AM and PM. For an AM incident the X AM ¼ 1, and for a PM incident, X AM ¼ 0. The only two covariates that are not binary integers are (i) the loss of power in megawatts and (ii) the number of customer without power, they are constant over time for each incident. The hazard ratio of a covariate, c, is a representation of the recovery (hazard) rate, relative to the baseline covariate, b. It is computed as the ratio of the hazard for the covariate over the hazard of the baseline, shown in Eq. (7). Vector xc contains zero entries except for the corresponding covariate, with a value of 1 if it is a dummy variable or a constant value. HRc ¼
μ0 ðtÞexpðβT xc Þ T T ¼ expðβ xc β xb Þ μ0 ðtÞexpðβT xb Þ
ð7Þ
All region variables are statistically significant, suggesting that recovery rates from power outages vary across geographic locations (especially with respect to the Gulf baseline). As all hazard ratios for the regional variables are smaller than one, resulting in slower recovery rates for the regions relative to the Gulf. The southwest region appears to have the largest impact among the rest of the regions on the recovery rate with a parameter estimate for the coefficient of 2.53. Most cause variables were statistically significant relative to the thunderstorm initiating event, with the exceptions being earthquakes, winter storms, and tornados (there were only one occurrence each for the earthquake and tornado initiating events). The hazard ratios for equipment, loading, and fire initiating events had hazard ratios greater than one, suggesting that those events increase the rate of recovery relative to the thunderstorm baseline: power is restored fastest when the outage is the result of a fire. Hurricanes, with a coefficient of 2.15, result in slower recovery rates relative to thunderstorms. The time of day in which the outage occurred, as well as the loss in megawatts incurred during the outage, were not statistically significant. The
K. Barker, H. Baroud / Reliability Engineering and System Safety 124 (2014) 201–206
205
Fig. 2. Trajectories of (a) recovery rate and (b) likelihood of recovery over time for example data in Table 2.
number of customers, while marginally significant, appears to have a small effect on slowing the rate of recovery. The standard error of the estimates is directly linked to the number of observations. The estimates of the coefficients of the loss in megawatts and the time of the outage have small standard errors suggesting a high accuracy in the estimation. For other covariates, such as the tornado, earthquake, and fire, the standard error is higher due to a much smaller number of observations in the data. A larger set of observations could impact the estimation of those covariates, their significance, and ultimately their hazard ratio. Applying the Weibull baseline recovery function to the likelihood of recovery prior to time t in Eq. (3) results in Eq. (8). Note that a closed form solution is possible when covariates are not time-varying, as is the case with this illustration. Otherwise, a functional form of x(t) is necessary. Vðt; xÞ ¼ exp
k 1 !expðβT xÞ t
λ
ð8Þ
In order to visualize the trajectory of the recovery rate and the likelihood of recovery, graphs of Eqs. (6) and (8), are plotted in Fig. 2 using the estimated values of the coefficient found in Table 4. The plots help decision makers envision the pattern changes in these two metrics as a function of time while taking into account the impact of all covariates through the regression model. For the example outage described in Table 2, the trajectories of μ(t,x) and V(t,x) are provided in Fig. 2. Fig. 2(a) suggests that recovery rate (that is, recoveries/hour) decreases after the initial impact, as would be expected for recovery situations where many outages are repaired quickly. The cdf for recovery, shown in Fig. 2(b), portrays that the likelihood of infrastructure recovery prior to t understandably increases with increasing t.
4. Concluding Remarks The proportional hazards model is a useful tool to model hazard functions in biostatistics and reliability engineering applications. Though, as this work demonstrates, PHM has the capability to generally model rates of occurrence over time, with the added benefit of modeling the effect of covariates (including those that vary over time) on the rate of occurrence. One such rate illustrated here is infrastructure recovery following a disruptive event. The use of PHM for modeling recovery can provide risk managers with a means to estimate the trajectory of the recovery rate and the likelihood of recovery for different scenarios described by the covariates (e.g., initiating disruptive event, geographic explicitness).
An illustrative example uses real, publicly available power outage data. According to statistical tests, the region and the cause of a power outage have a great impact on the recovery rate after the event had occurred. As a consequence, decision makers should take such result into consideration while making decisions regarding the recovery of the power. Further, industries suffering from interdependent effects due to the power outage could make use of the output of the PHM in estimating the likelihood of recovery and plan accordingly to minimize the losses. The use of the PHM for describing infrastructure recovery would be most beneficial in situations where time-varying covariates are involved (e.g., regular updates of customers without power). Such model would allow for a real-time update on the likelihood of recovery, particularly when a predictive model for the time-varying covariates can be estimated. Given the results of a time-varying PHM model, the decision makers are able to update the recovery activities accordingly. Consequently the recovery process is deemed more efficient and can see an improvement in the time to full recovery of the infrastructure system. Future work will use the PHM approach to compare different recovery strategies that impact the state variables, thereby changing the trajectory and likelihood of recovery over time (e.g., certain strategies may have a higher likelihood of recovery immediately following the disruption rather than later). As such, the PHM approach can be used in a recovery decision making framework, where covariates are treated decision variables. References [1] Holling CS. Engineering Resilience Versus Ecological Resilience. Engineering with Ecological Constraints. Washington, DC: National Academy Press; 1996; 31–44. [2] Bruneau M, Chang SE, Eguchi RT, Lee GC, O0 Rourke TD, Reinhorn AM, et al. A framework to quantitatively assess and enhance the seismic resilience of communities. Earthq Spectra 2003;19(4):733–52. [3] Whitson JC, Ramirez-Marquez JE. Resiliency as a component importance measure in network reliability. Reliab Eng Syst Saf 2009;94(10):1685–93. [4] Rose A. Economic Resilience to Disasters, Community and Regional Resilience Institute, Oakridge, TN, 2009. [5] Reed DA, Kapur KC, Christie RD. Methodology for assessing the resilience of networked infrastructure. IEEE Syst J 2009;3(2):174–80. [6] Zobel CW. Representing perceived tradeoffs in defining disaster resilience. Decis Support Syst 2011;50(2):394–403. [7] Department of Homeland Security. National Infrastructure Protection Plan. Washington, DC: Office of the Secretary of Homeland Security; 2009. [8] McDaniels T, Chang S, Cole D, Mikawoz J, Longstaff H. Fostering resilience to extreme events within infrastructure systems: characterizing decision contexts for mitigation and adaptation. Glob Environ Change 2008;18(2):310–8. [9] Guikema SD. Natural disaster risk analysis for critical infrastructure systems: an approach based on statistical learning theory. Reliab Eng Syst Saf 2009;94 (4):855–60. [10] Han S-R, Guikema SD, Quiring SM. Improving the predictive accuracy of hurricane power outage forecasts using generalized additive models. Risk Anal 2009;29(10):1443–53.
206
K. Barker, H. Baroud / Reliability Engineering and System Safety 124 (2014) 201–206
[11] Guikema SD, Quiring SM, Han S-R. Prestorm estimation of hurricane damage to electric power distribution systems. Risk Anal 2010;29(10):1744–52. [12] Guikema SD, Coffelt JP. Practical considerations in statistical modeling of count data for infrastructure systems. J Infrastruct Syst 2009;15(3):172–8. [13] Guikema SD, Quiring SM. Hybrid data mining-regression for infrastructure risk assessment based on zero-inflated data. Reliab Eng Syst Saf 2012;99 (1):178–82. [14] MacKenzie CA, Barker K. Empirical data and regression analysis for estimation of infrastructure resilience, with application to electric power outages. J Infrastruct Syst 2013;19(1):25–35. [15] Cox DR. Regression models and life-tables. J R Stat Soc B 1972;34(2):187–220. [16] Evanco WM. Using a proportional hazards model to analyze software reliability. In: Proceedings of the IEEE Computer Society Software Technology and Engineering Practice (STEP ‘99), Washington, DC, 1999. [17] Kumar ES, Sarkar B. Proportional hazards modeling of environmental impacts on reliability of photovoltaic modules. Int J Eng Adv Technol 2012;2(2):110–5. [18] Marquez AC, Gomez JF, De Leon PM. Modelling on-line reliability and risk to schedule the preventive maintenance of repairable assets in network utilities. IMA J Manage Math 2013;24(4):437–50. [19] Haimes YY. Strategic Preparedness for Recovery from Catastrophic Risks to Communities and Infrastructure Systems of Systems. Risk Anal 2012;32 (11):1834–45. [20] MacKenzie CA, Baroud H, Barker K. Static and dynamic resource allocation models for recovery of interdependent systems: application to the Deepwater Horizon oil spill. Ann Oper Res 2014 (in preparation). [21] Baroud H, Ramirez-Marquez JE, Barker K, Rocco CM. Measuring and planning for stochastic network resilience: application to waterway commodity flows. Risk Anal 2014 (in preparation). [22] Barabadi A, Barabady J, Markeset T. Maintainability analysis considering timedependent and time-independent covariates. Reliab Eng Syst Saf 2011;96 (1):210–7. [23] Gao X, Barabady J, Markeset T. An approach for prediction of petroleum production facility performance considering arctic influence factors. Reliab Eng Syst Saf 2010;95(8):837–46. [24] Crowley J, Hu M. Covariance analysis of heart transplant survival data. J. Am. Stat. Assoc. 1977;72(357):27–36. [25] Prentice RL, Williams BM, Peterson AV. On the regression analysis of multivariate failure time data. Biometrika 1981;68(2):373–9. [26] Lee E, Wang JW. Statistical Methods for Survival Data Analysis. 3rd edition. Hoboken, NJ: John Wiley and Sons; 2003. [27] Dale CJ. Application of the proportional hazards model in the reliability field. Reliab Eng 1985;10(1):1–14. [28] Kumar D, Klefsjo B. Proportional hazards model: a review. Reliab Syst Saf 1993;44(2):177–88. [29] Ansell JI, Phillips MJ. Practical aspects of modelling of repairable systems data using proportional hazards models. Reliab Eng Syst Saf 1997;58(2):165–71. [30] Leemis LM. Reliability: Probabilistic Models and Statistical Methods. 2nd edition. Williamsburg, VA: Lawrence Leemis; 2009. [31] Krivtsov VV, Tananko DE, Davis TP. Regression approach to tire reliability analysis. Reliab Eng Syst Saf 2002;78(3):267–73. [32] McCullagh P, Nelder JA. Generalized Linear Models, 2nd edition. London: Chapman and Hall; 1989. [33] Bradburn MJ, Clark TG, Love SB, Altman DG. Survival analysis part II: Multivariate data analysis—an introduction to concepts and methods. Br. J. Cancer 2003;89(3):431–6.
[34] Royston P, Parmar MK. Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Stat. Med. 2002;21 (15):2175–97. [35] Allison PD. Survival Analysis Using the SAS System: A Practical Guide. Cary, NC: SAS Institute Inc; 1995. [36] Kumar D, Klefsjo B. Proportional Hazards Model: A Review. Reliab Syst Saf 1993;44(2):177–88. [37] Barabadi A, Barabady J, Markeset T. Application of reliability models with covariates in spare part prediction and optimization—a case study. Reliab Eng Syst Saf 2014;123(1):1–7. [38] Liu H, Davidson RA, Rosowsky DV, Stedinger JR. Negative binomial regression of electric power outages in hurricanes. J Infrastruct Syst 2005;11(4):258–67. [39] Liu H, Davidson RA, Apanasovich TV. Spatial generalized linear mixed models of electric power outages due to hurricanes and ice storms. Reliab Eng Syst Saf 2008;93(6):897–912. [40] U.S. Energy Information Administration. B2: major disturbances and unusual occurrences, year to date through December 2002, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events_archive.html〉 [accessed October 2009], 2003. [41] U.S. Energy Information Administration. B2: major disturbances and unusual occurrences, year to date through December 2003, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events_archive.html〉 [accessed October 2009], 2004. [42] U.S. Energy Information Administration. B2: major disturbances and unusual occurrences, year to date through December 2004, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events_archive.html〉 [accessed October 2009], 2005. [43] U.S. Energy Information Administration. B2: major disturbances and unusual occurrences, year to date through December 2005, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events_archive.html〉 [accessed October 2009], 2006. [44] U.S. Energy Information Administration. B2: major disturbances and unusual occurrences, year to date through December 2006, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events_archive.html〉 [accessed October 2009], 2007. [45] U.S. Energy Information Administration. B2: major disturbances and unusual occurrences, year to date through December 2007, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events_archive.html〉 [October 2009], 2008. [46] U.S. Energy Information Administration.B2: major disturbances and unusual occurrences, year to date through December 2008, 〈http://www.eia.doe.gov/ cneaf/electricity/page/disturb_events.html〉 [accessed October 2009], 2009a. [47] U.S. Energy Information Administration. B1: major disturbances and unusual occurrences, year to date through June 2009, 〈http://www.eia.doe.gov/cneaf/ electricity/page/disturb_events.html〉 [accessed October 2009], 2009b. [48] Newby M. Perspective on Weibull proportional-hazards models. IEEE Trans Reliab 1994;43(2):217–23. [49] Hutton JL, Solomon PJ. Parameter orthogonality in mixed regression models for survival data. J R Stat Soc, Ser B 1997;59(1):125–36. [50] Kalbfleisch JD, Prentice RL. Marginal likelihood based on Cox0 s Regression and Life Model. Biometrika 1973;60(2):267–78. [51] Efron B. The efficiency of Cox0 s likelihood function for censored data. J. Am. Stat. Assoc. 1977;72(359):557–65.