Estimating spatio-temporal variations of taxi ridership caused by Hurricanes Irene and Sandy: A case study of New York City

Estimating spatio-temporal variations of taxi ridership caused by Hurricanes Irene and Sandy: A case study of New York City

Transportation Research Part D xxx (xxxx) xxx–xxx Contents lists available at ScienceDirect Transportation Research Part D journal homepage: www.els...

5MB Sizes 0 Downloads 14 Views

Transportation Research Part D xxx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Transportation Research Part D journal homepage: www.elsevier.com/locate/trd

Estimating spatio-temporal variations of taxi ridership caused by Hurricanes Irene and Sandy: A case study of New York City Ruijie (Rebecca) Biana, , Chester G. Wilmotb, Ling Wangc ⁎

a

The Glenn Department of Civil Engineering, Clemson University, 125 Lowry Hall, Clemson, SC 29634, United States 3240R Patrick F. Taylor Hall, Department of Civil and Environmental Engineering, Louisiana State University, Baton Rouge, LA 70803, United States Key Laboratory of Road and Traffic Engineering of the Ministry of Education, College of Transportation Engineering, Tongji University, Shanghai 201804, China

b c

ARTICLE INFO

ABSTRACT

Keywords: Taxi ridership Extreme weather Hurricane New York City

Extreme weather days were usually removed as outliers in past studies of taxi ridership. However, taxis play an important role during extreme weather events when other public transportation service is suspended. Thus, estimating spatio-temporal variations of taxi ridership during extreme weather conditions can provide valuable information on heretofore relatively unknown behavior of taxi riders and help identify areas with unusual taxi demands. In this study, New York City (NYC) taxi ridership shortly before the landfall of Hurricanes Irene and Sandy was analyzed. It was found that taxi ridership began to drop about 24 h before each hurricane made landfall. Six multisource regression models were estimated to explain the variation of taxi ridership in the last 24 h. Characteristics of the approaching hurricane, local weather conditions, and zonal socio-demographic variables were entered as explanatory variables. It was found that taxi ridership during hurricane-affected periods has a strong linear association with the ridership in unaffected periods but the proportion decreases as the storm approaches; a storm has the greatest impact on taxi ridership during weekend and at night, and the least impact on a weekday during the day; and taxi users make fewer trips during conditions of heavy rain or strong wind.

1. Introduction Taxi is an indispensable component of public transportation system in dense metropolitan areas such as New York City (NYC). According to data in the Census Transportation Planning Package (CTPP), commuting trips by taxi account for 2% of all commuting trips by public transportation within NYC. According to a report published by the Taxi and Limousine Commission (TLC) in 2006, NYC taxi cabs carry 11% of passengers who travel in modes of public transportation (Schaller Consulting, 2006). The most recent TLC report of 2016 states that yellow and green taxis together carry about 474,000 trips per day (NYCTLC, 2016). This statistic can be roughly validated based on data from other sources: the population of NYC is about 8.5 million, the daily person trip rate is about 2.5 trips/person for NYC residents, and about 3% of all-purpose trips use taxi or other similar shared-ride service, which gives 0.64 million taxi trips per day (PSB, 2017; U.S. Census Bureau, 2018). The significance of taxi for NYC is also revealed by the number of active taxi drivers (over 50,000) and the millions of taxi passengers they serve per year (NYCTLC, 2016). According to rules of the TLC, all licensed taxis must install the Taxicab Technology System and taxi owners must ensure this system is constantly maintained (NYCTLC, 2017). The Taxicab Technology System automatically records, collects, and transmits trip



Corresponding author. E-mail addresses: [email protected] (R.R. Bian), [email protected] (C.G. Wilmot), [email protected] (L. Wang).

https://doi.org/10.1016/j.trd.2019.10.009

1361-9209/ © 2019 Elsevier Ltd. All rights reserved.

Please cite this article as: Ruijie (Rebecca) Bian, Chester G. Wilmot and Ling Wang, Transportation Research Part D, https://doi.org/10.1016/j.trd.2019.10.009

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

record data. Based on the dataset of multiple years, it is found that holidays or extreme weather generally decreases taxi trips to less than 350,000 a day (NYCTLC, 2014). For example, the daily taxi ridership in NYC was only 29,000 on the day that Hurricane Irene (2011) made landfall. Although extreme weather events have a large impact on taxi ridership, it has not been well documented or analyzed in the past. Days with extreme weather events were usually treated as outliers and removed from further analysis. However, understanding taxi ridership during extreme weather conditions is important because they can support other public transportation systems in an emergency, and even function in their place if they cease to operate. This study selected Hurricanes Irene and Sandy and analyzed the variation of taxi ridership in the period leading up to hurricane landfall. Despite an overall drop, taxi ridership in some areas is higher than its ridership under normal conditions or greater than the ridership in other areas due to the following activities. First, some households depend on taxi as a means of evacuation. According to a post-Irene household behavioral survey conducted for the U.S. Army Corps of Engineers (USACE) in 2013, over 6% of the respondents reported they took taxi for evacuation. A similar statistic was reached in another survey, which was conducted after Hurricane Sandy (2012), funded by New York City Emergency Management Department, and completed in 2014. In addition, taxi took 39% and 43% of the reported evacuation trips made by public transit during the two hurricanes, respectively. Second, those who depend on mass transit service under normal circumstances have to shift to another mode if the Metropolitan Transportation Authority (MTA) suspends its services when a hurricane is approaching. Under these circumstances, it is likely that some of them may choose taxi as a substitute. An analysis of taxi ridership during affected periods can help determine the characteristics of variation in travel and identify areas with unusual demands. The objective of this paper is to answer following questions: (1) When did taxi ridership begin to have significant deviations from normal days? What was the temporal distribution of taxi ridership in affected periods? (2) What are the factors that affect taxi ridership during those periods? Who are more or less likely to be taxi users at an aggregate level? Do the characteristics of a hurricane affect taxi ridership? 2. Literature review 2.1. Estimate taxi ridership Several studies have been conducted to estimate NYC taxi ridership on normal days. Yang and Gonzales used multiple linear regression models to estimate hourly taxi ridership (2014). Candidate explanatory variables included transit access time, the socioeconomic and demographic characteristics of the population, and employment status. Qian and Ukkusuri studied the same topic, but they focused on estimating daily trips by weekday and weekend (2015). Geographically weighted regression models were used to take care of spatial correlation. They considered some other explanatory variables like land use composition (such as the proportion of commercial land use in an area) and travel characteristics (such as commuting time). Hochmair analyzed the spatio-temporal pattern of taxi trips using a spatially filtered negative binomial model to account for spatial effect in estimating weekday daily taxi trips (2016). Beside the afore-mentioned variables, this study also included characteristics of the built environment (such as road density), level of public transportation service (such as the number of bus stations), and the presence of airports. It is possible that these explanatory variables may also be useful in estimating taxi ridership during hurricane-affected periods because regular users may continue to take taxi even in an emergency situation. Overall, a common characteristic of all the above studies is that they did not include days with extreme weather events in estimating taxi ridership. 2.2. Impact of weather on taxi Only a limited amount of research has been conducted investigating the impact of weather on taxi ridership. One of the few exceptions is the research of Kamga and Yazici who investigated the travel time characteristics of NYC taxi trips under different weather conditions, which included clear weather, rain, and snow (2014). It was found that adverse weather resulted in longer travel time but also higher travel time reliability. Later, they continued to investigate the variation of taxi pickups and average hourly revenue during clear weather and rainy conditions (Kamga et al., 2015). Increasing pickup frequency and decreasing trip length under rainy conditions lead to an increase in hourly revenue. However, neither study included extreme weather conditions. Some researchers have analyzed taxi trips during extreme weather conditions either descriptively or quantitatively. Ferreira et al. visually explored the distribution of NYC taxi trips during Hurricanes Irene and Sandy (2013). They found that taxi trips decreased more significantly during Irene, but Sandy had a longer affected period. As their objective was for a visual exploration only, they did not quantify their output. Using the same dataset, Donovan and Work quantitatively measured the resilience of transportation networks (2015). The selected indicator of resilience was pace, defined as travel time per mile. Based on this indicator, they found that the length of disruption caused by Irene was 43 h, while Sandy caused a longer disruption of 132 h. This finding is consistent with the visual analysis conducted by Ferreira et al. In addition, Donovan and Work found that the maximum pace in the case of Irene and Sandy was 0.64 min/mile and 2.25 min/mile, respectively. Although the two studies did not reach an agreement on which hurricane caused a more significant peak disruption, it is actually due to the difference on the selected measurements, i.e., the number of taxi trips in the previous study and pace in the later study. Later, Zhu et al. investigated the recovery patterns of roadways in NYC during post-hurricane periods (2016). Roadway recovery was measured by the zonal recovery rate, which was defined as the quotient of trips during a certain hurricane period divided by trips during a corresponding control period. Based on the idea of evacuation response curves, they used logistic functions to fit the recovery curves by evacuation zone. In keeping with other studies, they found the recovery rate after Hurricane Sandy was slower than that after Hurricane Irene. Although the study by Zhu et al. can be 2

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Fig. 1. Study area.

considered to have conducted an indirect estimation of taxi ridership, they focused on post-hurricane periods rather than prehurricane periods. 3. Data description Trip data of yellow taxis has been collected since 2009 and data of Street Hail Liveries (SHLs, known as green taxis) has been collected since the start of its operation in 2013. Both datasets provide information of pick-up and drop-off dates/times/locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts. Another important component of the NYC taxi market is For-Hire Vehicles (FHVs, such as Uber/Lyft). This portion of trip data has been collected since 2015, but it is reported in a different format with fewer variables because it is collected and submitted by different bases. Specifically, the dataset only provides the dispatching base license number, pick-up date/time, and taxi zone location ID. The two hurricanes under study occurred in 2011 and 2012, when trip records were collected for yellow taxis only. Therefore, the format and quality of data are relatively consistent in this study. The study area of NYC includes five boroughs: Manhattan, the Bronx, Queens, Brooklyn, and Staten Island. Each borough has an identical boundary to the counties of New York State. The corresponding counties are New York, Bronx, Queens, Kings, and Richmond counties, respectively. According to the 2016 TLC report, over 90% of the yellow taxi pick-ups originate in Manhattan (i.e., New York County) (NYCTLC, 2016). Thus, a part of the analysis in this study may significantly reflect taxi trips made in Manhattan even though the dataset contains all taxi trips made in all zones of NYC. The chosen geographic unit in this study is the ZIP Code Tabulation Area (ZCTA). Generally, a ZCTA code is the same as the mail ZIP code for an area (US Census Bureau, 2015). Fig. 1 shows boundaries of the five counties, 214 ZCTAs, and six levels of evacuation zones. Areas within Evacuation Zone One are the most vulnerable to hurricane threats. For example, mandatory evacuation orders were issued to Evacuation Zone One only during the two hurricanes analyzed in this study. Regarding the temporal scope, the study period was set to start from three days before issuing an evacuation order and end on the day of hurricane landfall. As shown in the next section, the length of this study period is long enough to capture affected travel before landfall. For each of the hurricanes, two control groups were established to help identify affected periods. Data in the first control group is from similar days but different weeks before a hurricane strike. Data in the second control group is from similar days but different years. Data from 2010 were used in the second control group to reduce irrelevant impacts. For example, green taxis entered the NYC taxi market in 2013, which may affect trips made with yellow taxis. Table 1 presents the study periods used in different groups. The length of the study period is six days for Irene groups and five days for Sandy groups. Data cleaning work on the taxi trip records included removing records without valid longitude or latitude, removing records with a trip length of zero, and removing records with a pick-up point out of NYC. Average daily ridership in each study period was derived from the data and presented in the last column. As shown, average daily taxi ridership significantly decreased in the two study groups of hurricanes. Taxi ridership during truly affected periods was even lower, which will be presented 3

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Table 1 Study periods in different groups. Name of study groups

Start date

Evacuation order issued

MTA closure

End date (Hurricane landfall)

Average daily ridership

Hurricane Irene

Aug 23, 2011 (Tuesday)

Aug 26, 2011 (Friday)

Aug 27, 2011 (Saturday)

Aug 28, 2011 (Sunday)

331,565

Irene Control Group 1

Aug 2, 2011 Aug 9, 2011 Aug 16, 2011 (Tuesdays)

(na)

(na)

Aug 7, 2011 Aug 14, 2011 Aug 21, 2011 (Sundays)

427,629

Irene Control Group 2

Aug 3, 2010 Aug 10, 2010 Aug 17, 2010 Aug 24, 2010 (Tuesdays)

(na)

(na)

Aug 8, 2010 Aug 15, 2010 Aug 22, 2010 Aug 29, 2010 (Sundays)

387,082

Hurricane Sandy

Oct 25, 2012 (Thursday)

Oct 28, 2012 (Sunday)

Oct 28, 2012 (Sunday)

Oct 29, 2012 (Monday)

397,846

Sandy Control Group 1

Sep 27, 2012 Oct 11, 2012 Oct 18, 2012 (Thursdays)

(na)

(na)

Oct 1, 2012 Oct 15, 2012 Oct 22, 2012 (Mondays)

473,609

Sandy Control Group 2

Sep 30, 2010 Oct 14, 2010 Oct 21, 2010 Oct 28, 2010 (Thursdays)

(na)

(na)

Oct 4, 2010 Oct 18, 2010 Oct 25, 2010 Nov 1, 2010 (Mondays)

424,304

Notation: ‘na’ means there is no applicable data.

in the next section. In addition, average daily ridership of 2010 is less than that of 2011 or 2012 in the control groups. The period that is most suited to serve as a baseline group for further analysis is discussed next. 4. Temporal analysis Temporal analysis was conducted before model estimations to: (1) identify the periods for which hurricanes significantly affected taxi ridership; (2) select periods which can best serve as baselines for model estimations. Temporal analysis is made for the entire NYC as a whole in this section (i.e., spatial aspects have not been incorporated). 4.1. Taxi ridership in the case of Irene Fig. 2 shows the distribution of hourly taxi ridership in the case of Irene. Generally, taxi ridership presents diurnal regularity on days without adverse weather events. Taxi ridership on workdays presents a similar pattern to each other: a morning peak around 8–9 AM followed by a larger afternoon peak around 6 PM. This pattern changes over weekends. On Saturdays, taxi ridership climbs to its first peak after noon and reaches the highest peak around midnight. Taxi ridership during peak hours of Saturdays approximately equals that of workdays. However, taxi ridership on Sundays is much less than Saturdays and its peak hours are generally between noon and 6 PM. Fig. 2(a) shows the hourly ridership during Irene and the average hourly ridership in its two control groups. As shown, average hourly ridership of 2010 is generally lower than that of 2011 for most of the days. However, the case is different for Sundays. There might be some major events which had a significant effect on taxi ridership of Sundays in 2011. The reason is explained in the next paragraph. Fig. 2(b) plots the hourly ridership during Irene and the hourly ridership of each week in its first control group. All of them are weeks in 2011. Within the first control group, the curve colored in yellow is significantly different from the other two curves on Sunday. At a first glance, someone may conclude that a special event must have occurred on that day (i.e., Aug 7, 2011). As it happens, the reverse is true - ridership on the other two Sundays (i.e., Aug 14 and Aug 21, 2011) was affected by adverse weather conditions. According to notifications published by the Office of Emergency Management (OEM), a Flash Flood Warning was issued on Aug 14 and a Severe Thunderstorm Warning was issued on Aug 21. This finding explains the question on the unusual pattern of ridership raised in the previous paragraph. It also indicates that taxi ridership can be sensitive to weather conditions. Based on above analysis, the hourly ridership of the first week in the first Irene control group was chosen to serve as a baseline representing taxi travels in normal conditions. Comparing ridership during Irene with the baseline, it was found that there is no clear difference on the first four days in Fig. 2, even though the mandatory evacuation order was issued on Friday. The little variation on Friday may be due to the specific issuing time, which is 5:50 PM. The afternoon peak appears to have occurred as usual because there was enough time for people to go back home before hurricane landfall. In addition, there are also limited impacts on activities on Friday night. This is perhaps due to the fact 4

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

(a) Comparing with average hourly ridership across years

(b) Comparing with hourly ridership of the same year Fig. 2. Comparisons of hourly taxi ridership in the case of Irene.

that the storm was still far away and it thus had not raised enough concerns at that time. Significant deviations began to appear on the Saturday and about 24 h before hurricane landfall. The following is a summary of the temporal characteristics of taxi ridership by 6-h intervals in the last 24 h for Hurricane Irene. The 6-h interval was chosen because the National Hurricane Center (NHC) updates information of an approaching storm in 6-h intervals.

• Landfall-24 (Saturday 6 AM–12 PM): taxi ridership was less than that of normal days. However, there was still a morning peak on the Saturday, which occurred earlier than the other Saturdays; • Landfall-18 (Saturday 12 PM–6 PM): taxi ridership gradually decreased. The time when it began to drop was coincident with the time of MTA closure; • Landfall-12 (Saturday 6 PM–0 AM): taxi ridership kept decreasing; • Landfall-6 (Sunday 0 AM–6 AM): taxi ridership was close to zero in this period; • Landfall (Sunday 6 AM–12 PM): taxi ridership increased slightly, especially about 10 h after the landfall of Irene. The maximum ridership was around 2500 trips per hour.

4.2. Taxi ridership in the case of Sandy Fig. 3 shows the hourly taxi ridership in the case of Sandy. It was plotted in a similar manner to Fig. 2. As described in the previous section, a similar diurnal regularity also appears on days without extreme weather events in October. Regarding the difference between years, workday ridership in the month of October does not present such a clear difference as 5

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

(a) Comparing with average hourly ridership across years

(b) Comparing with hourly ridership of the same year Fig. 3. Comparisons of hourly taxi ridership in the case of Sandy.

that within the month of August (i.e., the case of Irene). However, ridership on Friday and weekend of 2012 is still more than that of 2010. Regarding hourly ridership in the same year, Fig. 3(b) does not present any clear difference on the three weeks in the first Sandy control group. Therefore, average hourly ridership of the first Sandy control group can serve well as the baseline. The mandatory evacuation order was issued at 10:45 AM on Sunday. In contrast to the case in Irene, taxi riders reacted more quickly this time as evidenced by the drop of ridership in the next time period. The similarity to the case of Irene is that taxi ridership did not display significant variations in the daily pattern until the day before the hurricane landfall. The following text summarizes the temporal distribution of taxi ridership by 6-h intervals in the 24 h prior to the landfall of Hurricane Sandy:

• Landfall-24 (Sunday 6 PM–0 AM): the nighttime activities were significantly affected probably because the time to landfall was • • • •

shorter than in Irene. Taxi ridership kept decreasing before and after MTA closure, but the decreasing rate was a little bit lower after MTA closure which is reflected by the change of the slope; Landfall-18 (Monday 0 AM–6 AM): ridership was lower than on normal days, but it did not decrease as close to zero as in the case of Irene; Landfall-12 (Monday 6 AM–12 PM): there were two morning peaks. The first peak was similar to other Mondays, which was around 8 AM. The second one was around 11 AM. Ridership began to decrease in the next period; Landfall-6 (Monday 12 PM–6 PM): ridership kept decreasing in this period and the afternoon peak around 6 PM did not show up as usual; Landfall (Monday 6 PM–0 AM): ridership kept decreasing and almost touched zero. Knowing the expected time of landfall was 6

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Fig. 4. Taxi ridership of the affected periods and their baselines.

later that night might have allowed committed activities such as going to work to happen as usual on the Monday morning. 5. Estimating taxi ridership of affected periods Spatial factors are brought into consideration in this section. The purpose of the estimation is to find explanatory variables that significantly affect taxi ridership of each ZCTA within the 24 h prior to hurricane landfall. The response variable is taxi ridership of each ZCTA in each 6-h time interval during affected periods, which is called “Taxi_hurricane” in the following text. 5.1. Taxi ridership of the affected periods and their baselines Fig. 4 is a plot of taxi ridership in the five 6-h periods leading up to the landfalls (i.e., Taxi_hurricane; on the Y-axis) as a function of the ridership in the corresponding baselines (i.e., Taxi_normal; on the X-axis). Each symbol stands for taxi ridership of a ZCTA in a specific time interval of the two conditions: hurricane and normal. The shape of the symbol is to distinguish the two hurricanes, i.e., dots for Irene and triangles for Sandy. Time periods are distinguished by different colors. First, there is a strong linear association between the two conditions in each time interval suggesting that Taxi_normal is likely to be a good explanatory variable in describing Taxi_hurricane. Second, the slope decreases as a hurricane approaches, which represents a significant temporal change. Thus, timedependent variables play a role in explaining the changing slopes across time intervals. Third, ridership during Irene and Sandy generally has two distinct trends in the same time interval, i.e., different slopes. It indicates that characteristics of hurricanes, weather related factors, and other case-specific variables potentially can explain the difference of slopes. Based on the above findings, there are two approaches to produce different slopes in the model estimation: (1) estimate multiple linear regression models separately for each time interval and (2) construct interaction terms to estimate a multisource regression model for all time intervals simultaneously. If interaction terms are constructed for all explanatory variables, the second approach will be equivalent to the first approach. Then the question becomes whether explanatory variables have significant temporal effects across time intervals. This can be tested by the significance of the associated interaction terms during model estimations. Therefore, multisource regression models can better serve the study objective. Following is a simple example of the model formulation: (1) assume there are three time intervals and two dummy variables (i.e., {T1, T2} ) are used to express them and (2) assume only two explanatory variables {x1, x2} are found significant in the model estimation, in which x1 has a significant temporal effect on the response variable and the other (i.e. x2 ) does not.

y = a1 x1

T1 + a2 x1

T2 + a3 x1 + bx2 + c

where

y is the time-dependent response variable; x1 is one of the selected explanatory variables, which is found to have a significant temporal effect on y across time intervals; x2 is the other selected explanatory variable, which does not have a significant temporal effect on y across time intervals; T1 and T2 are dummy variables expressing different time intervals. That is, T1 = 1 and T2 = 0 stand for the first time interval, T1 = 0 and T2 = 1 stand for the second time interval, and T1 = 0 and T2 = 0 stand for the third time interval; a1, a2, a3 , b , and c are the estimated parameters and the constant.

7

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Hour

Irene

Sandy

Landfall - 24

Landfall - 18

Landfall - 12

Landfall -6

Landfall

Fig. 5. Spatio-temporal distribution of the ratio.

5.1.1. Spatio-temporal analysis Fig. 5 presents the ratio between affected ridership and normal ridership (i.e., Taxi_hurricane/Taxi_normal) across zones and time intervals. Some zones do not feature in the figure because affected ridership and normal ridership of such zones are both zero in that specific time interval. As shown, the ratio generally decreases as the time of landfall gets closer since fewer people make trips as usual. However, some zones had unusually high taxi ridership (i.e., ratio > 2) across time intervals. Those zones are out of New York 8

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Table 2 Description of variables. Time-dependent variables Name and description

Taxi_hurricane (Taxi ridership in a 6-h interval by ZCTA during the period in which taxi travel is affected by an approaching hurricane. This is the response variable)

Taxi_normal (Taxi ridership in a corresponding 6-h interval by ZCTA on a normal day)

Time interval

Irene

Sandy

Mean

Std. dev.

Mean

Std. dev.

Landfall: 24 (i.e., the fourth 6-h interval before hurricane landfall) Landfall: 18 (i.e., the third 6-h interval before hurricane landfall) Landfall: 12 (i.e., the second 6-h interval before hurricane landfall) Landfall: 6 (i.e., the first 6-h interval before hurricane landfall) Landfall: 0 (i.e., the time interval in which the hurricane made landfall)

351.80

733.60

345.40

740.91

204.82

409.10

101.91

188.82

82.83

147.32

371.78

761.24

11.92

13.81

274.5

548.04

26.55

39.78

72.19

130.10

Correspond Correspond Correspond Correspond Correspond

536.30 871.67 1476.00 1381.29 694.70

1134.11 1908.39 2714.71 1880.27 1162.46

969.80 345.36 1341.90 1540.10 1951.10

2177.07 663.09 2823.69 3156.94 3407.59

to to to to to

Landfall: Landfall: Landfall: Landfall: Landfall:

24 18 12 6 0

Static variables Name and description

Irene

Income_Mean (Average household income by ZCTA in $1,000) High_Education (Percentage of population by ZCTA who have a bachelor degree or higher) EvacZone1 (Percentage of ZCTA area that received a mandatory evacuation order) Commute_Transit (Percentage of workers in each ZCTA who commute by transit (except taxi))

Sandy

Mean

Std. dev.

Mean

Std. dev.

93.88

59.38

93.06

60.40

41.61

24.15

40.98

24.31

9.30

18.90

8.53

17.26

57.64

11.85

58.52

11.41

county and are more likely to appear in Bronx county. In addition, zones covered by Evacuation Zone One are more likely to have unusually high taxi ridership 24 h before the storm landfall. Therefore, zonal characteristics were considered as explanatory variables in estimating taxi ridership (shown in Section 5.2). Besides, taxi ridership dropped more significantly during Hurricane Irene. None of the zones had greater taxi ridership than the normal condition (i.e., ratio < 1) six hours before the landfall of Irene. However, this phenomenon did not occur in the case of Sandy. Therefore, storm-related and other time-dependent variables were used in explaining the ratio changes (shown in Section 5.3). 5.2. Results of model estimations Separate models were estimated for the two storms (i.e., Irene and Sandy) and covered three different spatial ranges of the study area: (1) all ZCTAs in NYC, (2) ZCTAs in which 10% or more of their area is covered by Evacuation Zone One (called “within evacuation zones”), and (3) ZCTAs in which less than 10% of the areas are covered by Evacuation Zone One (called “out of evacuation zones”). Therefore, six multisource regression models were estimated. Records with zero taxi ridership in the affected condition (i.e., Taxi_hurricane equals 0) were removed from model estimations to avoid the problem of over-dispersion. Selected explanatory variables and their statistics are shown in Table 2. The standard deviation of Taxi_normal is quite large in each time interval because over 90% of taxi pick-ups originate from areas within Manhattan. That is, the highly skewed spatial distribution of taxi ridership leads to the presented large standard deviations. As shown, the standard deviation of Taxi_hurricane generally decreases across time. It means the difference of taxi ridership among ZCTAs tend to be less as a hurricane approaches. This can be explained by the fact that demand on taxi dropped significantly from time to time in Manhattan during the affected period. The decreasing demand is due to people’s general reluctance to travel in an extreme weather condition. Other candidate explanatory variables include characteristics of an approaching hurricane (such as storm intensity and forward speed), local weather conditions (such as humidity and temperature), and some other socio-demographic variables used in estimating taxi ridership on normal days (such as the percentage of household without any vehicle, average commuting time, and unemployment rate). The data are hurricane tracks from the National Hurricane Center, historical weather reports from the Weather Underground (www.wunderground.com), and datasets of household characteristics from the American Community Survey (ACS). All candidate explanatory variables with their interaction terms were entered into the model estimation process. The best model 9

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Table 3 Estimated models for taxi ridership of affected periods.

Constant Taxi_normal*Landfall: 24 Taxi_normal*Landfall: 18 Taxi_normal*Landfall: 12 Taxi_normal*Landfall: 6 Taxi_normal Income_Mean High_Education EvacZone1 Commute_Transit R Squared AIC Number of observations

Model 1: Irene (All Zones)

Model 2: Irene (Within Evac Zones)

Model 3: Irene (Out of Evac Zones)

Model 4: Sandy (All Zones)

Model 5: Sandy (Within Evac Zones)

Model 6: Sandy (Out of Evac Zones)

−49.699 0.612 (0.00) 0.182 (0.00) 0.023 (0.00) −0.024 (0.00) 0.025 (0.00) (na) 0.531 (0.00) −0.403 (0.01) 0.776 (0.00) 0.980 5988.917 534

4.223 0.625 (0.00) 0.179 (0.00) 0.013 (0.25) −0.028 (0.03) 0.032 (0.00) (na) (na) (na) (na) 0.978 1558.404 133

−106.600 0.604 (0.00) 0.185 (0.00) 0.031 (0.00) −0.021 (0.00) 0.019 (0.00) 0.589 (0.00) (na) −3.715 (0.00) 1.381 (0.00) 0.983 4369.835 401

−67.448 0.304 (0.00) 0.236 (0.00) 0.228 (0.00) 0.135 (0.00) 0.034 (0.00) −0.337 (0.02) 1.111 (0.01) −1.156 (0.00) 1.301 (0.00) 0.969 7301.424 602

45.047 0.298 (0.00) 0.207 (0.00) 0.202 (0.00) 0.123 (0.00) 0.029 (0.00) (na) (na) −1.074 (0.02) (na) 0.967 1827.257 147

−92.311 0.340 (0.00) 0.267 (0.00) 0.244 (0.00) 0.141 (0.00) 0.038 (0.00) (na) (na) (na) 1.780 (0.00) 0.975 5389.363 455

Notation: ‘na’ means this variable was not significant and was not selected.

was selected using the Akaike Information Criterion (AIC) employing a stepwise selection process. Variable selection involves adding or dropping variables from the set of candidate explanatory variables until the lowest AIC is achieved, since a smaller AIC value indicates a better goodness-of-fit. However, the variables selected based on AIC are not always significant, requiring further attention to the model formulation. In this situation, insignificant variables were removed from the model estimation and the remaining explanatory variables entered into the process of variable selection again. This process was repeated until all the selected variables were significant. Table 3 presents the model estimation results. Beyond the AIC values, Table 3 also presents another Goodness-of-fit (GOF) statistic – R squared. All the estimated models have large R Squared values, indicating a large portion of the variance in the data has been explained by the estimated models. Each value in parentheses in Table 3 is the p-value of a t-test with the null hypothesis that the estimated parameter equals zero. As shown in Table 3, almost all of the selected explanatory variables are significantly different from zero at a confidence level of 95%. Parameters for “Taxi_normal*Landfall: x” do have a significant change across most of the time intervals in the two storms. The only exception is in Model 2, where “Taxi_normal*Landfall: 12” has a p-value of 0.25. However, it cannot be removed or merged because estimated parameters for its previous and following periods are all significant. Following are interpretations on the estimated parameters. The parameters in front of the interaction terms (Taxi_normal*Landfall: x) stand for the proportion of taxi riders who kept using taxi in each 6-h interval as a storm was approaching (i.e., Taxi_hurricane/Taxi_normal).

• First, look at the estimated parameters vertically. They dropped significantly from Landfall: 24 to Landfall: 18 in the first three •

models of Irene. They reached the lowest value in Landfall: 6 and bounced back in Landfall: 0. However, values of the estimated parameters decreased steadily in the models of Sandy. This phenomenon reflects that residents might have been more anxious during Irene but felt more experienced with the second storm. Now look at the estimated parameters horizontally. In the case of Irene, parameters estimated for Model 2 are generally greater than those estimated for both Model 1 and 3. It means the declining rate of taxi ridership is less within evacuation zones during Irene. That is, taxi ridership declined less within evacuation zones, which may relate to the resident’s sudden demand on taxi for evacuation or as a substitute of other transit service. However, this did not happen in Sandy. A possible explanation is that residents living within evacuation zones became less risk averse after experiencing one hurricane and thus did not cause a sudden or significant rise in taxi trips.

The average household income (Income_Mean) was only significant and had a positive effect in the models estimated for areas out of evacuation zones in Irene (i.e., Model 3). However, it became negative and significant only for all areas in the case of Sandy (i.e., Model 4), which indicates a possible behavioral change of people earned high income. They tended to generate more taxi trips during Irene outside evacuation zones but they generally made fewer trips during Sandy in all zones. The percentage of populations who have a bachelor degree or higher (High_Education) was significant and had a positive effect in the two models estimated for all zones. It indicates this portion of people are more likely to take taxis during hurricanes. It also suggests that they are more likely to keep making trips during hurricanes than other people. The sign of the estimated parameter for the percentage of area covered by Evacuation Zone One (EvacZone1) suggested that a ZCTA with larger areas covered by Evacuation Zone One (i.e., having a larger area receiving a mandatory evacuation order) generated less taxi trips in the affected condition. This variable was not significant in the model estimated for areas within evacuation zones (Model 2), but it had a significant and negative effect for areas out of evacuation zones (Model 3). The explanation is that areas within evacuation zones are equally affected no matter how large the percentage of coverage is in the case of Irene. Meanwhile, areas that are closer to but out of evacuation zones generated less taxi trips. In contrast, the situation reversed during Sandy: the variable 10

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

Table 4 Parameters of Taxi_normal and other time-dependent variables. Storm

Hour to landfall

Parameters of Taxi_normal (All zones)

Barometric pressure (in)

Wind speed (mph)

Weekday and time

Weather conditions

Irene

−24 −18 −12 −6 0

0.612 0.182 0.023 −0.024 0.025

29.91 29.83 29.64 29.26 28.77

5.53 7.53 14 15.78 11.75

Weekend Weekend Weekend Weekend Weekend

daytime daytime nighttime nighttime daytime

Overcast Light Rain Heavy Rain Heavy Rain Heavy Rain

Sandy

−24 −18 −12 −6 0

0.304 0.236 0.228 0.135 0.034

29.69 29.54 29.25 28.79 28.68

14.38 14.4 17.04 22.54 19.91

Weekend Weekday Weekday Weekday Weekday

nighttime nighttime daytime daytime nighttime

Overcast Overcast Light Rain Light Rain Light Rain

Column

(2)

(3)

(4)

(5)

(6)

(7)

was found significant in the model estimated for areas within evacuation zones (Model 5) but not for areas out of evacuation zones (Model 6). It means areas within evacuation zones were less likely to generate taxi trips during Sandy and taxi ridership further decreased when the percentage of coverage became larger. Overall, it means residents felt more endangered during Irene, but they felt less endangered during Sandy. This is in line with the previous analysis on the interaction terms in the model. Lastly, zones with more commuters depending on public transit (except taxi) generated a greater number of taxi trips under affected conditions. That is, the suspension of public transit service might have indeed forced some transit users to take taxi. It was also found that this variable had a more significant effect on areas outside evacuation zones. Moreover, its impact was greater in the case of Sandy. 5.3. Analyzing interaction terms The following analysis was based on Models 1 and 4, which were models estimated for all zones in NYC. The third column in Table 4 shows the parameters for Taxi_normal in each time interval and storm. The other weather-related and time-dependent variables are also reported in this table for context and further analysis. First, its correlation with other continuous factors reflecting local weather conditions was calculated. Barometric pressure (in column 4) and local wind speed (in column 5) were found to have a larger correlation with these parameters than other weatherrelated factors (such as temperature or humidity) or storm-related factors (such as forward speed). The correlations are 0.61 and −0.54, respectively. The moderate correlation suggests that taxi riders are less likely to make trips as usual in a condition of falling barometric pressure or increasing local wind speed. Second, note the relationship of the parameters of Taxi_normal with the following categorical variables: weekday and time (in column 6) and weather conditions (in column 7). It was found that a storm had the greatest impact on taxi ridership during weekend nighttime and the least impact on weekday daytime; and taxi users traveled less during heavy rain conditions. A simple linear regression model was estimated based on the ten observations in Table 4 for a rough estimation of the relationship. The following model has an R squared value of 0.71. Both estimated parameters are significant at the 95% confidence level. The other weather-related or storm-related factors were also tested during the model estimation, but they are not as significant as the two that are selected.

parataxi _normal = 0.553

0.021 WindSpeed

0.253 HeavyRain

where

parataxi _normal are the parameters of Taxi_normal, as presented in column 3 of Table 4; WindSpeed is the local wind speed in miles/hour, as presented in column 5 of Table 4; HeavyRain is a dummy variable (shown in column 7 of Table 4), which equals 1 if the weather condition is heavy rain, and is 0 otherwise. The constant in the model is 0.553, which means the maximum of ridership in the condition of hurricanes can be 55.3% of the ridership in a corresponding normal condition. The proportion will decrease as wind speed increases or rain intensifies. 6. Conclusions This paper estimated spatio-temporal variations of taxi ridership before the arrival of two hurricanes in NYC. Based on the analysis, it was found that Hurricanes Irene (2011) and Sandy (2012) began to have significant impacts on taxi ridership about one day ahead of hurricane landfall. The greatest reduction in taxi ridership occurred within areas receiving an evacuation order. In all areas, factors that tended to mitigate a reduction in taxi use were a high level of education, the percentage of regular transit 11

Transportation Research Part D xxx (xxxx) xxx–xxx

R.R. Bian, et al.

commuters, and the level of taxi ridership under normal conditions. Considering the temporal aspect of taxi use in the face of an oncoming storm, a storm has different impacts on taxi ridership at different times of the day and days of the week. Ranking the impact from the greatest to the least is: weekend nighttime > weekday nighttime > weekend daytime > weekday daytime. The presence of adverse weather conditions, such as heavy rain or strong wind, reduce taxi ridership. The dataset used in this study records actual taxi ridership, which could be restricted by supply such as the number of taxi drivers available at the time. In future research it will be interesting to investigate how many taxi drivers were working immediately prior to landfall, and for those drivers who stopped working, what were their concerns (such as safety or decreasing ridership)? For those drivers who kept working, were they aware of additional demand in areas which otherwise would have been under-served? The contribution of this study has been in providing initial results on where taxi demand has manifested itself spatially and temporally, which may assist government agencies or the taxi industry in dispatching taxis during disasters to serve as an alternative mode for evacuation purposes. The other contributions of this study include identifying periods when taxi ridership significantly deviates from normal conditions in NYC, estimating the magnitude of the deviation, and building a model that estimates taxi ridership in the face of a hurricane. The information can benefit managers of the Taxi and Limousine Commission in NYC in being able to estimate the overall effect of hurricanes on the taxi market. The information can also benefit emergency managers, who would like to find out which area is more likely to depend on taxis during hurricanes. For example, areas with a large proportion of transit-dependent commuters or a large demand for taxis under normal circumstances, are more likely to have a greater demand for taxi with an approaching storm. The method used in this study can be applied in other cases with similar datasets. First, extensive taxi service is available in other major metropolitans in the U.S., such as Chicago and Los Angeles. Extreme weather or natural disasters also affect these cities with events such as severe winter storms and wildfires. Therefore, the method of this research could be applied there to find out the impact of such events on taxi ridership. Second, the emergence of transportation network companies (such as Uber/Lyft) in recent years has made it an alternative traveling mode in some cities where regular taxi service may or may not be active. The method of this research can be applied on those data to find out the role of transportation network companies in disasters. References Donovan, B., Work, D.B., 2015. Using coarse GPS data to quantify city-scale transportation system resilience to extreme events. In: Transp. Res. Board 94th Annu. Meet. pp. 1–16. Ferreira, N., Poco, J., Vo, H.T., Freire, J., Silva, C.T., 2013. Visual exploration of big spatio-temporal urban data: a study of New York City taxi trip. IEEE Trans. Vis. Comput. Graph. 19, 2149–2158. Hochmair, H.H., 2016. Spatiotemporal pattern analysis of taxi trips in New York City. Transp. Res. Rec. J. Transp. Res. Board 2542, 45–56. https://doi.org/10.3141/ 2542-06. Kamga, C., Yazici, M.A., Singhal, A., 2015. Analysis of taxi demand and supply in New York City: implications of recent taxi regulations. Transp. Plan. Technol. 38, 601–625. https://doi.org/10.1080/03081060.2015.1048944. Kamga, C., Yazıcı, M.A., 2014. Temporal and weather related variation patterns of urban travel time: considerations and caveats for value of travel time, value of variability, and mode choice studies. Transp. Res. Part C Emerg. Technol. 45, 4–16. https://doi.org/10.1016/j.trc.2014.02.020. NYCTLC, 2017. New York City Taxi and Limousine Commission Chapter 58 Medallion Taxicab Service [WWW Document]. URL: http://www.nyc.gov/html/tlc/ downloads/pdf/rule_book_current_chapter_58.pdf. NYCTLC, 2016. 2016 TLC Factbook. New York City. NYCTLC, 2014. 2014 Taxicab Factbook, 1–13. https://doi.org/10.1017/CBO9781107415324.004. PSB, 2017. Citywide Mobility Survey. New York City. Qian, X., Ukkusuri, S.V., 2015. Spatial variation of the urban taxi ridership using GPS data. Appl. Geogr. 59, 31–42. https://doi.org/10.1016/j.apgeog.2015.02.011. Schaller Consulting, 2006. The New York City Taxicab Fact Book. U.S. Census Bureau, 2018. American FactFinder [WWW Document]. U.S. Census Bur. URL: http://factfinder.census.gov. US Census Bureau, 2015. ZIP Code Tabulation Areas (ZCTAs) [WWW Document]. URL: https://www.census.gov/geo/reference/zctas.html (accessed 6.14.17). Yang, C., Gonzales, E., 2014. Modeling taxi trip demand by time of day in New York City. Transp. Res. Rec. J. Transp. Res. Board 2429, 110–120. https://doi.org/10. 3141/2429-12. Zhu, Y., Ozbay, K., Xie, K., Yang, H., 2016. Using big data to study resilience of taxi and subway trips for Hurricanes Sandy and Irene. Transp. Res. Rec. J. Transp. Res. Board 2599, 70–80. https://doi.org/10.3141/2599-09.

12