Mapping the popularity of urban restaurants using social media data

Mapping the popularity of urban restaurants using social media data

Applied Geography 63 (2015) 113e120 Contents lists available at ScienceDirect Applied Geography journal homepage: www.elsevier.com/locate/apgeog Ma...

2MB Sizes 0 Downloads 76 Views

Applied Geography 63 (2015) 113e120

Contents lists available at ScienceDirect

Applied Geography journal homepage: www.elsevier.com/locate/apgeog

Mapping the popularity of urban restaurants using social media data Shixiao Zhai a, b, Xiaolin Xu a, b, *, Lanrong Yang a, b, Min Zhou a, b, Lu Zhang a, b, Bingkui Qiu c a

Non-traditional Security Center of Huazhong University of Science and Technology, Wuhan, China College of Public Administration, Huazhong University of Science and Technology, Wuhan, China c College of Public Administration, Jinzhong University, Jinzhong, China b

a r t i c l e i n f o

a b s t r a c t

Article history: Received 31 March 2015 Received in revised form 15 June 2015 Accepted 15 June 2015 Available online xxx

Nowadays, geographers show growing interest in providing location-based services for urban residents. It is of great practical significance to screen and recommend the most popular restaurants to consumers, as dining is important to every urban dweller. Consumer review website (CRW) has emerged as an active social media platform in catering industry. This paper demonstrated how to quantify the popularity of urban restaurants (PUR) by using CRW, namely, Dianping.com. An applied popularity index (PI) was developed to quantify PUR, based on the consumer review scores (food, service, and decoration) and physical data (evaluation frequency and restaurant grade) for 8259 restaurants within the Hangzhou city, China. All the information, together with the geographic location data, was harvested from the corresponding Application Programming Interface (API) platform. PUR was then mapped by using a geographic information system (GIS). Results showed that restaurants of high popularity were generally concentrated in old urban districts, whereas those in new urban districts presented low popularity. The kernel density distribution of PI also highlighted the geography that the PI values declined from the central city toward the outskirts. Locational associations between PUR and urban functional units (bank, shopping mall, school, cinema, hotel, bar, and scenic spot) generally presented a similar tendency. Restaurants with high PI values were in high proximity to urban functional units, whereas those with low PI values were located far away from other functional units. It implied that restaurants with high popularity tended to be located in high mixture with urban functional units. Although we used Dianping.com to perform the analysis, the presented methodology can be extended to other types of CRWs. Our study is believed to provide new insights into applied geographic sciences. © 2015 Elsevier Ltd. All rights reserved.

Keywords: Social media GIS Restaurant reviews Local Moran's I Spatial analysis

1. Introduction In recent years, there has been a remarkable surge in the usage of social media among the general public (Kaplan & Haenlein, 2010). In China, for example, more than 65% of the netizen population participate in social media, and there are 100 million active daily microblog users (Xiong & Lv, 2013). The adoption and usage of social media will still increase as technological progress empowers more contributions and interactions of the general public (Hollis, 2011). The social media introduces a new bottom-up nature of communication, which enables the peer-to-peer, business-tobusiness, and consumer-to-business information offering and

* Corresponding author. College of Public Administration , Huazhong University of Science and Technology, No. 1037, Luoyu Road, Wuhan 430074, China. E-mail address: [email protected] (X. Xu). http://dx.doi.org/10.1016/j.apgeog.2015.06.006 0143-6228/© 2015 Elsevier Ltd. All rights reserved.

sharing (Park & Nicolau, 2015). This has not only changed the traditional form of news reporting but also provided new opportunities for geographic science (Sui & Goodchild, 2011), given the rich geographic information attached to the social media data, often known as “geotags,” in the form of longitude and latitude coordinates (Croitoru, Wayant, Crooks, Radzikowski, & Stefanidis, 2014; Lin & Cromley, 2015; Shelton, Poorthuis, Graham, & Zook, 2014). Scholars have applied social media data, Facebook, and microblogging, for example, into many fields of applied geographic studies, including population migration, urban space pattern, commuting behaviors, environmental event reactions, pandemics and disaster predictions, and crime occurrence (Cao et al., 2015; Char & Stow, 2015; Chunara, Andrews, & Brownstein, 2012; Croitoru et al., 2014; Gerber, 2014; Jang & Hart, 2015; Kounadi, Lampoltshammer, Groff, Sitko, & Leitner, 2015; Lampoltshammer, Kounadi, Sitko, & Hawelka, 2014; Lin & Cromley, 2015; Shelton

114

S. Zhai et al. / Applied Geography 63 (2015) 113e120

et al., 2014; Patel & Jermacane, 2015; Widener & Li, 2014). In the recent past, geographers have shown growing interest in providing location-based services for urban residents (Croitoru et al., 2014). It is of great practical significance to screen and recommend the most popular restaurants to consumers, as dining is important to every urban dweller. Consumer review website (CRW) has emerged as an active social media platform in catering and tourism industry (Liu, Su, Gan, & Chou, 2014; Ogut & Tas, 2012; Park & Nicolau, 2015). It is estimated that 60% of consumers rely on online reviews and ratings when making purchase decisions (Smith, 2013). CRWs provide a large volume of evaluation data for quantifying the popularity of urban restaurants (PUR). The Zagat Restaurant Survey in the USA and Dianping.com in China are noteworthy examples of CRWs. The Zagat Survey (conducted in 1979) has become one of the most trusted restaurant rating systems globally (Liu et al., 2014; Zagat, 2007). Dianping.com was established in 2003 following the model of the Zagat survey. By 2013, Dianping.com has recorded over 15 million restaurants in 2300 cities of China, with 48 million active users. Dianping.com provides a rating system for consumers to evaluate restaurants from three aspects (food, service, and decoration). Higher rating scores reflect higher consumer satisfaction degree and higher popularity among the general public (Parasuraman, Zeithaml, & Malhotra, 2005; Park & Nicolau, 2015). PUR in space can be visualized based on their rating scores and geographic location information. Mapping PUR using the big data from Dianping.com should demonstrate a typical case to apply social media data from CRW into geographic science. In response, with a case of Hangzhou city in China, this paper aims to demonstrate how to map PUR by using social media data from CRW. The specific objectives are to (1) develop an applied index to measure PUR using the big data from Dianping.com; (2) characterize the geographic variations of PUR within Hangzhou city; and (3) analyze the locational associations between PUR and other functional units in the urban system. 2. Literature review Recognizing the importance of CRWs, scholars have conducted a number of case studies (Liu et al., 2014; Lu, Ba, Huang, & Feng, 2013; Park & Nicolau, 2015; Vermeulen & Seegers, 2009; Ye, Law, Gu, & Chen, 2011; Zhang, Ye, Law, & Li, 2010). Prior studies generally focused on the fields of tourism and hospitality, and discussed the effect of online reviews from two aspects: the consumer decisionmaking process and the product scales. It was consistently found that purchase decisions and company revenues were both significantly influenced by the characteristics of review providers (e.g., expertise level and identity disclosure) and those of online reviews (e.g., information richness and star ratings) (Duan, Gu, & Whinston, 2008; Litvin, Goldsmith, & Pan, 2008; Liu et al., 2014; Zhang et al., 2010). For example, consumers usually perceive that extreme negative or positive ratings are useful than moderate ratings (Zhang et al., 2010). In the dining segment, past literature generally agreed that food, service, price, location, and environment were the key aspects in restaurant rating (Zhang, Zhang, & Law, 2014). Some studies also discussed the limitations of the world-renowned restaurant rating guides and developed a more effective rating scale (Liu et al., 2014). Evidence has revealed that online reviews were significant contributors to PUR, as the potential consumers are connected with many other diners by CRWs (Zhang et al., 2010). In particular, PUR is positively correlated with the volume of reviews, the service, and the quality of food (Zhang et al., 2010). These studies have advanced the application of CRWs to analyze PUR; however, no studies have employed applied geographic approaches to characterize the spatial variations of PUR based on CRWs.

A majority of previous studies collected the big data from the social media platform of Twitter (Chen & Yang, 2014; Gerber, 2014; Goodchild & Glennon, 2010; Kirilenko, Molodtsova, & Stepchenkova, 2015; Xu, Wong, & Yang, 2013). They can be divided into two categories: applied research using social media as data source and general exploratory data analysis (Widener & Li, 2014). Applied geographers are particularly interested in relating the geolocated property of tweets with their personal characteristics and behaviors. For example, Ghosh and Guha (2013) employed the point density analysis to map tweets with interest in the obesity topic. Widener and Li (2014) mapped the prevalence of unhealthy food references within the United States by mining the geolocated Twitter data. Yang and Mu (2015) detected depressed tweets and analyzed the spatial pattern depression by calculating Moran's I index. Kounadi et al. (2015) used the “Nearest Neighbor Hierarchical Clustering” to examine the proximity dependency of the Twitter responses to homicide responses. In another application, the geolocated nighttime tweet data were used as ancillary information to perform the areal interpolation of population (Lin & Cromley, 2015). All these cases demonstrated that the application of spatial analysis techniques and social media data can effectively assist to discover knowledge in a wide variety of fields. However, data from CRWs have rarely been applied in geographic research. 3. Methodology and data 3.1. Study area Hangzhou, located along the Chinese southeastern coast (Fig. 1), is a key historical, cultural, and tourism city. It covers an area of approximately 4881 km2 and has a population of 7.8 million. In terms of space, Hangzhou can be divided into two parts: the old urban districts and the new urban districts (Fig. 1). Given its superior geographic advantage and rich historical environment, food culture in Hangzhou is diversified and vibrant, and the catering industry is highly developed. Hangzhou is one of the earliest recorded cities in China by Dianping.com. It therefore serves as a typical case to map PUR based on Dianping.com. 3.2. Developing an applied index Dianping.com provides three aspects (food, service, and decoration) to evaluate restaurants with a maximum score of 30 points for each criterion. Literature review indicates that PUR is also correlated with the volume of reviews and the scale of operation, except for the environment, food, and service (Liu et al., 2014; Park & Nicolau, 2015; Zhang et al., 2014). Consequently, the number of reviews, the number of recommendations, the evaluation frequency, and the size and grade of the restaurants should be taken into consideration while using the big data from Dianping.com to measure PUR. Eighteen indices were first selected based on data availability and then they were subjected to expert panel evaluation. Ten experts participated in the expert consulting procedure. Three indices were revised according to the comments of the experts. Two pairs of indices were believed to be of different expressions but of identical nature, and therefore two indices were discarded. The experts pointed that two indices had shortcomings, and these indices were discarded. The final comprehensive index system to measure PUR is shown in Table 1. Data for these indices of total 8259 restaurants within Hangzhou (Fig. 2) as well as their geographic location information were collected using the Search Application Programming Interface (API) developed by Dianping.com that allows an authorized thirty party to run queries and access the data. Standard deviation method (Eq. (1)) was used to standardize the indices. After performing the

S. Zhai et al. / Applied Geography 63 (2015) 113e120

115

Fig. 1. Location and administrative divisions of Hangzhou city, China.

Table 1 Index system to measure the popularity of urban restaurants. Dimension Evaluation frequency

Food quality Decoration Service quality

Restaurant size and grade

Index

Explanation

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11

Number of web page views Number of evaluations Number of coupon evaluations Number of web collections Number of check-ins Total evaluation points of food quality Number of recommendation Total evaluation points of decoration Number of evaluations on atmosphere Total evaluations points of service quality Number of evaluations on special service

X12 X13 X14

Overall evaluation star rating Consumption per capita Number of branches

Number of times that consumers view the homepage of a certain restaurant Number of times that consumers evaluate a restaurant Number of times that consumers evaluate the coupon of a restaurant Number of consumers who show interest in a restaurant Number of consumers who visit a restaurant through the Dianping.com Total evaluation points by consumers on food quality of a restaurant Number of times that consumers recommend the food of a restaurant Total evaluation points by consumers on decoration of a restaurant Number of times that consumers evaluate the dining atmosphere of a restaurant Total evaluation points by consumers on service quality of a restaurant Number of times that consumers evaluate the special service of a restaurant (e.g., free parking, credit card promotion) Average star rating level of a restaurant by the total consumers Consumption per capita of a restaurant Number of branch stores of a restaurant within the city

KaisereMeyereOlkin (KMO) test and Bartlett's sphericity test, we used the principal component analysis (PCA), with varimax rotation in particular, to obtain component's eigenvalues and indices' loadings. PCA was applied given the arbitrariness associated with subjectively assigned weight. The KMO test and Bartlett's sphericity test were performed because they can examine the validity of PCA. On condition that KMO exceeds 0.7 and the Bartlett's sphericity test presents significance, PCA is regarded to be valid. The varimax rotation was applied because it can generate a simple and structural factor-loading matrix. We retained the high scoring components (eigenvalues >1.0) and indices (loadings >0.75) (Su, Wang, Luo, Mai, & Pu, 2014). Finally, an applied popularity index (PI) was formulated using the eigenvalues and loading scores (Eq. (2)):

x*i ¼

xi  m s

(1)

where x*i is the standardized value of xi, m is the mean value of x, and s is the standard deviation.

PI ¼

Ei

n .X i¼1

! Ei

0 @

k X

1 Li  Mj A

(2)

j¼1

where Ei is eigenvalue of component i, Lj is the loading score for index j, and Mi is the standardized value for index j.

116

S. Zhai et al. / Applied Geography 63 (2015) 113e120

3.3. Spatial analysis On the basis of the PI and geographic location information, PUR was mapped by using a geographic information system (GIS). In order to visually characterize the geographic variations of PUR in space, the restaurants were divided into six groups based on the rankings of PI: extremely high (group 6), very high (group 5), high (group 4), low (group 3), very low (group 2), and extremely low (group 1). In particular, the box plot was employed to determine the ranges per group: (1) threshold for groups 2, 3, 4, and 5 were 25%, 50%, and 75% of the PI ranges, respectively; (2) threshold for group 1 (extreme low values) was (25%  3* (75e25%)) of the PI ranges; (3) threshold for group 6 (extreme high values) was (75% þ 3* (75  25%)) of the PI ranges (McGill, Tukey, & Larsen, 1978). Kernel density estimation (Silverman, 1998) and local Moran's I (Anselin, 1988) were employed to statistically characterize the geographic variations of PUR in space. Previous studies demonstrated that these approaches were effective in analyzing the geolocated data from the social media (Ghosh & Guha, 2013; Yang & Mu, 2015). Kernel density estimation can help investigate the successive variations of PI-weighted restaurant points in space, whereas local Moran's I index can help identify the local spatial clusters or spatial outliers of PI-weighted points. Spatial clusters have two categories (Su, Jiang, Zhang, & Zhang, 2011): highehigh clusters (restaurants with high PI value are surrounded by restaurants with high PI values) and lowelow clusters (clustering of restaurants with low PI values). Spatial outlier locates within the mixture of restaurants of low and high PI values and includes highelow (a high PI value restaurant is surrounded by low PI value restaurants) and lowehigh (a low PI value restaurant is surrounded by high PI value restaurants) outliers. All the spatial analyses were performed by using ArcGIS 10.2. In particular, the search radius of kernel density estimation was determined by the range of semivariogram for PI. Preliminary experiment demonstrated that the Mean Integrated Square Error of the Quartic kernel function was significantly lower than that of the Uniform, Gaussian, and Epanechnikov kernel function. We therefore chose the Quartic kernel function. The weight matrix of local Moran's I for PI was determined by the nearest neighbor distance method. The nearest neighbor distance was used to generate spatial matrix because the autocorrelation information can be sufficiently incorporated into spatial analysis (Su et al., 2013). 3.4. Statistical analysis Economic well-being of economic activities is dependent on the center location, where the population demand is high (King & Golledge, 1978). Consequently, driven by the market forces and zoning, spatial clustering and co-occurrence patterns are always found among different functional units (Leslie, Frankenfeld, & Makara, 2012). A functional unit refers to the individual presence of any livingesupporting activity in an urban area (Myint, 2008). Functional units typically include hotels, banks, grocery stores, schools, and singing and dancing bars (Myint, 2008). Consequently, examining how PUR is locationally related to other functions is an essential step toward understanding the PUR variations in space. In response, we further analyzed the locational associations between PUR and other functional units in the urban system. Given the empirical and theoretical considerations, we selected eight categories of functional units for locational association analysis: bank, hotel, singing and dancing bar, shopping mall, elementary and secondary school, university, cinema, and scenic spot. One-way analysis of variance (ANOVA) was applied to compare the distance to different functional units from the six groups of restaurants (extremely high, very high, high, low, very low, and extremely

Fig. 2. Location of restaurants within Hangzhou city, China.

low). We then ranked the distance to different functional units from the six groups of restaurants, in order to quantify the locational association between PUR and other functional units. 4. Results 4.1. The integrated index to measure PUR KMO was 0.825, and the p-value for the Bartlett's sphericity test was 0.002, indicating the validity of PCA. Three components were extracted by PCA and they explained 82.67% of the total variance (Table 2). Component 1 had strong positive loadings on the number of evaluations, number of Groupon evaluations, and overall evaluation star rating (Table 3). It represented the evaluation frequency and grade of the restaurants. Component 2 reflected the food and service quality, given the strong positive loadings of the number of recommendations, total evaluation points of food quality, and total evaluation points of service quality (Table 3). It can be inferred that Component 3 denoted the decoration, as it had high positive loadings on the total evaluation points of decoration and number of evaluations on atmosphere. Eigenvalues of the three components and the loading scores for corresponding indices were then used to formulate PI as follows:

PI ¼ 3:69  ð0:894  X2 þ 0:765  X3 þ 0:813  X12 Þ þ 2:546  ð0:912  X6 þ 0:803  X7 þ 0:892  X10 Þ þ 1:214  ð0:876  X8 þ 0:771  X9 Þ (3)

Table 2 Eigenvalues and cumulative variances extracted by varimax rotated principal component analysis. Component

1 2 3 4 5 6 7

Eigenvalues Total

% of variance

Cumulative %

3.690 2.546 1.214 0.827 0.540 0.270 0.103

36.127 29.313 16.221 8.911 4.371 2.673 1.384

36.127 66.440 82.661 91.572 95.943 98.616 100.000

S. Zhai et al. / Applied Geography 63 (2015) 113e120 Table 3 Loading scores for indices in each component. Indices

Component

No.

Indices

1

2

3

X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12 X13 X14

Number of web page views Number of evaluations Number of coupon evaluations Number of web collections Number of check-ins Total evaluation points of food quality Number of recommendation Total evaluation points of decoration Number of evaluations on atmosphere Total evaluation points of service quality Number of evaluations on special service Overall evaluation star rating Consumption per capita Number of branch restaurants

.573 .894 .765 .612 .475 .112 .145 .235 .098 .191 .246 .813 .223 .312

.201 .218 .197 .176 .188 .912 .803 .347 .452 .892 .731 .141 .103 .087

.319 .304 .168 .221 .297 .418 .465 .876 .771 .187 .201 .076 .095 .067

4.2. Geographic variations of PUR Fig. 3 shows the great geographic heterogeneity of PUR within Hangzhou. Restaurants with high PI values were generally concentrated in old urban districts, whereas those in new urban districts presented low PI values. Statistics (Table 4) showed that restaurants with high PI values accounted for a large proportion in the old urban districts. For example, restaurants of groups 4, 5, and 6 in Shangcheng District and Xiacheng District shared 70.7% and 64.1% of the total amount, respectively. On the contrary, majority of the restaurants in the new urban districts belonged to the low-PI groups. In Xiaoshan District and Yuhang District, 63.1% and 66.1% of the total restaurants were, respectively, attributed to groups 1, 2, and 3. In addition, the mean value of PI for restaurants in the old urban districts was much higher than that for restaurants in the new urban districts (Table 4). Among the old urban districts, the mean value of PI was highest for Shangcheng District, followed by Xiacheng District and Gongshu District, which was much higher than that for Jianggan District and Binjiang District. These results also denoted the great geographic variations of PUR in space. The kernel density distribution of PI (Fig. 4) highlighted the geography that the PI values declined from the central city toward the outskirts. In particular, the core of PI for the whole city was located in the Shangcheng District. Fig. 5 shows the local clusters

Fig. 3. Distributions of the six categories of restaurants regarding their popularity within Hangzhou city, China: 1 (extremely low), 2 (very low), 3 (low), 4 (high), 5 (very high), 6 (extremely high).

117

and outliers of PI identified by the local Moran's I index. It can be seen clearly that the highehigh clusters were mainly distributed in Shangcheng District and Xiacheng District. The highelow outliers were also located within the old urban districts. Such results denoted that the most popular restaurants were clustered in the central urban areas. The lowelow clusters were roughly concentrated in the new urban districts, approaching the border of the old urban districts. It suggested that restaurants with low popularity were clustered in the junction zone between the new and old urban districts. Results of the kernel density estimation and local Moran's I analysis statistically evidenced that PUR presented significant spatial clusters and geographic variations. 4.3. Locational associations with other functional units Table 5 shows the distance to different functional units from the six groups of restaurants. Comparing with the distance to bank, there was no significant difference between group 1 and group 2, but it presented a typical decreasing sequence from group 2 to group 6. It denoted that restaurants with high popularity were located nearer to bank compared with those with low popularity. Locational associations between PUR and shopping mall, as well as cinema, were the same as those between PUR and bank. Distance to shopping mall and cinema declined from the extremely low group to the extremely high group. Locational associations between PUR and the other functional units, although were not identical, presented a similar trend with those between PUR and bank. Generally, restaurants with high popularity were in high proximity to urban functional units, whereas those with low popularity were at a far away distance. 5. Discussion and conclusions The online popularity is particularly important in the catering industry, given that the gastronomic experience is unaware before consumption (Zhang et al., 2010). However, PUR cannot be directly characterized or measured in an efficient and accurate manner, as the star rating systems of CRWs provide various formats of indices (Liu et al., 2014). Moreover, a thematic map, which exhibits the spatial patterns of PUR and provides location-based service for the general public, is always unavailable in CRWs. This paper demonstrated how to map PUR by using the big data from the social media platform of Dianping.com. As CRWs provide big data of various aspects, the integration of consumer review data (food, service, and decoration) and physical data (evaluation frequency and restaurant grade) is essential for quantifying PUR. We addressed this challenge by developing an applied index that integrated data of multiple dimensions. By using geospatial and statistical analyses, PUR patterns and their locational associations with other functional units were characterized. It was found that PUR presented great geographic heterogeneity in space. Popularity of restaurants in the old urban districts was generally higher than that of restaurants in the new urban districts. Specifically, restaurants with high popularity were clustered in the old urban districts, whereas those with low popularity were clustered in the junction zone between the new and old urban districts. Population density is the key influential factor of restaurant location (Dock, Song, & Lu, 2014). The old urban districts are densely populated in Hangzhou. Residential social activities are active, and the demand for food service is high. Catering industry in the old urban districts therefore started earlier and developed fairly well, given the high market potential and customer volume. Longer period of operation and promotion helped the restaurants earn a better reputation among the general public. Accessibility is another critical factor determining the restaurant competitiveness (Talaga,

118

S. Zhai et al. / Applied Geography 63 (2015) 113e120

Table 4 Number of the six categories of restaurants and the mean value of popularity index in different districts within Hangzhou city, China. District Old urban districts

New urban districts

a

Shangcheng Xiacheng Jianggan Gongshu Xihu Binjiang Total Xiaoshan Yuhang Total

Group 1a

Group 2

Group 3

Group 4

Group 5

Group 6

Mean_PI

12 11 63 32 53 26 197 44 55 99

76 93 328 128 368 138 1131 311 326 637

88 90 448 156 425 176 1383 296 381 677

146 154 411 172 451 211 1545 244 281 525

189 141 332 204 395 170 1431 126 96 222

90 52 54 67 101 24 388 10 14 24

1800.5 1079.2 447.1 959.9 750.4 446.6 914.0 154.1 171.5 126.8

(2.0%) (2.0%) (3.9%) (4.2%) (3.0%) (3.5%) (3.2%) (4.3%) (4.8%) (4.5%)

(12.6%) (17.2%) (20.1%) (16.8%) (20.5%) (18.5%) (18.6%) (30.2%) (28.3%) (29.2%)

(14.6%) (16.6%) (27.3%) (20.6%) (23.7%) (23.6%) (22.8%) (28.7%) (33.0%) (31.0%)

(24.3%) (28.5%) (25.1%) (22.7%) (25.2%) (28.4%) (25.4%) (23.7%) (24.3%) (24.0%)

(31.4%) (26.1%) (20.3%) (26.9%) (22.0%) (22.8%) (23.6%) (12.2%) (8.3%) (10.2%)

(15.1%) (9.6%) (3.3%) (8.8%) (5.6%) (3.2%) (6.4%) (1.0%) (1.2%) (1.1%)

1, extremely low; 2, very low; 3, low; 4, high; 5, very high; 6, extremely high.

2010). Restaurants in the old urban districts are easily accessible considering the convenient transportation. These restaurants therefore have a large potential consumer base. All these factors make the restaurants in the old urban districts more attractive and help gain more popularity. In order to find out the contributor to the variations of PI values in space, we further drew the thematic maps (Fig. 6) for the three components extracted by PCA. It can be seen that restaurants in the new districts presented low values of Component 1 and Component 2. It implied that the low PUR values for restaurants in Xiaoshan and Yuhang were related to the low evaluation frequency, small business scale, as well as the poor food and service quality. A similar analysis demonstrated that low PUR values for restaurants in Binjiang and Jianggan were associated with their low evaluation frequency and small business scale. When overlaying these thematic maps with Fig. 5, we found that lowehigh outliers in the old districts generally presented low values of Component 2 and Component 3. It suggested that the food and service quality as well as decoration were critical influential factors of PUR in the old districts. In addition, the highelow outliers in the new districts roughly had high values of Component 3. It denoted that decoration was an important contributor to PUR in the new districts. These results not only supported previous findings that food, service, and environment were the key aspects in restaurant rating (Zhang et al., 2014), but also evidenced that influence of these factors varied in space. Results demonstrated that locational associations between PUR and the eight categories of functional units generally presented a similar tendency. Restaurants with high PI values were in high

proximity to other functional units, whereas those with low values were located far away from other functional units. Such results implied that restaurants with high popularity tended to be located in high mixture with other functional units. This discovery supported typical urban geographic research, which emphasizes the accessibility to various public facility types (Apparicio, Abdelmajid, Riva, & Shearmur, 2008; Lamichhane et al., 2013). Urban functional units always present spatial co-occurrence and clustering patterns (Lamichhan et al., 2013; Myint, 2008) because of a combination of zoning and market forces (Leslie et al., 2012). Restaurants strongly need consumer base and residential buying power to survive. Functional units such as shopping malls, schools, cinemas, and scenic spots should provide a large consumer base for restaurants within close proximity. These restaurants could earn more business income to improve food quality and decoration. Hence, restaurants with high popularity can be expected in geographic space with high mixture of functional units. The produced thematic maps of PUR have useful implications. For one thing, the general public is provided with critical references for choosing a restaurant. Consumers can easily get the locations of restaurants with high scores in food quality, service, decoration, and overall popularity. For another, they offer spatial insights into the local catering industry to the officials and benchmarking statistics to help the unpopular restaurants improve their service. The low scoring restaurants can find their gap in different aspects of PUR. The primary contributions of this study are twofold. First, we presented a pioneering, but practical, approach to apply the social

Fig. 4. Kernel density distribution of the popularity index for restaurants within Hangzhou city, China.

Fig. 5. Patterns of spatial clusters and outliers of the popularity index identified by local Moran's I within Hangzhou city, China.

S. Zhai et al. / Applied Geography 63 (2015) 113e120

119

Table 5 Comparisons of the distance to different functional units from the six groups of restaurantsa. Group

1e2 1e3 1e4 1e5 1e6 2e3 2e4 2e5 2e6 3e4 3e5 3e6 4e5 4e6 5e6 Rankings **

Mean difference Bank

Hotel

Singing and dancing bar

Shopping mall

Elementary and University secondary school

Cinema

Scenic spot

14.3 36.7** 135.4** 308.8** 406.1** 51.1** 149.7** 323.2** 420.4** 98.6** 272.1** 369.3** 173.5** 270.7** 97.2** 1,2 > 3 > 4 > 5>6

100.3** 103.7** 110.1** 238.1** 382.0** 6.6 99.8** 227.8** 371.7** 106.4** 234.5** 378.3** 128.1** 271.9** 143.8** 1 > 2,3 > 4 > 5 > 6

26.1 103.4** 158.6** 290.5** 332.51** 57.3** 132.4** 264.3** 306.3** 75.1** 207.1** 249.2** 131.8** 173.8** 42.0 1,2 > 3 > 4 > 5,6

106.1 211.8** 617.1** 1071.1** 1513.3** 1105.6** 511.1** 964.9** 1407.1** 405.3** 859.2** 1301.4** 453.9** 896.1** 442.2** 1,2 > 3 > 4 > 5 > 6

44.7 41.5 71.65** 158.6** 280.9** 3.17 66.9** 113.8** 236.2** 70.1** 117.0** 239.4** 86.9** 209.32** 122.3** 1,2,3 > 4 > 5 > 6

4.8 267.5** 349.98** 778.6** 1041.3** 272.3** 354.7** 783.5** 1046.1** 282.4** 711.1** 973.8** 428.7** 691.42** 262.6** 1,2 > 3 > 4 > 5>6

7.8 27.0 117.8** 288.5** 619.8** 34.8 125.6** 296.3** 627.6** 190.7** 261.5** 592.8** 170.7** 502.0** 331.2** 1,2,3 > 4 > 5 > 6

91.7 272.6** 423.9** 661.1** 784.5** 280.9** 332.26** 569.4** 692.7** 251.3** 488.5** 611.8** 237.2** 360.5** 123.3 1,2 > 3 > 4 > 5,6

p < 0.01. a 1, extremely low; 2, very low; 3, low; 4, high; 5, very high; 6, extremely high.

media data from CRW into geographic science. Second, we described the spatial tendency for the locational associations between PUR and other functional units. Although we used Dianping.com to perform the analysis, the presented methodology can be extended to other types of social media platforms (e.g., Amazon and eBay) for wide geographic applications (e.g., consumer

evaluation in e-commerce, hospitality, tourism, and entertainment industry). This study also has several limitations. First, majority of the Dianping.com users are young people and their preferences may be different from those of the middle-aged and old people. The problem of common method variance may exist when using the review data. Second, we did not analyze the influence of the

Fig. 6. Distributions of the six categories of restaurants regarding their values in Component 1 (a), Component 2 (b) and Component 3 (c) within Hangzhou city, China: 1 (extremely low), 2 (very low), 3 (low), 4 (high), 5 (very high), 6 (extremely high).

120

S. Zhai et al. / Applied Geography 63 (2015) 113e120

restaurant typology. Further studies should be conducted to compare PUR of different restaurant types. Third, PCA was used to integrate the indices. More methods for indices integration should be applied and compared. Last, the influential factors of PUR were not systematically analyzed. More efforts should be spared to quantify the spatial determinants of PUR. Acknowledgments We would like to thank the anonymous reviewers for their helpful comments. This study is supported by National 985 Project of Non-traditional Security at Huazhong University of Science and Technology (2010e2020) and Research on the Internet Virtual Social Risk Governance from the perspective of National Political Security (NSSF No.11&ZD033). References Anselin, L. (1988). Spatial econometrics: Methods and models. Dordrecht, The Netherlands: Kluwer Academic Publishers. Apparicio, P., Abdelmajid, M., Riva, M., & Shearmur, R. (2008). Comparing alternative approaches to measuring the geographical accessibility of urban health services: distance types and aggregation-error issues. International Journal of Health Geographics, 7, 7. Cao, G., Wang, S., Hwang, M., Padmanabhan, A., Zhang, Z., & Soltani, K. (2015). A scalable framework for spatiotemporal analysis of location-based social media data. Computers, Environment and Urban Systems, 51, 70e82. Char, Y., & Stow, C. A. (2015). Mining web-based data to assess public response to environmental events. Environmental Pollution, 198, 97e99. Chen, X., & Yang, X. (2014). Does food environment influence food choices? A geographical analysis through “tweets”. Applied Geography, 51, 82e89. Chunara, R., Andrews, J. R., & Brownstein, J. S. (2012). Social and news media enable estimation of epidemiological patterns early in the 2010 Haitian cholera outbreak. The American Journal of Tropical Medicine and Hygiene, 86, 39e45. Croitoru, A., Wayant, N., Crooks, A., Radzikowski, J., & Stefanidis, A. (2014). Linking cyber and physical spaces through community detection and clustering in social media feeds. Computers, Environment and Urban Systems. http://dx.doi.org/ 10.1016/j.compenvurbsys.2014.11.002. Accepted manuscript. Dock, J. P., Song, W., & Lu, J. (2014). Evaluation of dine-in restaurant location and competitiveness: applications of gravity modeling in Jefferson County, Kentucky. Applied Geography. http://dx.doi.org/10.1016/j.apgeog.2014.11.008. Accepted manuscript. Duan, W., Gu, B., & Whinston, A. B. (2008). The dynamics of online word-of-mouth and product salesdAn empirical investigation of the movie industry. Journal of Retailing, 84, 233e242. Gerber, M. S. (2014). Predicting crime using twitter and kernel density estimation. Decision Support Systems, 61, 115e125. Ghosh, D., & Guha, R. (2013). What are we ‘tweeting’ about obesity? mapping tweets with topic modeling and geographic information System. Cartography and Geographic Information Science, 40, 90e102. Goodchild, M. F., & Glennon, J. A. (2010). Crowd sourcing geographic information for disaster response: a research frontier. International Journal of Digital Earth, 3, 231e241. Hollis, C. (2011). IDC digital universe study: Big data is here, now what? (p. 2011). http://bit.ly/kouTgc. Jang, S. M., & Hart, P. S. (2015). Polarized frames on “climate change” and “global warming” across countries and states: evidence from twitter big data. Global Environmental Change, 32, 11e17. Kaplan, A. M., & Haenlein, M. (2010). Users of the world unite! the challenges and opportunities of social media. Business Horizons, 53, 59e68. King, L. J., & Golledge, R. G. (1978). Cities, space, and behavior: The elements of urban geography. New Jersey: Prentice-Hall. Kirilenko, A. P., Molodtsova, T., & Stepchenkov, S. O. (2015). People as sensors: mass media and local temperature influence climate change discussion on Twitter. Global Environmental Change, 30, 92e100. Kounadi, O., Lampoltshammer, T. J., Groff, E., Sitko, I., & Leitner, M. (2015). Exploring twitter to analyze the public's reaction patterns to recently reported homicides in London. PLoS One, 10, e0121848. Lamichhane, O. P., Warren, J., Puett, R., Porter, D. E., Bottai, M., Mayer-Davis, E. J., et al. (2013). Spatial patterning of supermarkets and fast food outlets with respect to neighborhood characteristics. Health & Place, 23, 157e164.

Lampoltshammer, T. J., Kounadi, O., Sitko, I., & Hawelka, B. (2014). Sensing the public's reaction to crime news using the ‘Links correspondence method’. Applied Geography, 52, 57e66. Leslie, T. F., Frankenfeld, C. L., & Makara, M. A. (2012). The spatial food environment of the DC metropolitan area: clustering, co-location, and categorical differentiation. Applied Geography, 35, 300e307. Lin, J., & Cromley, R. G. (2015). Evaluating geo-located twitter data as a control layer for areal interpolation of population. Applied Geography, 58, 41e47. Litvin, S. W., Goldsmith, R. E., & Pan, B. (2008). Electronic word-of-mouth in hospitality and tourism management. Tourism Management, 29, 458e468. Liu, C., Su, C., Gan, B., & Chou, S. (2014). Effective restaurant rating scale development and a mystery shopper evaluation approach. International Journal of Hospitality Management, 43, 53e64. Lu, X., Ba, S., Huang, L., & Feng, Y. (2013). Promotional marketing or Word-ofMouth? evidence from online restaurant reviews. Information Systems Research, 24, 596e612. McGill, R., Tukey, J. W., & Larsen, W. A. (1978). Variations of box plots. The American Statistician, 32, 12e16. Myint, S. W. (2008). An exploration of spatial dispersion, pattern, and association of socio-economic functional units in an urban system. Applied Geography, 28, 168e188. Ogut, H., & Tas, B. K. O. (2012). The influence of internet customer reviews on online sales and prices in hotel industry. The Service Industries Journal, 32, 197e214. Parasuraman, A., Zeithaml, V. A., & Malhotra, A. (2005). E-S-QUAL: a multiple-item scale for assessing electronic service quality. Journal of Service Research, 7, 213e233. Park, S., & Nicolau, J. L. (2015). Asymmetric effects of online consumer reviews. Annals of Tourism Research, 50, 67e83. Patel, D., & Jermacane, D. (2015). Social media in travel medicine: a review. Travel Medicine and Infectious Disease. http://dx.doi.org/10.1016/j.tmaid.2015.03.006. Accepted manuscript. Shelton, T., Poorthuis, A., Graham, M., & Zook, M. (2014). Mapping the data shadows of Hurricane Sandy: uncovering the socio spatial dimensions of ‘big data’. Geoforum, 52, 167e179. Silverman, B. W. (1998). Density estimation for statistics and data analysis (p. 48). London: Chapman & Hall/CRC. Smith, A. (2013). Civic engagement in the digital age. Pew Research Center. http:// www.pewinternet.org/2013/04/25/civic-engagement-in-the-digital-age/. Sui, D., & Goodchild, M. F. (2011). The convergence of GIS and social media: challenges for GIScience. International Journal of Geographical Information Science, 25, 1737e1748. Su, S., Jiang, Z., Zhang, Q., & Zhang, Y. (2011). Transformation of agricultural landscapes under rapid urbanization: a threat to sustainability in Hang-Jia-Hu region, China. Applied Geography, 31, 439e449. Su, S., Wang, Y., Luo, F., Mai, G., & Pu, J. (2014). Peri-urban vegetated landscape pattern changes in relation to socioeconomic development. Ecological Indicators, 46, 477e486. Su, S., Xiao, R., Xu, X., Zhang, Z., Mi, X., & Wu, X. (2013). Multi-scale spatial determinants of dissolved oxygen and nutrients in Qiantang River, China. Regional Environmental Change, 13, 77e89. Talaga, P. (2010). Location, location, location: An econometric analysis of restaurant location selection in connecticut (Thesis). Maryland, and New Jersey: The College of New Jersey. Vermeulen, I. E., & Seegers, D. (2009). Tried and tested: the impact of online hotel reviews on consumer consideration. Tourism Management, 30, 123e127. Widener, M. J., & Li, W. (2014). Using geolocated twitter data to monitor the prevalence of healthy and unhealthy food references across the US. Applied Geography, 54, 189e197. Xiong, C., & Lv, Y. (2013). Social network service and social development in China. Studies in Communication Sciences, 13, 133e138. Xu, C., Wong, D. W., & Yang, C. (2013). Evaluating the “geographical awareness” of individuals: an exploratory analysis of twitter data. Cartography and Geographic Information Science, 40, 103e115. Yang, W., & Mu, L. (2015). GIS analysis of depression among twitter users. Applied Geography, 60, 217e223. Ye, Q., Law, R., Gu, B., & Chen, W. (2011). The influence of user-generated content on traveler behavior: an empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Computers in Human Behavior, 27, 634e639. Zagat, S. (2007). New York city restaurants 2007. New York: Zagat Survey, LLC. Zhang, Z., Ye, Q., Law, R., & Li, Y. (2010). The impact of e-word-of-mouth on the online popularity of restaurants: a comparison of consumer reviews and editor reviews. International Journal of Hospitality Management, 29, 694e700. Zhang, Z., Zhang, Z., & Law, R. (2014). Positive and negative word of mouth about restaurants: exploring the asymmetric impact of the performance of attributes. Asia Pacific Journal of Tourism Research, 19, 162e180.