A testbed for evaluating network construction algorithms from GPS traces

A testbed for evaluating network construction algorithms from GPS traces

Computers, Environment and Urban Systems 66 (2017) 96–109 Contents lists available at ScienceDirect Computers, Environment and Urban Systems journal...

5MB Sizes 3 Downloads 85 Views

Computers, Environment and Urban Systems 66 (2017) 96–109

Contents lists available at ScienceDirect

Computers, Environment and Urban Systems journal homepage: www.elsevier.com/locate/ceus

A testbed for evaluating network construction algorithms from GPS traces Mahdi Hashemi School of Computing and Information, University of Pittsburgh, 135 North Bellefield Avenue, Pittsburgh, PA 15260, USA

a r t i c l e

i n f o

Article history: Received 3 January 2017 Received in revised form 3 June 2017 Accepted 11 August 2017 Available online xxxx Keywords: Network construction Street network Pedestrian network GPS data Spatial data mining Artificial intelligence

a b s t r a c t Developing algorithms which construct street/pedestrian networks from crowd-sourced GPS traces has been an ongoing research since the outbreak of inexpensive GPS receivers on mobile devices. Although, the proposed algorithms are evaluated by their developers, the evaluation results cannot be used to compare their accuracy because: (a) different algorithms target different types of networks, some designed for complicated networks while others for simple ones, (b) GPS traces, used in different studies, are not the same, some of them are more accurate and denser than others, and (c) the constructed networks are evaluated either qualitatively or with different quantitative metrics. Lack of a comprehensive testbed for evaluating network construction algorithms has made it difficult for authors, reviewers, and readers to monitor the effectiveness of such algorithms. This study establishes a testbed for evaluating network construction algorithms containing three components: (a) street and pedestrian networks with different densities and complexities as the baseline, (b) collections of GPS traces with different accuracies and sampling rates to be used by algorithms to construct those networks, and (c) three quantitative metrics to indicate the completeness, precision, and topology correctness of the constructed network, in addition to the algorithm's time complexity, conventionally used to indicate its time performance. This testbed not only paves the way for comparing network construction algorithms but also allows researchers to focus on their algorithms rather than collecting data for testing it or looking for ways to describe its accuracy. © 2017 Elsevier Ltd. All rights reserved.

1. Introduction Equipping mobile devices with GPS receivers has resulted in a tremendous amount of crowd-sourced vehicle and pedestrian GPS traces (Hashemi & Karimi, 2014, 2016a, 2016b, 2016c; Hashemi & Malek, 2012). This has created a new research field for developing algorithms to construct street and pedestrian networks from GPS traces (Ahmed, Karagiorgou, Pfoser, & Wenk, 2015). Although the proposed algorithms are evaluated by their developers, the evaluation results cannot be used to judge which algorithm is more effective because they are evaluated independently on different datasets. There are three reasons why it is difficult to compare the effectiveness of network construction algorithms only based on the independently presented evaluation results. First, the underlying network in GPS traces is sometimes very complicated and dense with many close intersections (Wang et al., 2015) and sometimes very simple with distinct intersections (Blanke, Guldener, Feese, & Tröster, 2014; Xie, Philips, Veelaert, & Aghajan, 2014). It is more difficult to construct the former than the latter. Second, some studies use a large amount of GPS traces to construct the network (Xie et al., 2014) while others use much less (Kasemsuppakorn & Karimi, 2013). With more GPS traces, the algorithm is expected to construct a more complete and precise network (Kasemsuppakorn &

E-mail address: [email protected]

http://dx.doi.org/10.1016/j.compenvurbsys.2017.08.003 0198-9715/© 2017 Elsevier Ltd. All rights reserved.

Karimi, 2013; Zhang, Thiemann, & Sester, 2010). GPS traces not only differ in their quantity but also in their accuracy and sampling rate. More accurate GPS traces with low sampling rates (e.g., 1 s) are more desired for constructing networks. Third, a qualitative description of the constructed network (Bruntrup, Edelkamp, Jabbar, & Scholz, 2005) or overlaying it on a satellite image (Wu, Zhu, Ku, & Wang, 2013) is not enough to compare it with constructed networks by other algorithms. Although, some researchers have used quantitative metrics to indicate the constructed network's accuracy (Wang et al., 2015), these metrics are different and measure various aspects of the constructed network. Finally, the running time of the algorithm, reported by some researchers (Kasemsuppakorn & Karimi, 2013), is useful for comparison only if the computing platform and datasets are the same. According to Biagioni and Eriksson (2012), out of eleven network construction algorithms proposed by 2010, only one of them compared its results to others and it was only a qualitative comparison. Most studies sufficed to a qualitative description of the constructed network and overlaying it on the ground truth network. Biagioni and Eriksson (2012) also highlighted the problem regarding the diversity of GPS traces and networks used in different studies for evaluation. In an attempt to offer a solution, they provided 118 h of shuttle GPS traces at the University of Illinois, Chicago campus as well as their underlying street network (http://www.cs.uic.edu/bin/view/Bits/Software). Although a step forward, this is not enough because: (a) it is not useful for algorithms constructing other types of networks such as pedestrian

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Fig. 1. Shape point, node, segment, and across-track GPS error.

networks, (b) statistics about the accuracy of GPS traces, indicating what challenges they pose to algorithms, are not reported, and (c) a testbed must include networks with different complexities and GPS traces with different accuracies and sampling rates to assess the algorithm's performance under different circumstances. In short, there is a need for a comprehensive and diverse baseline with more statistics about the accuracy of GPS traces and metrics for measuring the accuracy and performance of algorithms. This work addresses this problem by providing a testbed for evaluating network construction algorithms containing different networks and GPS trace datasets, three quantitative metrics (completeness, precision, and topology correctness) to evaluate the accuracy of algorithms, and one quantitative metric (time complexity) to measure the time performance of algorithms. 2. Background In this work, as depicted in Fig. 1, we define: • shape point as points defining the geometry of segments, • node as shape points connected to only one or more than two segments, • segment as polylines connecting two nodes without any intermediate node, and • across-track GPS error as the shortest distance between a GPS point and the segment on which the object is moving.

97

Table 1 summarizes how different studies have described the constructed network's accuracy. Table 2 lists only those studies, from Table 1, which have evaluated the accuracy of their constructed network quantitatively, along with their metrics and their associated shortcomings. Table 2 reveals that the most popular metric to quantitatively evaluate the constructed network's completeness is the total length of segments in the constructed network falling in a buffer of the ground truth network's segments. The major problem with this approach is that it ignores those segments in the constructed network which represent a segment in the ground truth network but do not completely fall in this buffer. It is noteworthy that not falling inside the buffer in this case is the result of imprecision not incompleteness and should not degrade the completeness metric. Besides, duplicate segments in the constructed network will falsely increase the completeness metric's value. To address this problem, we find a ground truth match for every segment in the constructed network and define the completeness as the total length of matched segments in the ground truth network. Note that the major differences between our completeness metric (discussed in Section 3.2.1) and others are: (a) we find a ground truth match for every constructed segment rather than using a buffer around the ground truth network and (b) we measure the total length of matched segments in the ground truth network rather than the constructed network. This way, not only all segments in the constructed network will have a ground truth match and will not be left out but also duplicate segments in the constructed network, if any, will be matched with a single segment in the ground truth network and thus will not affect the completeness metric (although they will affect the precision and topology correctness metrics). However, spurious segments in the constructed network (constructed segments that do not represent any segments in the ground truth network) remain a source of bias in our completeness metric. The most common metric for evaluating the constructed network's precision, according to Table 2, is the average or root mean square of distances between the matching segments in the two networks. This is a reasonable approach and we keep it, with small modifications discussed in Section 3.2.2, to measure the constructed network's precision.

Table 1 Different network construction algorithms and their evaluation strategies. Algorithm

Overlaying the constructed network on a satellite image

Overlaying the constructed network on the ground truth network

Qualitative description such as good or very good

Qualitatively and visually comparing the Quantitative Comparison shortest routes in the constructed network and metrics to other the ground truth network algorithms

Bruntrup et al. (2005) Cao and Krumm (2009) Zhang et al. (2010) Fathi and Krumm (2010) Ahmed and Wenk (2012) Biagioni and Eriksson (2012) Liu et al. (2012) Liu et al. (2012) Wu et al. (2013) Kasemsuppakorn and Karimi (2013) Blanke et al. (2014) Qiu, Wang, and Wang (2014) Xie et al. (2014) Wang et al. (2015)

×

×



×

×

×



×





×

×

×

×



×



×

×



×

×



×

×





×



×

×





×





× × ✓ ✓

✓ ✓ × ×

✓ ✓ ✓ ✓

× × × ×

✓ ✓ × ✓

✓ ✓ × ×

×





×



×

×





×

×

×

× ✓

✓ ×

✓ ✓

× ✓

× ✓

✓ ×

98

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Table 2 Quantitative metrics for evaluating the constructed network's accuracy in different studies and their shortcomings. Algorithm

Quantitative metrics

Major shortcomings

Zhang et al. (2010)

Number (and the percentage) of segments in the constructed network that are completely within 2, 5, and 7 m buffers of the ground truth network.

(a) This metric does not reflect the topology correctness of the constructed network. (b) The entire length of a segment must be inside the buffer to be considered as matched, or it will be entirely dismissed. (a) These metrics do not reflect the topology correctness of the constructed network. (b) Fake nodes in the constructed network may spuriously increase the value of the first metric. (c) The approach to finding the counterpart of nodes might return no or more than one nodes. (a) These metrics do not reflect the topology correctness of the constructed network. (b) Fake nodes and segments in the constructed network may cause these metrics to overestimate the accuracy while imprecise nodes and segments in the constructed network may cause these metrics to underestimate the accuracy. (a) This metric does not reflect the completeness of the constructed network. (b) It is not mentioned how close two flags in the constructed network and the ground truth network should be to be considered a match. (c) There is no discussion on how many times the process of selecting a random start location and putting flags need to be repeated. (d) If the process is repeated too many times, the overwhelming number of flags will result in many matches and consequently a large value for this metric which overestimates the topology correctness. If the process is repeated very few times, the low number of flags will result in very few matches and consequently a small value for this metric which underestimates the precision.

Fathi and (a) Percentage of nodes in the constructed network which are within 20 m of Krumm (2010) nodes in the ground truth network. (b) Average distance between the nodes in the constructed network and their counterparts in the ground truth network. The counterpart of a node in the constructed network is the one in the ground truth network within its 20 m buffer. Ahmed and Number of nodes, number of segments, and length of good sections in G and Wenk (2012) GO (for more details refer to the original paper).

Start from a random location on the ground truth network and put flags on constant distances while moving away from the start location in different directions until a maximum radius from the start location is reached or a previously followed segment is encountered. Do the same process on the constructed network with the same start location as in the ground truth network. Repeat the entire process many times while changing the start location randomly. Count the number of matching and unmatching flags between the ground truth network and the constructed network. This metric reflects the precision and topology correctness of the constructed network simultaneously. Liu et al. (2012) Total length of segments in the constructed network that are completely within a (a) This metric does not reflect the topology correctness of the constructed 20 m buffer of the ground truth network and their direction is not more than 60° network. different than the direction of their counterpart in the ground truth network. (b) The entire length of a segment must be inside the buffer to be considered as matched or it will be entirely dismissed. Liu et al. (2012) (a) Total length of segments in the constructed network that are matched (a) These metrics do not reflect the topology correctness of the constructed with segments in the ground truth network. Matching happens if the network. Hausdorff distance between the two segments is less than 50 m. (b) Hausdorff distance is the distance between two segments where they (b) Average distance between the matching segments in the constructed deviate most from each other which will prevent many segments from being network and the ground truth network. matched just because they distance from each other more than 50 m at one point. (c) Any mistakes in matching the segments to calculate the first metric will directly contaminate the second metric. Kasemsuppakorn (a) Total length of segments in the constructed network that are completely (a) The entire length of a segment must be inside the buffer to be considered and Karimi within a 1.83 m buffer of the ground truth network. as matched or it will be entirely dismissed. (2013) (b) The root mean square of distances between the matching segments in (b) Any mistakes in matching the segments to calculate the first metric will the constructed network and the ground truth network. directly contaminate the second metric. (c) Number of pairs of nodes that are connected, without intermediate (c) Finding out whether or not a pair of nodes are connected in both nodes, in both constructed and ground truth networks divided by the constructed and ground truth networks means we already know each pair of number of pairs of nodes that are connected in the constructed network. nodes in the constructed network corresponds to which pair of nodes in the ground truth network. There is no discussion of how this is known. Blanke et al. Percentage of segments and nodes in the constructed network that are (a) This metric does not reflect the topology correctness of the constructed (2014) completely within a buffer of segments and nodes in the ground truth network. network. (b) The entire length of a segment must be inside the buffer to be counted or it will be entirely dismissed. Wang et al. (a) Number of correctly constructed nodes. (a) These metrics do not reflect the topology correctness of the constructed (2015) (b) Average and median of distances between the correctly constructed network. nodes and their ground truth counterparts. (b) It is not mentioned how it is decided that a node in the constructed network is correct or not. (c) It is not mentioned how the ground truth counterpart of a node in the constructed network is known. Biagioni and Eriksson (2012)

The two aforementioned metrics reflect the completeness and precision of the constructed network but are not informative of its topology correctness. Kasemsuppakorn and Karimi (2013) were the only who offered a sound topology correctness metric: the number of pairs of nodes that are directly (with no intermediate nodes) connected in both constructed and ground truth networks divided by the number of pairs of nodes that are connected in the constructed network. Although it is a valid metric for indicating the constructed network's topology correctness, it assumes that every node in the constructed network has a unique counterpart in the ground truth network and that unique counterpat is somehow known. The source of this knowledge (which node in the ground truth network is the counterpart of which node in the constructed network) is not mentioned in their work which leads

to the speculation that it was obtained visually. We propose a topology correctness metric, in Section 3.2.3, which can be calculated automatically. 3. Testbed The proposed testbed for network construction algorithms has three components: GPS trace datasets to be applied by algorithms for constructing networks, ground truth networks for those areas covered by GPS traces, and quantitative metrics to measure the constructed network's quality with regard to the ground truth network. Section 3.1 provides GPS trace datasets and ground truth networks and Section 3.2 describes the quantitative metrics.

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

99

Table 3 GPS trace datasets and their characteristics. Code

Number Average of sampling rate points (second)

Minimum sampling rate (second)

Average across-track GPS error (meter)

Standard deviation of across-track GPS error (meter)

Length Duration Average (meter) (second) speed (m/s)

Source

Ground truth network

GPS1 GPS2

4833 4833

9 9

1 1

7.39 3.70

5.30 2.65

348,528 42,434 348,528 42,434

8.2 8.2

NET1 NET1

GPS3

2432

18

10

8.06

5.35

348,528 42,434

8.2

GPS4

2432

18

10

4.03

2.68

348,528 42,434

8.2

GPS5

1170

40

30

8.37

5.39

348,528 42,434

8.2

GPS6

1170

40

30

4.19

2.70

348,528 42,434

8.2

GPS7 GPS8

7745 7745

9 9

1 1

5.10 2.55

4.01 2.01

367,143 64,213 367,143 64,213

5.7 5.7

GPS9

3953

17

10

5.56

4.53

367,143 64,213

5.7

GPS10 3953

17

10

2.78

2.27

367,143 64,213

5.7

GPS11 1866

38

30

5.96

4.97

367,143 64,213

5.7

GPS12 1866

38

30

2.98

2.49

367,143 64,213

5.7

GPS13 4693 GPS14 4693

2 2

1 1

5.81 2.91

5.96 2.98

60,363 60,363

10,288 10,288

5.9 5.9

GPS15 1030

10

10

5.97

5.95

60,363

10,288

5.9

GPS16 1030

10

10

2.99

2.98

60,363

10,288

5.9

GPS17 358

30

30

6.05

6.14

60,363

10,288

5.9

GPS18 358

30

30

3.03

3.07

60,363

10,288

5.9

GPS19 24,314 GPS20 24,314

2 2

1 1

4.57 2.29

3.85 1.93

689,204 57,441 689,204 57,441

12.0 12.0

GPS21 5789

10

10

4.56

3.85

689,204 57,441

12.0

GPS22 5789

10

10

2.28

1.93

689,204 57,441

12.0

GPS23 2016

30

30

4.52

3.79

689,204 57,441

12.0

GPS24 2016

30

30

2.26

1.90

689,204 57,441

12.0

GPS25 9862 GPS26 9862

5 5

1 1

4.81 2.41

3.96 1.98

53,798 53,798

52,681 52,681

1.0 1.0

GPS27 9862

5

1

1.00

0.82

53,798

52,681

1.0

GPS28 4275

12

10

4.77

3.95

53,798

52,681

1.0

GPS29 4275

12

10

2.39

1.98

53,798

52,681

1.0

GPS30 4275

12

10

1.00

0.83

53,798

52,681

1.0

GPS31 1620

33

30

4.71

3.97

53,798

52,681

1.0

GPS32 1620

33

30

2.36

1.99

53,798

52,681

1.0

GPS33 1620

33

30

1.00

0.84

53,798

52,681

1.0

OSM OSM (manually improved accuracy) OSM (manually increased sampling rate) OSM (manually improved accuracy and increased sampling rate) OSM (manually increased sampling rate) OSM (manually improved accuracy and increased sampling rate) OSM OSM (manually improved accuracy) OSM (manually increased sampling rate) OSM (manually improved accuracy and increased sampling rate) OSM (manually increased sampling rate) OSM (manually improved accuracy and increased sampling rate) GeoLife GeoLife (manually improved accuracy) GeoLife (manually increased sampling rate) GeoLife (manually improved accuracy and increased sampling rate) GeoLife (manually increased sampling rate) GeoLife (manually improved accuracy and increased sampling rate) GeoLife GeoLife (manually improved accuracy) GeoLife (manually increased sampling rate) GeoLife (manually improved accuracy and increased sampling rate) GeoLife (manually increased sampling rate) GeoLife (manually improved accuracy and increased sampling rate) Self-collected Self-collected (manually improved accuracy) Self-collected (manually improved accuracy) Self-collected(manually increased sampling rate) Self-collected (manually improved accuracy and increased sampling rate) Self-collected (manually improved accuracy and increased sampling rate) Self-collected(manually increased sampling rate) Self-collected (manually improved accuracy and increased sampling rate) Self-collected (manually improved accuracy and increased sampling rate)

NET1 NET1

NET1 NET1

NET2 NET2 NET2 NET2

NET2 NET2

NET3 NET3 NET3 NET3

NET3 NET3

NET4 NET4 NET4 NET4

NET4 NET4

NET5 NET5 NET5 NET5 NET5

NET5

NET5 NET5

NET5

100

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Table 4 Ground truth networks and their characteristics. at

Total length (meter)

Number Area of the convex of hull of the network segments (km2)

Number of shape points (including nodes)

Number Average number of of shape points nodes per km2

Average number of nodes per km2

Average degree of nodes

Type

Location

NET1 5.036005

32,245

47

19.348

771

42

40

2

2.238

Cary, NC, USA

NET2 12.975221 56,669

409

6.506

1585

355

244

55

2.276

NET3 5.423558

34,595

84

6.979

564

65

81

9

2.585

NET4 6.439557

73,380

98

28.411

1410

80

50

3

2.450

NET5 8.056071

11,695

243

0.297

216

151

727

508

3.205

Suburban streets (simple) Suburban streets (complex) Urban streets (simple) Urbanstreets (complex) Pedestrian

Code

3.1. GPS traces and ground truth networks This section provides collections of GPS traces with different quantities and qualities tracked over networks with different complexities and densities. There are 33 GPS trace datasets in Table 3, each with a different accuracy, sampling rate, and/or underlying network. Each row in Table 3 shows the characteristics of a GPS trace dataset which can be used to construct its underlying network mentioned in the last column of this table. The source of each GPS trace dataset is also mentioned in this table. Our street GPS traces are from either Open Street Map (OSM; www.openstreetmap.org) or the GeoLife project (Zheng, Chen, Li, Xie, & Ma, 2010). However, because of the insufficiency of OSM and GeoLife GPS traces for pedestrian networks, we collected the GPS traces for the last 9 rows in Table 3 using GPS Kit, a mobile application on iPhone, while walking in the area. This diversity in complexity, density, sampling rate, accuracy, speed, type of the underlying network, and

Cary, NC, USA Beijing, China Beijing, Chine Pittsburgh, PA, USA

source helps to pose various challenges to network construction algorithms, which is more important to our testbed than the size of each dataset, despite the evident necessity of being representative. All GPS traces can be downloaded at https://drive.google.com/file/d/ 0Bw2ujsj7Jl3XTGxCVkszVWNmc0E/view?usp=sharing. A ground truth network, the most important part of our testbed, provides the baseline to evaluate the constructed network's actuality. Even if two algorithms apply the same set of GPS traces to construct the same network, there is no way to decide which one has been more effective, in the absence of a ground truth network. Table 4 provides more details about the ground truth networks mentioned in the last column of Table 3. We digitized the ground truth networks using high-resolution satellite images besides field surveys only for the pedestrian network (NET5 in Table 4). Figs. 2 to 11 show five of GPS trace datasets in Table 3 along with their underlying networks overlaid on satellite images.

Fig. 2. GPS trace dataset GPS1 in Table 3.

Fig. 3. Ground truth network NET1 in Table 4.

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

101

Fig. 4. GPS trace dataset GPS7 in Table 3.

All ground truth networks can be downloaded at https:// drive.google.com/file/d/0Bw2ujsj7Jl3XUE5IS1ZUaHVIaDQ/view?usp= sharing. Each ground truth network is provided for download in two tables: adjacency matrix and network geometry table. Cell i,j in the adjacency matrix shows the Id of the segment connecting node i to node j if there is such a segment and 0 otherwise. The adjacency matrix (or graph) determines which nodes are connected to each other. In other words, it determines the network's topology as a graph. However, the adjacency matrix is not informative of whether the segment connecting two nodes is straight or curved on the ground. Therefore, we need a second source providing information about each segment's geometry. The network geometry table provides the coordinates of shape points on each segment Id, determining the exact geometry of that segment. The network geometry table has three columns: longitude, latitude, and segment Id. The first two columns show the coordinates of shape points on each segment. The shape points' task is merely to determine the shape or geometry of segments. The adjacency matrix along with the information about the segments' shape (provided in the geometry table) can fully represent a street/pedestrian network.

input data. However, we provide no further comments on time complexity as it is extensively elaborated in the foundational literature (Cormen, Leiserson, Rivest, & Stein, 2009; Sedgewick & Wayne, 2011). 3.2.1. Completeness This metric shows what percentage of the ground truth network is constructed by the algorithm, regardless of the precision and topology correctness of the constructed network. Here is how this metric is calculated. Distance is used to match segments in the constructed network with segments in the ground truth network. In other words, each segment in the constructed network is matched with the closest segment in the ground truth network. Distance between a segment in the constructed network (e.g. segment A in Fig. 13) and a segment in the ground truth network (e.g. segment B in Fig. 13) is calculated as follows. The shortest distance between each shape point on segment A to segment B (shown by di in Fig. 13) is measured and stored in an array. The size of the array would be n which is the number of shape points on segment A. The average in this array is considered as the distance between the two segments (shown by d in Eq. 1).

3.2. Quantitative metrics n

Three quantitative metrics are proposed here for evaluating the constructed network's quality: completeness, precision, and topology correctness. These metrics cover different aspects of the constructed network, are independent of the network type (street or pedestrian), and can be calculated automatically. Fig. 12 schematically depicts a ground truth network, along with three constructed networks, each satisfying only two out of three aforementioned metrics, to highlight the third metric's contribution in the constructed network's overall quality. We also suggest time complexity to measure the algorithm's time performance because it is independent of the computation platform and



∑i¼1 di n

ð1Þ

To find the shortest distance between a shape point on segment A (shown by s1 in Fig. 14) and segment B, first the closest shape point on segment B to s1 is found (p1 in Fig. 14). Then the shortest distance between s1 and each of the two straight lines extruded from p1 is calculated (shown by d1 and d2 in Fig. 14). Among d1 and d2, the smaller one is considered as the distance between s1 and segment B. Each segment in the constructed network has a match in the ground truth network, but some segments in the ground truth network might

102

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Fig. 5. Ground truth network NET2 in Table 4.

have not been matched with any segments in the constructed network. We show the total length of matched segments in the ground truth network with l and the total length of segments in the ground truth network with L. Completeness is calculated using Eq. 2 and it is between 0 and 1 where greater values show more completely constructed networks. The pseudo-code for calculating the completeness is provided in Appendix A.

Completeness ¼ l=L

ð2Þ

The reason completeness is calculated based on the ground truth network rather than the constructed network is that despite more than one segment in the constructed network may be matched with a single segment in the ground truth network (Fig. 15), that single segment in the ground truth network is counted only once when calculating l.

3.2.2. Precision The completeness metric indicates how complete the constructed network is, though it does not reflect how precisely the segments are

Fig. 6. GPS trace dataset GPS13 in Table 3.

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

103

Fig. 7. Ground truth network NET3 in Table 4.

constructed. The precision metric shows how close the constructed segments are to their ground truth counterparts, regardless of topological errors. From the previous step, we have the distance between each segment in the constructed network and its match in the ground truth network (shown by d in Eq. 1). The average of all these distances and their standard deviation are used to indicate how precisely the network is constructed. The pseudo-code for calculating the precision is provided in Appendix A. 3.2.3. Topology correctness An algorithm may construct a quite complete and precise network but with many topological errors (e.g. Fig. 16). Topological errors cause the constructed network to perform poorly in computations, such as wayfinding. We apply the Floyd's algorithm (Floyd, 1962) to measure the constructed network's topology correctness. Floyd's algorithm receives as input the adjacency matrix (An × n where n is the number of nodes) of a graph. The element i,j in the adjacency matrix is 0 if i = j, 1 if there

is a segment between nodes i and j, and ∞ otherwise. Floyd's algorithm returns an n × n matrix where the element i,j represents the length of the shortest path from node i to node j. The Floyd's Algorithm's pseudo-code is provided in Appendix A. The adjacency matrixes of the ground truth network and constructed network (which are not necessarily the same size) are separately inputted to the Floyd's algorithm. We show the average of non-diagonal elements in the returned matrix by Floyd's algorithm for the ground truth network by at and for the constructed network by ac. We have included the value of at for the ground truth networks in our testbed in Table 4. Topology correctness is calculated as: Topology correctness ¼ at =ac

ð3Þ

The pseudo-code for calculating the topology correctness is provided in Appendix A. The value of the topology correctness is between 0 and at. Topology correctness will be 0 when ac has its maximum value

Fig. 8. GPS trace dataset GPS19 in Table 3.

104

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Fig. 9. Ground truth network NET4 in Table 4.

which is ∞. It happens if the constructed network is unconnected (Fig. 17). An unconnected graph has one or more isolated parts. Topology correctness will be equal to at when ac has its minimum value which is 1. It happens when there is a segment between every pair of nodes in the constructed network. The desired value for the topology correctness is 1. A topology correctness of less than 1 means that nodes in the constructed network are under-connected compared to the ground truth network. In other words, nodes that are connected in reality, are not connected in the constructed network (e.g. Fig. 16). A topology correctness of greater than 1

means that nodes in the constructed network are over-connected compared to the ground truth network. In other words, nodes that are not connected in reality, are connected in the constructed network (e.g. Fig. 18). 4. Benchmarking the testbed Kernel density estimation (KDE), a common approach for constructing networks from GPS traces (Chen & Cheng, 2008; Davies, Beresford, & Hopper, 2006), is used here to establish a simple comparison

Fig. 10. GPS trace dataset GPS25 in Table 3.

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

105

Fig. 11. Ground truth network NET5 in Table 4.

Fig. 12. Contribution of different accuracy metrics in the constructed network's overall quality.

benchmark for future network construction algorithms. In this method, a grid of cells is overlaid on GPS points. A threshold on the frequency of GPS points falling in a circle around the center of each cell determines whether or not that cell is part of the network. The network is then constructed by connecting those cells that are part of the network. Despite its sensitivity to the threshold on the frequency of GPS points, the KDE method is popular for its stability against GPS error and its time performance (Biagioni & Eriksson, 2012). Table 5 reports the quantitative metrics for the constructed networks along with the settings used for the KDE method. Figs. 19–21 show the sensitivity of the constructed network from GPS1 dataset to each of the three KDE method parameters. The topology correctness of the constructed network from GPS1 dataset is zero in all cases (thus, not shown in Figs. 19–21), which highlights the KDE method's weakness in facing topological errors, such as undershoots and overshoots. On the other hand, high values of completeness (close to one) underscore the KDE method's strength in detecting most segments. In other words, while the KDE method constructs most segments, it fails to fill the fractures in the constructed segments and remedy the topological shortcomings. The constructed networks' precision is also poor (ranging from 7 to 23 m),

which is mainly because GPS points are replaced by square cells to construct the network. 5. Summary and future directions The heterogeneity of datasets and inadequacy of metrics used to evaluate network construction algorithms create barriers in

Fig. 13. Calculating the distance between segment A in the constructed network and segment B in the ground truth network.

106

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Fig. 14. Calculating the shortest distance between a shape point on segment A and segment B.

Fig. 15. The two dashed segments in the constructed network match with the single dashed segment in the ground truth network.

Fig. 16. A completely and precisely constructed network with topological errors highlighted in circles.

Fig. 17. The constructed network is unconnected and the topology correctness is 0.

Fig. 18. An example where the topology correctness of the constructed network is greater than 1.

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

107

Table 5 Quantitative metrics for the constructed network from different GPS trace datasets using the KDE method. Dataset

GPS1 GPS2 GPS3 GPS4 GPS5 GPS6 GPS7 GPS8 GPS9 GPS10 GPS11 GPS12 GPS13 GPS14 GPS15 GPS16 GPS17 GPS18 GPS19 GPS20 GPS21 GPS22 GPS23 GPS24 GPS25 GPS26 GPS27 GPS28 GPS29 GPS30 GPS31 GPS32 GPS33

Cell size (meter)

Search radius (meter)

5 50 5 50 5 60 5 60 Failed to construct the network Failed to construct the network 5 30 5 30 5 30 5 30 Failed to construct the network Failed to construct the network 5 15 5 15 5 45 5 45 Failed to construct the network Failed to construct the network 5 15 5 15 5 25 5 25 Failed to construct the network Failed to construct the network 2 5 2 5 1 4 2 10 2 8 1 8 2 12 2 12 2 10

Threshold on the frequency of GPS points

Completeness

Precision (meter) Average

STD

Topology correctness

1 1 1 1

0.87 0.87 0.83 0.83

19.54 19.54 17.10 18.01

19.31 19.31 17.35 19.09

0 0 0 0

1 1 1 1

0.83 0.84 0.76 0.77

15.96 15.22 16.24 15.51

13.03 12.17 11.67 11.20

0 0 0 0

1 1 1 1

0.67 0.65 0.57 0.57

12.04 19.45 24.82 24.82

17.29 55.47 50.06 50.06

0 0 0 0

1 1 1 1

0.65 0.67 0.33 0.33

17.17 16.13 61.31 61.31

35.15 33.47 94.84 94.84

1.1113 1.0551 2.1534 2.1534

2 4 4 2 2 2 1 1 1

0.46 0.84 0.98 0.43 0.62 0.78 0.39 0.39 0.47

11.29 3.74 1.10 12.62 6.32 3.50 15.44 15.44 10.93

8.08 3.83 1.06 9.76 6.33 5.30 10.48 10.48 9.44

1.7026 1.1195 1.0151 1.8370 1.3963 0 2.2995 2.2995 1.8561

Fig. 19. Precision and completeness of the constructed network from GPS1 dataset for different values of cell size (search radius = 50 m and threshold = 1).

comparing their effectiveness. Density and accuracy of GPS traces and complexity of their underlying networks play important roles in the quality of networks constructed by these algorithms. Additionally, a visual or verbal description of the constructed network's quality is fuzzy and subjective. In an attempt to remove these comparison barriers, this work provided a comprehensive testbed for evaluating such algorithms, containing: (a) street and pedestrian networks with different densities and complexities, (b) datasets of GPS traces with different accuracies and sampling rates on each network, and (c) quantitative metrics for evaluating the completeness, precision, and topology correctness of constructed networks. With future network construction algorithms using this testbed for evaluation, comparison will become much easier and just. A future research direction is to apply this testbed to the state-ofthe-art network construction algorithms and set the bar for future algorithms.

Fig. 20. Precision and completeness of the constructed network from GPS1 dataset for different values of search radius (cell size = 5 m and threshold = 1).

Fig. 21. Precision and completeness of the constructed network from GPS1 dataset for different values of threshold (cell size = 5 m and search radius = 50 m).

108

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109

Appendix A

References Ahmed, M., Karagiorgou, S., Pfoser, D., & Wenk, C. (2015). Map construction algorithms. Map construction algorithms (pp. 1–14). Springer. Ahmed, M., & Wenk, C. (2012). Constructing street networks from GPS trajectories. Algorithms–ESA (pp. 60–71). Springer. Biagioni, J., & Eriksson, J. (2012). Inferring road maps from global positioning system traces. Transportation Research Record: Journal of the Transportation Research Board, 2291(1), 61–71. Blanke, U., Guldener, R., Feese, S., & Tröster, G. (2014). Crowdsourced pedestrian map construction for short-term city-scale events. Proceedings of the First International Conference on IoT in Urban Space (pp. 25–31) (Rome, Italy). Bruntrup, R., Edelkamp, S., Jabbar, S., & Scholz, B. (2005). Incremental map generation with GPS traces. Proceedings of the 8th international conference on intelligent transportation systems (pp. 574–579). Vienna, Austria: IEEE.

Cao, L., & Krumm, J. (2009). From GPS traces to a routable road map. Proceedings of the 17th ACM SIGSPATIAL international conference on advances in geographic information systems (pp. 3–12). Seattle, Washington: ACM. Chen, C., & Cheng, Y. (2008). Roads digital map generation with multi-track GPS data. Proceedings of international workshop on education technology and training and international workshop on geoscience and remote sensing. Vol. 1. (pp. 508–511). Shanghai, China: IEEE. Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2009). Introduction to algorithms (3rd ed.). Cambridge, Massachusetts: MIT Press. Davies, J. J., Beresford, A. R., & Hopper, A. (2006). Scalable, distributed, real-time map generation. IEEE Pervasive Computing, 5(4), 47–54. Fathi, A., & Krumm, J. (2010). Detecting road intersections from GPS traces. Proceedings of the sixth international conference on geographic information science (pp. 56–69). Zurich: Springer. Floyd, R. W. (1962). Algorithm 97: Shortest path. Communications of the ACM, 5(6), 345.

M. Hashemi / Computers, Environment and Urban Systems 66 (2017) 96–109 Hashemi, M., & Karimi, H. A. (2014). A critical review of real-time map-matching algorithms: Current issues and future directions. Computers, Environment and Urban Systems, 48, 153–165. Hashemi, M., & Karimi, H. A. (2016a). A machine learning approach to improve the accuracy of GPS-based map-matching algorithms. Proceedings of the IEEE 17th international conference on information reuse and integration (pp. 77–86). Pittsburgh, PA: IEEE. Hashemi, M., & Karimi, H. A. (2016b). A weight-based map-matching algorithm for vehicle navigation in complex urban networks. Journal of Intelligent Transportation Systems, 20(6), 573–590. Hashemi, M., & Karimi, H. A. (2016c). Collaborative personalized multi-criteria wayfinding for wheelchair users in outdoors. Transactions in GIS. http://dx.doi.org/ 10.1111/tgis.12230. Hashemi, M., & Malek, M. R. (2012). Protecting location privacy in mobile geoservices using fuzzy inference systems. Computers, Environment and Urban Systems, 36(4), 311–320. Kasemsuppakorn, P., & Karimi, H. A. (2013). A pedestrian network construction algorithm based on multiple GPS traces. Transportation research part C: emerging technologies, 26, 285–300. Liu, X., Biagioni, J., Eriksson, J., Wang, Y., Forman, G., & Zhu, Y. (2012). Mining large-scale, sparse GPS traces for map inference: Comparison of approaches. Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 669–677). Beijing, China: ACM.

109

Liu, X. Y., Wang, Y., Forman, G., Ni, L. M., Fang, Y., & Li, M. (2012). Road recognition using coarse-grained vehicular traces. (HP Labs). Qiu, J., Wang, R., & Wang, X. (2014). Inferring road maps from sparsely-sampled GPS traces. Advances in artificial intelligence (pp. 339–344). Springer. Sedgewick, R., & Wayne, K. (2011). Algorithms (4th ed.). Boston: Pearson. Wang, J., Rui, X., Song, X., Tan, X., Wang, C., & Raghavan, V. (2015). A novel approach for generating routable road maps from vehicle GPS traces. International Journal of Geographical Information Science, 29(1), 69–91. Wu, J., Zhu, Y., Ku, T., & Wang, L. (2013). Detecting road intersections from coarse-gained GPS traces based on clustering. Journal of Computers, 8(11), 2959–2965. Xie, X., Philips, W., Veelaert, P., & Aghajan, H. (2014). Road network inference from GPS traces using DTW algorithm. Proceedings of the 17th International Conference on Intelligent Transportation Systems (ITSC) (pp. 906–911). Qingdao, China: IEEE. Zhang, L., Thiemann, F., & Sester, M. (2010). Integration of GPS traces with road map. Proceedings of the second international workshop on computational transportation science (pp. 17–22). ACM. Zheng, Y., Chen, Y., Li, Q., Xie, X., & Ma, W. -Y. (2010). Understanding transportation modes based on GPS data for web applications. ACM Transactions on the Web, 4(1), 1–36.