Transportation Research Part B 70 (2014) 65–89
Contents lists available at ScienceDirect
Transportation Research Part B journal homepage: www.elsevier.com/locate/trb
Assessing partial observability in network sensor location problems Francesco Viti a,⇑, Marco Rinaldi b, Francesco Corman c, Chris M.J. Tampère b a University of Luxembourg – Research Unit of Engineering Science, Faculty of Science, Communication and Technology, 6 Rue Coudenhove-Kalergi, L-1359 Luxembourg, Luxembourg b KU Leuven, Center for Industrial Management/Traffic & Infrastructure, Celestijnenlaan 300, B-3001 Heverlee, Belgium c Delft University of Technology, Transport Engineering and Logistics – Dept. of Maritime and Transport Technology, Mekelweg 2, 2628 CD Delft, The Netherlands
a r t i c l e
i n f o
Article history: Received 1 November 2013 Received in revised form 7 August 2014 Accepted 7 August 2014
Keywords: Network sensor location problem Partial observability Null space Pivoting Under-determinedness
a b s t r a c t The quality of information on a network is crucial for different transportation planning and management applications. Problems focusing on where to strategically extract this information can be broadly subdivided into observability problems, which rely on the topological properties of the network, and flow-estimation problems, where (prior) information on observed flows is needed to identify optimal sensor locations. This paper contributes mainly to the first category: more specifically, it presents a new methodology and an intuitive metric able to quantify the quality of a solution in case of partial observability, i.e. when not all flow variables are observed or can be uniquely determined from the observed flows. This methodology is based on existing approaches that can efficiently find solutions for full observability (i.e., the set of sensors needed to make the system fully determined), and exploits only the algebraic relations between link, route and origin–destination flow variables to quantify the information contained in any arbitrary subset of these variables. The new metric allows, through its adoption within simple search algorithms, to efficiently select sensor locations when the number of available sensors is limited by, for example, budget constraints and is less than the number needed to guarantee full observability. The chosen positions aim at selecting those locations that contain the largest information content on the whole network. This is an important contribution in this field, since even in small sized networks the solution for full observability requires an exceedingly large amount of sensors. The assessment of partial observability solutions, based on explicit route enumeration, allows one to categorize families of full observability solutions, and shows that these contain different information potential. This way, it is possible to rank solutions requiring a lower number of sensors while containing the same information content. We tested this new methodology both on toy networks, in order to analyse the properties of the metric and illustrate its logic, and to explain and test heuristic search algorithms for optimal sensor positioning on a real-sized network. Analysis of partial observability solutions shows that the basic search algorithms succeed in finding the links that contain the largest deal of information in a network. Ó 2014 Elsevier Ltd. All rights reserved.
⇑ Corresponding author. Tel.: +352 4666445352; fax: +352 4666445200. E-mail addresses:
[email protected] (F. Viti),
[email protected] (M. Rinaldi),
[email protected] (F. Corman),
[email protected] (C.M.J. Tampère). http://dx.doi.org/10.1016/j.trb.2014.08.002 0191-2615/Ó 2014 Elsevier Ltd. All rights reserved.
66
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
1. Introduction Information from traffic sensors (such as traffic flows, vehicle counts, travel time, . . .) is crucial for many transportation engineering applications, e.g. state estimation, model calibration, travellers’ information services, traffic management, etc. It is however not possible to give general recommendations on where to extract this information in a network, in order to obtain a reliable picture of the underlying traffic patterns. The quality of traffic information depends on how many and which types of sensors are installed, and on the complexity of the network layout (i.e. how each traffic variable is related to any other variable). Moreover, different weights and priorities may be attributed to the state variables, for example to account for different road classes, spatial coverage, more capacitated or busy roads, etc. The problem of where to strategically position traffic sensors in order to obtain the largest amount of information on a network is traditionally referred to as network sensor location problem (NSLP). NSLP problems can be distinguished between those focusing on the algebraic and topologic properties of the network structure and connections (observability problems), and those relating observed traffic states (usually, flows) with the ones derived using estimation techniques (flow-estimation problems). This paper focuses mainly on the first category, although extension to the second is possible, as will be detailed in Section 6. In order to formally introduce the observability problem, we here define the following terms, which will be used throughout this paper: Definitions about observability: by observed flows we specify those variables which are measured by the installed sensors. When a variable is fully described by measured ones, we will define it as observable, and as consequence we will talk about observable flows. Conversely, if not all variables needed to derive an observable flow are measured, we will talk about unobservable flows. Solution of an observability problem is therefore any set of observed flows such that all other unmeasured variables are completely derivable. An observed variable is always observable. Therefore, a system is fully observable if all its variables are observable. In observability problems, the goal is to find the minimum set of observed link (or route, OD) flows in order to make the system of equations describing all traffic states uniquely determined, i.e. such that all unobserved flows are however observable. Closely related to this, more general, goal is to find the tight upper bound of the minimum number of observed flows such that the system becomes fully observable, while, in the same time, the observed flows are linearly independent, i.e. no redundant information is taken. Existing approaches already provide simple and elegant exact solutions for full observability (e.g., Hu et al., 2009; Castillo et al., 2009; Ng, 2012), i.e. they allow one to find the set of sensors able to completely explain the system, but these solutions are characterized by a very large number of variables to be observed. Empirical analysis and analytical methods showed that up to 60–70% of the links in a network would have to be observed in order to attain full system’s observability (Hu et al., 2009; Ng, 2012; Castillo et al., 2013b), which is economically infeasible for any real-sized network. The focus of this paper is therefore to develop methodologies for partial observability, i.e. when not all variables are observed or observable. The main contribution of this paper is the development and appraisal of a new metric to assess partial observability solutions, which considers the information that observed flows provide on the whole network, i.e. including the unobservable flows, and that has natural interpretation opportunities. The metric, which can be seen as a measure of the maximum error on all the unobserved flows, whether fully observable or not, relative to the set of observed ones, can be estimated and used in an optimization setup to identify the variables (whether link, route or OD) that will minimize this error. This measure is then used to develop simple heuristic algorithms, which are shown to identify efficient solutions (i.e. minimizing a measure of information loss) of partial observability. These solutions are based on topologic information only, i.e. they depend solely on the structure of the network. This paper is outlined as follows. Section 2 provides a more elaborate and formal mathematical description of the network sensor location problem and an extensive literature review, which particularly focuses on observability problems and on the existing metrics and location rules. Section 3 introduces the methodology, together with the derivation and analysis of the new metric. Section 4 introduces the local search heuristics, shows the properties of the introduced methodology on different toy networks and studies the interaction between different full and partial observability solutions’ behaviours, while Section 5 presents an application on a real sized network and compares our metric with other metrics for assessing partial observability solutions. Section 6 sketches the potential further applications and extensions, and finally Section 7 provides conclusions and future research directions. 2. Literature review 2.1. The basic linear system In traffic networks, we generally distinguish three basic flow vectors: route flows h, origin–destination (OD) flows f and link flows v. The (static) relations between these flows are simply given by the following equations (see e.g. Cascetta, 2009):
X dlr hr ¼ v l r2R
8l 2 L
ð1Þ
67
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
X
qwr hr ¼ f w 8w 2 W
ð2Þ
r2R
with fw the OD flow for an origin–destination pair w in the set of OD pairs W; hr the route flow on a route r belonging to the routes set R, and v1 the link flow on link l belonging to the link set L. The matrices d and q, consisting of the elements dlr and qwr, have values 0 or 1, depending on whether route r contains or not link l, and whether route r connects OD-pair w, respectively. In this paper we deal with the problem of locating a number of sensors, each corresponding to the observation of one state variable. In particular we focus on traffic counting devices, which are often used to estimate the link flow variables. Other sensors may be deployed to represent link flow variables (e.g. fixed cameras, toll gantries) as well as route and OD variables (e.g. scanning devices, tagged vehicles). One can refer to e.g. Viti et al. (2008), Castillo et al. (2008b) and Gentili and Mirchandani (2012b) for a discussion of how different sensors can be used to estimate each variable in Eqs. (1) and (2). In the paper, we consider observing a variable equivalent to locating a sensor, i.e. we do not specify which state estimation process is used to translate from sensor information to traffic state. Furthermore, we focus on the static case, i.e. the state variables in Eqs. (1) and (2) are time-independent, but we argue that generalizing the results to the dynamic case is rather straightforward, once the functional relationships of matrices d and q are specified (for instance with a dynamic network loading model, see e.g. Viti et al., 2008). Alternative and (partly) analogous topological relations can be defined by introducing node-based relations instead of route-based, and by replacing the above Eqs. (1) and (2) with equivalent conservation of flows equations at nodes. One can refer to e.g. Ng (2012) for the mathematical definition of node-based approaches. The methodology developed in this paper applies to both modelling paradigms, and in general to any method seeking for full observability solutions. For sake of clarity and illustration, but without losing generality, we will refer mainly to the route-based modelling approach defined by Eqs. (1) and (2). It is possible to rewrite Eqs. (1) and (2) in the following compact matrix form:
v f
¼
d h
ð3Þ
q
We make use of the toy example in Fig. 1 to explain the basic ideas in this paper. The network in Fig. 1 has 5 links, 4 OD pairs {(A, C), (A, D), (B, C), (B, D)} and 4 routes, specified in the attached table. Looking at the example, even if all links in this simple network were equipped with flow sensors, route flows could not be determined uniquely, because the rank of the matrix d is 3 instead of 4 (see Fig. 2(a) in Section 2.2). In fact, only three links at maximum contain linearly independent flow information and therefore make the other two unobserved flows fully observable (e.g. link flows v1, v2 and v4 uniquely determine the flows in v3 and v5). Therefore, without recurring to modelling or behavioural assumptions, it is sufficient to complement link flow information with, for instance, information on the split rates at nodes, or with route or OD flow data. It can be easily verified, for instance, that adding information on just one route would make the system fully determined. Following this reasoning, the approach described in this paper uses a full observability solution – in terms of variables – as a starting point, such as the one provided for the network in Fig. 1. This solution can be obtained through link flows, route flows, and/or OD flows indifferently, if one regards equally the state variables in Eqs. (1) and (2). As long as a matrix representing a full observability relationship between the available variables can be obtained, the partial observability metric developed in this study and the greedy heuristic approach can be fully applied, with no loss of generality. This typical solution under-determinedness characterising sensor locations not guaranteeing full observability, and the way it has been addressed, has characterized large part of the research on NSLP, and the various possibilities to refine the information given by Eqs. (1) and (2) resulted in different problem types. These have been described, analysed and classified in the following of this section. 2.2. Categorization of network sensor location problems Gentili and Mirchandani (2012a,b) classified sensor location problems into two distinct categories: (a) observability – or coverage – problems and (b) flow-estimation problems. The first deals mainly with the problem of solution determinedness
routes
links
h1
{v1, v3, v4}
h2
{v2, v3, v4}
h3
{v1, v3, v5}
h4
{v2, v3, v5}
Fig. 1. Simple network with only one alternative link between the two inner nodes.
68
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Fig. 2. Initial matrix relation and three iterations of the pivoting procedure on the network in Fig. 1.
of the linear system (3); the second deals with the relationship between sensor locations and the quality of a specific flow estimation problem relying on data from the sensors (e.g., OD estimation, state estimation). An analogous classification was proposed earlier by Hodgson (1990) for general supply chain network problems. More specifically, Hodgson identified set-covering solutions, i.e. those providing the minimum number and locations of sensors that guarantee full coverage, and maximal covering solutions (i.e. maximising the amount of variables covered), which aim at finding the maximum flow captured by a specified number of locations, while choosing locations where information obtained is independent from all other sensors. This means that solutions were those containing the smallest possible information redundancy. This was referred to, in his terms, as non-cannibalizing solutions. Both observability and flow-estimation problems look for solutions maximising coverage while minimising the number of sensors to install, which is implicitly requiring any sensor location to provide (linearly) independent information.
2.2.1. Observability problems The literature on observability problems has a broad application domain, beside transportation network applications (e.g., electrical networks, supply chain networks, telecommunications). Observability problems focus on the network structure, i.e. exploiting the basic relationships between link, route and OD variables expressed by the matrices d and q in Eq. (3). These relationships are often used in combination with (i) an a priori matrix relating link and OD flows, (ii) route choice probabilities given by some traffic assignment rules (e.g., to solve the flow-estimation NSLP using traffic counts only (Castillo et al., 2009)), (iii) in route or OD-based data such as license plate recognition (Castillo et al., 2008b), and scanned links (Gentili and Mirchandani, 2005; Castillo et al., 2012, 2013a), or (iv) with heuristic rules and metrics targeting specific applications (e.g. OD estimation), as described in detail in Section 2.2.2. One approach to solve observability problems is to perform opportune matrix manipulations, i.e. by swapping or sorting rows or columns, while carefully maintaining the original relations between the variables as in the original set of relations (3). This approach (we will often call it pivoting in this paper) has been proposed by Castillo et al. (2000) and applied first to electrical and power networks (Castillo et al., 2006, 2007) and more recently to traffic networks (Castillo et al., 2008a, 2009). One of the main advantages of the pivoting procedure is that in any solution the observed variables are linearly independent. Clearly, multiple solutions exist, depending on the sequence in which rows and columns are swapped. Similar procedure has been proposed by Hu et al., 2009, who focused on the conversion of the matrix d into its ‘‘reduced row echelon form’’ through the Gaussian elimination algorithm. One of the main contributions of Hu et al. (2009) is to focus on the full observability problem for link flow inference, i.e. the problem of finding the minimum set of links to equip with sensors (basis links) such that all links in the network are observable. Sensitivity analysis using different toy networks suggested that an upper bound for the number of linearly independent links to be observed is likely to exist. Ng (2012) showed that the Gaussian elimination method can be also applied to link-node matrices, and that through this reformulation an analytical expression for the upper bound could be found, i.e. it equals the difference between the number of links m and the number of non-centroid nodes n. Recently, a graphical approach based on a spanning-tree method was proposed by He (2013) as alternative to the pivoting and the Gaussian elimination methods. Similar approaches were proposed previously in power systems (Mori and Tsuzuki, 1995) and hydraulics (Rahal, 1995; Gupta and Prasad, 2000). By using this technique, He (2013) pointed out that the relation found by Ng (2012) does not consider all information contained in a network, as it neglects dependency of routes through the OD relations. The analytical expression provided by Ng (2012) was also discussed by Castillo et al. (2013b), who argued that the minimum number of linearly independent links to be observed can be improved if a set of linearly independent paths (in terms of links) is identified first. This also obviates the full route enumeration requirement in link-path matrix-based
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
69
approaches. Nevertheless, the set of linearly independent paths is not unique so an exact expression using this approach could not be provided yet. Moreover, Castillo et al. (2014) show that solutions from node-based approaches can only provide an upper bound for the minimum number of observed variables, while a lower number can be achieved if link-node information is complemented with information on (linearly independent) paths. This statement is confirmed in the case study presented in Section 5 of this paper. 2.2.1.1. Partial observability: definitions and metrics. The above methods provide elegant solutions in case of full system’s observability. However, these solutions may be economically impractical in real networks, as they may require an exceedingly large amount of sensors to be placed. Problems addressing partial observability, i.e. where part of the information remains unknown, are more suited for realistic applications. There is no generally accepted way of assessing partial observability. While it is rather straightforward to check when a system is fully observable or not, it is hard to quantify the information missing on the unobservable flow variables. In case a variable in not observable, there may still be some information provided by the observed and observable links: their observability may still limit the uncertainty on the unobservable flow values. Quantification of the missing information, or equivalently of the (maximum) error that can be done on the unobserved variables, can rather easily be provided by adding extra information or assumptions on the flows in the network. This is discussed in Section 2.2.2 as the problem then falls into flow-estimation problems. Gentili and Mirchandani (2012b) provided an intuitive definition of partial observability problems as to ‘‘find the (minimum) subset of state variables such that the system is partially observable at level h’’. In this definition, the number of observable variables defines the level h. This definition is also used by e.g. Castillo et al. (2012) and He (2013) within the problem of ‘‘determining the observable links from a pre-specified set of installed sensors’’. To understand this definition, let reconsider the example in Fig. 1. If we use for instance Castillo’s pivoting procedure, and focus on only link-route relations (i.e. v = dh, specified in Fig. 2(a)) we would select in sequence a number of observed links, and on the basis of the links we observed, we may find that some of the unobserved links become observable. Fig. 2(c) shows for instance that link 3 becomes observable by simply observing links 1 and 2. The relation is indicated in bold-red in the figure, i.e. v3 = v1 + v2. Therefore this partial observability solution is of level h = 1. Full observability is then reached by observing v4, as also v5 becomes then observable (thus the full observability solution is reached in this case at level h = 2). For the details of the pivoting procedure one can refer to e.g. Castillo et al. (2008a). An analogous partial observability definition was proposed by Castillo et al. (2012), where, those h links that, at least, should be observable are specified a priori. This means that the problem is reformulated as to ‘‘find the minimum number of observed links such that a pre-selected set of h link flows becomes observable’’. This means that any solution of this problem would be at least of level h according to Gentili and Mirchandani (2012b). Another intuitive definition is to define solution of partial observability is solving the problem of ‘‘determining the optimal locations for given number of sensors’’ (see e.g. Hu et al., 2009; Ng, 2012; and He, 2013). Weights or priorities must in this case be pre-specified to indicate the different importance that some links can have with respect to the others (e.g., motorways that are likely to carry larger flows). Looking at the example in Fig. 2, one may for instance set a high priority in order to observe link 3 instead of one between links 1 and 2, therefore yielding to a different full observability solution than the one obtained from the pivoting in Fig. 2(d). One may also be interested in studying a system where one or more sensors were already installed, and to ‘‘find the minimum number of extra sensors in order to achieve full observability, which complements an existing set of sensors’’. This instance of the NSLP was solved using different algorithms by Hu et al. (2009), Castillo et al. (2009) and Ng (2013). The above definitions of partial observability focus on finding the set of observed and observable variables satisfying certain criteria or properties. They do not however give any indication of what can still be told about the unobservable variables. One can easily observe, for instance, that in Fig. 2(c), the sum of all possible values of v4 and v5 must be equal to the sum of the measured flows v1 and v2. The solution space for v4 and v5 is therefore reduced with respect to a completely unmeasured scenario. We do not want to neglect this information, since it reduces the potential error on any estimation problem of the unobservable variables. It is therefore important to define a metric that is sensitive to a change in the solution space due to an added (or inversely a missing) sensor. A definition of partial observability, which can take into account the partial information that observed flows can give to the set of unobservable variables is proposed here: Definitions about partial observability: We say that a (non-observed, non-observable) variable is partially observable when it is functionally related to at least one of the observed flows. In turn, with partial observability we mean any scenario where at least one of the observed flows in the context of a full observability solution, is no longer measured. An alternative, analogous, definition is that a subset of the observable flows of a full observability solution becomes unobservable due to one or more missing measurements. Linking this definition to the one by Gentili and Mirchandani (2012b), finding a suitable metric that can tell something about the unobservable variables could help at assessing and ranking partial observability solutions having the same level h, which may be numerous. This was for instance the motivation in Viti and Corman (2012), where a new metric aimed at quantifying the maximum relative error in total network flow variability (Maximum Unobserved Variability, MUV), which was defined as the sum of the maximum estimation errors on the unobservable flows. Castillo et al. (2012) and
70
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
He (2013) proposed empirical metrics that comply with the above definition of partial observability; the first used weighted sum of observed and unobserved variables, while the second proposed to manually assign weights to all variables before applying a graphical version of the Gaussian elimination method and use these to calculate a total weight value related to the unobservable variables, and that needs to be minimised. 2.2.2. Flow estimation problems The solutions found in observability problems are purely topological, i.e. they require only information on the network connectivity and (eventually) an explicit enumeration of routes. Flow-estimation problems need instead extra information or assumption on the flows, since they aim at identifying a solution in relation to the expected/predicted traffic patterns, and the solution will therefore depend on the estimated (or observed) flows in the network. Traditionally, these problems have been formulated as inverse of the OD estimation problems from traffic counts, where sensor locations are given as input, and the measure of the reliability of information is determined in a final task of the procedure, i.e. it has been used as a posteriori assessment measure (e.g., Lam and Lo, 1990; Yang et al., 1991). Sensor locations have also often been assessed in relation to travel time reliability (e.g., Chen et al., 2002; Li and Ouyang, 2011) or network variability studies (Viti et al., 2008; Viti and Corman, 2012). Flow-estimation NSLP formulations typically assume the availability of a prior OD Matrix (e.g., Yang and Zhou, 1998), prior knowledge and reliability of the route flows (e.g., Wang et al., 2012), known turning proportions at nodes (e.g. Bianco et al., 2001), information on link usage proportions (e.g. Gan et al., 2005), prior distribution of the demand matrix (Simonelli et al., 2012), and in combination with specific offline estimation methods, e.g., Generalized Least Squares (e.g., Yang and Zhou, 1998; Chung, 2001; Wang et al., 2012), or online methods, e.g., Kalman filtering (Fei et al., 2007) to identify the optimal locations maximizing the reliability of estimations, or to account for the effect of incidents (Fei et al., 2013). 2.2.2.1. Metrics, location rules and solution algorithms. Because flow-estimation NSLP solutions must be found, often without having any information a priori on the effective information that will be obtained from the installed sensors, a number of heuristic rules and practical algorithms have been introduced. Different metrics have been proposed to evaluate the expected quality of NSLP solutions. Among others, a well-known measure used in NSLP literature is the Maximum Possible Relative Error (MPRE, Yang et al., 1991). In essence, the MPRE value represents the largest possible error that can be made when estimating OD flows, given number and position of traffic counts (assumed error free), and assuming a specific estimation method. Gan et al. (2005) extended the MPRE concept to account for the expectation value of the error, the expected relative error (ERE), which is calculated using Monte-Carlo simulations. Viti et al. (2008) derived an extension of MPRE that can be used to minimize the error on any type of flow (link, route, OD) and also on other performance measures such as speeds, densities or travel times. Their method is based on the estimation of correlations between traffic variables through also Monte-Carlo simulations. Conceptually different metrics were proposed in literature. Chen et al. (2005), applied the Total Demand Scale (TDS) metric (Bierlaire, 2002), which is calculated as the difference between the maximum and minimum possible total OD demand estimates in a polyhedron constrained by traffic measurements. Gan et al. (2005), and later Yang et al. (2006) used information on shortest routes and on geographical and account for spatial separation between sensors to identify optimal screenlines over the network. More recently, Zhou and List (2010) and Simonelli et al. (2012) introduced measures of OD matrix estimation variability which are found by using the trace of the covariance matrix of the posterior OD estimate. This is somewhat related to the online approaches based on Kalman filtering of Eisenman et al. (2006) and Fei et al. (2007). A more extensive discussion and comparison of heuristic methods and algorithms based on the rules discussed in this section can be found in Montoya-Zamora et al. (2011). All performance indices introduced in this literature review can be expressed as deviations between actual and estimated OD demand, route or link flows, and they assume sensor locations to be an input parameter. However, the link flow observations are not available a priori. Alternatively, heuristic rules have been introduced. These heuristic evaluation rules try to indirectly capture the quality of the flow estimates before their effective installation. Therefore, they do not explicitly optimize an estimation function or a metric, but define (generally rather intuitive) guidelines. Inspired by the seminal works of Lam and Lo (1990) and Yang et al. (1991) on estimation reliability of OD flows using traffic counts, Yang and Zhou (1998) proposed four rules: 1. OD-coverage: solution must guarantee that all OD pairs are observed (at least for a small portion). Therefore, for any OD pair there must be at least one sensor installed on any of the routes connecting the OD pair. 2. Maximum flow fraction: the portion of OD flow measured by a sensor w.r.t. to all other OD pairs measured by the same sensor is maximized. Thus it is beneficial to give priority to locations containing the largest information distributed over fewer OD pairs. 3. Maximum flow intercepting: given a set of locations, sensors should be positioned to maximise the total flow captured. Those locations whose sum of flows from all sensors maximizes the total network flow captured are therefore more appealing. 4. Link independence: locations should be chosen such that information extracted is linearly independent. This rule is meant to avoid that two or more sensors contain duplicate information.
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
71
Yang and Zhou (1998) use these rules to obtain an efficient greedy algorithm seeking to minimize the MPRE metric. By assuming both information from sensors and route fractions to be known and error-free, the solution of the OD estimation for a specified number of sensors lies on the affine space defined by the subset of linear Eq. (3). The OD coverage rule must be satisfied to guarantee that the solution space is compact and the MPRE metric takes only finite values. Note that, to apply the OD coverage rule, no prior information is needed on the OD flows, route fractions, and on the estimation method. While coverage and link independence do not depend necessarily on any prior information, maximum flow fraction and intercepting rules cannot be applied if no assumption on the traffic patterns is specified. Larsson et al. (2010), Cipriani et al. (2006) and Yang et al. (2006) introduced new rules to complement Yang and Zhou’s rules: 1. Route covering/OD separation: for each OD pair, all the routes connecting this pair should be covered. This rule extends the OD covering rule by imposing the coverage at each route connecting any OD pair. 2. Maximal OD demand fraction: the sum of intercepted demand fraction captured by a sensor should be maximal. The fraction is computed as the ratio between the route flow on link a of the OD pair w and the total flow of the OD pair itself. 3. Maximal net OD flow captured: given a number of counting sensors, the best location is the one that captures the largest net OD trips, i.e. ‘double’ counting of OD flows measured by different link flows are not considered. 4. Maximal net route flow captured: given a number of counting sensors, the best location set is the one that captures the largest net route flows. Gentili and Mirchandani (2012a) formulated analytically the above-listed 8 rules and tested them using synthetic examples, showing that no rule could always dominate the others under any scenario. Different solution algorithms have been based on the above rules. Yim and Lam (1998) proposed to specify a weight to the link/OD proportions using the results of a traffic assignment model and a prior OD matrix, and formulated a linear programming model to maximize both net and total captured flows. Bianco et al. (2001) developed a two-stage procedure that aims at determining sensor locations that best estimate turning fractions. The objective is to maximize the OD coverage and reduce the estimation error on the turning flows. Chung (2001) adopted specific weights to extend the maximum flow intercepting rule and account for the information contained in prior OD matrices. A similar approach was proposed by Ehlert et al. (2006) for finding a second-best solution, given a set of existing sensors preinstalled on the network.
2.2.3. Concluding remarks on the state-of-the-art Table 1 provides an overview of the relevant literature on NSLP and specifies the main modelling characteristics, which have been discussed in this section. In the table, we specified alternative ways to distinguish NSLP approaches, following again Gentili and Mirchandani (2012a), by considering e.g. the sensor technology adopted, (traffic counts, floating car data, cameras, etc.). In addition, distinction can be made for instance on the metric adopted, or the variables (whether link, route and/or OD flows are sought) and constraints (budget, maximum number of sensors to install, etc.) used. To conclude this literature survey we give a number of considerations. First, observability and flow-estimation problems, though well distinct concepts, are often intermixed in NSLP solution algorithms. Observability/coverage rules, such as the OD or route coverage, or the link independence rules are normally on the basis of flow-estimation NSLP methods (e.g. they are important elements in the greedy algorithm developed by Yang and Zhou, 1998). Analysing the system observability helps at quantifying the ‘‘level of under-determinedness’’ in Eqs. (1) and (2) before any a priori information or estimation method is included in the problem, and to identify those variables that contain original (linearly independent) information in the solution space. Once characterised the observability of the system, flow-estimation techniques, and in particular the location rules and metrics described in Section 2.2.2, are used to quantify the estimation error that can be done on the unobservable variables. Developing a methodology able to generally assess the quality of partial solutions without requiring extra information or assumptions on the flows is, on the other hand, desirable for many aspects. First of all, starting from full observability solutions may not be practical as already in relatively small-sized networks one needs to consider a large share of state variables in order to guarantee full network observability. However, it is reasonable to think that a part of these links may not contain much information. Is it therefore possible, given a certain budget in terms of, e.g., maximum number of sensors to install, to identify those containing the largest information content?1 Or, analogously, what would be the number of sensors needed that would still guarantee an acceptable level of under-determinedness? Second, full observability solutions do not tell much about how much information is lost if one or more of the state variables in this solution is in the end not observed, or provides faulty information. Is it therefore possible to develop a metric able to quantify the gained or lost value of adding/removing a sensor? This is also related to a third research question: given the existence of multiple possible solutions guaranteeing full observability, which ones are the most robust in terms of sensor failure? 1 We stress here that information is, in this paper, not intended as any traffic variable (flow, travel time, etc.) but is related to the level of underdeterminedness of Eqs. (1) and (2). A formal definition of information is given later in Section 3.2.
72
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Table 1 Summary and categorization of literature on the NSLP. Paper
NSLP
Objective
Rules
Metric
Variables
Constraints
Yang and Zhou (1998)
Flow Traffic counts estimation + coverage
OD estimation reliability
OD coverage; # links selected
Flow estimation
Traffic counts
OD estimation reliability
Maximum possible relative error (MPRE) % captured flows
Error bounds
Yim and Lam (1998)
OD flows
# sensors
Bianco et al. (2001)
Flow estimation
Traffic counts
Maximize flow monitored
Coverage; independence; capturing; intercepting OD coverage; maximum flow intercepting Flow coverage
Turning fractions
# nodes selected
Chen (2001)
Flow estimation
Traffic counts
OD flows
Budget
Gan et al. (2005)
Flow estimation
Cordon screen lines
Maximize OD flows covered Maximize captured flow
% net flow captured
# sensors; full coverage
Gentili (2002)
Coverage
Chen et al. (2005) Flow estimation
Link + node sensors Traffic counts
Gan et al. (2005)
Traffic counts
Total unmonitored link flows OD flow coverage % captured flows Total flow OD separation; captured maximum captured flow Maximum Matrix rank captured flows Maximum Total Demand captured flows Scale (TDS) OD covering rule; MPRE; TDS; OD separation Expected Relative Error (ERE) Path covering RMSE rule OD Flow capturing High-volume links
Flow estimation
Sensor type
Flow coverage OD estimation reliability OD estimation reliability
Coverage Gentili and Mirchandani (2005) Ehlert et al. (2006) Flow estimation
Active sensors
Travel time estimation
Traffic counts
Eisenman et al. (2006)
State estimation
Link sensors
Cipriani et al. (2006)
Flow estimation
Traffic counts
Second-best solutions Estimation and prediction quality State estimation
Yang et al. (2006) Flow estimation Castillo et al. (2008a)
Coverage
Castillo et al. (2008b)
Coverage
Viti et al. (2008)
State estimation
Mirchandani et al. Coverage (2009)
Flow fraction; OD and path net flow captured Cordon screen OD estimation OD coverage; OD lines reliability flow separation Link sensors Observability Link independence; flow coverage Link State estimation Link sensors + AVI independence; flow coverage State estimation Maximum flow; Link link sensors + travel independence times AVI Travel Flow coverage; time + reliability travel time coverage Link sensors Link flow Link coverage inference Loop OD estimation Flow coverage; detectors + AVI reliability flow separation
Hu et al. (2009)
Coverage
Zhou and List (2010)
Flow estimation
Li and Ouyang (2011)
Coverage
Link + node sensors
State estimation Flow & travel time coverage
Fei and Mahmassani (2011) Castillo et al. (2012)
Flow estimation
Link sensors
OD estimation
Coverage
Link sensors; scanned links
Link and path observability
Gentili & Mirchandani (2012a) Ng (2012)
Coverage
Scanned links
Path flow reconstruction
Coverage
Link sensors
Link flow inference
Flow intercepting; OD coverage Link independence; flow coverage Flow coverage
Link coverage
Link flows; path # sensors; full flows coverage OD flows # links selected Error bounds
# links selected
Travel times estimated
# sensors
Mean Squared Error (MSE) RMSE
OD flow estimated Travel times simulated
# sensors
MSE
Simulated flows (Assignment) OD flows
MPRE Matrix rank
Link-OD incidence matrix Trip matrix; path flows
Budget
# sensors
# of cordon lines; full OD coverage Full coverage
Root Mean square error (RMSE) Captured flows; Simulated Covariance flows (Monte Carlo) Vehicle miles % travel time covered; travel covered;% time variance variance Matrix rank # unmonitored links RMSE; entropy; Mean OD flows + error trace of variance posterior estimates RMSE Lagrange multipliers; link flows % flow covered # flow estimated
# sensors
Heuristic based Trip matrix; path flows on # links & flow info Matrix rank; Path flows RMSE
# traffic counts + scanned links # scanned links
Matrix rank
# sensors, % flow captured # tag readers installed Full coverage Budget
# sensors
Budget
# unmonitored Full coverage links
73
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89 Table 1 (continued) Paper
NSLP
Sensor type
Objective
Simonelli et al. (2012)
Flow estimation
Traffic counts
OD estimation reliability
Viti and Corman (2012)
Flow estimation
Link sensors
Wang et al. (2012) Flow estimation
Traffic counts
OD estimation
Castillo et al. (2013a)
Coverage
Link sensors; scanned links
Link flow estimation
Fei et al. (2013)
Flow estimation
Link sensors
He (2013)
Coverage
Link sensors
Castillo et al. (2013b) Ng (2013)
Coverage
Link sensors
Coverage
Link sensors
Coverage
Link sensors
Castillo et al. (2014)
Rules
Minimum estimation variability Flow estimation Minimum variability estimation variability Minimum route flow uncertainty
Link independence; flow coverage OD coverage rule; Flow estimation + OD max flow captured coverage Link flow Minimum inference spanning trees Link flow Upper bound for inference full observability Link flow Upper bound for inference full observability Link flow Minimum inference number of links using l.i. paths
Metric
Variables
Constraints
Trace of posterior demand matrix Max Unobserved Variability (MUV) Uncertainty in route flow estimated Flow amount of information;
OD flows
# link counts; budget
Link flow variations
# sensors;% unobs. variability
OD flows
# sensors
Trip matrix; path flows
OD demand uncertainy
Incident scenarios
# sensors + scanned links # sensors; budget
Matrix rank
Link flows
Full observability
Minimum # of sensors Minimum # of sensors Minimum # of sensors
Link and path flows Link flows
# existing sensors
Link flows
Partial observability Full observability
Finally, are full observability solutions equally valuable, or is it possible, through some rules or criteria to rank them between one another? Finding inspiration by the metrics previously developed in flow estimation problems, we introduce here a new metric, which is able to quantify the marginal value of observing a certain state variable. This metric has great interpretation opportunities, and has the property to clearly distinguish partial observability solutions. The adoption of this metric, in combination with any full observability solution allows the development of efficient algorithms able to find efficient partial observability solutions on networks of any size, and to show that not all full observability solutions are the same and therefore one has to be carefully selected if it is needed, for example, to solve flow-estimation problems. 3. Methodology In this section we describe the method developed to assess partial observability solutions. To understand the properties of the methodology we start from full observability solutions obtained through Castillo’s pivoting approach, as we did in Section 2.2.1. Later in Section 3.2 we introduce the metric we propose for assessing partial observability solutions. We then illustrate the logic of the metric and show its properties in Section 3.3 through a numerical example. 3.1. The pivoting process Eq. (3) takes all link-route-OD relations into account; it therefore also contains duplicate information, which is reflected by the presence of linearly-dependent rows. The largest number of linearly independent relations out of (3) represents the matrix rank. The linear relation in (3) can be further described by a basis (of size equivalent to the rank), which can then be used to represent any solution of Eq. (3) through an opportune linear transformation. These basic linear algebra concepts are at the core of the solutions algorithm for full observability such as the pivoting procedure proposed by Castillo et al. (2000, 2008a). A pivoting procedure enables one to formulate link and route variables as linear relationships of a subset of independent variables, by opportunely swapping the elements of the link-path-OD matrix (d, q)T. For sake of illustration, we will drop from now on all relations involving OD flows f, i.e. we only consider the link-route matrix d:
v ¼ dh
ð4Þ
An extension of the following relations to account for the full expression (3) is straightforward. By manipulating rows and columns such as in the example in Fig. 2, we can put in direct relation link flows with link flows, as well as with route flows, while respecting the fundamental relations expressed by the matrix d. A solution of full observability for the link inference problem is when link flows are only function of link flows, as was the case of the expression in Fig. 2(d). We now introduce matrix P in the example as the solution to the link inference problem, i.e. it represents the selection of observed state variables (link flows) that allows for full observability:
74
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
P:
v3 1 ¼ 1 v5
1 0 1 1
0 1 v1 B C @ v2 A
ð5Þ
v4
where we separated by construction linearly independent link flows on the right-hand side, and linearly dependent ones on the left-hand. We stress out here that matrix P is a full observability solution that can be derived by any method in literature (e.g., Castillo et al., 2008b; Hu et al., 2009; Ng, 2012; He, 2013). Therefore in general we can write for any solution of full observability that:
v dep ¼ P v indep
ð6Þ
where by vdep we denote a vector of linearly dependent link flows (or route and OD flows in the case the full Eq. (3) is used) and vindep the linearly independent flows. Eq. (6) suggests that if all elements in vindep are observed, then all elements in vdep are uniquely determined and the system is fully observable. In general, links (and routes and OD variables in the complete case) can be both among the set of dependent and independent elements. We therefore expand rows and columns of (6) to keep all link flow variables on both sides of the equation:
v indep I ¼ v dep P
0
0
v indep v indep ¼X v dep v dep
For readability’s sake, we will denote from now on the augmented matrix
ð7Þ
I P
0 0
as X.
To illustrate the procedure we refer again to the toy network in Fig. 1. Fig. 3 illustrates one possible solution for Eq. (7), found by expanding the relations in (5). The selected dependent and independent relations are again indicated in bold for the sake of illustration. Once more, the extension of the solution by adding also route and OD variables is straightforward. It can be easily verified that the link independence rule, specified in the previous section, is satisfied by any solution of the pivoting procedure. Any set of links (or routes, or ODs) chosen from the columns of a pivoted matrix is by construction made of linearly independent variables. The pivoting procedure may have many different solutions in the form of Eq. (6) depending on the variables chosen for performing the pivoting operations, as also explained in, e.g., Castillo et al. (2012). For example, another solution of full observability is obtained by observing link 3 in the example in Fig. 1, together with any pair of links taken on its opposite sides. We will show in the next section that solutions containing link 3 or not containing it are very different, and because of symmetry in the example, they are the only two solutions showing topological differences. One of the aims of our paper is to find a metric that is able to distinguish these two ‘‘families’’ of solutions. In particular, we look for a way to rank solutions based on the information contained by each single observed variable. This would also be beneficial to identify partial observability solutions having the largest information content. In case of large-scale networks, exhaustive route enumeration is impractical, so if one uses a route-based approach, this would likely start from an incomplete set of relations between links and routes. Therefore, there might be different solutions with full observability that have different amounts of links measured, depending on the chosen explicit route enumeration. On the other hand, node-based approaches do not equally contain all information in the system, as He (2013) pointed out. In general, inclusion of more or less (OD-link or route-link) relations in the description of a network will yield to full observability solutions with different amounts of variables measured. 3.2. Null space (NSP) metric Full observability solutions are rather impractical in real-sized networks. In Viti and Corman (2012) it was observed that full observability solutions for link flow inference, solved on the Sioux Falls network (76 links, 596 routes, 24 nodes, among which 14 OD pairs), one needs 65 link sensors, which means 85% of the links must be equipped with a sensor. Tests on larger networks, such as the one of Rotterdam, the Netherlands (reprinted later in Fig. 7), show that one needs 281 sensors, i.e. 60% of all links to guarantee full observability. These percentages are consistent with the analytical derivations of the upper bound for the minimum number of sensors provided by Hu et al. (2009), Ng (2012), He (2013) and Castillo et al. (2013b, 2014).
Fig. 3. Expanded matrix representation of full observability solution (5).
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
75
Partial observability solutions become therefore useful as, normally, the available budget of sensors is limited. To assess such solutions we need, however, to clarify the definition of information used in this paper, which is based on traditional information theory: Definitions about Information: In linear equation systems, the solution space pertaining to a situation in which a subset of variables is not uniquely determined is an affine space spanned by the combination of the domains of the unknown variables. We define information acquisition the process by which the size of this solution space is reduced, either by measuring some of these variables or exploiting existing relations between them. Largest information is therefore obtained when measuring the set of variables which results in the highest possible reduction in solution space size, ultimately leading it to collapse into a single point, corresponding to full information. This is the case of any full observability solution. We therefore aim at developing a metric that has three main characteristics, which summarise the above observations: 1. It allows one to assess the quality of a partial observability solution (in relation to a full observability solution) if only a subset of the links of a full observability solution is taken; quality is measured on the basis of the above definition of information; 2. It quantifies the maximum relative error that can occur if not all sensors characterizing a full observability solution are installed and/or working; this error ranges between 0 (full observability) and 1 (no sensor in the network); 3. It can be used in an optimization framework to identify the sensor locations that potentially contain the largest information about the network, given a certain budget (e.g. maximum number of sensors). By selecting such locations it allows to rank full observability solutions based on all their partial observability solutions. Property 1 would allow us to indicate which solution of partial observability at level h, as defined by Gentili and Mirchandani (2012b), would contain also the largest amount of information to those links that are not fully observable. Property 2 would enable one to find the vector of links, which, taken together, would minimize the uncertainty of inferring the information extracted from the sensors to the unobservable link variables. Moreover, it would enable one to calculate the utility of adding an extra sensor on the network, so it would also be applicable to find where to add extra sensors to scenarios where a set of sensors is already installed. Although full observability solutions may not be realistic in mid- or large-sized networks, they are likely to be a good starting point to identify efficient solutions for the partial observability problem, as we would limit our search space to matrices of reduced dimensions (as rank(P) 6 rank (d)), where linearly dependent information has been removed. Moreover, if we focus on the link inference problem, any observable variable will not be in function of the routes, but only of the set of linearly independent links (as shown in the example in Fig. 3). Therefore, we focus on the Null space of the pivoted matrix P. This Null space is, by definition, the vector space of any x 2 Rl (with l the number of links), such that def
NullðPÞ ¼ x 2 Rl : P x ¼ 0
ð8Þ
i.e. the vector of variables x that result in a Null contribution towards the dependent variables, as well as to the independent and unobserved ones.2 In other words, the Null space describes the degrees of freedom resulting from under-determinedness of the subspace of Rl of solutions that are possible given P, which means that in case of full observability there are no degrees of freedom. For all other cases, the solution space of x is clearly an uncertainty measure for the choice of not measuring one or more independent variables. The goal of the metric is to assign a single scalar value to vector space defined by any possible x in the Null space. Based on elementary matrices algebra, we describe the Null space (itself a vector space) with a basis, i.e. a set of vectors that act as building block for all vectors in the full vector spaces. Formally, the basis of a vector space (such as Null(P)) is a collection B of linearly independent vectors such that every vector in the space can be expressed as a linear combination (say by coefficients a) of the columns of B:
x ¼ aB;
8x 2 NullðPÞ
ð9Þ
Basic algebraic properties further state that
P x ¼ P aB ¼ 0
ð10Þ
We consider an orthonormal basis, in order to embed information with respect to how the Null space dimensions are oriented, and neglecting any influence by their extent. This property will be of key importance when dealing with partial observability, as will be shown later. 2 The relationship between a given vector space and its Null space is described by the fundamental theorem of linear algebra (see for instance Strang, 1993): any matrix A 2 Rmxn induces four fundamental subspaces (image, Null space, co-image and left Null space), and that orthogonal bases for these spaces can be extracted from the singular value decomposition of the original matrix.
76
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Assume now that we do not observe part of the independent variables set, i.e. we no longer know their value. This leads to a set of variables (rows in the vector of variables) that are independent but unobserved, and a set of missing columns in the matrix explaining the true relationships between unobserved and dependent variables. We mark those missing columns and rows in red in the following equation:
ð11Þ
Where I0 and I00 are reduced identity matrices of suitable dimensions; matrix P is considered structured as ½P0 P00 ; P0 corresponds to a reduced P matrix, with some columns removed. We claim that this new matrix P0 can describe the information lost when some variable is unobserved. In fact a Null space can be associated to P0 , that by definition includes all variables that result in a null contribution towards the observed variables. Therefore, the Null space of P0 can describe quantitatively the error introduced by not observing all independent variables. 0 T I 0 We can compute a vector belonging to the Null space of ¼ X0T , and a basis B0 describing this Null space, simP0 0 ilarly to Eqs. (8) and (9). We are interested in the basis of the Null Space pertaining to the inverse relationship between the vector space spanned by the dependent variables and that spanned by the independent variables. As in general matrix X0 is 0 not square and thus not invertible, we could employ the full pseudo-inverse matrix X +; From a practical point of view, the 0 T Null space of such a matrix is equivalent to the one of the transpose matrix X , that is computationally more easy to determine. Every variable pertaining to the independent variables that can be described in the form aB0 , will not be measurable by the chosen set of (independent and observed) and (dependent) variables, thus yielding a measurement error. Clearly, it can be stated that
X0T B0 ¼ 0
ð12Þ 0
It is then interesting to relate a vector belonging to the Null space of X T (i.e. described by the basis B0 ) to the original matrix 0 0 X, XTB . This represents the transformation of all vectors belonging to the Null space of X T onto the space of the variables considered in the original full set of dependent and independent variables. Since the orthonormal basis of the Null space B0 is composed by vectors that hold no information other than the orientation of the spaces’ dimensions, such a transformation will yield the extent of error that affects the unmeasured variables (related to a full observability solution represented by P) due to those independent and not observed (represented by B0 ). 0 0 We consider the Frobenius norm ||XTB ||F of the product XTB to express the magnitude of the transformation from an unobserved variable into the measured variables. In Appendix A we have described mathematically the Frobenius norm. By normalizing the above norm against the Frobenius norm of the initial matrix X, we can finally define the Null space relative to the matrix P (NSP) metric:
NSP ¼
jjXT B0 jjF jjXjjF
ð13Þ
This formula allows us to always obtain finite values when computing the metric without placing any constraint to the flow variables. In Appendix B we have analysed the dimensions of the above matrices, and demonstrated that the metric equals 1 in case of no variable observed, therefore showing that the metric provides values within the interval [0,1]. Obvious extra constraints for (13) would be, for example, non-negativity and link capacity constraints. However, the values of the metrics would, especially in the latter case, depend on the assumed capacities, which can be determined analytically or, for example, through heuristic procedures. In this paper we opt for a more generic case of unconstrained variables. For the same reason we do not use non-negativity constraints, as the metric can also be adopted to quantify the loss of information for flow variations rather than their absolute values (as it was assumed in Viti and Corman, 2012). Moreover, excluding non-negativity allows for a more general setup and an immediate translation to relative values. Nevertheless, analysing the solutions considering non-negativity constraints remains a very interesting topic for further research, Castillo et al. (2014) have discussed its implications, and how non-negativity constraints trace, rather than a linear subspace, a polyhedral convex cone in the solution space. In this respect, however, it would then be impossible to refer to Null Space and Basis wrt. the full solution space, but one can still apply the concept of basis to the vectors that generate the convex cone, and the null space associated with them. A very interesting direction of research to undertake is then assessing whether the pseudo-inverse projection we employ as the fundamental operation in our metric could be extended to the convex cone case, but this goes beyond the scope of this work. We note that the adoption of the Frobenius norm has analogies with Zhou and List (2010) and Simonelli et al. (2012); both studies employed the trace of the a posteriori demand matrix as a metric for the variance of the estimation given a certain set of sensors. In our case, the trace is used as a scalar measure of the maximum extent of the Null space. We should point out that the adopted metric focuses on the norm of the matrices. Considering also the orientation (that is lost by applying the Frobenius Norm) in the metric would be analogous to associating a higher cost or a less preference on
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
77
having a particular variable unobserved. This information can be included back in the metric through a suitable weight matrix, as elaborated further in the sections about further extensions of the approach.
3.2.1. Graphical explanation The NSP metric is, in general, dependent on the starting matrix P chosen. The metric defines the normalized error in measuring the network variables selected with matrix P; due to the normalization, the metric ranges from 0 to 1; a value of 0 is found only in a full observability case; a value of 1 means that the error is complete, and is found when no independent variable is being observed. Intermediate values indicate the largest error that can be done when estimating the unobserved link flow variables on the basis of the observed ones, in a normalized form. In the effort of providing an interpretation of our metric, we return again to the network in Fig. 1. We first give a graphical interpretation of the steps followed in the derivation of the metric, and later we show how the values of each partial observability solution may be interpreted as a measure of the ‘degree of under-determinedness’ in the system. Eq. (7) presented one possible full observability solution of the network in Fig. 1, namely observing the flow on links 1, 2 and 4. This means that in this solution, every triple (v1, v2, v4) uniquely determines a couple (v3, v5). Note that each full observability solution identifies a set of linearly independent variables, but not necessarily orthogonal. For sake of illustration, however, we represent the triple (v1, v2, v4) as orthogonal. Fig. 4(a) shows the mapping of a triple (v1, v2, v4) onto the solution space of all possible pairs (v3, v5). Recall that v3 and v5 are not supposed to be linearly independent, therefore in the illustration they are not represented as orthogonal. The dotted polyhedron on the left figure, represents any combination of the triple (v1, v2, v4). The points inside the dotted area are representing any value of the triple (v1, v2, v4) where each variable may range between 0 and 1. All these points map onto the solution space for the dependent variables (v3, v5) illustrated by the dotted region in the shape of a skewed rectangle in the right figure of Fig. 4(a). Assume now that v4 is not observed anymore, we now must represent the solution space related to a choice of (v1, v2) with respect to all triples of unobserved variables (v3, v4, v5). This is shown in Fig. 4(b). Looking at Eq. (5) we note that v3 is still fully described by v1 and v2 (therefore v3 and v4 should be orthogonal according to matrix P), v5 is only partially observable and unbounded, while v4 is fully unbounded. The space spanned by the triple (v3, v4, v5) for a given choice of (v1, v2) is now an unbounded prism; the shape and area of the cutting surface orthogonal to v4 depends on the surface (v3, v5), i.e. on the shape of the dotted skewed rectangle of Fig. 4(a), right. We characterize this surface also by an orthonormal basis (represented as a unitary circle plus two orthogonal vectors, that all lie on the surface), in this particular case having two dimensions. In particular we focus on its projection onto (v3, v5) that corresponds in the figure to the transformed (distorted) ellipse plus the transformed (distorted) vectors, that lie on the plane (v3, v5). Finally, we linearly transform back this basis into the original solution space (v1, v2, v4), through the 0 product XTB . This way, we are able to identify a basis of the solution space of the unobserved variables in the original coordinate system of linearly independent variables. In the bottom Fig. 4(c), an ellipsoid, graphically representing the extent of the uncertainty within such basis, is shown. The Frobenius norm can be intuitively considered to be related to the main diagonal of its bounding parallelotope. The transformation onto the original space (v1, v2, v4) is needed in order to normalize the uncertainty onto the dimension of the full observability solution. The greater the measure of the uncertain region, the greater fraction will be calculated in Eq. (13). The full numerical elaboration of this graphical example is provided in Appendix C.
3.2.2. Analysis of different full observability solutions We continue analysing the properties of our metric by studying the maximum relative error on the whole network in Fig. 1 when dealing with partial observations, but starting from different full observability solutions. The other advantage of using the example in Fig. 1 is that, on the basis of our metric (13), only two ‘‘families’’ of pivoting giving full observability solutions are possible, i.e. those taking three v4 links where one is the middle link 3 (we will denote them as ‘‘centre’’ solutions), and those triples not containing the middle link (i.e. any combination of three of (v1, v2, v4, v5), called ‘‘Star’’ family). By starting from a specific full observability solution, some partial observability solution may not appear and thus cannot be evaluated. For example, we cannot evaluate the loss of information when we do not measure the middle link 3 if we evaluate the ‘‘Star’’ family. Note that we do not assume any constraint for the error values, which means that errors can potentially take any positive (overestimation errors) or negative (underestimation) value, and they can be of any order of dimension. Let us now analyse the behaviour when dealing with the ‘‘Star’’ family, specifically in the case of observing only one of the three link flows v1, v4 and v5, the three possible combinations are expressed in Table 2. Apart from the single observed link, it is clear that the maximum possible estimation error on the other links can still vary among the full [1, 1] interval. It is interesting to point out the difference between choosing v1 with respect to v4 or v5: when choosing v1 as a link to be observed, the resulting uncertainty measure on v1 and v3 is the sum of the uncertainties on the topologically connected links v4 and v5, both unknown and unbounded. It would be preferable to avoid such ‘‘higher-order’’ uncertainties, and when choosing either v4 or v5 as the first observed link, it is clear that this ‘‘higher-order’’ uncertainty is limited to v2 alone. A metric able to correctly identify and react to this behaviour should then yield a smaller value of error when choosing either v4 or v5, compared to choosing v1.
78
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Fig. 4. Geometric interpretation of the NSP metric. Table 2 Maximum estimation error when observing one link between
v1, v4 and v5.
Max. estimation error Observed:
v1 v2 v3 v5 NSP
v1
0 [1, 1] + [1, 1] 0 [1, 1] + [1, 1] [1, 1] 0.79
Observed:
v4
[1, 1] [1, 1] + 0 [1, 1] [1, 1] + 0 [1, 1] 0.64
Observed:
v5
[1, 1] [1, 1] + 0 [1, 1] 0 + [1, 1] 0 0.64
79
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89 Table 3 Maximum estimation error when observing two links between
v1, v4 and v5.
Max. estimation error Observed: v1 v2 v3 v4 v5 NSP
v1 & v4
Observed:
0 0 + [1, 1]0 0 + [1, 1] 0 [1, 1] 0.44
v1 & v5
Observed:
0 [1, 1] + 00 [1, 1] + 0 [1, 1] 0 0.44
Table 4 Maximum estimation error when observing one link between
v4 & v5
[1, 1] 0 + 0[1, 1] 0 0 0 0.44
v1, v3 and v5.
Max. estimation error Observed: v1 v2 v3 v4 v5 NSP
v1
Observed:
0 [1, 1] 0 [1, 1] [1, 1] [1, 1] [1, 1] 0.8
v3
[1, 1] 0 [1, 1] 0 0 [1, 1] [1, 1] 0.69
Table 5 Maximum estimation error when observing two links between
Observed:
v5
[1, 1] [1, 1] [1, 1] [1, 1] [1, 1] 0 0 0.8
v1, v3 and v5.
Max. estimation error Observed: v1 v2 v3 v4 v5 NSP
v1 & v5
0 0 [1, 1] [1, 1] 0 [1, 1] 0 0.53
Observed:
v1 & v3
0 0 0 0 [1, 1] [1, 1] 0.47
Observed:
v3 & v5
[1, 1] 0 [1, 1] 0 0 0 0.47
When observing two links, rather than one, the possible combinations are as in Table 3. In this case, there are no ‘‘higherorder’’ uncertainties and, between the three options, the third is the one that minimizes the error extent, as it leaves unobserved other two variables out of five. Therefore, in this case, the NSP metric is not able to identify a better partial observability solution. One may then think of using, in the occurrence of such cases, other criteria to rank the solutions. For instance one may focus on the ‘‘lower-order’’ uncertainties, or, equivalently, the number of variables that are observable, i.e. the level h according to the definition of Gentili and Mirchandani (2012b). We show in Table 4 how the NSP metric assesses the various solutions analysed for the ‘‘Star’’ family. It is apparent that our metric is able again to recognize the instances of ‘‘higher-order’’ uncertainties, recognizing in all three cases that there is only one source of uncertainty left in the observations. Repeating the same procedure for the ‘‘Centre’’ example, the 3 combinations, along with the values obtained from our metric, are shown in Table 5. It is interesting to see that, in this case, the values given by our metric to the solutions when observing the combinations of two links are different; this shows that when dealing with this peculiar family of pivots, uncertainty on the central link has generally a greater impact than that of the lateral ones. This small example clearly shows how our metric can be interpreted, while also helping realizing that, indeed, when dealing with partial observability, the initial provided full observability solution can and will influence the resulting solutions’ characteristics. It is now clear that we need a way to classify solutions a priori, since an a posteriori classification such as the one performed above needs all combinations to be enumerated and then analysed. This suggests us to develop a search algorithm able to find efficient solutions. This is elaborated in the next section.
4. Solution algorithm The new metric proposed in Section 3 allows one to assess the information content given a certain set of sensor locations. However, one may also be interested in looking for the vector(s) of variables to equip with sensors, such that such
80
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
information content is maximized, or vice versa, that the extent of the solution space for e.g. a related estimation problem is minimized. This means that we want a method that, given the budget of n sensors, identifies the n variables such that the NSP value is minimal among all possible combinations of locations. So far, we have made some observations, which may be relevant for developing an efficient search algorithm. First, we have chosen to lay the foundations of our metric on a full observability solution, and in particular, on any solution that contains only linearly independent variables among the observed ones. This allows one to seek for solutions that are in line with the link independence rule of Yang and Zhou (1998). This choice has also practical and computational advantages, since the number of possible pivoted matrices is limited by all possible matrix row permutations. The set of all possible solutions without pivoting becomes in fact quickly too vast to be investigated, with 2jLj possible solutions, for the jLj link variables that can be observed or not. This way, we discard a great number of ‘‘uninteresting’’ solutions containing linearly dependent relations, and thus redundant in our definition of information. This can lead to a decrease in the number of possible solutions, although this might still yield intractable problems when dealing with large networks, since the computational complexity of a enumerating all possible full observability solutions is O(2n), i.e. depending exponentially on n, the number of independent variables. We cannot prove at this stage that by making this choice we exclude partial observability solutions that cannot be found starting from a full observability solution. This issue will be investigated in future works. 4.1. A simple local search heuristic We introduce a simple heuristic to investigate the possible solutions. The Add heuristic starts from an empty set of observed variables, and adds, as variable to be observed, the one that results in the largest increase of information. Iteratively, the chosen variable is then added in the set of observed variables. Alternatively, a Remove heuristic removes variables from the full observability solution in order of least information contents. Tests on different networks showed the two approaches to be fully analogous and comparable in result and computational speed. We therefore discuss only the Add heuristic from now on. Here we show how the Add heuristic’s underlying algorithm is developed, in pseudo-code form. A MatLabÒ implementation of said code has been employed for the tests presented in this paper: Pseudocode of procedure NSP_add (network N, amount of sensors to be placed n) Based on topology of N, determine link-route-OD relationships (d, q) of (Eq. (4)) Compute matrix P of Eq. (6): perform Pivoting (Castillo et al., 2008a, 2012) on d defining xdep and xindep independent_var = xindep observed_var = ø for i = 1 to n NSP_best = 1 foreach l in (independent_var/observed_var) S incumbent_observed_var = (l observed_var) 0 Determine matrix X = subset of X removing columns in incumbent_observed_var (Eq. (11)) 0 Obtain B0 , basis of the Null space of matrix X , via singular value decomposition (Eq. (12)) 0 Compute the NSP(X ) metric by (Eq. (13)) if NSP(X0 ) < NSP_best: observed_var_best = incumbent_observed_var; NSP_best = NSP (observed_var_best) endif endfor observed_var = observed_var_best endfor return observed_var; NSP_best
In case multiple solutions exist sharing the same value of the NSP metric, the local search will keep only one of the possible solutions, either randomly chosen or following other criteria, for example keeping those having the highest level h according to the definition of Gentili and Mirchandani (2012b). A Beam search procedure has also been implemented. According to such a procedure, in case multiple solutions exist leading to the same value of the NSP metric, the complete set of best solutions is kept for the next steps of heuristics. Practically, a pool of candidate solutions is associated to a certain amount of sensors to be observed. In case of an Add procedure (respectively Remove), all possible variables to be observed are added (respectively removed), one at a time, to each candidate solution in the pool. A Beam search avoids the drawback of breaking ties between solutions that are locally the same, but can lead to better solutions (or, worst case, same solutions), at the expense of additional computational and memory requirements.
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
81
4.2. Ranking full observability solutions The example in Section 3 has shown that even in a very small network we could identify two families of pivots, which produced different partial observability solutions according to our metric. It is therefore interesting to understand how many different pivoting ‘families’ can be found for a given network, or to define a method to rank pivots based on the information contained in (all) its partial observability instances. As shown in the example, our definition of information, partial observability and metric hint to the fact that the choice of the full observability pivot (family) from which we start the search for partial observability solutions is of dire importance. It is clear that performing an analysis of all possible pivots (families) following the procedure adopted in the aforementioned example, i.e. enumerating all possible combinations of observed links that provide full information, is unfeasible for bigger networks. We therefore develop a way of evaluating the information content of different full observability pivots, basing ourselves on the singular value decomposition (SVD) procedure. We implicitly employed SVD when determining a matrix’s Null space (see also Appendix B), but in doing so we disregarded some information resulting from this procedure, among which the singular values. These can, on the other hand, provide an indication of the magnitude of information present in the system. The singular values describe a scaling transformation of vector spaces. These values represent the main components of the scaled vector space along different orthogonal dimensions. When computed for the full observability pivot matrix P, the singular values represent then the relation between different variables, i.e. the information that a separate variable carries about the others; we consider then that a ranking of those values would roughly describe the ranking of the matrices against the NSP metric. As described mathematically in Appendix B, the Frobenius norm of a given matrix can be computed as the square root of the sum of its squared singular values. We therefore adopt this as a way to rank pivoted matrices, coinciding with the denominator of our NSP metric (13). Note that this is just an heuristic, approximated relation. We leave the in-depth analysis of the algebraic relationship between NSP and the Frobenius norm of the initial pivot as future research, while in the following we present some empirical results on selected networks.
4.3. Analysis on toy networks We show the possible advantage deriving from performing a ranking of full observability solutions before applying the local search heuristics using small, mid-sized and large networks. Fig. 5 shows the solutions for the ‘‘centre’’ and the ‘‘star’’ families of the toy network example in Fig. 1, and in bold we represent the path of the Add local search. The graphical representation shows the whole set of solutions. By grouping them by the amount of variables measured, we can organize them as a lattice, where every layer has a constant amount of variables measured, and solutions of different layers that are connected differ by adding one variable from the measured set.
Fig. 5. Lattice of the solutions for the reduced network example of Fig. 1 for the centre family (left) and the star family (right).
82
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Continuous lines are all those solutions evaluated by the local search. In this figure o represents the set of observed independent variables and e the corresponding error as computed by our metric. As said in Section 3.3, we can decide to further rank the solutions having the same NSP value, by selecting the solution that has the highest partial observability level following the definition of Gentili and Mirchandani (2012b). In the star family case (right picture), this means choosing the couple (v4, v5) instead of (v1, v4). For this example with limited complexity, the beam search would deliver the optimal solutions, in both cases. Specifically an Add Beam search would visit the nodes labelled []; [3]; [1 3] and [5 3]; [1 3 5] in the left-hand plot; and the nodes labelled []; [4] and [5]; [1 4], [1 5] and [4 5]; [1 4 5] in the right-hand plot. A slightly more complicated network to analyse is the Parallel Highway network used also by Hu et al. (2009) and by Castillo et al. (2012), and reported in Fig. 6(a). It should be stressed out that already for this case (14 links, 4 OD pairs) enumerating all combinations of independent variables yielding full observability, through Castillo’s pivoting procedure, was impractical. We generated therefore 600 random combinations of such independent variables. Here, full observability for the link flows is obtained by locating 9 sensors. In this case (which presents clear symmetries) we can observe six families
Fig. 6. Parallel Highway network (Hu et al., 2009), families of pivots using the NSP metric (b) and comparison with partial observability solutions using Ng’s (2012) and He’s (2013) methods.
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
83
of combinations, i.e. combinations exhibiting equal rank, ranging from a low value (the norm of P is equal to 4.1231) to a high rank value (5.1962), respectively, as shown in Fig. 6(b). The results of these comparisons can be interpreted as the following: the faster the descent at the beginning, the higher the amount of information captured by the first sensors, according to the employed metric. In fact, returning to our definition of metric, a stronger decrease (or increase, if one considers the Remove algorithm) of the NSP value means that the solution space is reduced more significantly. Therefore, a faster descent means a strongest impact in the reduction of the size of the solution space in case of partial observability, i.e. a stronger reduction in the system’s degree of under-determinedness. Fig. 6(c) shows, finally, a comparison between the descents obtained from the highest and lowest ranking Pivot-based full observability solutions and those that can instead be obtained through the node-based methods of He and Ng. It’s interesting to see that, according to our metric, the highest ranking Pivot-based full observability outperforms those resulting from the node-based approaches by remaining systematically below their curves. Not shown on these Figures is the descent obtained through the Add Beam Search procedure, as, in all instances, it yielded the exact same results as the Add local search. 5. Case study: Rotterdam network We finally test our methodology on a relatively large-sized network. We consider an extract of the road network of Rotterdam, in the Netherlands, consisting of 476 links, 239 nodes, 5668 routes and 1890 OD pairs (Fig. 7). We first obtain the full observability solutions PPivot, PNg and PHe by employing, respectively, Castillo’s pivoting procedure (resulting in 281 independent links), Ng’s Node-based procedure (284 independent links) and He’s spanning tree procedure (322 independent links). It should be stressed out that the number of sensors necessary to achieve full observability differ from one another, since they are derived from different algebraic relations. The network used in He’s procedure includes extra links used to transform the original network in a graph with one centroid, in order to apply the minimum spanning tree procedure; the total number of links in the resulting network is therefore bigger than in the case of Ng’s procedure, and, accordingly, that of those required to obtain full observability. If one were to exclude all the connectors from this count, the two methodologies are indeed equivalent in terms of produced solutions. Castillo’s pivoting procedure is instead based on a given set of routes on the network: in this case study this was obtained through a K-Shortest path enumeration procedure. This introduces additional information to the relations used by Ng and He, since it provides an extra layer of topological connectivity other than that provided by link/node relationships alone. In Castillo et al. (2014) the authors fully exploit the information given by routes in the network to construct a route-only dependent full observability procedure, and claim that this yields a lower bound to the minimum number of variables necessary to fully characterize the system. In order to better validate our approach, we compare three different partial observability metrics: NSP, LSC and He’s link weight based approach. LSC (Largest Sum of Coefficients) is a simple metric based on summing up the coefficients (taken in absolute values) relating the measured variable with all non-measured ones; this intuitively represents the extent of the influence that a measured variable has on the others, and captures in a very simple way both ‘‘higher-order’’ and ‘‘lower-order’’ uncertainties; each non-zero value in the pivoted matrix can be in fact easily interpreted as dependence of the uncertainty of an unobservable variable to an independent one. An analogous metric was proposed in Viti and Corman (2012) to quantify the total network state estimation uncertainty given a pre-specified number of sensors. He’s link weight based approach has been suggested in He (2013); we implemented it in order to perform a thorough evaluation of the different approaches available in literature.
Fig. 7. Extract of the Rotterdam network.
84
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Table 6 Summary of tests performed on the Rotterdam network (Fig. 7).
Evaluation metric: He Evaluation metric: NSP Evaluation metric: LSC
Descent metric: He
Descent metric: NSP
Descent metric: LSC
Experiment 1 Experiment 4 Experiment 7
Experiment 2 Experiment 5 Experiment 8
Experiment 3 Experiment 6 Experiment 9
We performed 9 experiments in order to explore all combinations of the three selected metrics, summarized in Table 6. All tests are based on the same initial full observability solution XHe. Fig. 8 shows the results of these 9 experiments; Fig. 8(a) shows the results of Experiments 1–3, Fig. 8(b) those of Experiments 4–6 and Fig. 8(c) those of Experiments 7–9. These three groups show how the combinations of sensors, chosen by the three different metrics, are evaluated, separately, by the three different metrics.
Fig. 8. Results of cross-comparison of metrics and descent algorithms on the Rotterdam network.
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
85
It is therefore intuitively safe to assume that for a given metric, the descent performed according to itself will always be best, but as can be seen in the Figures, this is actually not the case. What the results exhibit, instead, is that the choices performed by NSP, especially for the first, crucial sensors, are deemed best by all three metrics simultaneously. Experiment 5 has been performed also with the Add Beam Search, in order to assess the incidence of same-level same-value combinations of sensors in a network bigger than Parallel Highway. This test yielded a more than 4200 evaluated combinations of sensors (more precisely, 4265), for a total of 9 h of computational time (in comparison, Experiments 1–9 evaluated around 300 configurations and took about 30 min each), and the final resulting solution is exactly the same as the one obtained with the Local Search. We would like to point out that the full observability solution on the Rotterdam network using Castillo’s pivoting procedure has been found by using an explicit route enumeration using the k shortest routes for each OD pair. In Rinaldi et al. (2013) we analysed these results, and observed that this resulted in an interesting geographical spread of the sensors, and in favouring links near the centroids, and at the major junctions (e.g. crossing between motorways). By achieving a lower number of sensors (281) with respect to Ng’s and He’s methods, we may also observe that indeed route-based approaches may potentially start from richer topological relations with respect to node based. A well-chosen set of paths (e.g. those linearly independent as indicated by Castillo et al., 2013b) may help at finding lower number of sensors guaranteeing full observability than the number indicated analytically by Ng (2012). We will investigate in the future the relation between the quality of the chosen routes and the NSP metric, but this goes beyond the scope of this paper.
6. Further application opportunities and extensions The methodology proposed in this paper was shown to give different scientific contributions. First of all, the new metric allows one to assess partial observability solutions in terms of ‘‘degree of determinedness’’, i.e. given the position of sensors, it gives a quantitative evaluation of what remains unknown. Therefore it gives an indication of what is left as uncertainty in an estimation process aimed at completing the information from these sensors through, e.g. a model. Further, through the development of a local search, we have shown that not all full observability solutions have the same relevance for identifying partial observability solutions identified by state-of-the-art approaches, but a ranking could be done, based on some ideas also at the basis of our metric. It should be stressed out that, by keeping the assumptions in our study very generic, i.e. only the fundamental relations (1), (2) are necessary input and, eventually, a generated route set, many possible refinements and extensions are possible. The most straightforward extension is to relax the full a priori setting, for instance by adopting weights, such as in He (2013), that take into account the different importance that each variable may have. For example, one may impose higher weights to more capacitated links (e.g. motorways), or to those that are expected to carry the largest flows. This could be 0 implemented, for instance, by introducing a diagonal weight matrix W in Eq. (13), pre-multiplying XTB and X. A matrix W that would privilege information on a link would have larger weights on the diagonal element corresponding to that element. A rather different way of making a distinction between solutions of the proposed methodology could be by associating a utility that takes into account the spatial distribution of the sensors in the network. Many solutions can show large clusters of sensors and areas where no sensor is present. As pointed out by Gan et al. (2005), and Yang et al. (2006) a specific structure can be associated to sensor locations through a shortest-path-based procedure. Using this method they could identify solutions where OD flow separability is maximized, and as result sensor locations are distributed more evenly in space. This idea could be adopted in our method to distinguish different solutions and identify those able to separate sensors more uniformly in space. A second interesting further research direction is to combine the proposed methodology with flow-estimation approaches. A first step could be to adopt some of the rules proposed for these problems and described in Section 2.3. For instance, one could seek for the partial observability solution that in the same time satisfies the OD coverage or the route coverage rule, or in case a traffic model is adopted, as normally is the case in flow-estimation problems, then also maximum flow fraction and flow capturing rules could be checked and used to identify the most interesting solutions. Vice versa, the method described in this paper can be adopted in a preliminary step of a flow-estimation problem to evaluate what is the level of under-determinedness, and therefore provide expectations of how reliable an estimated solution can be, independently on the estimation technique adopted. A third direction, which was already introduced and partly explored in Viti and Corman (2012), is the exploitation of our approach for capturing traffic variations: sensor locations can be chosen so that the net flow variability of the network captured is maximal, i.e. the flow variations observed in the monitored flows explain a large share of the flow variations of the unmonitored ones. This can be treated as a third category for NSLP (network sensor location flow-reliability problems). The approach developed in this study is a natural extension of the approach proposed in Viti and Corman (2012). One can in fact aim at minimizing the estimation errors on the net flows, as well as on the flow variations. This problem has particularly relevant meaning in reliability and robustness studies; knowing how much a certain flow (link, route or OD) varies depending on the variation of other flows is in fact fundamental to identify the robustness of a network or to estimate the reliability of a certain state estimation (travel times, flows, etc.). In addition, many problems require the knowledge of the sensitivity of estimated flow parameters from existing flow information, which can also be derived from the information of the network
86
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
flow variations (for example, optimization problems where the solution is found by studying the variation, or change in the gradient, of the network flows by a change of a control variable). 7. Conclusions We study the network sensor location problem (NSLP) under partial network observability. This is a step forward in understanding the information available when only a subset of variables (i.e. link, route or OD flows) in a network can be observed. We proposed a new methodology that: (1) exploits the sole connectivity of the network, regardless of any measured value; (2) restricts its focus to the independent variables only, through a pivoting procedure; (3) uses an innovative metric that associates an error to a partially observed network, based on the Null space of the reduced matrix defining the topology of the network; (4) allows one to rank the different full observability solutions on the basis of our metric, and (5) uses this metric to define simple local search heuristics able to find the solution with the locally minimal error, when a maximum number of sensors is given. This last step allowed us also to identify a criterion to rank the different full observability solutions, which, once fixed, may or may not lead to partial observability solution maximizing the information on the network. We tested the new methodology on different toy networks and a real-sized network, showing that even few links might give a large deal of information, according to our metric, and therefore reduce the estimation uncertainty. Another important conclusion drawn from the analysis of different networks is that the adopted metric gives consistent rankings of the different full observability solutions, i.e. different solutions contain different information to be exploited in partial observability solutions. Therefore, not all full observability solutions behave equally same when seeking the largest information given a sub-set of all observed variables characterizing a full observability solution. Future research directions include combining our methodology with flow-estimation methods, guided by additional external rules such as the ones found in the literature of NSLP. Other extensions address a shift of paradigm from a purely topological approach to an approach where also some extra information is given, such as the most likely route, the shortest path for given OD pairs, the amount of demand for given OD pairs, link capacities, etc., as outlined in Section 6. Finally, the proposed approach can be generalized and applied to other typologies of networks, such as power networks or information networks, where knowledge of the network status is crucial for its management; and full observability is very expensive – if not impossible – to achieve. Acknowledgements The authors acknowledge the Research Fund of the KU Leuven, which partially financially supported this research under Grant OT/11/068. We also are indebted with the three anonymous reviewers, who certainly helped us at improving the paper significantly. Appendix A:. Frobenius norm The Frobenius or Hilbert–Schmidt norm of a matrix A of dimensions m n is expressed as
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi uX qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi u n uminfm;ng X u m X 2 jjAjjF ¼ t jai;j j ¼ traceðA AÞ ¼ t r2k i¼1 j¼1
k¼1
⁄
with A the conjugate transpose of A, ai,j a generic element of, A and rk the singular values of A. The last definition in indicates that the calculated norm lays its foundations on the singular value decomposition method, which is a well-known factorization method used in uncertainty analysis, pattern recognition, machine learning applications, etc. This connection is exploited when we discuss the relationship between the full and partial observability solutions in Section 4. Appendix B:. NSP metric – matrices and dimensions definition Let |x| be the cardinality of x. The link to link relationship matrix P, resulting – for example – from a pivoting procedure is in the form:
v dep ¼ P v indep ;
P 2 Rjdepjxjindepj
The augmented relationship matrix X is defined as follows:
X¼
I 2 Rjindepjxjindepj P 2 Rjdepjxjindepj
0 2 Rjindepjxjdepj 0 2 Rjdepjxjdepj
! 2 Rjindepjþjdepjxjindepjþjdepj
Removing an independent variable from the matrix P yields a partial observability matrix P0 2 Rjdepjxjindep1j , and, subsequently, the following augmented partial observability matrix:
87
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
I0 2 Rjindepjxjindep1j P0 2 Rjdepjxjindep1j
X0 ¼
0 2 Rjindepjxjdepj 0 2 Rjdepjxjdepj
! 2 Rjindepjþjdepjxjindep1jþjdepj 0
0
0
We define B0 the basis of the null space of the matrix X T (or of the pseudo-inverse matrix X +; in fact Null(X0 T) = Null (X +)):
B0
nullðX0T Þ
As per the definition of null space resulting from the fundamental theorem of linear algebra (see for instance Strang, 1993), 0 this basis has dimensions B0 2 Rjindepjþjdepjxjindepjþjdepjr with r the rank of the matrix X T. To define our metric, we compute the projection of this basis onto the original matrix X as follows:
XT B 0 the dimensions of the multiplicands are:
XT 2 Rjindepjþjdepjxjindepjþjdepj B0 2 Rjindepjþjdepjxjindepjþjdepjr and the resulting product has dimensions:
XT B0 2 Rjindepjþjdepjxjindepjþjdepjr Following from above, the augmented partial observability matrix resulting from observing no independent variable has the following form:
0 2 Rjindepjxjdepj
0
X ¼
! 2 Rjindepjþjdepjxjdepj
0 2 Rjdepjxjdepj
0
It is trivial to state that, in this case, null(X T) R|indep|+|dep|, for which the orthonormal base can then be also trivially defined as follows:
B0 ¼ I 2 Rjindepjþjdepjxjindepjþjdepj Finally, it is clear from the definition of our metric that, when no sensor is observed, the metric value is exactly 1:
NSP ¼
jjXT B0 jjF jjXT IjjF jjXT jjF ¼ ¼ ¼ 1: jjXT jjF jjXT jjF jjXT jjF
Appendix C:. Numerical example for the network in Fig. 1 We begin from a link-route matrix as follows:
0
v1
1
0
1 0
B C B B C B B v2 C B 0 B C B B C B B v3 C ¼ B 1 B C B B C B B C B B v4 C B 1 @ A @ 0 v5
1 0
1
0 1 C h1 C 1 CB C CB h C CB 2 C B C 1 1 1C CB C CB h3 C C@ A 1 0 0C A h 4 0 1 1 1 0
Through the Pivoting procedure, one obtains the following linearly independent relationships:
1 0 0 h1 B C B B C B B h2 C B 0 B C B B C B B v3 C ¼ B 1 B C B B C B B C B B h3 C B 1 @ A @ 1 v5 0
1 0 1 C v1 C 0 1 CB C CB v C CB 2 C B C 0 0 C CB C CB v 4 C C@ A 1 1 C A h 4 1 0
1 1 1 1 1 1
1
We are interested then in the independent to dependent link relationship matrix P:
P:
v3 1 ¼ 1 v5
1 0 1 1
0
1
v1 B C @ v2 A v4
88
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
Using the augmented matrix form X:
0
1 0
0
0 0
1
C 0 0C C 0 0C C C 0 0A 1 1 1 0 0
B B0 1 0 B X¼B B0 0 1 B @1 1 0
If, as we did in the example in Fig. 4, we stop measuring
0
v4, the resulting augmented matrix will become:
1
1 B B0 B X0 ¼ B B0 B @1
0 0 0 C 1 0 0C C 0 0 0C C C 1 0 0A 1 1 0 0 0
0
If we then calculate the basis B of the Null Space of X T:
0
1 0:4472 C 0:4472 C C C 1 0 C C 0 0:2764 A 0:2764 0 0:7236
0:4472 B B 0:4472 B B0 ¼ B B0 B @ 0:7236
0 0
the projection of the original matrix XT onto this base becomes
0
0 0 0:2764
B XT B0 ¼ @ 0 0 1
0 0
1
C 0 0A
0 0 0:7236 0 0 Finally, the ratio of the norm of this projection wrt the norm of the original matrix, which defines our NSP metric, is:
NSP ¼
jjXT B0 jjF ¼ 0:4472 jjXT jjF
References Bianco, L., Confessore, G., Reverberi, P., 2001. A network based model for traffic sensor location with implications on O/D matrix estimates. Transportation Science 35 (1), 50–60. Bierlaire, M., 2002. The total demand scale: a new measure of quality for static and dynamic origin-destination trip tables. Transportation Research Part B: Methodological 36, 755–851. Cascetta, E., 2009. Transportation Systems Analysis. Springer, New York, US. Castillo, E., Cobo, A., Jubete, F., Pruneda, R.E., Castillo, C., 2000. An orthogonally based pivoting transformation of matrices and some applications. Journal of Matrix Analysis and Applications 22 (3), 666–681. Castillo, E., Conejo, A.J., Pruneda, R.E., Solares, C., 2006. Observability analysis in state estimation: a unified numerical approach. IEEE Transactions on Power Systems 21 (2), 877–886. Castillo, E., Conejo, A.J., Pruneda, R.E., Solares, C., 2007. Observability in linear systems of equations and inequalities: applications. Computers & Operations Research 34 (6), 1708–1720. Castillo, E., Conejo, A., Menendez, J.M., Jimenez, P., 2008a. The observability problem in traffic network models. Computer Aided Civil and Infrastructure Engineering 23, 208–222. Castillo, E., Menendez, J.M., Jimenez, P., 2008b. Trip matrix and path flow reconstruction and estimation based on plate scanning and link observations. Transportation Research Part B: Methodological 42, 455–481. Castillo, E., Jimenez, P., Menendez, J.M., Conejo, J.A., 2009. The observability problem in traffic models: algebraic and topological methods. IEEE Transactions on Intelligent Transportation Systems 9 (2), 275–287. Castillo, E., Gallego, I., Menéndez, J.M., Jiménez, P., 2012. Link flow estimation in traffic networks on the basis of link flow observations. Journal of Intelligent Transportation Systems 15 (4), 205–222. Castillo, E., Nogal, M., Rivas, A., Sanchez-Cambronero, S., 2013a. Observability in traffic networks. Optimal location of counting and scanning devices. Transportmetrica B: Transport Dynamics 1 (1), 68–102. Castillo, E., Calviño, A., Menendez, J.M., Jimenez, P., Rivas, A., 2013b. Deriving the upper bound of the number of sensors required to know all link flows in a traffic network. IEEE Transactions on Intelligent Transportation Systems 14 (2), 761–771. Castillo, E., Calviño, A., Lo, H.K., Menendez, J.M., Grande, Z., 2014. Non-planar hole-generated networks and link flow observability based on link counters. Transportation Research Part B: Methodological 68, 239–261. Chen, A., Yang, H., Lo, H.K., Tang, W.H., 2002. Capacity reliability of a road network: an assessment methodology and numerical results. Transportation Research Part B: Methodological 36 (3), 225–252. Chen, A., Chootinan, P., Recker, W., 2005. Examining the quality of synthetic O–D trip table estimated by path flow estimator. Journal of Transportation Engineering 131, 506–513. Chung I.-H. (2001). An Optimum Sampling Framework for Estimating Trip Matrices from Day-to-day Traffic Counts. Unpublished doctoral dissertation, University of Leeds, Leeds, UK.
F. Viti et al. / Transportation Research Part B 70 (2014) 65–89
89
Cipriani E., Fusco G., Gori S., Petrelli M. (2006). Heuristic Methods for the Optimal Location of Road Traffic Monitoring Stations. Proceedings of the IEEE-ITS Conference, September 2006, Toronto, Canada. Ehlert, A., Bell, M.G.H., Grosso, S., 2006. The optimization of traffic count locations in road networks. Transportation Research Part B: Methodological 40, 460–479. Eisenman, S.M., Fei, X., Zhou, X., Mahmassani, H.S., 2006. Number and location of sensors for real-time traffic network estimation and prediction: a sensitivity analysis. Transportation Research Record 1964, 260–269. Fei, X., Mahmassani, H.S., Eisenman, S.M., 2007. Sensor coverage and location for real-time traffic prediction in large-scale networks. Transportation Research Record 2039, 1–15. Fei, X., Mahmassani, H.S., 2011. Structural analysis of near-optimal sensor locations for a stochastic large-scale network. Transportation Research Part C: Emerging Technologies 19, 440–453. Fei, X., Mahmassani, H.S., Murray-Tuite, P., 2013. Vehicular network sensor placement optimization under uncertainty. Transportation Research Part C: Emerging Technologies 29, 14–31. Gan, L., Yang, H., Wong, S.C., 2005. Traffic counting location and error bound in origin-destination matrix estimation problems. Journal of Transportation Engineering 131 (7), 524–534. Gentili, M., Mirchandani, P.B., 2005. Locating active sensors on traffic networks. Annals of Operations Research 136 (1), 229–257. Gentili, M., Mirchandani, P.B., 2012a. Survey of models to locate sensors to estimate traffic flows. Transportation Research Record 2243, 108–116. Gentili, M., Mirchandani, P.B., 2012b. Locating sensors on traffic networks: models, challenges and research opportunities. Transportation Research Part C: Emerging Technologies 24, 227–255. Gupta, R., Prasad, T.D., 2000. Extended use of linear graph theory for analysis of pipe networks. Journal of Hydraulic Engineering 126, 56–62. Hodgson, M.J., 1990. A flow-capturing location-allocation model. Geographical Analysis 22 (3), 270–279. He, S., 2013. A graphical approach to identify sensor locations for link flow inference. Transportation Research Part B: Methodological 51, 65–76. Hu, S., Peeta, S., Chu, C., 2009. Identification of vehicle sensor locations for link-based network traffic applications. Transportation Research Part B: Methodological 43, 873–894. Lam, W.H.K., Lo, H.P., 1990. Accuracy of O–D estimates from traffic counting stations. Traffic Engineering and Control 7 (1), 105–114. Larsson, T., Lundgren, J.T., Peterson, A., 2010. Allocation of Link Flow Detectors for Origin-Destination Matrix Estimation — A Comparative Study. ComputerAided Civil and Infrastructure Engineering 25, 116–131. Li, X.P., Ouyang, Y.F., 2011. Reliable sensor deployment for network traffic surveillance. Transportation Research Part B: Methodological 45 (1), 218–231. Mirchandani, P.B., Gentili, M., He, Y., 2009. Location of Vehicle Identification Sensors to Monitor Travel-time Performance. IET Intelligent Transportation Systems 3 (3), 289–303. Montoya-Zamora, R., Romero, J.A., Guzman-Cruz, R., 2011. Comparison between heuristics and methodologies to solve the NSLP. International Journal of Engineering Science and Technology 3 (7), 5929–5939. Mori, H., Tsuzuki, S.T., 1995. A fast method for topological observability analysis using a minimum spanning tree technique. IEEE Transactions on Power Systems 6 (2), 491–500. Ng, M., 2012. Synergistic sensor location for link flow inference without path enumeration: a node-based approach. Transportation Research Part B: Methodological 46 (3), 781–788. Ng, M., 2013. Partial link flow observability in the presence of initial sensors: solution without path enumeration. Transportation Research Part E: Logistics and Transportation Review 51, 62–66. Rahal, H., 1995. A co-tree flows formulation for steady state in water distribution networks. Advances in Engineering Software 22, 169–178. Rinaldi, M., Viti, F., Corman, F. (2013). A Null-space Metric for the Analysis of Partial Network observability in Sensor Location Problems. In: Proceedings of the 92nd TRB Annual Meeting, January 11–15 2013, Washington DC. Simonelli, F., Marzano, V., Papola, A., Vitiello, I., 2012. A network sensor location procedure accounting for O-D matrix estimate variability. Transportation Research Part B: Methodological 46, 1624–1638. Strang, G., 1993. The fundamental theorem of linear algebra. The American Mathematical Monthly 100 (9), 848–855. Viti, F., Tampere, C.M.J., Verbeke, W., Immers, B., 2008. Sensor locations for reliable travel time prediction and dynamic management of traffic networks. Transportation Research Record 2049, 103–110. Viti F. Corman F. (2012). A Novel Approach to the Sensor Location Problem for Measuring the Observed Network Flow Variability. Proceedings of the 5th International Symposium of Traffic Network Reliability, 18–19 December, Hong Kong, China. Wang, N., Gentili, M., Mirchandani, P., 2012. A model to locate sensors for estimating static OD volumes given prior flow information. Transportation Research Record 2283, 67–73. Yang, H., Iida, Y., Sasaki, T., 1991. An analysis of the reliability of an origin-destination trip matrix estimated from traffic counts. Transportation Research Part B: Methodological 25, 351–363. Yang, H., Yang, C., Gan, L., 2006. Models and algorithms for the screen line-based traffic-counting location problems. Computers & Operations Research 33 (3), 836–858. Yang, H., Zhou, J., 1998. Optimal traffic counting location for origin-destination matrix estimation. Transportation Research Part B: Methodological 32 (2), 109–126. Yim, K.N., Lam, W.H.K., 1998. Evaluation of count location selection methods for estimation of OD matrices. Journal of Transportation Engineering 124 (4), 376–383. Zhou, X., List, G., 2010. An information-theoretic sensor location model for traffic origin–destination demand estimation applications. Transportation Science 44 (2), 254–273.