Numerical solution procedures for the morning commute problem

Numerical solution procedures for the morning commute problem

Mathematical and Computer Modelling 49 (2009) 1295–1310 Contents lists available at ScienceDirect Mathematical and Computer Modelling journal homepa...

2MB Sizes 3 Downloads 97 Views

Mathematical and Computer Modelling 49 (2009) 1295–1310

Contents lists available at ScienceDirect

Mathematical and Computer Modelling journal homepage: www.elsevier.com/locate/mcm

Numerical solution procedures for the morning commute problem Yu (Marco) Nie a,∗ , Michael H. Zhang b a

Department of Civil and Environmental Engineering, Northwestern University, 2145 Sheridan Road, Evanston 60208, United States

b

Department of Civil and Environmental Engineering, University of California, Davis, United States

article

info

Article history: Received 17 November 2006 Received in revised form 18 November 2008 Accepted 25 November 2008 Keywords: Morning commute Variational inequality Projection algorithm Ascent direction algorithm

a b s t r a c t This paper discusses solution techniques for the morning commute problem that is formulated as a discrete variational inequality (VI). Various heuristics have been proposed to solve this problem, mostly because the analytical properties of the path travel time function have not yet been well understood. Two groups of ‘‘non-heuristic’’ algorithms for general VIs, namely projection-type algorithms and ascent direction algorithms, were examined. In particular, a new ascent direction method is introduced and implemented with a heuristic line search procedure. The performance of these algorithms are compared on simple instances of the morning commute problem. The implications of numerical results are discussed. © 2009 Elsevier Ltd. All rights reserved.

1. Introduction The analysis of the morning commute problem is traced back to the seminal work of Vickrey [1], in which traffic congestion is assumed to take the form of queuing behind bottlenecks caused by temporal demand surge. In Vickrey’s model, all commuters wish to arrive at work at the same desirable time. Yet this is impossible due to the existence of bottlenecks. Travelers have to balance the trade-off, by choosing their departure times, between travel costs and schedule costs (i.e., the cost of early or late arrival). They could either depart earlier or later (thus bear higher schedule costs) to avoid congestion, or try to arrive on time but have to travel in rush hour. As equilibrium is attained, no traveler could reduce the sum of the travel and schedule costs by unilaterally changing his/her departure time. By endogenizing individual’s departure time choice, Vickrey’s model determines the temporal evolution of rush-hour traffic congestion. Many attempts have since been made to extend or refine Vickrey’s conceptual model. Heterogeneity of individuals was first introduced by Henderson [2,3], then refined by Newell [4], and extended in a series of papers of Arnott et al. [5–7]. A great deal of efforts were devoted to a more realistic representation of traffic phenomena. Along this line are Henderson’s speed–density relationship model [2,3], revised Henderson’s model by Chu [8], Mun’s kinematic wave model [9], and Yang and Huang’s variable capacity bottleneck model [10]. Extending the analysis of morning commute to general road networks was initiated in the work of Mahmassani and Herman [11]. Using a simple two-route network, they combined commuters’ departure time and route choices together. This network configuration was later revisited by several authors, including [12, 13]. Another simple network configuration, which comprises two tandem links and two interacting origin-destination (OD) pairs, was first studied by Kuwahara [14]. [15] extended Kuwahara’s analysis to a standard 2-in-1 merge network and identified a dynamic network equilibrium paradox. However, analyzing morning commute turns out to be very challenging in general settings, where complicated topology, multiple O-D pairs and realistic traffic flow dynamics come into play. It was not until the early 1990s that transportation scientists began to formulate the problem using the general theory of variational inequality. Friesz et al. [16] and Smith [17] are among the first to establish the equivalence between the equilibria of morning commute and the solution to a corresponding variational inequality VI(c , K ), where c is a mapping from K to



Corresponding author. Tel.: +1 847 467 0502. E-mail addresses: [email protected] (Y. Nie), [email protected] (M.H. Zhang).

0895-7177/$ – see front matter © 2009 Elsevier Ltd. All rights reserved. doi:10.1016/j.mcm.2008.11.008

1296

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Ω ⊃ K (where Ω and K are polyhedral) and depends on the underlying traffic flow models. Although it provides a valid mathematical treatment for the morning commute problem, the original VI formulation of [16] is hard to solve. Above all, the mapping c was not cast in a closed form, making it difficult to assert its analytical properties such as differentiability and monotonicity. This drawback hinders the design of effective solution procedures. Numerous formulations had since been proposed in the hope of casting the mapping more appropriately. Most existing work is either based on VI [18–21], or highly related to it [22,23]. Various network traffic flow models were examined, such as delay function model [16,19], point-queue model [17,22], kinematic wave model [21]. However, none of the existing work has ever produced a closed form for the mapping c. Instead, c has to be evaluated through a complicated numerical procedure known as dynamic network loading (DNL). Consequently, the algorithmic study for the morning commute problem remains in its infancy. For most of the aforementioned formulations, heuristic solution procedures were suggested, such as the method of successive average, e.g., [24]. If the mapping c is assumed to have some nice properties (e.g., strong monotonicity and Lipschitz continuity), several derivative-free solution techniques for VI can be applied. The projection algorithm [25] and the descent direction algorithm [21] are two examples. Efficient solution algorithms for the morning commute problem are worthy of more extensive investigations, even if the current generation of formulations does not warrant a mathematically rigorous solution scheme. Numerical algorithms are not only indispensable in real-world applications, but also helpful in enlightening analytical studies. This paper is devoted to VI-based algorithms for the morning commute problem. Particularly, we examine a class of ascent direction algorithms which are based on maximizing a bivariate merit (or gap) function. The introduction of the merit function makes it possible to conduct a line search for a better step size, and thereby helps improve the overall convergence performance. Line search seems prohibitively expensive at first glance because evaluating the merit function requires performing traffic simulation. Existing descent direction algorithms [21] often apply predetermined step sizes. However, our numerical experiments indicate that properly applying line search techniques could bring about significant computational improvements. Also included in our study are two heuristic algorithms and three projection-type algorithms. Although some of the algorithms have been adopted to solve the morning commute problem, their relative performance remains unknown to a large extent. More importantly, all algorithms either employ ‘‘heuristics’’ that do not guarantee convergence, or assume the mapping c satisfy conditions (e.g., monotonicity) that may not actually hold. In the latter case, it is useful to verify the assumptions about the mapping, and to reveal the impacts of the violation of such assumptions on different solution procedures. Analysis and numerical experiments presented herein are focused on Vickrey’s classical setting given the following reasons. First of all, the analytical solution is known in that case, and thus serves as a comparison/evaluation benchmark. Secondly, the simple network setting greatly simplifies the notation, and shuns the complexity associated with network topology, such as spillovers.1 Thirdly, conducting intensive experiments that require thousands of iterations on real-sized problems is computationally prohibitive. Last but not least, the analysis and numerical algorithms presented in this paper can be easily extended to general cases. The next section reviews Vickrey’s classical model of morning commute and its analytical solution. The VI formulation of the problem and its equivalent maximization program are presented in Section 3. All algorithms are discussed in Section 4. Section 5 reports numerical results and the last section summarizes our findings. 2. Vickrey’s bottleneck model and its analytical solution There is a fixed number of identical individuals (N) who commute from home to work in the morning, through a stretch of highway containing a bottleneck. As long as the incoming flow rate at the bottleneck (expressed in vehicles per unit time) exceeds its capacity s, a queue develops and queued vehicles are served in a first-in-first-out order. Consequently, the en-route travel time of any individual departing at time t contains two parts: a waiting time d(t ) in the queue and a free flow travel time t0 . Each commuter prefers to a punctual arrival within an identical time window, [ta∗ − ∆, ta∗ + ∆]. Yet this is impossible due to the physical limitation of road supply — some must arrive early and others late. Whenever an early or late arrival occurs, a schedule cost τ (t ) is imposed for the individual departing at t. Each individual is assumed to choose a departure time that minimizes the linear combination of travel time d(t ) + t0 and schedule cost τ (t ). A Wardrop equilibrium [27] is attained if no commuter could reduce his/her commute cost by changing the departure time. Let q(t ) be the number of queued vehicles at time t. For brevity the bottleneck is assumed to locate at the entrance of the highway so that any vehicle arrives at the bottleneck right after its departure. q(t ) is related to the incoming flow rate f (t ) as follows dq(t ) dt

 =

0, f (t ) − s,

for q(t ) = 0 and f (t ) ≤ s otherwise.

(1)

An individual’s queuing time d(t ) equals queue length (at the time it joins the queue end) divided by the capacity of the bottleneck s, i.e., d(t ) =

q(t ) s

.

(2)

1 See e.g. [21] and [26] for discussions on how queue spillover could affect equilibrium solution. However, the impact of spillover on algorithm performance remains unclear to the best of our knowledge.

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

1297

Following [28], the commute cost of each traveler who departs at time t, c (t ) is defined by the following piecewise linear function:

( α(d(t ) + t0 ) + β(ta∗ − t − d(t ) − t0 ), c (t ) = α(d(t ) + t0 ), α(d(t ) + t0 ) + γ (t + d(t ) + t0 − ta∗ ),

t < td∗ early arrival t ∈ [td∗ , td∗0 ], punctual arrival t > td∗0 late arrival

(3)

where α , β and γ are positive scalars, and [td∗ , td∗0 ] denotes the departure time window that guarantees punctual arrival. Note that empirical evidence indicates that γ > α > β [29]. The equilibrium condition can be expressed as



f (t )[c (t ) − c ∗ ] = 0, c (t ) − c ∗ ≥ 0,

∀t ∀t

(4)

RT

where f (t ) denotes the departure rate at t and f (t ) ≥ 0, ∀t, 0 f (t ) = N. c ∗ is the minimum commute cost and T is the analysis period. Seeking f (t ) that satisfies Condition (4) via numerical procedures is the central concern of this paper. In this simple case, however, the equilibrium solution can be obtained analytically. To provide a benchmark for computational studies carried out in Section 5, the equilibrium departure rate f (t ) is given as follows:

 α  s,  α − β

f (t ) =

s

  

α s, α+γ

t ∈ [tb , td∗ ] t ∈ [td∗ , td∗0 ]

(5)

t ∈ [td∗0 , te ].

3. Variational inequality formulation and its transformations In this section we first introduce VI formulations for the morning commute problem, and then discuss several important properties of a transformation-based merit functions. 3.1. Continuous and discrete VI formulations The continuous time VI formulation of the morning commute problem, denoted as CVI(c , K ), is stated as follows: find a function f ∗ (t ) such that T

Z

c (f ∗ )(f (t ) − f ∗ (t ))dt ≥ 0,

∀f (t ) ∈ K

(6)

0

RT

where K = {f (t )| 0 f (t ) = N , f (t ) ≥ 0, ∀t }. The following result establishes the equivalence between VI and the dynamic user equilibrium conditions [16]. Theorem 1. f ∗ (t ) solves CVI(c , K ) if and only if f ∗ (t ) satisfies the equilibrium condition (4). The infinite-dimension VI problem is difficult to tackle and thus is often discretized. A discrete VI formulation reads [18] DVI(c , K ): find a vector f ∗ ∈ K such that hc (f ∗ ), f − f ∗ i ≥ 0,

( K =

f = [f1 , . . . , fm ]T ∈ Rm |

m X

∀f ∈ K

(7)

) fi = N , fi ≥ 0

(8)

i=1

where the analysis period [0, T ] is divided into m time intervals with an identical length of δ . The following existence results is well known. Theorem 2. A solution exists for DVI(c , K ) if K ⊂ Rm is closed, convex and c : K → Rm is continuous on K . Proof. Via either Brouwer’s fixed-point theorem or via degree theory. See Section 2.2 [30].



Given that K is a polyhedral (hence a closed convex set), the question of the solution existence relies on the continuity of c with respect to f , and that was independently established in [18,22] for exit-flow and point-queue models respectively. We shall assume in this paper that the continuity of c holds because we evaluate c using a dynamic network loading (DNL) model consistent with the point-queue model [31].2 However, it is widely believed that multiple equilibria could exist for the morning commute problem, see, e.g. [21]. 2 However, for more complicated dynamic traffic models such as those allowing queue spillover (e.g., cell transmission model [32]), the validity of the continuity assumption of c remains an unresolved issue. The discontinuity of c in the presence of queue spillovers may contribute to the instability of the equilibrium solution, as noted in [33].

1298

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

3.2. Transformations based on merit functions It is well known that VI can be transformed to an optimization problem based on merit (gap) functions. Let us first define the merit function associated with DVI(c , K )3 :

θ (f ) := minhc (f ), g − f i = maxhc (f ), f − g i. g ∈K

(9)

g ∈K

Hereafter the first definition is exclusively used for simplicity. The following proposition shows the connection between DVI(c , K ) and θ (f ) (for the convenience of the reader, a proof is given in Appendix A). Proposition 1. f ∗ is a solution to DVI(c , K ) if and only if f ∗ solves the maximization program max θ (f ).

(10)

f ∈K

Note that θ (f ) may not be differentiable due to the likely existence of multiple solutions to the minimization program that defines itself (see Section 10.2, [30]). In order to resolve the problem, [36] defines the regularized merit function by adding a strictly convex term to the linear objective: r

θr (f ) := min{hc (f ), g − f i + hg − f , G(g − f )i} 2

g ∈K

(11)

where r is a positive scalar and G is a symmetric positive definite matrix. It is shown [36] that the gradient of θr (f ) exists if c is continuously differentiable on K , and is expressed as

∇θr (f ) = c (f ) + (Jc (f ) − rG)(gr∗ (f ) − f ), ∀f ∈ K (12) where Jc (f ) is the Jacobian matrix of c and gr∗ (f ) is the solution to the minimization program defined in (11). However, evaluating ∇θr (f ) is a difficult exercise. For one thing it requires to solve a minimization program. But a greater obstacle is its dependence on Jc (f ) that is not available in a closed form. Thus, efficient algorithms would have to depend on ‘‘derivativefree’’ approaches. The following theorem lays a foundation for algorithms in the category. Theorem 3. For every f ∈ K , the following three statements are valid. 1. The optimal solution to the minimization program that defines θr (f ) is the projection of f − 1r G−1 c (f ) onto K according to



z T Gz, i.e.,

vector norm kz kG = gr∗ (f ) = ΠKG

 f −



1 −1 G c (f ) r

where r and G as defined in (11). 2. If c (f ) is Lipschitz continuous and strongly monotone4 on K , then gr∗ (f ) − f is a strict ascent direction for θr (f ) at f , whenever f is not a solution to DVI(c, K). 3. If c (f ) is Lipschitz continuous and monotone on K , either gr∗ (f ) − f is a strict ascent direction for θr (f ) at f , or θr (f ) = r hg − f , G(g − f )i. 2 Proof. The first statement follows from Theorem 10.2.3 in [30]. The second and third statements follow from Propositions 1 and 2 in [37], respectively.  Theorem 3 shows that gr∗ (f ) − f may yield sufficient improvement in the regularized merit function, thus opens a door to efficient algorithm designs in the absence of derivative information. The descent direction algorithm of [21] falls into this category but it takes predefined step sizes. In the next section, we shall present an ascent direction algorithm with line search based on [37]. The projection of f − 1r G−1 c (f ) also plays a fundamental role in projection-type algorithms for DVI(c , K ), as shown in the following theorem (the reader is referred to Appendix A for a proof). Theorem 4. For any f ∗ ∈ K , f ∗ solves DVI(c , K ) if and only if f ∗ = ΠKG (f ∗ − 1r G−1 c (f ∗ )), where r and G as defined in (11). We now return to the merit function of form (9) to explore some useful properties that would lead to another feasible direction algorithm. ∗ T ] be the optimal solution to the minimization program that defines θ (f ) in (9) Proposition 2. Let g ∗ (f ) = [g1∗ , . . . , gi∗ , . . . , gm ∗ for a given f . If K is defined by (8), then gi (f ) = 0, ∀i 6= imin and gi∗min = N, where imin = arg mini ci (f ).

Proof. From definition we have hc (f ), g − f i = hc (f ), g i − hc (f ), f i. Notice that the second Pm term is a constant for any given f , so the minimum only depends on hc (f ), g i. Since g (f ) belongs to K , gi (f ) ≥ 0, and i gi (f ) = N. Obviously, hc (f ), g i is minimized if and only gimin = N , gi = 0, ∀i 6= imin .  3 Earliest applications of this concept in transportation studies are due to Hearn [34] and Marcotte [35]. 4 The reader is referred to Appendix A for definitions.

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

1299

Most existing heuristic algorithms for the morning commute problem assume that g ∗ (f ) − f is a feasible direction along which θ (f ) could be improved. Note that the basic logic of these heuristics is to shift a certain amount of flows between minimum-cost and non-minimum-cost assignment elements.5 Nevertheless, it remains unclear from the literature under which conditions g ∗ (f ) − f is indeed an ascent direction. The following result attempts to address this issue. Proposition 3. Let g ∗ (f ) solve the minimization program (9) for a given f . Let h = (1 − λ)f + λy, λ ∈ [0, 1] be a new solution along the search direction y. g ∗ (f ) − f is a strict ascent direction for θ (f ) at f if (1) c (f ) is strictly monotone, and (2) ∃λ ∈ (0, 1) such that y = g ∗ (h). Proof. From the definitions of h and θ (f ), as well as Condition 2 above we have

θ (h) = hc (h), yi − hc (h), hi, θ (f ) = hc (f ), g ∗ (f )i − hc (f ), f i. Thus

θ (h) − θ (f ) = hc (h), yi − hc (f ), g ∗ (f )i + hc (f ), f i − hc (h), hi = hc (h), yi − hc (f ), g ∗ (f )i + hc (f ), f i − (1 − λ)hc (h), f i − λhc (h), yi. From the strict monotonicity of c, we know

hc (h) − c (f ), h − f i = λhc (h) − c (f ), y − f i = hc (h), yi − h(c (h), f )i − hc (f ), yi + h(c (f ), f )i > 0. Applying this inequality into θ (h) − θ (f ), we obtain

θ (h) − θ (f ) > hc (f ), yi − hc (f ), g ∗ (f )i + λ(hc (h), f i − hc (h), yi). Noting that y and g ∗ (f ) are optimal solutions to (9) at h and f respectively, we have

hc (f ), yi ≥ hc (f ), g ∗ (f )i,

hc (h), f i ≥ hc (h), yi.

Thus θ(h) − θ (f ) > 0. This suggests that y is a strict ascent direction. Since y → g ∗ (f ) when λ → 0, g ∗ (f ) is a strict ascent direction.  4. Algorithms We are now ready to discuss the details of solution procedures. For the sake of clarity, algorithms for solving DVI(c , K ) are classified into three groups: heuristic algorithms, projection algorithms and algorithms based on feasible ascent directions. Heuristic algorithms are probably the simplest, yet most often used in the literature. Projection algorithms are classical for monotone VIs, but the application of the extra-gradient method and the hyperplane projection method in solving the morning commute problem has not been investigated elsewhere. The use of ascent direction algorithms with line search is also new, as to the best of our knowledge. 4.1. Heuristic algorithms Most heuristic algorithms implicitly assume that g ∗ (f ) − f is an ascent direction, where g ∗ (f ) is the optimal solution to the minimization program defining the merit function θ (f ). Although they all agree that flow should be shifted from non-cheapest assignment element onto the cheapest one, existing heuristics differ from each other on the flow-shifting strategy. MSA The simplest strategy is the so-called method of successive average (MSA). Algorithm MSA

Step 1: Set λ =

1 , k

obtain the new solution by

f k+1 = (1 − λ)f k + λg ∗ (f k ) MSA converges to the optimal solution of the static traffic assignment problem, as shown in [38]. Although some experiments [24] indicate MSA can produce solution that satisfactorily fulfill the dynamic equilibrium condition, its convergence in solving DVI(c , K ) has not been rigorously justified. DSA What actually happens in MSA is that a predetermined amount of flow is deducted from all non-cheapest assignment elements evenly while the cheapest assignment element receives flows to maintain the conservation. [22] suggests that, instead of making even shift, more flows should be deducted from the assignment elements that carry larger flows and have more expensive costs. This idea leads to what is called the day-to-day swapping algorithm (DSA). 5 Each assignment element corresponds to a time interval for our problem. In general cases, it is associated with a combination of a time interval and a path.

1300

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Algorithm DSA

Step 1: Let λ be a predetermined step size, the new solution f k+1 is obtained by fik+1 = fik − λfik (cik − cikmin ), m X

k+1 fimin =N−

∀i 6= imin

(13)

fik+1

(14)

i=1,i6=imin

where imin is defined in Proposition 2. Critical to the performance of DSA is the choice of λ, which is problem-specific. According to [22], λ should be small enough to avoid oscillation and negative flows. A slightly different strategy that explicitly prevents negative flows is employed in this study, as given below:





 k fik (cik − cikmin )  , fik+1 = max  0 , f − λ N i m   P k k k fi (ci − cimin )

∀i 6= imin .

(15)

i

4.2. Projection-based algorithms Theorem 4 shows that the projection of f ∗ − 1r G−1 c (f ∗ ) might be used to test the optimality of DVI(c , K ). Actually, iteratively performing this projection underlies all projection-type algorithms to be examined in this subsection. Without loss of generality, we assume G = I for brevity [30]. BPA The basic projection algorithm (BPA) involves an Euclidean projection in each iteration: Algorithm BPA Step 1: Let r be a constant positive scalar, compute f k+1 = ΠK



fk −

1 r



c (f k ) .

This is the algorithm adopted in [25]. To ensure the convergence of the above algorithm, the projection mapping ΠK (f − 1 c (f )) has to be a contraction (see Appendix A for a definition). This in turn requires that the mapping c is Lipschitz r continuous (with L), strongly monotone (with µ), and r >

L2 , 2µ

see Theorem 12.1.2, [30]. If r is allowed to vary from one

iteration to the next, a co-coercive c (with γ ) is enough to guarantee convergence given r > 21γ [30]. However, selecting a sequence of r to achieve the improved convergence is difficult, and no general rules seem to apply. Our implementation of BPA keeps r constant. EGM Co-coerciveness is still a strong condition. Conceivably, an algorithm relying on weaker convergence conditions might have better chance to converge. As shown in [39], a simple revision of BPA will reduce the convergence requirement to Lipschitz continuity (with L) plus pseudomonotonicity. Known as an extra-gradient method (EGM), the algorithm updates f k in two steps. Algorithm EGM Step 1: Let r be a constant positive scalar, compute y = ΠK k



k

f −

1 r



c (f ) . k

Step 2: Compute f

k+1

= ΠK



 f − c (y ) . 1

k

k

r

Note that EGM merely executes one more projections per iteration compared with BPA. HPM EGM still requires the estimate of Lipschitz constant because the convergence of EGM is ensured only if r is larger than L. The drawback was overcome in the hyperplane projection method (HPM) invented by Konov [40]. Like EGM, HPM performs two projections per iteration. However, it adds a line search routine to find an appropriate r in the second projection, and thus its convergence does not depend on the Lipschitz constant. Consequently, the convergence of HPM only requires that c is pseudomonotone, the weakest condition in all the algorithms in consideration. Our exposition of HPM follows [30]: Algorithm HPM Step 1: Compute yk = ΠK



fk −

1 r



c (f k ) .

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

1301

Step 2: Given a constant σ ∈ (0, 1), find the largest λk ∈ [0, 1] such that

hc (λk yk + (1 − λk )f k ), f k − yk i ≥ r σ kf k − yk k2 .

(16)

Step 3: Let z = λ y + (1 − λ )f and set k

k k

rz =

k

k

kc (z k )k2 . hc (z k ), f k − z k i

(17)

Step 4: Compute f k+1 = Πk



fk −

1 rz



c (z k ) .

HPM is applicable to any DVI(c , K ) where K is compact and c is continuous and pseudomonotone on K , see Theorem 12.1.16 [30]. However, its computational overhead is higher than the other projection methods since a line search has to be performed to find rz in each iteration. In Step 2 of Algorithm HPM, an Armijo-type line search routine is adopted as follows: An Armijo line search for HPM

Step 0: Set λk = 1.0. Step 1: If the inequality (16) is satisfied, stop; otherwise set λk = 0.5λk , repeat Step 1. Finally, we note that computing projection ΠK (a) equals the following quadratic program minhg − a, g − ai, s. t. g ∈ K .

(18)

Given the special structure of K , this problem can be solved using the reduced gradient algorithm (see details in Appendix B). 4.3. Ascent direction algorithms PAD The first ascent direction algorithm implemented in this paper, due to [37], is built on Theorem 3. We name it the projection ascent direction algorithm (PAD) since the algorithm maximizes the regularized merit function (11) along the direction defined by a projection. The convergence of PAD requires that the mapping c is monotone and Lipschitz continuous. Nevertheless, an estimate of Lipschitz constant is not necessary. Algorithm PAD

Step 1: Compute yk = ΠK (f k − 1r c (f k )), and evaluate θr (f k ) using (11) at yk . Step 2: if

θr (f k ) ≥ −

r 2(1 − β)

kyk − f k k2

(19)

where β is a positive scalar smaller than 1, set r = α r , α ∈ (0, 1), then update f k+1 = f k ; otherwise, find the largest λk such that θr (z k ) > θr (f k ), where z k = (1 − λk )f k + λk yk , then update f k+1 = z k . The Armijo rule is again used in the line search. In order to reduce the computing cost, however, the initial step size used in each iteration is the one from the last iteration. An Armijo line search for PAD

Step 0: Set λk = min{1.0, 2λk−1 }. Step 1: Compute z k = (1 − λk )f k + λk yk . If θr (z k ) > θr (f k ) is satisfied, stop; otherwise set λk = 0.5λk , repeat Step 1. AAD Proposition 3 suggests g ∗ (f ) as an alternative ascent direction (AAD) when one can find λ ∈ (0, 1) such that y = g ∗ ((1 − λ)f + λy)

(20)

where f is the current solution and g (h) is an optimum solution to the minimization program in (9) for a given h. Central to the application of AAD is to search for a step size λ to satisfy (20). A line search procedure is proposed in the following for this purpose. ∗

Algorithm AAD

Step 1: Compute yk = g ∗ (f k ), where g ∗ (f k ) is defined in Proposition 2. Evaluate θ (f k ) using (9) at yk . Set λk = min{1.0, λk−1 }. Step 2: Compute z k = (1 − λk )f k + λk yk . If θ (z k ) > θ (f k ), f k+1 = z k ; otherwise, set yk = g ∗ (z k ) and λk = 0.5λk , repeat Step 2. The above algorithm starts by setting the search direction as g ∗ (f k ) and selecting an initial step size, which may be inherited from previous iterations. If this direction turns out to be ascent, line search is terminated and a new solution is obtained by moving along the direction. Otherwise, AAD tries to search for a fixed point y for (20). If g ∗ (·) is a contraction, according to the Banach fixed-point theorem, we have (1) the mapping g ∗ (·) has a unique fixed point in K , and (2) for any starting point

1302

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Table 1 Input data. Scenario

ta∗ (am)

∆(min)

α

β

γ

1 2

8:00 8:15

0.0 7.5

1.0 2.0

0.8 1.0

1.2 3.8

In both scenarios, N = 1500 vehicles, s = 1800 vehicle/h, t0 = 2.5 min, the analysis period T is 7:00 am to 9:00 am.

x0 ∈ K , the following contraction iterate yk+1 = g ∗ ((1 − λ)f + λyk )

(21)

will generate a sequence {y } converging to the fixed point, see Theorem 2.1.21 [30]. Note that g (·) is unlikely to be a contraction. Even if it is, computing the ascent direction exactly may be computationally intensive and thereby not suitable for practical purposes. As a heuristic, the step size is reduced in each contraction iterate and the procedure is terminated whenever an ascent direction is found. k



4.4. Other algorithmic Issues Initial solution For all algorithms, an initial solution f 0 is chosen such that the total demand N is uniformly distributed to each time interval, i.e., fi0 = N /m,

i = [1, . . . , m].

(22)

The feasibility of this solution is obvious. Convergence criterion The value of θ (f ) is a natural convergence measure since the maximum of θ (f ) is zero (see the proof of Proposition 1 in Appendix A). Thus the distance between θ (f ) and zero can be adopted as a termination criterion. Specifically, an algorithm is terminated when either the iteration number reaches the maximum allowed value kmax , or the following criterion is satisfied:

θ¯ (f ) =

θ (f ) hg ∗ (f ), c (f )i

≤

(23)

where θ (f ) = ming ∈K hc (f ), g − f i = hc (f ), g ∗ (f ) − f i. Following [24], θ¯ (f ) is called duality gap in this paper. We note that ¯ f ) actually equals the relative equilibrium gap which measures the violation of equilibrium conditions. θ( Line search Obtaining an optimal step size may not always be possible in line search. In the algorithm PAD, for example, a line search along the direction dk = ΠK (f k − 1r c (f k )) − f k is finite only if the mapping c is Lipschitz continuous and monotone, a condition ensuring that dk is indeed an ascent direction (Theorem 3). Since such a strong requirement on c is hardly fulfilled, we need to avoid the potential ‘‘breakdown’’ caused by a wrong direction. In our implementation, a line search procedure is forced to terminate as long as the current step size is less than the minimum allowed value λmin , which is then used to obtain the new solution. Dynamic network loading The evaluation of the mapping c, often referred to as dynamic network loading (DNL) in general settings, is a crucial module in the whole algorithm development. In this paper, the mapping c is evaluated using polymorphic dynamic network loading (PDNL) model developed in [31]. PDNL is a mesoscopic (i.e., individual traffic flow packets are moved according to macroscopic rules) simulation platform that integrates a variety of traffic flow models. PDNL produces cumulative arrival/departure curves on each link, from which time-dependent travel times (hence commute costs) are evaluated. The reader is referred to [31] for more details. 5. Numerical results All seven algorithms were coded in MS-VC++ and run on a Windows-XP workstation (with 3.06 GHz CPU and 2 Gb memory) for solving Vickrey’s bottleneck model in two different scenarios. Input data for each scenario are summarized in Table 1. In all experiments, both assignment and loading intervals are set to 30 s. So T is divided into 240 assignment intervals.  in Eq. (23) is set to 0.01 for all algorithms. The maximum number of iterations kmax is set to 10,000 for heuristic algorithms, and 5000 for others. 5.1. Heuristic algorithms The step size of DSA at iteration k is determined by

λk = α/[k/β],

α = 0.01, β = 250

where [a] is the minimum integer larger than a. The values of α and β are picked based on experiments. Algorithm DSA achieved the required accuracy after 6059 iterations in Scenario 1. The numerical solutions (flow rates and commute costs)

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

1303

Fig. 1. Numerical solutions for Scenario 1 at different duality gaps.

Fig. 2. Numerical solutions for Scenario 2 at different duality gaps.

corresponding to different duality gaps are reported in Fig. 1, in comparison with the analytical solution computed directly from Eqs. (5). Clearly, as the duality gap is getting closer to 0, the difference between numerical and analytical solutions diminishes accordingly. In Scenario 2, the desired duality gap −0.01 is achieved by algorithm MSA after 6876 iterations. The comparison between numerical and analytical solutions for Scenario 2 is given in Fig. 2. Fig. 3 compares the convergence curves (i.e., duality gap vs. iteration index curve) of the two heuristic algorithms. In both scenarios, MSA converges slightly faster than DSA. Furthermore, DSA experienced more intensive fluctuations. This is not a surprise. Although DSA adopts a seemingly ‘‘smarter’’ flow-shifting strategy, its performance depends on the choice

1304

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Fig. 3. Convergence curves of heuristic algorithms for both scenarios.

of step size which can only be determined from trial and error. An imperfect choice of step size would either result in slow convergence or produce oscillations like those manifested in Fig. 3. 5.2. Projection algorithms For algorithm BPM, EGM and HPM, r is set to 1.25, 2 and 0.2, respectively. According to our computation experience, the choice of r does affect the performance of the projection algorithms, but not substantially. In HPM, we set σ = 0.5, and enforce rz ≤ 10 (see Eq. (17)) to avoid that too small a step is used in the second projection. We will only report convergence curves in this section since Figs. 1 and 2 show that the duality gap measures the difference between the analytical and numerical solutions rather well. As shown in Fig. 4, algorithm BPM did not converge at all in both scenarios: the duality gap periodically oscillated from the very beginning and never reached as high as −0.1. The performance of algorithm EGM is mixed. It converged very fast in Scenario 1, but failed to converge in Scenario 2. HPM is much more stable compared to its counterparts. In Scenario 1, although HPM was slightly slower than EGM at the beginning, it caught up quickly and finally achieved the duality gap −0.01 after 3034 iterations. In the second scenario, HPM obtained a duality gap as good as −0.018 while oscillations prevented the other two from getting into the neighborhood of the solution. 5.3. Ascent direction algorithms In Step 2 of algorithm PAD, scalars α and β are respectively taken as 0.8 and 0.9. The minimum allowed step size λmin is set to 0.510 in both PAD and AAD, which means the maximum number of line search is 10 in each iteration. We noted that the value of λmin affects the convergence performance of the two algorithms. Usually, a smaller λmin corresponds to slower convergence but less oscillation. Again, we selected 0.510 based on trial and error. The convergence performance of the two ascent direction algorithms is illustrated in Fig. 5. Obviously, algorithm AAD converges faster than PAD in both scenarios. To achieve a duality gap of −0.015 in Scenario 1, AAD spent 884 iterations while PAD consumed about 3600 iterations. For Scenario 2, AAD obtained a duality gap of −0.017 after 1861 iterations while PAD never reached this height until all 5000 iterations are exhausted. 5.4. Discussions As seen, convergence curves of all algorithms are characterized by fluctuations in different forms. For projection-type algorithms BPM and EGM, oscillations may appear from the beginning. Although heuristic algorithms may also experience fluctuations in their earlier iterations, such fluctuations are less intensive and the duality gap is continuously improved as iteration goes on. For the algorithms that perform line search (HPM, PAD and AAD), the duality gap increases sharply and

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

1305

Fig. 4. Convergence curves of projection-type algorithms for both scenarios.

Fig. 5. Convergence curves of ascent direction algorithms for both scenarios.

roughly monotonically in the earlier iterations, and then the convergence curve tends to become flat with relatively slight fluctuations. The convergence performance of the three algorithms is more stable because the use of line search avoids a sharp drop of the duality gap in a single iteration.

1306

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Fig. 6. Comparison of convergence performance for the four algorithms.

Obviously, algorithms BMP and EGM are not appropriate for practical purpose because of the instability. We thus exclude them from further examination. Fig. 6 compares the convergence curves of algorithms MSA, HPM, PAD and AAD for the first 500 iterations (DSA is not included because it is similar to MSA). In Scenario 1, HPM is definitely the front runner: it always converged faster than the others and obtained the duality gap of −0.05 with less than 100 iterations. As a close second, Algorithm AAD also demonstrated stable convergence and reached the height of −0.05 within 300 iterations. Although PAD outperformed AAD in the first 20 iterations, it fell behind soon. Note that the convergence curve of PAD contains several flat portions, implying that the solution was unchanged in those iterations because r was big enough to meet the condition (19). Our experience indicates that reducing the parameter α (in Step 2 of PAD) helps diminish the flat portions but may introduce instability. The convergence performance of all algorithms degraded to some degree in the second scenario. Although AAD replaced HPM as the fastest algorithm in the earlier iterations, it spent as many as about 300 iterations to get a duality gap of −0.1. HPM was slowed down, obviously because rz used in the second projection was too small (if rz is too small, the convergence curve tends to be flat). In Scenario 2 the convergence speed of MSA seems comparable to those of AAD and HPM: after 500 iterations, the three algorithms almost obtained the same duality gap. However, MSA suffered more severe fluctuations. It has been shown that the convergence of those ‘‘non-heuristic’’ algorithms requires the mapping c to satisfy some conditions. We proceed to check whether or not the following three conditions are violated in our examples: monotonicity, pseudomonotonicity and Lipschitz continuity. Note that one needs two different feasible solutions x and y to conduct the check. For convenience, we just pick up two consecutive solutions f k and f k+1 during iterations. Figs. 7 and 8 report the test result for Scenarios 1 and 2, respectively. The mapping c is not monotone in both scenarios because negative values of (c (x) − c (y))T (x − y) were found in both tests. The violation of pseudomonotonicity was observed in Scenario 2 (c (x)T (x − y)/c (y)T (x − y) < 0), but not detected in Scenario 1. Further, our tests indicated that the Lipschitz constant L is larger than 15 in Scenario 1 and 25 in Scenario 2. In a nutshell, the mapping c of Scenario 1 seems better behaved than that of Scenario 2: the observed violation of monotonicity is less serious; the Lipschitz constant L has a better lower bound; and the pseudomonotonicity holds for all tested cases. Obviously, the properties of the mapping c has a great impact on the behaviors of numerical algorithms. We consistently observed that solving Scenario 2 is more difficult than Scenario 1 for all algorithms. Moreover, ‘‘non-heuristic’’ algorithms seem to be affected more seriously. Note that HPM and AAD were far more superior to MSA in Scenario 1, while their relative advantages were greatly weakened in the other. Table 2 reports the average computational overhead of each algorithm in each iteration. We focus on the average number of projections (solving a quadratic program) and evaluations of c (dynamic network loading) performed in each iteration since these are most time-consuming operations. The most economic algorithms are heuristic algorithms, which only require to evaluate c once per iteration. For projection-type algorithms, the computation expenses depend on the number of projections required in each iteration. Note that HPM evaluated c much more times than EGM and BPM, because c has to

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Fig. 7. The test of the properties of the mapping c in Scenario 1.

Fig. 8. The test of the properties of the mapping c in Scenario 2.

1307

1308

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Table 2 The computational overhead per iteration. Algorithms

Scenario 1

Evaluation of c Projection Evaluation of c Projection

Scenario 2

MSA

DSA

BPM

EGM

HPM

PAD

AAD

1.00 0.00 1.00 0.00

1.00 0.00 1.00 0.00

1.00 1.00 1.00 1.00

2.00 2.00 2.00 2.00

5.08 2.00 6.14 2.00

2.42 2.42 2.50 2.50

2.16 0.00 2.18 0.00

be evaluated in the line search of HPM. The evaluation of c is also necessary in the line search of PAD and AAD. Yet it seems that PAD and AAD spent less number of line search per iteration than HPM. Recall that the two algorithms make use of the step size in the previous iteration to get better initial guess. Note that the overhead of AAD roughly doubles that of heuristic algorithms, which is the least in the three algorithms that perform line search. Overall, the computational overhead of non-heuristic algorithms are substantially higher than that of heuristic algorithms. However, these additional costs may be paid off if the convergence performance is significantly improved. Taking Scenario 1 as an example, MSA spent more than 3000 iterations to get a duality gap of −0.05, while HPM and AAD took about 100 and 300 iterations respectively. In this case, even a much higher expense per iteration could be paid off. We caution that, however, such a comparative computational advantage seems problem-specific. 6. Concluding remarks This paper studies numerical solution procedures for the morning commute problem, which is formulated as a variational inequality problem. This problem is notoriously difficult to solve mainly because the flow-cost mapping cannot be cast in a closed form. On the one hand, the absence of a closed form increases the computational expense substantially. On the other hand, it casts doubts on the applicability of most existing VI algorithms. Our findings from numerical experiments are summarized below: 1. Heuristic algorithms are capable of getting reasonably good equilibrium solutions, although they converge slowly and suffer from fluctuations. Heuristic algorithms are also less demanding in computation resources and are easy to implement. 2. Projection-type algorithms without line search did not seem to work well. They easily get trapped in periodic oscillations and fail to converge. 3. All algorithms with line search have demonstrated relatively faster and more stable convergence compared with the others. They could outperform heuristic algorithms in terms of computational efficiency if the properties of c are good enough. Although the aforementioned points offer some basic guidelines to the selection of algorithms, they are far from conclusive. First of all, the flow-cost mapping c in our formulation does not even satisfy the weakest monotonicity condition. Therefore, the convergence of all algorithm considered is not theoretically guaranteed. We have observed that both heuristic algorithms and algorithms with line search tended to converge in all tested examples. Yet their applicability for general problems remains to be verified. Secondly, our limited results suggest that the convergence of the algorithms is improved as the monotonicity of c is less violated. Although quantifying such violations is in general difficult, it is worthy of further investigation whether or not the violations are affected by traffic flow models (discrete vs. continuous), network structure, the interaction of different O-D pairs, and the approach of computing commute cost. Appendix A Definition 1. Let hx, yi =

Pn

j =1

xj yj , kxk be a norm of vector x, a mapping c : K ⊆ Rn → Rn is said to be

1. pseudomonotone on K if for all x, y ∈ K ,

hx − y, c (y)i ≥ 0 → hx − y, c (x)i ≥ 0 2. monotone on K if for all x, y ∈ K , hc (x) − c (y), x − yi ≥ 0 3. strictly monotone on K if for all x, y ∈ K ,

hc (x) − c (y), x − yi > 0 4. strongly monotone on K with modulus µ if for all x, y ∈ K ,

hc (x) − c (y), x − yi > µkx − yk2 . According to [30], we have strongly monotone ⇒ strictly monotone ⇒ monotone ⇒ pseudomonotone.

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

Definition 2. Let hx, yi =

Pn

j=1

1309

xj yj , kxk be a norm of vector x, a mapping c : K ⊆ Rn → Rn is said to be

1. Lipschitz continuous on K with Lipschitz constant L if for all x, y ∈ K ,

kc (x) − c (y)k ≤ Lkx − yk 2. co-coercive on K with modulus γ if for all x, y ∈ K ,

hc (x) − c (y), x − yi ≥ γ kc (x) − c (y)k2 3. a contraction on K with modulus η ∈ (0, 1) if for all x, y ∈ K ,

kc (x) − c (y)k ≤ ηkx − yk. It is known that every strongly monotone, Lipschitz continuous function is co-coercive but not vice versa. Proof of Proposition 1. If f ∗ is a solution to DVI(c , K ), then hc (f ), g − f i ≥ 0, so θ (f ∗ ) ≥ 0. On the other hand, by definition, θ(f ) ≤ hc (f ), f − f i = 0, ∀f ∈ K . Consequently, θ (f ∗ ) has to be zero and obviously this is the maximum possible value of θ(.). Thus, f ∗ maximizes θ on K . Conversely, if f ∗ maximize θ on K , according to the definition we know that θ (f ∗ ) = 0. And this ‘‘zero’’ is attained when taking the infimum for hc (f ∗ ), g − f ∗ i, ∀g ∈ K . That is to say, hc (f ∗ ), g − f ∗ i ≥ 0, ∀g ∈ K .  Proof of Theorem 4. If f ∗ solves DVI(c , K ), then we have

hc (f ∗ ), f − f ∗ i ≥ 0,

∀f ∈ K .

This is equivalent to

hG−1 c (f ∗ ) − rf ∗ + rf ∗ , G(f − f ∗ )i ≥ 0,

∀f ∈ K .

The above inequality can be further transformed into 1

h(f ∗ − G−1 c (f ∗ )) − f ∗ , G(f − f ∗ )i ≤ 0, r

∀f ∈ K .

This inequality shows that (f ∗ − 1r G−1 c (f ∗ ))− f ∗ belongs to the normal cone to K at f ∗ . In other words, (f ∗ − 1r G−1 c (f ∗ ))− f ∗

is ‘‘perpendicular’’ to K at f ∗ . In turn, this means f ∗ is the projection of (f ∗ − 1r G−1 c (f ∗ )) onto K with respect to vector norm defined by G.  Appendix B The quadratic program (18) can be solved efficiently using the reduced gradient algorithm (RGA) described below: Algorithm RGA for solving quadratic program

• Step 1: Initialization. set g0 as a feasible solution and iteration index n = 0. • Step 2: Finding lmax = arg maxl (gnl ), gn = [gn1 , . . . , gnm ]T . • Step 3: Finding search direction dn . First compute dln for all l 6= lmax using ( (gnl − al ) − (gnlmax − almax ) ≤ 0 (gnlmax − almax ) − (gnl − al ), l dn = l lmax lmax l l gn [(gn − a ) − (gn − a )], otherwise

(24)

then dlnmax = −

m X

dln .

l=1,l6=lmax

P Pm l l l l l • Step 4: Compute step size. A = m l=1 2dn (gn − a ). l=1 (dn dn ), B =  dln ≥ 0, ∀l ∞,   l − g λmax = n : dln < 0 , otherwise min l dn

B B B if − 2A < 0, λ = 0; if − 2A > λmax , λ = λmax ; otherwise, λ = − 2A . • Step 5: if λ ≤  , stop; otherwise gn+1 = gn + λdn , go to Step 2.

References [1] W.S. Vickrey, Congestion theory and transport investment, American Economic Review 59 (1969) 251–261. [2] J.V. Henderson, Road congestion: A reconsideration of pricing theory, Journal of Urban Economics 1 (1974) 346–355. [3] J.V. Henderson, Economic Theory and Cities, Academic Press, New York, 1977 (Chapter 8).

(25)

1310

Y. Nie, M.H. Zhang / Mathematical and Computer Modelling 49 (2009) 1295–1310

[4] G. Newell, The morning commute for non-identical travellers, Transportation Science 21 (2) (1987) 74–88. [5] R. Arnott, A. de Palma, R. Lindsey, Schedule delay and departure time decisions with heterogeneous commuters, Transportation Research Record 1197 (1988) 56–57. [6] R. Arnott, A. de Palma, R. Lindsey, Route choice with heterogeneous drivers and group-specific congestion costs, Regional Science and Urban Economics 22 (1992) 71–102. [7] R. Arnott, A. de Palma, R. Lindsey, The welfare effects of congestion tolls with heterogeneous commuters, Journal of Transport Economics and Policy 28 (2) (1994) 139–161. [8] X. Chu, Endogenous trip scheduling: The henderson approach reformulated and compared with the vickery approach, Journal of Urban Economics 37 (1994) 324–343. [9] S. Mun, Traffic jams and the congestion toll, Transportation Research 28B (1994) 365–375. [10] H. Yang, H.-J. Huang, Analysis of the time-varying pricing of a bottleneck with elastic demand using optimal control theory, Transportation Research 31B (1997) 425–440. [11] H. Mahmassani, R. Herman, Dynamic user equilibrium departure time and route choice on idealized traffic arterials, Transportation Science 18 (1984) 362–384. [12] R. Arnott, A. de Palma, R. Lindsey, Departure time and route choice for routes in parallel, Transportation Research 24B (1990) 209–228. [13] R.M. Braid, Peak-load pricing of a transportation route with an unpriced substitute, Journal of Urban Economics 40 (179–197) (1996). [14] M. Kuwahara, Equilibrium queuing patterns at a two-tandem bottleneck during the morning peak, Transportation Science 24 (3) (1990) 217–229. [15] R. Arnott, A. de Palma, R. Lindsey, A dynamic traffic equilibrium paradox, Transportation Science 27 (2) (1993) 148–160. [16] T.L. Friesz, D. Bernstein, T.E. Smith, R.L. Tobin, B.W. Wei, A variational inequality formulation of the dynamic network equilibrium problem, Operation Research 41 (1993) 179–191. [17] M.J. Smith, A new dynamic traffic model and the existence and calculation of dynamic user equilibria on congested capacity-constrained road networks, Transportation Research 26B (1993) 49–63. [18] B.W. Wei, R.L. Tobin, T.L. Friesz, D. Bernstein, A discrete time, nested cost operator approach to the dynamic network user equilibrium problem, Transportation Science 29 (1995) 79–92. [19] Bin Ran, Randolph W. Hall, David. Boyce, A link-based variational inequality model for dynamic departure time/route choice, Transportation Research 30B (1996) 31–46. [20] Huey-Kuo Chen, Che-Fu Hsueh, A model and an algorithm for the dynamic user-optimal route choice problem, Transportation Research 32B (1998) 219–234. [21] W.Y. Szeto, H.K. Lo, A cell-based simultaneous route and departure time choice model with elastic demand, Transportation Research 38B (2004) 593–612. [22] H.-J. Huang, W.H.K. Lam, Modelling and solving dynamic user equilibrium route and departure time choice problem in network with queues, Transportation Research 36B (2002) 253–273. [23] B.W. Wie, R.L. Tobin, M. Carey, The existence, uniqueness and computation of an arc-based dynamic network user equilibrium formulation, Transportation Research 36B (2002) 897–918. [24] C.O. Tong, S.C. Wong, A predictive dynamic traffic assignment model in congested capacity-constrained road networks, Transportation Research 34B (2000) 625–644. [25] J.H. Wu, M. Florian, Y.W. Xu, J.M. Rubio-Ardannaz, A projection algorithm for the dynamic network equilibrium problem, in: Proceedings of the International Conference on Traffic and Transportation Studies, Beijing, China, July 1998, pp. 379–390. [26] Y. Nie, H.M. Zhang, Solving the dynamic user optimal assignment problem considering queue spillback, Networks and Spatial Economics, 2007, in press (doi:10.1007/s11067-007-9022-y). [27] J.G. Wardrop, Some theoretical aspects of road traffic research, Proceedings of the Institute of Civil Engineers, Part II 1 (1952) 325–378. [28] R. Arnott, A. de Palma, R. Lindsey, Recent developments in the bottleneck model, in: Road Pricing, Traffic Congestion and the Environment, Edward Elgar Publishing Inc., Massachusetts, USA, 1998, pp. 79–112. [29] K.A. Small, The scheduling of consumer activities: Work trips, American Economic Review 72 (1982) 467–479. [30] F. Facchinei, J.-S. Pang, Finite-Dimensional Variational Inequalities and Complementarity Problems, volume I, II, Springer, 2003. [31] Y. Nie, J.T. Ma, H.M. Zhang, A polymorphic dynamic network loading model, Computer-Aided Civil and Infrastructure Engineering 23 (2) (2008) 86–103. [32] C.F. Daganzo, The cell transmission model, part II: Network traffic, Transportation Research 29B (1995) 79–93. [33] Carlos F. Daganzo, Queue spillovers in transportation networks with a route choice, Transportation Science 32 (1998) 3–11. [34] D.W. Hearn, The gap function of a convex program, Operations Research Letters 1 (1982) 67–71. [35] P. Marcotte, A new algorithm for solving variational inequality, with applications to the traffic assignment problem, Mathematical Programming 33 (1985) 339–351. [36] M. Fukushima, Equivalent differentiable optimization problems and descent methods for asymmetric variational inequality problems, Mathematical Programming 53 (1992) 99–110. [37] D.L. Zhu, P. Marcotte, Modified descent direction methods for solving the monotone variational inequality problem, Operations Research Letters 14 (1993) 111–120. [38] Y. Sheffi, Urban Transportation Networks: Equilibrium Analysis with Mathematical Programming Methods, Prentice Hall, Englewood cliffs, NJ, 1985. [39] G.M. Korpelevich, The extragradient method for finding saddle points and other problems, Matekon 12 (1976) 747–756. [40] I.V. Konnov, Combined Relaxation Methods for Variational Inequalities, Springer-Verlag, Berlin, 2001.