You are on page 1of 136

Facultes Universitaires Notre-Dame de la Paix de Namur

Faculte des Sciences


Departement de Mathematique

On the inverse
shortest path problem

Didier Burton

Doctoral dissertation presented in 1993


under the guidance of Ph.L. Toint
to obtain a Ph.D. Degree in Science
To the Memory of My Parents
Preface

In spite of the fact that their motivation and development originally occured at di erent times,
graph theory and optimization are elds of mathematics which nowadays have many connections.
Early on, the use of graphs suggested intuitive approaches to both pure and applied problems.
Optimization and more precisely mathematical programming have steadily grown with the size
and diversity of the problems considered. Available computer hardware and hence computer
science certainly contributed to many of these developments. Optimization techniques therefore
often supplied a suitable algorithmic framework for solving problems arising from graph theory.
This doctoral thesis is about such a connection between graph theory and optimization.
My purpose in this work is to analyse the inverse shortest path problem. I was introduced to
this problem during a two-year research period (supported by the Region Wallonne), whose aim
was to model the behaviour of road networks users, particularly in urban centres. Trac mod-
elling revealed the importance of accurate estimates of perceived travel costs in a road network.
This experience motivated the present research.
I am very grateful to my advisor, Professor Philippe Toint. His invaluable guidance, avail-
ability and judicious advice were much appreciated. In addition, I enjoyed the opportunities he
gave me to interact with other professors and researchers abroad.
I especially want to thank Bill Pulleyblank (IBM T.J. Watson Research Center, Yorktown
Heights, USA) for his collaboration, during my visit to Yorktown Heights. I am also grateful
to Laurence Wolsey (CORE, Louvain-la-Neuve, Belgium), Michel Minoux (Universite Pierre et
Marie Curie, Paris, France), Tijmen Jan Moser (Rijksuniversiteit, Utrecht, The Netherlands) and
Annick Sartenaer, Michel Bierlaire and Daniel Goeleven from the Department of Mathematics
(FUNDP, Namur) for very interesting discussions and suggestions for this work.
I wish to express my thanks to the members of my advisory board who kindly agreed to
examine this work: F. Callier, J.-J. Strodiot (both from FUNDP, Namur), L. Wolsey (CORE,
Louvain-la-Neuve) and M. Minoux (Universite Pierre et Marie Curie, Paris, France). I am also
indebted to S. Vavasis (Cornell University, USA) and anonymous referees who contributed to
improving parts of my thesis.
Michel Vause (GRT, FUNDP, Namur) has supplied useful tools and hints for writing and
illustrating this text.
The Department of Mathematics of the Facultes Universitaires Notre-Dame de la Paix (Na-
mur) hosted me during my thesis work, and partly supported participation in scienti c meetings
in London (UK) and Chicago (USA). The Transportation Research Group (FUNDP, Namur)

i
preface ii

provided the computer hardware used for the numerical experiments, and contributed to the
expenses of several trips abroad. The Communaute Francaise de Belgique also gave nancial
support for my mission to London (UK).
Finally and most importantly, the Belgian National Fund for Scienti c Research supported
me during the preparation of this thesis.

Namur, December 1992


Didier Burton
Contents

Preface i
1 Introduction 1
1.1 The graph theory context : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 1
1.2 Motivating examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2
1.2.1 Trac modelling : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 2
1.2.2 Seismic tomography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 3
1.3 The inverse shortest path problem : : : : : : : : : : : : : : : : : : : : : : : : : : : 3
1.4 Solving the problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
1.4.1 A shortest path method : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
1.4.2 An optimization framework : : : : : : : : : : : : : : : : : : : : : : : : : : : 5
1.4.3 Solving an instance of inverse shortest path problems : : : : : : : : : : : : 6
2 The shortest path problem 7
2.1 Terminology and notations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 7
2.2 A speci c shortest path problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8
2.2.1 The problem type : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.2.2 The graph type : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 9
2.2.3 The strategy type : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 10
2.3 Shortest path tree algorithms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11
2.3.1 Shortest path trees : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 11
2.3.2 Bellman's equations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 12
2.3.3 Label-setting and label-correcting principles : : : : : : : : : : : : : : : : : : 14
2.3.4 Search strategies : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 15
2.3.5 Search strategies for label-setting and label-correcting methods : : : : : : : 15
2.4 Label-correcting algorithms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 16
2.4.1 L-queue algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 16
2.4.2 L-deque algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17
2.4.3 L-treshold : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 17
2.5 Label-setting algorithms : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18
2.5.1 Dijkstra's algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 18
2.5.2 Dial's algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 19

iii
contents iv

2.5.3 Binary heap algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 20


2.6 An auction algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 22
2.6.1 Basic concepts : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 22
2.6.2 Description of Bertsekas' algorithm : : : : : : : : : : : : : : : : : : : : : : : 23
2.6.3 Properties of the algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : 24
2.6.4 Algorithm's performance : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 25
2.7 An algorithm using an updating technique : : : : : : : : : : : : : : : : : : : : : : : 26
2.7.1 The shortest path method as a linear program : : : : : : : : : : : : : : : : 26
2.7.2 Solving the problem from another root : : : : : : : : : : : : : : : : : : : : : 27
2.7.3 Computational performance : : : : : : : : : : : : : : : : : : : : : : : : : : : 28
2.8 A shortest path method for the inverse problem : : : : : : : : : : : : : : : : : : : : 28
3 Quadratic Programming 30
3.1 Terminology and notations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 30
3.1.1 Triviality and degeneracy : : : : : : : : : : : : : : : : : : : : : : : : : : : : 30
3.1.2 Convexity : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 31
3.2 A speci c quadratic problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 32
3.2.1 The objective function : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 33
3.2.2 The feasible region : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 33
3.2.3 Searching for a strictly convex QP method : : : : : : : : : : : : : : : : : : 36
3.3 Note on the complexity of convex QP methods : : : : : : : : : : : : : : : : : : : : 36
3.3.1 Solving a problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 36
3.3.2 Complexity classes : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 37
3.3.3 Convex quadratic programming : : : : : : : : : : : : : : : : : : : : : : : : : 38
3.4 Resolution strategies : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38
3.4.1 Primal and dual methods : : : : : : : : : : : : : : : : : : : : : : : : : : : : 38
3.4.2 Simplex-type and active set methods : : : : : : : : : : : : : : : : : : : : : : 41
3.4.3 Choosing a particular method : : : : : : : : : : : : : : : : : : : : : : : : : : 42
3.5 The Goldfarb and Idnani method : : : : : : : : : : : : : : : : : : : : : : : : : : : : 44
3.5.1 Basic principles and notations : : : : : : : : : : : : : : : : : : : : : : : : : : 44
3.5.2 The GI algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 44
3.5.3 Linear independence of the constraints : : : : : : : : : : : : : : : : : : : : : 49
3.5.4 Linear dependence of the constraints : : : : : : : : : : : : : : : : : : : : : : 52
3.5.5 Finite termination of the GI algorithm : : : : : : : : : : : : : : : : : : : : : 54
4 Solving the inverse shortest path problem 55
4.1 The problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 55
4.2 Algorithm design : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 56
4.2.1 The Goldfarb-Idnani method for convex quadratic programming : : : : : : 56
4.2.2 Constraints in the active set : : : : : : : : : : : : : : : : : : : : : : : : : : : 58
4.2.3 The dual step direction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 59
contents v

4.2.4 Interpretation of the dual step direction : : : : : : : : : : : : : : : : : : : : 61


4.2.5 Determination of the weights : : : : : : : : : : : : : : : : : : : : : : : : : : 61
4.2.6 Modifying the active set : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 64
4.2.7 The algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 65
4.2.8 Nonoriented arcs : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 66
4.2.9 Note : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 66
4.3 Preliminary numerical experience : : : : : : : : : : : : : : : : : : : : : : : : : : : : 66
4.3.1 The implementation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 66
4.3.2 The tests : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 67
4.4 Complexity of the inverse shortest paths problem : : : : : : : : : : : : : : : : : : : 69
5 Handling correlations between arc weights 70
5.1 Motivation : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 70
5.1.1 Transportation research : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 71
5.1.2 Seismic tomography : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 72
5.2 The formal problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 73
5.2.1 Classes and densities : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 73
5.2.2 Shortest paths constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : : 74
5.2.3 Constraints on the class densities : : : : : : : : : : : : : : : : : : : : : : : : 75
5.2.4 The inverse problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 76
5.3 The uncorrelated inverse shortest path problem : : : : : : : : : : : : : : : : : : : : 77
5.4 An algorithm for recovering class densities : : : : : : : : : : : : : : : : : : : : : : : 79
5.4.1 Islands, dependent sets and their shores : : : : : : : : : : : : : : : : : : : : 80
5.4.2 The dual step direction : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 80
5.4.3 Determination of the class densities : : : : : : : : : : : : : : : : : : : : : : 83
5.4.4 The primal step direction : : : : : : : : : : : : : : : : : : : : : : : : : : : : 86
5.4.5 The maximum steplength to preserve dual feasibility : : : : : : : : : : : : : 88
5.4.6 The algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 88
5.5 Numerical experiments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 90
5.5.1 Implementation remarks : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 90
5.5.2 Correlated method { uncorrelated method : : : : : : : : : : : : : : : : : : : 90
5.5.3 Selecting violated constraints : : : : : : : : : : : : : : : : : : : : : : : : : : 92
6 Implicit shortest path constraints 95
6.1 Motivating examples : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 95
6.2 The problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 96
6.3 The complexity of the problem : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 97
6.3.1 The convexity of the problem : : : : : : : : : : : : : : : : : : : : : : : : : : 97
6.3.2 The 3-SAT problem as an inverse shortest path calculation : : : : : : : : : 98
6.4 An algorithm for computing a local optimum : : : : : : : : : : : : : : : : : : : : : 101
6.4.1 Computing a starting point : : : : : : : : : : : : : : : : : : : : : : : : : : : 102
contents vi

6.4.2 Updating the explicit constraint description : : : : : : : : : : : : : : : : : : 102


6.4.3 Reoptimization : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 103
6.4.4 The algorithm : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 103
6.4.5 Some properties of the algorithm : : : : : : : : : : : : : : : : : : : : : : : : 105
6.5 The reoptimization procedure : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 107
6.5.1 Notations : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 107
6.5.2 How to reoptimize : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 108
6.6 Some numerical experiments : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 109
6.6.1 Implementation details : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110
6.6.2 Tests : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 110
7 Conclusion and perspectives 114
A Symbol Index 116
Bibliography 119
List of Tables

2.1 Search strategies for labelling methods. : : : : : : : : : : : : : : : : : : : : : : : : : 15


2.2 Illustration of Bertsekas' algorithm. : : : : : : : : : : : : : : : : : : : : : : : : : : : 24
4.1 The inverse shortest path test examples : : : : : : : : : : : : : : : : : : : : : : : : 67
4.2 Results obtained on the test problems : : : : : : : : : : : : : : : : : : : : : : : : : 68
5.1 Test problems involving class densities : : : : : : : : : : : : : : : : : : : : : : : : : 91
5.2 Comparative test results for the correlated and uncorrelated algorithms : : : : : : 91
5.3 Test problems with equality constraints : : : : : : : : : : : : : : : : : : : : : : : : 93
5.4 Test results on equality constraints : : : : : : : : : : : : : : : : : : : : : : : : : : : 93
6.1 Test examples and their characteristics : : : : : : : : : : : : : : : : : : : : : : : : : 110
6.2 Results for the test problems : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 111

vii
List of Figures

2.1 A cycle and a weak cycle. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 8


2.2 A weighted graph, a shortest path tree and a shortest spanning tree. : : : : : : : : 12
2.3 A binary heap. : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 21
2.4 A small graph as illustration for the auction algorithm. : : : : : : : : : : : : : : : : 24
2.5 A small graph with a cycle of small cost. : : : : : : : : : : : : : : : : : : : : : : : : 26
2.6 Some shortest path tests results : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 29
3.1 A small example for proving the non-convexity. : : : : : : : : : : : : : : : : : : : : 35
4.1 A rst example : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 58
4.2 Iterations per problem size and shortest paths calculation : : : : : : : : : : : : : : 68
5.1 The rst example involving correlations between arc weights : : : : : : : : : : : : : 72
5.2 The graph generated from a discretization : : : : : : : : : : : : : : : : : : : : : : : 73
5.3 An island : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 78
5.4 The correlated algorithm: iterations per problem size and shortest paths calculation 92
5.5 Algorithm variants: iterations per problem size and shortest paths calculation : : : 94
6.1 A small graph : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 98
6.2 The representation of xi : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : 99
6.3 The subgraph associated with clause c : : : : : : : : : : : : : : : : : : : : : : : : : 100
6.4 A small example showing path combinations : : : : : : : : : : : : : : : : : : : : : : 104
6.5 Solving P (A) and P (A, ): one iteration : : : : : : : : : : : : : : : : : : : : : : : : 109

viii
1

Introduction

This chapter introduces the inverse shortest path problem. We rst present the problem's context
along with motivating its applications. We then state the formal problem and specify the under-
lying mathematical tools that we have exploited. These tools will be analysed in the chapters
that follow.

1.1 The graph theory context


Graph theory was initially developed as an abstract mathematical theory and most of its appli-
cations at rst featured the solving of combinatorial puzzles like those of Euler (1736) [39] and
Hamilton (1856) [61]. These researches laid the foundations of graph theory, together with the
research of Kirchho (1847) [71] and Cayley (1857) [21], who independently developed the theory
of trees. More recently, graph theory has become a favoured modelling tool for approaching
questions arising in many elds of applied mathematics. In particular, graph theory is playing
a signi cant role in computer science where appraising algorithm complexities [29, 50] and de-
termining ecient data structures [107], for instance, are of special interest. A partial list of
applications that take signi cant advantage of developments in graph theory, includes routing
problems [13], network modelling [45], computerized tomography [91], location problems [62] and
problems of society [98]. In particular, circulation problems around and within cities motivated
studies aimed at understanding problems relating to trac [101]; these studies again pro t from
the intuitive use of graphs. This incomplete list of references already shows the considerable
contributions of graph theory to solving optimization problems.
Loosely speaking, a graph consists of \nodes" connected by \arcs" of a certain \length". One
or more successive arcs form a \path". More formal de nitions are given below (in Section 1.3).
A famous problem in graph theory is that of nding shortest paths in networks, given arc
lengths. This problem, which consists of nding paths of minimum length between some origin(s)
and some destination(s), naturally arises in the analysis of transportation, communication and
distribution problems [45]. Shortest path techniques are also applied in elds as diverse as trac
modelling [38] and computerized tomography [86]. Consequently, the shortest paths problem
has become fundamental in graph theory and is much studied in the literature. Very ecient
1
introduction 2

algorithms have been proposed during the last three decades to solve the shortest paths problem
(see [5, 8, 34, 43, 67, 85]).
However, models based on shortest paths do not always re ect observations accurately. These
inaccuracies are often caused by an inadequate knowledge of the arc weights or lengths used in the
shortest path calculations. One way to overcome this diculty and to improve one's knowledge
of the arc weights is to consider the inverse problem. Solving an inverse shortest path problem
consists of nding weights associated with the arcs of a network, that are as close as possible to
a priori estimated values, and that are compatible with the observations of some shortest paths
in the network.

1.2 Motivating examples


The best way to introduce the inverse shortest path problem is probably by considering applica-
tions. Thus we consider examples drawn from mathematical trac modelling and computerized
tomography.

1.2.1 Trac modelling


In this eld of applied mathematics, it is generally assumed that the users of a given network of
roads optimize some criteria to choose their trip from an origin to a destination. These criteria
appear to depend, on the one hand, on a priori known costs associated with the network links, and
on the other hand, on individual perceived costs [32], these costs being evaluated in time, distance,
money or some other more complex measure. The road network planners are obviously extremely
interested in the distribution of the trac along actual paths taken by the users. Shortest path
techniques have been often used to determine these paths, for many ecient algorithms have been
developed for this problem. However these procedures fail to re ect the actual behaviour of users
over a network as a result of incomplete knowledge about perceived costs. The precise assessment
of the cost of a route (in the user's mind) is complex and often di erent from that used by the
planners: network users choose perceived shortest routes for their journeys [13]. Hence recovering
the perceived arc costs is an important step in the analysis of a network users' behaviour. It is
therefore very useful to know some of the routes that are actually used (and thus considered as
shortest) and then to incorporate this knowledge into the model, modifying the a priori costs so
as to guarantee that the given route is indeed shortest in the modi ed network. Care must also
be exercised to avoid large changes in the costs as compared to their a priori values.
Although these perceived routes may be observable, their precise description might vary with
time and across individuals, and the travel cost is usually subject to some estimation. This
provides bounds on the travel cost of (shortest) paths from an origin to a destination, the actual
path between the origin and the destination remaining unknown.
This is an instance of the inverse shortest path problem. One is given a network represented
by a graph (with oriented arcs to amount for one way links) and corresponding travel costs on
the arcs. Naturally, travel costs are required to be positive or zero. The question is to modify
introduction 3

these costs as little as possible to ensure, on the one hand, that some given paths in the graph
are shortest paths between their origin and destination, and on the other hand, that the total
cost of shortest paths between given origins and destinations is bounded by given values.

1.2.2 Seismic tomography


Another interesting example is in seismic tomography (see for example [90, 91, 104, 116]). The
network represents a discretization of the geologic zone of interest into a large number of \cells",
and the costs of the arcs represents the transmission time of certain seismic waves from one cell
to the next. According to Fermat's principle, these waves propagate along rays that follow the
shortest path, as a function of time, across the earth's crust. Earthquakes are then observed
and the arrival time of the resulting seismic perturbations is recorded at various observation
stations on the surface. The question is to recover the geological structure by reconstructing the
transmission times between the cells from the observation of shortest time waves and an a priori
knowledge of the geological nature of the zone under study.
Of course, the propagation time of the rays between a known source and a known receiver
cannot be measured without some error, and the ray paths themselves usually remain unknown.
This provides bounds on the duration of seismic rays.
This is again an application of the inverse shortest path problem.
The determination of the internal transmission properties of an inaccessible zone from outside
measurements is a very common preoccupation in many scienti c elds. However, we believe that,
because of their practical importance, the two examples above are enough to motivate the study
of the inverse shortest path problem.

1.3 The inverse shortest path problem


We de ne a weighted oriented graph as the triple (V ; A; w), where (V ; A) is an oriented graph
with n vertices and m arcs, and where w is a set of nonnegative weights fwigmi=1 associated with
the arcs. We denote the vertices of V by fvk gnk=1 and the arcs of A by faj = (vs(j ); vt(j ))gmj=1 ,
with s(j ) being the index of the initial or source vertex of the j -th arc and t(j ) the index of its
nal vertex or target. A path in an oriented graph is any sequence of arcs where the nal vertex
of one arc is the initial vertex of the next. An acyclic or simple path is a path which does not
pass through the same vertex more than once.
We assume that such a weighted oriented graph G = (V ; A; w) is given, together with
 a set of nE \explicit"1 acyclic paths
pj = (aj ; aj ; : : :; ajl j ) (j = 1; : : :; nE );
1 2 ( )
(1:1)
where l(j ) is the number of arcs in the j -th path (its length), and where
t(ji) = s(ji+1) for i = 1; : : :; l(j ) , 1; (1:2)
1
in the sense that the paths are de ned as explicit successions of arcs, in contrast with implicitly de ned paths
in the next item.
introduction 4

 a set of nI origin-destination pairs (oj ; dj ); j = 1; : : :; nI for de ning paths \implicitly"


between the origins and the destinations.
If we de ne w as the vector in the nonnegative orthant of Rm whose components are the
given initial arc weights fwig, the problem is then to determine w, a new vector of arc weights,
and hence a new weighted graph G = (V ; A; w) such that
min kw , w k (1:3)
w2Rm
is achieved, for some given norm k  k, under the constraints that
wi  0 (i = 1; : : :; m); (1:4)
pj is a shortest path in G; j = 1; : : :; nE (1:5)
and that X
0  lj  wa  uj ; j = 1; : : :; nI ; (1:6)
a2p1j (w)
where p1j (w) is a2 shortest path in G (with respect to the weights w) starting at node oj and
arriving at the node dj 3. The values lj and uj are lower and upper bounds on the cost of the
shortest path from oj to dj , respectively. For consistency, we impose that lj  uj (j = 1; : : :; nI )
and we allow lj to be zero and uj to be plus in nity. Constraints (1.6) are bounds constraints on
the costs of shortest paths between origin-destination pairs.
The formulation (1.3){(1.6) de nes a continuous optimization problem. The decision vari-
ables, given by w, are chosen according to the objective (1.3), i.e. di ering as little as possible
from w, among feasible values determined by the set of constraints (1.4){(1.6). We now make
some observations about the constraints and the objective function.
With the exception of the nonnegativity constraints (1.4), we introduce two unusual types
of constraints: the rst constraints (1.5) will be referred to as explicit shortest path constraints,
the second constraints (1.6) are called implicit shortest path constraints. The latter constraints
de ne the paths p1j (w) (j = 1; : : :; nI ) implicitely by their origin oj and destination dj and the
weighted graph G. The actual path taken from oj to dj clearly depends on the set of weights
associated with the arcs in A. These constraints are unfamiliar in the sense that the upper bound
of one implicit shortest path constraint de nes one linear constraint that is to be chosen among an
exponential number of linear constraints. Moreover the chosen one is a priori unidenti ed. This
will be clari ed later. On the other hand, the explicit shortest path constraints involve the paths
pj (j = 1; : : :; nE ) that are explicitly de ned as a succession of arcs speci ed by (1.1). These
constraints are uncommon because one explicit shortest path constraint de nes an exponential
number of \ xed" linear constraints. The lower bounds of the implicit constraints also involve
an exponential number of \ xed" linear constraints. Again, this will be analyzed in the chapters
that follow.
2
The shortest path not necessarily being unique
3
The meaning of the superscript 1 in p1j (w) is to indicate that the shortest path is considered, as opposed to
the second shortest. This notation will be important in Chapter 6.
introduction 5

Our inverse problem is to reconstruct the arc weights subject to the constraints described
above. It is readily observed that, as is the case in many inverse problems, the constraints do not
uniquely determine the arc weights: the reconstruction problem is underdetermined. Fortunately,
it often happens in applications that some additional a priori knowledge of the expected arc
weights is available. This additional information then provides stability and uniqueness of the
inversion (see [105], for instance). This a priori information may be obtained either from \direct"
models, for which there is no problem of uniqueness, or from the solution of a previous inverse
problem with di erent data. In order to insure the uniqueness of the solution of our inverse
shortest path problem, we force w to be as close as possible to the a priori information contained
in w.
As far as we are aware, the inverse shortest path problem has never been formally stated nor
studied in the scienti c literature.

1.4 Solving the problem


Before giving a precise algorithmic approach to solving the problem stated in (1.3){(1.6), we
need to examine two fundamental tools that are to be used in our context: the rst is taken from
graph theory and the second from mathematical programming.

1.4.1 A shortest path method


Solving the inverse shortest path problem requires the use of a method giving solutions to the
direct problem. Indeed, constraints (1.5) and (1.6) involve shortest paths calculation in their
description. The next chapter discusses the choice of a method for nding shortest paths in a
weighted oriented graph, that is suitable for inverse shortest path applications.

1.4.2 An optimization framework


A number of interesting variants of our problem's formulation can be constructed by considering
various norms in (1.3). In particular the `1 , `2 and `1 norms seem attractive. Since, for all
x 2 Rm ,
kxk1  kxk2  pmkxk1 (1:7)
and
p1m kxk1  kxk2  kxk1; (1:8)
where k  k1 , k  k2 and k  k1 are norms on Rm , the `1, `2 and `1 norms are equivalent4 [56].
Throughout this thesis, we will restrict ourselves to the `2 norm, or least squares norm, mostly
because it is widely used, has useful statistical properties, and leads to tractable computational
methods. Note that choosing the `1 norm would lead to linear programming, another interesting
approach.
4
This does not mean that they give identical results.
introduction 6

One could also modify the problem by introducing other functions of the fwi g to minimize.
These objective functions may be linear, quadratic or generally nonlinear. Investigation of these
alternatives is beyond the scope of this thesis.
As a consequence, we can write the objective function (1.3) of our problem as
1 Xm
minm 2 (wi , wi )2; (1:9)
w2R i=1
where the factor 12 is chosen for convenience.
Solving (1.9)(1.4){(1.6) needs the algorithmic framework of quadratic programming (QP).
Chapter 3 is devoted to the analysis of a particular QP method.

1.4.3 Solving an instance of inverse shortest path problems


Our speci c problem (1.9)(1.4){(1.6) will be analysed in three steps.
We rst consider the inverse shortest path problem with explicit shortest path constraints,
that is the problem de ned by (1.9), (1.4) and (1.5). The shortest path constraints will be
examined and the concept of an \island" will be introduced to characterize their violation. The
framework of Goldfarb and Idnani's convex QP method will be specialized to our context. An
algorithm will be proposed and tested on various examples. Chapter 4 deals with this inverse
shortest path problem.
In many applications, one can observe correlations between arc weights. It is then interest-
ing to generalize the inverse shortest path problem to take these correlations into account. In
the uncorrelated problem, the variables are the arc weights. Correlation between these weights
introduce more aggregated variables called classes or cells which partition the set of arcs. Our
new variables then consist of the \densities" of these classes, the weight of the arcs within each
class being calculated from the corresponding class density. This new formulation reduces the
number of variables but involves more restrictions, that is, more constraints. This is the subject
of Chapter 5. We will establish the results that allow the handling of shortest path constraints in
the space of class densities. A generalized algorithm will be proposed and its numerical perfor-
mance will be compared to that of the uncorrelated algorithm of Chapter 4. Di erent strategies
for handling constraints will be proposed and discussed in the correlated case.
Finally, in Chapter 6, we discuss the inverse shortest path problem with implicit shortest
path constraints, particularly the dicult case where upper bounds on the costs of shortest paths
are considered. We will see that the inclusion of such constraints in an inverse shortest path
application can give rise to non-convexity. We will show that this problem is NP-complete. An
algorithm will be proposed for nding a locally optimum solution to the problem. We will also
provide a stability analysis for a found solution. Finally, numerical experiments with this method
will be presented.
We note that Appendix A summarizes the notation that is used in Chapters 4{6.
2

The shortest path problem

This chapter considers methods solving the shortest path problem. It does not cover the matter
completely and thoroughly, but presents a few algorithms with variants that are applicable in
the context of inverse shortest path applications. A particular method will be preferred for its
appealing computational complexity.

2.1 Terminology and notations


Throughout this chapter we use the following de nitions and notations, which are those of Chap-
ter 1 but here presented in more details.
In this thesis, we are concerned with graphs arising in modelling systems. Many shortest path
oriented models, such as those dedicated to best routing in transport networks, require directed
graphs, that is, graphs in which edges have directions. We adopt for the most part the terminology
and notation of Deo [29], which are commonly found in the literature (see [25, 26, 30, 63, 64],
for instance). A directed graph or digraph consists of a nite set V = fvk gnk=1 of vertices and a
set A = falgml=1 of ordered pairs of vertices called arcs, where al = (vs(l) ; vt(l)) with s(l) being
the index of the source or initial vertex of the l-th arc and t(l) the index of its terminal or nal
vertex. A digraph is also called an oriented graph, though some authors still make a distinction
between both terms, keeping \oriented" to qualify digraphs that have at most one arc between a
pair of vertices. We de ne G, a weighted oriented graph as the triple (V ; A; w), where (V ; A) is an
oriented graph with n vertices and m arcs, and where w is a set of real numbers called weights,
fwlgml=1, associated with the arcs. We assume that G is simple: multiple arcs between ordered
pairs of vertices (parallel arcs) and loops (arcs of the form (v; v )) are not allowed.
In this chapter, it will be convenient to refer to arcs (and weights) by a double index addressing
their initial and nal vertices. Writing aij supposes that there exists l 2 f1; : : :; mg such that
i = s(l) and j = t(l); the associated weight (wl) is then denoted by wij . If aij is not de ned
(aij 62 A), then wij is set to 1. Note that \the" arc aij is unambiguously de ned since G is
simple. Using these notations, we precise that the graph G is not necessarily symmetric, that is,
wij does not needfully equal wji for all i; j 2 V (i 6= j ) provided that wij and wji are nite.

7
the shortest path problem 8

Finally we adopt the following terminology about reachability: an (oriented) path1 p in an


oriented graph G is a sequence of vertices (v1; v2; : : :; vk ) of V such that (vi ; vi+1) is an arc in A
for all i = 1; : : :; k , 1; if the graph is weighted, the cost of a path sums all its intermediate arc
weights. Not to confuse with the length of p, denoted by l(p), which is the number of arcs in
p (here, l(p) = k , 1). The path p is an acyclic or simple path if v1; v2; : : :; vk are distinct and
an (oriented) cycle if v1 = vk . A weak path and a weak cycle in an oriented graph are a path
and a cycle in the corresponding undirected graph, respectively; see Figure 2.1 for an example of
weak cycle. Note that weak paths and weak cycles include2 oriented paths and oriented cycles,
respectively.

t t
t t t t
7S

 7S


S S
 S  S
 S  S

 w
S  w
S
-

Figure 2.1: A cycle and a weak cycle.

Finding a shortest path in G between two vertices of V then consists of nding a path that
P
minimizes (i;j )2p wij , where p is any path between the two vertices.
The complexity of an algorithm refers to the amount of resources required by its running com-
putation. The worst-case complexity of an algorithm is the lowest upper bound on its complexity.
We shall use the Landau's symbols to characterize algorithms' computational complexities:
 A function g(n) is O(f (n)) if there exist a scalar c and an index n0 such that
jg(n)j  cf (n) for all n > n0: (2:1)

 A function g(h) is o(f (h)) if


lim jg(h)j = 0: (2:2)
h!0 f (h)
Relation (2.1) de ning O(f (n)) is weaker than that de ned in (2.2). That is why algorithms'
computational times are often compared in their order of magnitude via O(f (n)) relations.

2.2 A speci c shortest path problem


The shortest path problem may be related to very di erent particular questions, according to
the problem type, the network characteristics and the resolution strategy. It is then essential to
determine the speci city of the shortest path problem we are concerned with. So far, the most
1
In contrast with Deo [29], a path is by default oriented.
2
contrarily to Deo [29].
the shortest path problem 9

in-depth classi cation of shortest path problems is agreed to be the Deo and Pang's taxonomy
[30]. Other general surveys can be found in [37, 95].
The next three sections aim at locating our shortest path problem within the Deo and Pang's
classi cation.

2.2.1 The problem type


We recall that, in the context of our inverse problem, shortest paths occur in constraint descrip-
tions (see (1.5) and (1.6) in Chapter 1). These involve shortest paths between origin-destination
pairs, that are not constrained to satisfy additional conditions such as visiting intermediate ver-
tices before reaching destination. Our problem then refers to the one-pair shortest path problem
or, more generally, to the \single-source" problem. Algorithms solving that problem are usually
based on the methods proposed by Bellman [5], Dijkstra [34], Ford [44], and Moore [85]. Methods
solving all-pairs problems such as that of Floyd [43] are consequently inappropriate.
The problem of determining k-th shortest paths do not apply practically to our case. But
disposing of second shortest path costs is of theoretical importance in Chapter 6.

2.2.2 The graph type


The trac modelling application mentioned in Chapter 1 needs to represent road networks in
graph form. In [112], Van Vliet discusses road networks and their characteristics in relation to the
shortest path problem. In particular, he noticed that graphs generated from road networks are
large and have a number of arcs to number of vertices ratio of about 3. Our second motivating
application of Chapter 1 was drawn from computerized tomography. In this case, grid graphs are
often used to discretize a medium into cells. These graphs may be relatively large depending on
the discretization re nement. Experiments on grid networks presented in [33] by Dial et al. again
show small arcs to vertices ratios. Both examples witness that most large networks in real-life
situations are \sparse"; a graph G is said to be sparse when m << n2 , that is, when the ratio
m=n is small.
On the other hand, as stated at the beginning of this chapter, trac applications have need
of directed graphs. In contrast, grid networks are usually undirected. Since an undirected graph
nds its directed match when replacing each edge with two oppositely oriented arcs, the directed
case is more general.
Finally, we will consider weighted graphs with only nonnegative and xed arc weights. Indeed,
it seems to be the case in most practical problems. The presence of negative arc weights allows
negative cycles, that is, cycles with a negative cost. If such a negative cycle exists, a shortest
path algorithm will minimize the shortest distance to ,1 by going round and round this cycle.
The existence of a shortest path is then conditioned by the absence of negative cycles along
any path between the origin-destination pair under consideration. Because of the nature of our
applications, we decide to restrict arc weights to nonnegative values and hence avoid discussions
about maybe costly procedures detecting negative cycles in a graph. If the reader is involved
in detecting oriented negative cycles, a interesting label-correcting approach using a dynamic
the shortest path problem 10

breadth- rst search technique is recently due to Goldfarb, Hao, and Kai [54].
Summing up the above characteristics, the graphs we are interested in for determining shortest
paths are large, sparse, oriented and non-negatively weighted.

2.2.3 The strategy type


The strategy refers to the technique employed in the algorithm for calculating shortest paths.
The various existing techniques use combinatorial procedures, matrix operations, updating ap-
proaches, and so on (see [30] for a complete panorama). Each technique privileges precise data
structures for representing graphs. The choice of a particular strategy partly depends on the
problem and the graph type. Let us use these informations to select some suitable techniques, or
discard ill-adapted ones.
 The single-source problem calls for combinatorial techniques rather than algebraic ones.
Indeed, the latter strategies usually use matrices as data structure and are better suited
for all-pairs problems. A combinatorial method traverses the arcs of the graph and records
the information so obtained. This approach is successful in solving one-pair problems and
has led to some of the most ecient shortest path algorithms. These methods generally
represent graphs in forward star form; this consists, for each vertex v 2 V , of storing either
all arcs whose initial vertex is v , or all vertices reachable from v by one arc, that is, the set
of successors of vertex v
S (v) def
= fu 2 V j avu 2 Ag: (2:3)
 The fact that our graphs are oriented, large and sparse also encourages the use of combi-
natorial techniques.
 Among combinatorial methods, two types of algorithms then seem attractive: the label-
correcting algorithms and the label-setting algorithms. The former methods apply to weight-
ed graphs with general arc weights (with or without negative weights) while the latter will
not work for graph with negative arc weights. Label-setting methods consequently appear
to be in the class of appropriate strategies for solving our speci c shortest path problem.
The next section deals with these strategies.
 Updating techniques take advantage of an initial nding of shortest paths to calculate other
ones or the same ones when small changes occur in the graph. For instance, Florian, Nguyen
and Pallottino [42] proposed an algorithm computing shortest paths from a given origin us-
ing the information of shortest paths from another origin. This strategy may be pro table
when shortest paths between many origin-destination pairs are to be found. Another in-
teresting updating method is due to Goto, Ohtsuki and Yohimura [59], which seems to be
ecient when the shortest path problem must be solved repeatedly with di erent numerical
values of arc weights. Such a situation of course arises in our inverse shortest path appli-
cations. However, methods solving this latter problem generally use matrices to store arc
weights (namely to apply a LU-factorization) [47, 59]. These matrices being not available
the shortest path problem 11

in our case, we will not investigate in such methods. Section 2.7 of this chapter discusses
the advantages of a particular updating technique that uses the forward star representation
of a graph.
 We nally point out a strategy that has been recently studied by Bertsekas in [7, 8]: the
auction strategy. An auction algorithm for nding shortest paths seems to be relatively
ecient in many cases. Discussion about computational results obtained by this technique
can be found in Section 2.6.
Let us now examine some properties of algorithms solving the shortest path problem in sparse
graphs using above-mentioned techniques.

2.3 Shortest path tree algorithms


We rst overview algorithms using \labelling" techniques. These algorithms typically calculate a
shortest path tree which is a structure containing the shortest paths from one vertex called source
or root to all other vertices of the graph. A proof that shortest paths build trees in the graph
where they are calculated is proposed in Section 2.3.2. Let us give some properties of the tree
structure and their implications on shortest paths before proving this interesting result.

2.3.1 Shortest path trees


An oriented graph G is said to be (strictly) connected if there is at least one (oriented) path
between every pair of vertices in G. Similarly we say that G is weakly connected if there exists
at least one weak path between every pair of vertices in G. A tree is a weakly connected graph
without any weak cycles|neither an oriented cycle nor a weak cycle. If we de ne the in-degree
of a vertex v as the number of arcs in A having v as nal vertex, a tree with only one vertex (the
root) of zero in-degree is called an arborescence or an out-tree. This is a well-known characteristic
of shortest path trees. If G is connected, then a shortest path tree rooted at vertex v reaches
every vertex in N n fv g: one then speaks of a spanning tree. Consequently, a shortest path tree
in a connected oriented graph is technically a spanning arborescence. It is important to note
that a shortest path tree is not a shortest spanning tree (see Figure 2.2, where the arc weights
are shown next to the arcs which are represented by arrows). A shortest spanning tree is that of
smallest weight, where the weight of a tree T is de ned as the sum of the weights of all arcs in T .
In Figure 2.2, the weight of the shortest path tree from the root is 10 while that of the shortest
spanning tree is 6.
Theorem 2.1 A tree with n vertices has n , 1 arcs.
Proof. Let us reason by induction on the number of vertices. One directly sees that the
theorem holds for n = 1; 2 and 3. We assume now that the theorem holds for all trees with fewer
than n vertices.
Consider a tree T with n vertices and one of its arc aij = (vi ; vj ). There is no other path
between vi and vj except aij because any other path would create a weak cycle in T , which is
the shortest path problem 12

t t
Q 1
k
Q
t t t t
Q 1
Q
k

t t t
6 Q 6 Q
Q Q
5 3

 5 3
 3


  
 5  5  5
root root

Figure 2.2: A weighted graph, a shortest path tree and a shortest spanning tree.

impossible since T is a tree. Then T n faij g consists of two trees since there are no weak cycles
in T . Both trees have fewer than n vertices each, and therefore, by induction hypothesis, each
contains one less arc than the number of vertices in it. Thus, T n faij g consists of n vertices and
n , 2 arcs and hence T has exactly n , 1 arcs. 2
As a consequence, being an arboresence, a shortest path tree can be stored in a n-array
of vertex numbers, the i-th component containing the vertex number j such that the arc (j; i)
belongs to the shortest path tree. By convention, the component corresponding the source vertex
is set to 0 (remember that the source vertex has a zero in-degree). One also says that the n-array
contains the predecessor of each vertex (di erent from the root) in the arborescence.

2.3.2 Bellman's equations


We are now interested in calculating a shortest path tree SPT in G = (V ; A; w) from the source
vertex src 2 V . We assume that all cycles have nonnegative cost and that there exists a path
from src to every other vertex of V . Let sc (v ) be shortest path cost from src to v in SPT , and
pred (v ) the predecessor of v in SPT with the convention that pred (src ) = 0. An algorithm that
yields the shortest path tree SPT gives the unique solution to Bellman's equations [5]:
sc (j ) = min (sc (i) + wij ); j 6= src ; (2:4)
ij(i;j )2A
which de ne the shortest path costs sc (j ) (j 6= src ) recursively with the initial condition that
sc (src) = 0.
Now let us formalize the condition that distinguishes a spanning tree from a shortest path tree.
Theorem 2.2 Assume that SPT is a spanning tree rooted at vertex src. Then SPTis a shortest
path tree with root src if and only if, there exist labels s(i) associated with vertex i (i = 1; : : :; n)
such that, for every arc aij in A, s(i) + wij  s(j ), with j 6= src and s(src ) = 0.
Proof. Suppose that SPT is a spanning tree rooted at src . If there exists an arc aij 2 A
such that s(i)+ wij < s(j ), then the path from src to j in SPT is not shortest. Conversly, assume
that s(i) + wij  s(j ) holds for every arc aij 2 A, with j 6= src and s(src) = 0. Let p be a path
from src to j , which is of the form (src = v1; v2; : : :; vl(p)+1 = j ). Then, by hypothesis, the cost
the shortest path problem 13

of p is bounded below by
2l(p),1 3
X
s(v2 ) , s(src ) + 4 s(vk+1 ) , s(vk )5 + s(j ) , s(vl(p)) = s(j ): (2:5)
k=2
Hence SPT is a shortest path tree. 2
Note that the label vector s in Theorem 2.2 does not necessarily equal the vector sc de ned
by Bellman's equations. The value of s(i) will match that of sc (i) if and only if s(i) + wij = s(j )
for every arc aij in SPT (see [99]).
The equations (2.4) already suggests a basic algorithm for calculating shortest path costs3,
which is commonly viewed as a \prototype" shortest path tree algorithm [49]:
Algorithm 2.1
Step 1. Initialize a tree SPT rooted at src and, for each v 2 V , set sc(v) to the cost of the path
from src to v in SPT ;
Step 2. Let aij 2 A be an arc for which sc(i) + wij , sc(j ) < 0, then adjust the vector sc by
setting sc (j ) = sc (i) + wij , and update the tree SPT by replacing the current arc incident
into vertex j by the new arc aij ;
Step 3. Repeat Step 2 until optimality conditions (2.4), which may be rewritten as
sc (i) + wij  sc (j ); for all (i; j ) 2 A (2:6)
are satis ed.
In course of calculation, the value of sc (v ) (v 2 V ) is greater or equal to the cost of the path from
src to v in the current tree. One usually calls sc (v ) the label of vertex v .
Algorithms that label the reached vertices with their shortest path costs from the source are
called labelling algorithms. There are two conventional ways for classifying labelling algorithms:
authors like Steenbrink [102], Dial et al. [33], Deo and Pang [30] distinguish between label-
correcting and label-setting methods; more recently, Gallo and Pallottino [48, 49] rather discern
di erent search strategies (the breadth- rst search, the depth- rst search and the best- rst search)
employing precise data structures analysed by Aho et al. in [1], Tarjan in [106] and Pallottino
in [93]. In order to clarify both approaches, let us touch upon Step 2 of the above prototype
algorithm. Considering the forward star representation of sparse graphs, one realizes that it is
worth selecting vertices rather than arcs: once a vertex i has been chosen, the operations of
Step 2 are performed on all arcs aij with vertex j 2 S (i). We suppose that vertex i is selected
from a set of candidate vertices Q. With this point of view, search strategies refer to the way
i is chosen from Q in relation to the underlying data structure of Q, and label-setting or label-
correcting methods sooner refer to some properties of the vertices that are to be selected in
Q.
These precisions allow to present a variant of the prototype Algorithm 2.1 that includes the
updating of the shortest path information given by the vector pred .
3
Bellman's equations do not supply information about shortest paths themselves.
the shortest path problem 14

Algorithm 2.2
Step 1: Initializations.
Set sc (src) 0 and pred (src ) 0.
For each vertex i 2 V n fsrcg, set sc (i) +1 and pred (i) i.
Finally, set Q fsrcg.
Step 2: Selecting and updating.
Select a vertex i in Q, and set Q Q n fig.
For each vertex j 2 S (i) such that sc (i) + wij < sc (j ) do:
 set sc(j ) sc(i) + wij and pred (j ) i;
 if j 62 Q, then set Q Q [ fj g and update Q.
Step 3: Loop or exit.
If Q = ;, then go to Step 2.
Else exit: sc and pred contain the costs and the description of the shortest path tree rooted
at src , respectively.
The updating of the set Q in Step 2 technically depends on the data structure used to store the
candidate vertices.

2.3.3 Label-setting and label-correcting principles


A label-setting method labels a vertex permanently or temporarily: the label sc (v ) of a vertex
v is made permanent only when sc(v ) equals the shortest path cost from src to v . The vertex
i selected in from Q is the vertex of minimum label that is labelled as temporary; vertex i gets
permanently labelled when removed from Q. It means that each vertex will enter Q at most once
and that one shortest path is found at each iteration; the algorithm then terminates in at most
n , 1 iterations. As far as the graph type is concerned, remember that label-setting algorithms
only work on graphs with nonnegative arc weights.
On the other hand, a label-correcting method never labels vertices permanently until the
algorithm ends. As a consequence, such a method cannot guarantee that any current path is
shortest until termination occurs. A label-correcting algorithm therefore usually requires more
than n , 1 iterations, but each of them demands less calculation. In contrast with label-setting
methods, label-correcting strategies apply to graphs with general arc weights.
These label-setting and label-correcting principles are not mutually exclusive and may \co-
exist" as it is the case for the auction algorithm of Bertsekas [8]: the auction algorithm follows
label-setting principles in that the shortest distance to a vertex is found at the rst time the
vertex is labelled; it also follows label-correcting principles in that the label of a vertex may
continue to be updated after its shortest distance is found. This will be detailed in Section 2.6.
the shortest path problem 15

2.3.4 Search strategies


As mentioned in Section 2.3.2, search strategies involve selection rules depending on Q's data
structure. Let us examine three commonly used procedures (see namely [106], for a detailed
description).
The breadth- rst search selects the oldest element in Q, that is, the element which was inserted
last; the underlying data structure is known as the \First-In-First-Out" (FIFO) list, or queue.
The depth- rst search chooses the newest element from Q, i.e. the element which was inserted
last; the \Last-In-First-Out" (LIFO) list, or stack allows such a search. Both breadth- rst and
depth- rst strategies are designed to use lists for which adding and removing an element are
elementary operations. The rst element of a list is its head and the last element its tail. Some
authors [28, 33] employ a double-ended queue or deque for sequencing vertices. A deque is a list
composed of a queue and a stack, which allows additions and deletions at either list end. The
linked-list which consists of linking each element of Q by means of a pointer to the next one,
allows ecient implementations of queues, stacks and deques.
In the best- rst search, we make use of numerical values (labels) associated with the vertices.
The element to be selected is that of minimum label in Q. Appropriate data structures imple-
menting this strategy are the priority queues. A priority queue is a collection of elements, each
with an associated label, on which the following operations are eciently performed: adding a
new element, removing the minimum value element and correcting the label of an element whose
location is known.

2.3.5 Search strategies for label-setting and label-correcting methods


Labelling algorithms typically maintain the candidate set Q with the help of some data structure
that facilitates operations on this set. In Table 2.1, we show the conventional correspondance
between labelling methods and search strategies.

Labelling methods
Label-correcting Label-setting
Queue Stack Deque Linked list Buckets Heap
Auction
Table 2.1: Search strategies for labelling methods.

Breadth- rst searches (with queues or deques) often organize the selection in Q for label-
correcting algorithms (see [48], for instance). This is due to the fact that a breadth- rst search
visits the vertices in concentric zones starting from the source. With the same idea, Gallo
and Pallottino [48] remark that depth- rst searches (with stacks) are odd in shortest path tree
algorithms since the rst updated vertex, which is in S (src ), will be selected last.
Using a best- rst strategy with the shortest path costs (sc ) as labels, the selection of Step 2 in
Algorithm 2.2 yields the vertex i of Q that is at the shortest distance from src . Once the forward
the shortest path problem 16

star S (i) is updated, i does not need to be updated any more until the end of the algorithm.
Shortest path tree algorithms using a search strategy derived from the best- rst one are then
label-setting algorithms, and are also called shortest- rst algorithms. In particular, Dijkstra
[34] originally did not use any list in his shortest- rst algorithm; Yen [117] exploits linked lists;
Denardo and Fox in [28] and Dial et al. in [33] manipulate buckets, and Johnson [67] makes use
of heaps. Buckets and heaps are structures for ordering vertices with respect to their label. They
will be described in Section 2.5 with the algorithms employing them.
Finally, as mentioned before, the auction technique used by Bertsekas [8] is of special nature.
This technique is de ned and analysed in Section 2.6.
The next sections are devoted to review some shortest path tree algorithms that use above
search strategies. Computational complexities are mentioned to allow comparisons between these
algorithms.

2.4 Label-correcting algorithms


The rst well-known label-correcting algorithm was introduced by Ford [44], and then detailed by
Bellman [5] and Moore [85]. Their method recursively solves Bellman's equations (2.4) and has
a computational complexity of O(mn). Proofs and comments about this method can be found
in [57, 89].
We present three variants of this prototype algorithm that have been suggested ever since.
We do not mention the depth- rst variant because of its high computational complexity (O(n2n ))
and poor practical performance. Other variants can be consulted in [49].

2.4.1 L-queue algorithm


Gallo and Pallottino [48] proposed an ecient version of the Bellman/Ford/Moore algorithm
using a queue for representing the set Q and the forward star form. They called their algorithm
\L-queue", for list search queue. We denote the queue's head and tail by Qh and Qt , respectively.
Algorithm 2.3
Step 1: Initializations.
Set sc (src) 0 and pred (src ) 0.
For each vertex i 2 V n fsrcg, set sc (i) +1 and pred (i) i.
Finally, set Q fsrcg, Qh src and Qt src .
Step 2: Selecting and updating.
Remove vertex i at Qh .
For each vertex j 2 S (i) such that sc (i) + wij < sc (j ) do:
 set sc(j ) sc(i) + wij and pred (j ) i;
 if j 62 Q, then insert j at Qt.
the shortest path problem 17

Step 3: Loop or exit.


If Q = ;, then go to Step 2.
Else exit.
On sparse graphs (m=n small), this breadth- rst search algorithm runs in O(mn) = O(cn2) with
small c. Experiences in [49] show that this worst-case complexity is not reached in practice. The
storage requirement is 4n + 2m (n + 2m for the weighted graph, n for the queue, and 2n for the
vectors pred and sc).

2.4.2 L-deque algorithm


Pape [94] exploited a suggestion of D'Esopo and set up an algorithm where Q is a double-ended
queue (deque) which allows the insertion of vertices at both ends of the list according to a
predetermined strategy: a distinction is made between vertices that are in Q and others; the list
is split into a queue and a stack; unlabelled vertices are inserted at the tail of Q (like a queue),
while the vertices that have been already labelled are inserted at the head of Q (like a stack).
The resulting algorithm called \L-deque" is very similar to \L-queue". We therefore present the
modi ed Step 2 of Algorithm 2.3, using the same notations about Q's head and tail:
Algorithm 2.4
Step 2: Selecting and updating.
Remove vertex i at Qh .
For each vertex j 2 S (i) such that sc (i) + wij < sc (j ) do:
 set sc(j ) sc(i) + wij and pred (j ) i;
 if j 62 Q, then insert j at Qt if sc(j ) = 1, otherwise (sc(j ) nite), insert j at Qh .
Of course, the presence of a stack implies a rather high worst-case complexity: O(n2n ). According
to Dial et al. [33], the algorithm L-deque is ecient when applied to sparse and almost planar4
graphs. The latter restriction does not necessarily meet our graph requirements. The need in
storage is the same as that of L-queue. Although Gallo and Pallottino [48] observe better run-
times for L-deque algorithm for a broad variety of problems, we prefer the L-queue algorithm
which presents less restrictions (see namely [70] where constructed examples show limitations of
L-deque).

2.4.3 L-treshold
The partitioning of Q explains the eciency of L-deque. With the same idea, Glover et al. [60]
organized the list Q as two separate queues Q0 and Q00 using a treshold parameter s. The queue
Q0 is dedicated to vertices whose label falls below the treshold parameter s. The algorithm
typically proceeds as follows: at each iteration, a vertex is selected and removed from Q0 , and
any vertex j to be added in the candidate list is inserted in Q00 . When Q0 is empty, the treshold
4
that is, drawable in 2 dimensions without arc intersections.
the shortest path problem 18

s is adjusted and Q is repartioned according the new treshold value. The procedure then goes
on until exhaustion of the candidate list Q.
This method becomes ecient once suitable values are chosen for s. As noticed by Bertsekas
[9], if s is taken to be equal to the current minimum label, the method behaves like Dijkstra's
algorithm, which is presented in the next section; if s exceeds all vertex labels, then Q00 is empty
and the algorithms reduces then to the generic label-correcting method. Appropriate treshold val-
ues have been proposed by Glover et al. in [60], and Gallo with Pallottino in [49]. When applied
to graphs with nonnegative arc weights, the worst-case computational complexity is O(mn). Al-
though this theoretical performance equals that of other label-correcting algorithms, the treshold
algorithm allows better practical performance than the other label-correcting algorithms. The
storage requirement for this method is 5n + 2m.

2.5 Label-setting algorithms


The rst label-setting (or shortest- rst search) algorithm is due both to Dijkstra [34] and Moore
[85]. But Dijkstra established the formal properties of this algorithm. As Bellman/Ford/Moore's
algorithm has generated most of the label-correcting methods, label-setting algorithms can be
viewed as a particular implementation of Dijkstra's algorithm.
Remember that arc weights wij are supposed to be nonnegative for label-setting methods.

2.5.1 Dijkstra's algorithm


Dijkstra's algorithm has been initially implemented with an unordered list of temporarily labelled
vertices. For notation convenience, Q will represent the set of vertices marked as temporary. The
algorithm presented here has been adapted to make use of the forward star representation.
Algorithm 2.5
Step 1: Initializations.
Set sc (src) 0 and pred (src ) 0.
For each vertex i 2 V n fsrcg, set sc (i) +1 and pred (i) i.
Finally, set Q V .
Step 2: Selecting and updating.
Find vertex i in Q verifying
sc (i) = min
v2Q
sc (v ): (2:7)
If the minimum is not unique, select any i that achieves the minimum.
For each vertex j 2 S (i) such that sc (i) + wij < sc (j ), set sc (j ) sc (i) + wij and
pred (j ) i.
Set Q Q n fig.
Step 3: Loop or exit.
If Q = ;, then go to Step 2.
Else exit.
the shortest path problem 19

The proof that this algorithm is correct can be found in [57, 89]. The complexity of the
algorithm is at most O(n2) since each arc is examined only once, and its space requirement is
4n + 2m. For complete5 graphs, the complexity reduces from O(n3) (for L-queue) down to O(n2 )
(for Dijkstra's method): the n factor is the price to pay for both considering general arc weights
and detecting negative cycles. Note that, according to Johnson [66], a label-correcting variant of
Dijkstra's algorithm is able to take into account negative arc weights (provided that no negative
cycles occur in the graph).
One can easily observe that Dijkstra's algorithm allows a relative run-time reduction when
solving a one-pair (o; d) shortest path problem, the reduction amount depending on how far
d is located from o. Indeed, one shortest path is found at each iteration of the algorithm; as a
consequence, the algorithm may halt once the destination vertex d has been permanently labelled.
Considering the actual graph sparsity also should reduce the run-time of the algorithm.
The critical operation in Dijkstra's algorithm is that of nding the vertex with smallest label
in Q. Maintaining Q ordered then appears as a reasonable approach. Yen suggested a variant
of Dijkstra's algorithm using an ordered linked list for sequencing vertices. The computational
complexity of that variant remains O(n2) on sparse graphs. As noticed in [14], ordered linked
lists do not speed up the original algorithm signi cantly, since inserting a new element in Q or
modifying a vertex label requires the complete scanning of Q.

2.5.2 Dial's algorithm


Dial proposed in [31] a di erent way of maintaining Q ordered. He assumes that the arc weights
are nonnegative integers. Denoting the largest arc weight by W def = max(i;j )2A wij , the possible
nite label values range from 0 to (n , 1)W . The idea is then to associate a slice of this range
with the vertices whose label falls within that slice. The so built small set of vertices is usually
referred to as a bucket. For the sake of simplicity, we will consider (n , 1)W + 1 buckets, denoted
by Q(k) (k = 0; : : :; (n , 1)W ), the k-th bucket being associated with label value k. Finding
the minimum label vertex then consists of retrieving the rst nonempty bucket (in ascending
order), rather than scanning the candidate list Q. Usual operations on this data structure are
very elementary. A new element is added (deleted) by inserting (removing) it into (from) the
appropriate bucket. Correcting the label of an element is simply achieved by moving the element
into the bucket matching its new label value. Dial's algorithm then proceeds as follows.
Algorithm 2.6
Step 1: Initializations.
Set sc (src) 0 and pred (src ) 0.
For each vertex i 2 V n fsrcg, set sc (i) +1 and pred (i) i.
For k = 1; : : :; (n , 1)W , set Q(k) ;.
Finally, set Q(0) fsrcg.
5
A graph is complete if there exist an arc bewteen every pair of vertices.
the shortest path problem 20

Step 2: Selecting and updating.


Find smallest k such that Q(k) 6= ;.
For each vertex i in Q(k), select each vertex j 2 S (i) such that sc (i) + wij < sc (j ), and do:
 if sc(j ) =6 1, then set Q(sc(j )) Q(sc(j )) n fj g;
 set sc(j ) sc(i) + wij and pred (j ) i;
 set Q(sc(j )) Q(sc(j )) [ fj g.
Set Q(k) ;.
Step 3: Loop or exit.
If Q(k) = ; for k = 0; : : :; (n , 1)W , then exit.
Else go to Step 2.
Actually, Dial's algorithm is more optimized than that above. Indeed, it is sucient to maintain
only W + 1 buckets, instead of (n , 1)W : if we are currently searching bucket k, then all buckets
beyond k + W are known to be empty. This can be easily checked since the label sc (j ) of vertex
j is of the form sc (i) + wij , where i is a vertex that has already been removed from the candidate
list; we also have that sc (i)  k and aij  W ; then sc (j )  k + W . Dial implemented the bucket
structure with two-way linked-lists.
The computational complexity of Dial's algorithm is O(m + nW ), and the space requirement
reaches 5n + 2m + W + 1. Using buckets with nonuniform widths and splitting down large ones
at the right moment can speed up the algorithm. Denardo and Fox [28] reduced the complexity
bound by proposing such strategies. They also generalized Dial's algorithm with noninteger arc
weights. See also [112] where buckets have bounded cardinality.

2.5.3 Binary heap algorithm


Yet another data structure has been favoured to keep a list ordered : the heap. This term has been
rst introduced by Williams [114]. Other authors used this technique, denoting the underlying
data structure by priority queue [1, 72]. Later, Tarjan [107, Chapter 3] proposed a thorough
analysis of the heap.
A heap is a partially ordered collection H of items, the k-th item associated with a real-valued
label H(k). The properties of a heap are suitably represented by an arborescence. The root or
top item is that of minimum label in H. Moreover, the label of every item in H does not exceed
the label of all the items that are its descendants in the arborescence. A binary heap, denoted
by Q for convenience, is a heap of K items verifying the following:
Q(k)  Q(bk=2c); for all k = 2; : : :; K; (2:8)
where bxc is the greatest integer smaller than or equal to x. An binary heap with K = 6 is
illustrated in Figure 2.3, where the labels (Q(k))6k=1 = (1; 2; 5; 3; 2; 6) are encircled next to their
index k. Binary heaps are very often employed for their easy implementation.
the shortest path problem 21



k=1

 
1
, @

 
, @
, @


k=2 2 5 k=3

JJ


3
k=4

J

k=5
2 6
k=6
Figure 2.3: A binary heap.

The binary heap Q then has dlog(K +1)e levels in its arborescence, where log is the logarithm
base 2, and dxe is the smallest integer greater than or equal to x. As a consequence, operations
of removing the minimum label item, inserting a new item, and correcting a label have a compu-
tational complexity of O(log K ). Indeed, each heap manipulation concerning one item consists of
exchanging that item either with its ascendant (one level up) or with one of its descendant (one
level down). The procedure of recovering the heap properties after some change about a vertex
or a label will be referred to as \order the heap". The heap Q can be implemented by means of
two n-vectors: one for the arborescence, and one for keeping track of the items' position in the
heap.
We now describe a shortest path tree algorithm using a binary heap to manage the list
of candidate vertices, which is still denoted by Q. The heap will contain at most n items or
vertices. As for previous algorithms, Qh (= Q1 ) and Qt denote the head and the tail of the
heap, respectively. Note that index t  n. For technical details concerning the heap updating
see [49, 107]. Algorithms using related techniques have been developed by D.B. Johnson [67] and
E.L. Johnson [68].
Algorithm 2.7
Step 1: Initializations.
Set sc (src) 0 and pred (src ) 0.
For each vertex i 2 V n fsrcg, set sc (i) +1 and pred (i) i.
Finally, set Q fsrcg.
Step 2: Selecting and updating.
Set i Qh .
Replace Qh by Qt and order the heap Q.
For each vertex j 2 S (i) such that sc (i) + wij < sc (j ), do:
 set sc(j ) sc(i) + wij and pred (j ) i;
 if j 62 Q then insert j as Qt and order the heap Q.
the shortest path problem 22

Step 3: Loop or exit.


If Q 6= ;, then go to Step 2.
Else exit.
Remark that at Step 2, one must not \ nd" the minimum label vertex, in contrast with both
Algorithm 2.5 and Algorithm 2.6. The counterpart is given by the need of reordering the heap.
Since each arc is examined at most once, the computational complexity of the binary heap
algorithm (or Johnson's algorithm) is O(m log n) when the arc weights are nonnegative. The
space requirement is O(5n + 2m). For sparse graphs, m = O(n) and the complexity becomes
O(n log n), which is the lowest theoretical complexity bound that has been found so far for solving
the shortest path tree problem. This method performs very well in practice since one can hardly
tell its practical performance apart from O(n) [14]. Interesting variants have been proposed
namely by Denardo and Fox [28]: they used a heap for ordering buckets (see Section 2.5.2) and
got a computation complexity of O(m log W ) where W is the maximum arc weight; this bound
is of course attractive when W < n.

2.6 An auction algorithm


We now present an algorithm which shares features from both label-correcting and label-setting
algorithms: the auction algorithm. The auction strategy has been rst studied by Bertsekas [6, 7]
in the context of assignment problems. Later, Bertsekas applied this technique to the shortest
path problem [8, 9].

2.6.1 Basic concepts


Bertsekas' algorithm is designed to solve the shortest path problem from several origins to a single
destination in a weighted oriented graph. It then works out our one-pair problem. We therefore
present the basic auction algorithm for the single origin case as proposed by Bertsekas in [8, 9].
The following assumptions are made on the graph G = (V ; A; w):
1. All cycles have positive length.
2. Each vertex is the source of at least one arc (except for the destination vertex of the one-pair
problem).
3. The graph G is simple.
We are interested in nding a shortest path from vertex src to vertex dst . The auction procedure
maintains a simple path P = (src ; v1; v2; : : :; vk ) starting at src , and a vector  of prices associated
with the vertices, v being the price of vertex v . At each iteration, the path P is either extended by
adding a new vertex vk+1 , or contracted by deleting its current destination vk . When the current
destination becomes dst , the algorithm ends. Bertsekas gives an intuitive sense of his algorithm:
it proceeds as a person trying to reach a destination in a graph-like maze, going forward and
the shortest path problem 23

backward along the current path; each time a backtracking occurs, the person evaluates and
keeps track of the \price" or the \desirability" of revisiting and advancing from the left position.
An iteration consists of updating the pair (P;  ) so that it satis es the complementarity
slackness condition (CS):
i  wij + j for all (i; j ) 2 A (2:9a)

i = wij + j for (i; j ) such that aij is an arc of P . (2:9b)


These CS conditions are equivalent to Bellman's equations (2.4) with the labels sc (i) replaced
by the negative prices ,i . When a pair (P;  ) satis es CS, the portion of P between two of
its vertices i and j is a shortest path from i to j (by (2.9a)), and i , j is the corresponding
shortest path cost (by (2.9b)).

2.6.2 Description of Bertsekas' algorithm


The algorithm proceeds by extending and contracting the current path P . A degenerate case
occurs when the path is reduced to the vertex src . The path P is then either extended, or left
unchanged with the price src being strictly increased. The algorithm needs to begin with a pair
(P;  ) that satis es CS. This is not a restrictive assumption when all arc weights are nonnegative:
one can use the default pair
((src);  ) with i = 0 for all i = 1; : : :; n: (2:10)
Let us describe the algorithm by characterizing a typical iteration.
Algorithm 2.8
Let i be the terminal vertex of P . If
i < (i;jmin fw + j g;
)2A ij
(2:11)
go to Step 1; else go to Step 2.
Step 1. Contract path:
Set i min(i;j )2Afwij + j g,
6 src, contract P . Go to the next iteration.
and if i =
Step 2. Extend path:
Extend P by vertex ji where
ji = arg (i;jmin fw + j g:
)2A ij
(2:12)
If ji = dst then exit: P is the desired shortest path.
Else, go to the next iteration.
In order to have an intuitive view of this procedure, see Table 2.2 showing all steps of Algorithm 2.8
when applied to the small graph illustrated in Figure 2.4, where the vertex numbers are encircled
and the arc weights are next to the arcs.
the shortest path problem 24



dst

 
4
2 ,, @ 2



, @
I@
, , @ @
2 3


@ @I
@  ,
,
,
1 @@ ,
, 2
1
src

Figure 2.4: A small graph as illustration for the auction algorithm.

Iteration P before iteration  before iteration Iteration type


1 (1) (0,0,0,0) Contraction at vertex 1
2 (1) (1,0,0,0) Extension at vertex 2
3 (1,2) (1,0,0,0) Contraction at vertex 2
4 (1) (1,2,0,0) Contraction at vertex 1
5 (1) (2,2,0,0) Extension at vertex 3
6 (1,3) (2,2,0,0) Contraction at vertex 3
7 (1) (2,2,2,0) Contraction at vertex 1
8 (1) (3,2,2,0) Extension at vertex 2
9 (1,2) (3,2,2,0) Extension at vertex 4
10 (1,2,4) (3,2,2,0) Exit
Table 2.2: Illustration of Bertsekas' algorithm.

2.6.3 Properties of the algorithm


We note here some important properties of the auction algorithm presented above. The reader is
referred to [8, 9] for their proofs. These properties can be easily checked on the example detailed
in Table 2.2.
Property 1 The price i generated by Algorithm 2.8 is an underestimate of the shortest path
cost from i to dst .
Property 2 For every pair of vertices i and j , and all iterations, i , j is an underestimate of
the shortest distance from i to j .
Property 3 The portion of P between vertex src and any vertex i 2 P is a shortest path from
src to i, with shortest path cost equal to src , i .
the shortest path problem 25

Moreover, note that the shortest path cost to a vertex is found at the rst time the vertex becomes
the terminal vertex of the path P and is then equal to src , and that the vertices become terminal
for the rst time in the order of their proximity to the origin.
Some properties allow to state relationships between the prices i and the shortest path costs
sc (i); remembering the convention that sc (src) = 0, Properties 2 and 3 imply that
sc (j )  src , j for all j 2 V (2:13)
and
sc (i) = src , i for all i 2 P (2:14)
if the CS conditions hold. Then, we can write the following:
sc (i) + i , dst  sc (j ) + j , dst for all i 2 P and j 2 V : (2:15)
The price i being an estimate of the shortest path cost from i to dst (see Property 1), the
quantity sc (j ) + j , dst as an estimate of the shortest path cost from src to dst using only
paths passing through j . It thus makes sense to consider vertex j as \most desirable" for inclusion
in the algorithm's path if sc (j ) + j , dst is minimal.

2.6.4 Algorithm's performance


The crucial operation in Bertsekas' algorithm is the calculation of min(i;j )2A fwij + j g each time
vertex i becomes the terminal vertex of the path P . Bertsekas [8, 9] proposes several techniques
for reducing that computational time.
The computational complexity of the auction algorithm is O(mn). This bound can be reduced
by considering some more characteristics of the graph: if  bounds the number of arcs in the
subgraph of vertices that are closer to the origin src than the destination dst , a more accurate
estimate is O(n). Graph with a small diameter still improve this computational bound: if  is
the minimum number of arcs in a shortest path from src to dst , and W = max(i;j )2A wij , then
the computational complexity is O(mW ) or O(W ). According to Bertsekas, the practical
performance of auction algorithms remains to be fully investigated, particularly using parallel
machines.
The auction algorithm performs very well on random graphs and problems with few desti-
nations (more than one, but much less than n); it even runs faster than Johnson's algorithm.
When applied to sparse graphs, the auction method seems to be not so ecient: A. Sartenaer
[100] compared the performance of Bertsekas' algorithm against that of Johnson's method when
practiced on urban networks. She noticed that Bertsekas' method does not take advantage of
the sparsity as well as Johnson's algorithm does. Moreover, the complexity bound estimated for
Bertsekas's algorithm depends on the shortest path cost and there are problems for which the
number of iterations of the algorithm is not polynomially bounded. See, for instance, the small
graph in Figure 2.5 which involves a cycle of relatively small cost. A. Sartenaer experimented
that the run-times of Bertsekas' algorithm closely depend on the value of W : examining the steps
taken by Algorithm 2.8, one directly sees the the price of vertex 3 will be rst increased by 1 and
the shortest path problem 26

 
1
 
   1
- 2
1
- 3
W
- 5
src dst


@
I ,
@ ,1
1 @
,
4

Figure 2.5: A small graph with a cycle of small cost.

then by increments of 3 (the cost of the cycle) as many times as necessary for the price 3 to
reach or exceed W . If this situation is unlikely to arise in randomly generated problems, it is not
the case for urban networks (for instance): just think of a roundabout followed by a long road.
On the other hand, Jonhson's algorithm acts the same whatever the value of W can be, since it
will terminate in n , 1 iterations.
The storage requirement is 3n +2m or at most 5n +2m depending on wether a data structure
is used to retrieve the minimum value min(i;j )2A fwij + j g with i the terminal vertex of P .
It seems dicult to make use of a heap in the auction algorithm, since at each iteration this
minimum value may be calculated on a di erent set of arcs. The computational e ort of building
a new heap \from scratch" each time P 's terminal vertex changes would degrade the algorithm's
performance.

2.7 An algorithm using an updating technique


When many shortest path trees must be computed, a way of doing results in applying a shortest
path tree algorithm for each di erent root. If some of the root vertices are \close" to each other,
the shortest path trees at these roots will share many arcs. This observation, although well-
known, has not been exploited before Florian, Nguyen and Pallottino [42]. This is mainly due
to the simplicity and eciency of the available algorithms for solving the shortest path problem
from a single origin. Some authors tackled the problem of updating the shortest path cost matrix
[47, 59, 88]. But the use of such methods are impractical when faced to large networks. Florian
et al. [42] set up a dual simplex strategy adapted to the forward star representation of a graph.

2.7.1 The shortest path method as a linear program


The problem of nding the shortest paths from a vertex src can be written in the minimum cost
ow format as a linear program: X
min
x wij xij (2:16)
(i;j )2A
subject to X X
xij , xji = b(i); for all i 2 V ; (2:17)
fj j(i;j )2Ag fj j(j;i)2Ag
the shortest path problem 27

xij  0; for all (i; j ) 2 A; (2:18)


where (
b(i) = n , 1; for i = src (2:19)
,1; for i 6= src:
The variable xij represents the number of shortest paths going through arc aij . Hence, the
objective function (2.16) can be translated as: minimize the sum of the costs of all shortest paths
from src. The constraints (2.17){(2.18) with (2.19) express the need of the shortest paths to form
an arborescence rooted at src .
The corresponding dual linear program is, by letting (,i ) be the dual variable associated
with constraint i, X
max
 (j , src ) (2:20)
j
subject to
wij + i + j  0; for all (i; j ) 2 A: (2:21)
Note that the quantities ,i are the prices of Bertsekas' auction algorithm presented in Sec-
tion 2.6.
A primal feasible solution for problem P (2.16){(2.19) is a set of ows fxij g that satisfy (2.17)
and (2.18). A dual feasible solution for problem D (2.20){(2.21) is a set of dual variables fi g
that satisfy (2.21). An optimal solution is both primal and dual feasible.

2.7.2 Solving the problem from another root


An optimal solution of P corresponds to a basis of (2.17){(2.19) whose arcs determine a spanning
tree that minimizes (2.16). As the source vertex src is changed into a di erent vertex src 0, the
shortest path tree rooted at src is a dual feasible and primal infeasible solution for the new
problem. That is why Florian et al. naturally use the dual simplex method for solving this new
problem. Here is the framework of a dual simplex algorithm.
Algorithm 2.9
Step 0: start with a dual feasible, primal infeasible solution;
Step 1: if the primal solution is feasible, terminate;
Step 2: select the variable to exit the basis, usually that with the most negative solution value;
Step 3: select the variable to enter the basis such that dual feasibility is maintained;
Step 4: obtain a new basis, update primal and dual variables and go to Step 1.
The algorithm may be interpreted as follows: given a shortest path arborescence T , one
builds another one, T 0 say, by transforming T . Each transformation deletes an oriented arc from
T towards T 0, and adds an oriented arc a+ from T 0 towards T , which is selected to maintain
the paths shortest from the new root. Note that the current subarborescence rooted at t(a+ ) is
already optimal for the new problem. The procedure ends when T 0 is an arborescence.
the shortest path problem 28

2.7.3 Computational performance


The updating algorithm presented above has a worst-case computational complexity of O(n2 )
when using an implicit updating of the dual variables. Burton [14] tested an implementation of
Florian's algorithm against that of Johnson for the shortest path tree problem. Both algorithms
were implemented with heaps for selecting minimum labeled elements, and were applied to sparse
graphs. For these tests, a rst shortest path arborescence (at src ) was generated by Johnson's
algorithm. Then various new sources (src 0 ) were selected and new shortest arborescences were
calculated from these sources both by Florian's and Johnson's methods. The experimentations
showed that Florian's algorithm runs faster when the new source src 0 belongs to the rst or second
centroid of src , that is when src 0 is reachable from src by at most two arcs. This condition can
be easily veri ed since the forward stars are available. In all other cases, Johnson's algorithm
outperforms that of Florian.
Florian's algorithm should have a better practical performance when applied to more dense
graphs. Indeed the arc a+ is selected among very few arcs in sparse graphs, and the vertices
s(a+ ) and t(a+) are usually close to each other. Florian's algorithm then proceeds by many
\small" steps because the subarborescences rooted at t(a+ ) are rather bonsas than trees. In
dense graphs, Florian's algorithm should consequently be more appropriate.

2.8 A shortest path method for the inverse problem


One can observe better complexity bounds for label-setting methods. However, we must distin-
guish between practical performance and theoretical worst-case performance. Although label-
correcting algorithms generally runs at worst in O(mn), the best of them are competitive with
the best label-setting algorithms. The best practical methods are not necessarily those with the
best computational complexity bounds.
In Figure 2.6, several problems of increasing dimension have been solved by six shortest paths
algorithms. These results are take from [14] where the author tested only variants of Dijkstra's
algorithm. They consequently do not cover all methods exposed in this chapter. Nevertheless,
they witness some trends in the use of label-setting methods. The number of nodes increases from
left to right and computation run-times augment upward. The rst result serie in foreground is
obtained by Johnson's algorithm; we can observe the almost linearity as the number of vertices
increases. The last serie in background re ects the quadratic behaviour of original Dijkstra's
algorithm. Intermediate algorithms are personal variants of Dijkstra's method which are not
registered in the literature.
For practical eciency of other algorithms, we refer to authors that have been cited throughout
this chapter. According to Gallo and Pallottino [49], the binary heap algorithm proposed by
Johnson gives similar results to those obtained by Dial's algorithm. We prefer that of Johnson
since its eciency does not depend on the maximal arc weight. The L-treshold algorithm is also
ecient, but presents the following drawback: one must adjust the treshold parameter for each
kind of application in order to obtain optimal eciency. The auction algorithm proposed by
the shortest path problem 29

120

100

80

60

40

20

Dijkstra

Johnson

Figure 2.6: Some shortest path tests results

Bertsekas, although very powerful in many cases, is relatively sensible to the presence of cycles
with small length and is not best suited for sparse graphs. Finally, Florian's allows better run-
times when one can pro t by an already calculated shortest path tree whose root is close to the
new one.
The general algorithm that is best suited to inverse shortest path applications then seems to be
Dijkstra's method implemented with a binary heap. It has an attractive practical performance,
requires few storage, and takes advantage of the sparsity in a very ecient way. If available,
Florian's algorithm may be helpful when combined to this choice.
3

Quadratic Programming

In Chapter 1, we introduced the inverse shortest path problem and decided on a particular
algorithmic framework for solving the problem by choosing the `2 or least squares norm. The
resultant problem's formulation determined a quadratic programming problem, or QP for short.
This chapter deals with the theoretical background of such programs, and discusses the selection
of a method well suited for treating our inverse problem.

3.1 Terminology and notations


Quadratic programming refers to optimization problems in which the objective function f (x) is
quadratic and the constraints Ei (x) are linear. We are thus concerned with nding a solution x
to the minimizing problem that follows:
min f ( x ) = a T x + 1 xT Gx (3:1)
x 2
subject to
Ei (x) def
= nTi x , bi  0 (i = 1; : : :; h); (3:2)
where x, a and fni ghi=1 belong to Rm , G is a m  m symmetric1 matrix, b is in Rh and the
superscript T denotes the transpose. For the sake of simplicity, we do not explicitely mention
equality constraints of type nTi x , bi = 0 which the vector of variables x may also be subject
to. In fact, these constraints can be theoretically represented by two opposite constraints of type
(3.2).

3.1.1 Triviality and degeneracy


A constraint Ei (x)  0 may either be amply satis ed (when the strict inequality Ei (x) > 0
holds), or binding (when the strict equality Ei (x) = 0 holds), or violated (when Ei(x) < 0). A
constraint will be called satis ed when it is not violated. Vectors x that satisfy all constraints
(3.2) are feasible vectors and the set F def
= fx j Ei(x)  0 for i = 1; : : :; hg of feasible vectors
1
It is always possible to arrange that the matrix G is symmetric since [xT Gx = xT GT x] implies that xT Gx =
1
2
xT (G + GT )x where the matrix 12 (G + GT ) is symmetric.
30
quadratic programming 31

is called the feasible region. The feasible region may be empty. As de ned in [11], a constraint
Eq(x) is trivial or redundant if, and only if, there exist no vector x 2 Rm such that Ei(x)  0
(i 2 f1; : : :; hg n fq g), and Eq (x) < 0. Note that this de nition pertains when there is no vector
x such that Ei(x)  0 (i 2 f1; : : :; hg n fq g); the feasible region is then empty even without the
q-th constraint which cannot limit the feasible region any more.
We will say that degeneracy occurs when the solution to the QP problem is the same, whether
some binding inequality constraint is imposed or not. It means that we can disregard the
(in)equality, solve the problem, and nd a solution which exactly satis es the (in)equality (see
[12] for further details).

3.1.2 Convexity
The concept of convexity concerns both feasible region F and objective function f (x).
The feasible region F of our quadratic program, determined by (3.2), is characterized by the
following theorem|which can easily be proved, since each of the h constraints (3.2) limits the
feasible region to a halfspace.
Theorem 3.1 The feasible region F of a quadratic program is convex, i.e.,
x; y 2 F ) x + (1 , )y 2 F ; (3:3)
for all 0    1.
An extreme point of the convex set F is a point that does not lie strictly within the line segment
connecting two other points of the set. More formally,
Definition 1 A point x in a convex set F is said to be an extreme point of F if there are no
two distinct points x1 and x2 in F such that
x = x1 + (1 , )x2 (3:4)
for some 0 <  < 1.
Note that a singleton and the entire space Rm are both convex, the rst one is bounded and the
second one is unbounded.
Definition 2 A function f : D( Rm ) ! R is convex if the set D is convex, and if the following
inequality holds for any pair of points x1, x2 2 D and any real number 0    1:
f ((1 , )x1 + x2)  (1 , )f (x1) + f (x2): (3:5)
If the sign  in (3.5) is replaced by < and  6= 0; 1, the function f is said to be strictly convex.
Geometrically, if f is strictly convex and continuously di erentiable, a line segment drawn between
any two points on its graph falls entirely above the graph. Such a function increases more rapidly
(or decreases less rapidly) than a straight line:
f (x2) > f (x1) + rf T (x1):(x2 , x1 ) for all x1 ; x2 2 D; (3:6)
quadratic programming 32

where r designates the Gradient (or rst derivative) of the next argument.
Now let us come back to the objective function f of our QP. Clearly, f is twice continuously
di erentiable and the second derivative of f , its Hessian, is G an m  m matrix whose components
are second partial derivatives. Remember that G is symmetric.
Definition 3 An m  m matrix G is said to be positive semi-de nite if for all x 2 Rm , xT Gx 
0.
Definition 4 An m  m matrix G is said to be positive de nite if for all x 6= 0 2 Rm , xT Gx > 0.

These de nitions are signi cant in quadratic programming for their relationship with the
convexity of f and hence with the characterization of a solution to the QP problem. The two
following theorems state these connections.
Theorem 3.2 Let G be an mm symmetric matrix. The quadratic function f (x) = aT x+ 12 xT Gx
is convex on Rm if and only if G is positive semi-de nite.
A proof of this theorem can be found in [113].
If G is positive de nite, then the function f (x) = aT x + 12 xT Gx is strictly convex on Rm .
These properties may a ect the quality of a minimizer of the QP problem (3.1){(3.2):
Theorem 3.3 Assume that the feasible region F determined by (3.2) is not empty. If G is
positive semi-de nite, a solution x of the QP problem (3.1){(3.2) is a global solution, that is,
f (x)  f (x); for all x 2 F : (3:7)
Moreover, if G is positive de nite, then x is also unique, that is,
f (x) < f (x); for all x 6= x 2 F : (3:8)
See [12] for a proof of this theorem.
When the Hessian G is inde nite then local solutions which are not global can occur2 . The
problem of minimizing a convex quadratic function f (x) on a convex feasible domain F  Rm is
called convex quadratic programming.

3.2 A speci c quadratic problem


Let us characterize of our inverse problem (1.9)(1.4){(1.6) with respect to the above terminology.
2
It is also the case when the feasible region of the QP problem is not convex.
quadratic programming 33

3.2.1 The objective function


The objective (1.9) of our inverse shortest path problem can be re-written in the QP standard
form as
f (w) = 12 wT w , w T w: (3:9)
We directly see that the Hessian matrix of f (w) is the identity matrix (on Rm ) which is positive
de nite. This implies the strict convexity of our objective (1.9) and hence the certainty that a
found solution is global and unique if the feasible region F of the problem is convex; otherwise,
if F is non-convex, only local optimality can be practically observed3 .

3.2.2 The feasible region


Remember that the objective of our inverse problem is subject to the non-negativity of w (1.4),
to \explicit" shortest path constraints (1.5), and nally to \implicit" shortest path constraints
(1.6). The rst constraints are classical bound constraints which are of course linear, and hence
determine a convex feasible region in Rm . We therefore do not examine these constraints in more
details. Let us rather interpret and clarify the status of the shortest path constraints.

Explicit constraints
An explicit shortest path constraint has been stated in Chapter 1 as follows:
pj is a shortest path in G; j = 1; : : :; nE ; (3:10)
where G is an oriented weighted graph (V ; A; w), consisting of a set V of n vertices, a set A of
m arcs and an m-vector w of weights associated with the arcs. The paths pj (j = 1; : : :; nE ) are
de ned as an explicit succession of consecutive arcs in A. Note that we are using the terminology
and de nitions presented in Chapter 2.
The formulation (3.10) asks the cost of pj not to exceed that of any path with the same
origin and destination as pj . This may be expressed as a (possibly large) set of vectorial4 linear
inequality constraints of the type
X X
wk  wk ; (3:11)
kjak 2p0j kjak 2pj

where p0j is any path with the same origin and destination as pj . As a consequence, the set of
feasible weights determined by (3.10) is convex as it is the intersection of a collection of half
spaces. The problem of minimizing (1.9) subject to (3.11) for j = 1; : : :; nE , and to the non-
negativity constraints (1.4) is then a classical QP problem. This QP is however quite special
because its constraint set is (potentially) very large, very structured, and possibly involves a
nonnegligible amount of redundancy. Indeed, the number of linear constraints of the form (3.11)
3
We refer here to algorithm complexities in terms of polynomial or non polynomial run-times. These notions
will be introduced later.
4
That is, 0 belongs to the subset of points verifying the inequality as an equality.
quadratic programming 34

is dependent on the number of possible paths between two vertices in the graph, which typically
grows exponentionally with the density of the graph m=n. As all paths are taken into account
between an origin and a destination, a lot of constraints (3.11) are trivial once only few ones are
suitably considered; indeed, at most m constraints are not trivial since they are vectorial and
w  0. There exist procedures that eliminates such trivial or redundant constraints, allowing
to start the problem's resolution with fewer constraints (see Boot's procedure in [11, 12], for
instance). However, these checks for triviality require the enumeration of all constraints. In our
case, enumerating an exponential number of constraints is of course out of question5 , and we will
have to use a \separation procedure" to determine which of these constraints are violated for
a given value of the arc weights. This separation is naturally based on the computation of the
shortest paths within the graph.

Implicit constraints
An implicit shortest path constraint restricts some attribute of a shortest path between an origin
and a destination without suggesting the path that has to be taken from the origin to the destina-
tion. We therefore say that the shortest path is implicitly determined by its origin, its destination,
the oriented graph and the current value of the arc weights w. Typically, the restricted attribute
is the cost of the shortest path since our variables are the arc weights. We consider bound con-
straints on this shortest cost. We assume that nI origin-destination pairs (oj ; dj ); j = 1; : : :; nI
are concerned with this constraint type, and we formulate the corresponding implicit constraints
as follows: X
0  lj  wa  uj ; j = 1; : : :; nI ; (3:12)
a2p1j (w)
where p1j (w) is a shortest path (with respect to the weights w) from oj to dj . The values of lj
and uj are lower and upper bounds on the cost of the shortest path from oj to dj , respectively.
For consistency, we impose that lj  uj (j = 1; : : :; nI ) and we allow lj to be chosen as zero and
uj to be in nite.
One constraint (3.12) actually consists of two inequality constraints: one for the lower bound
part and one for the upper bound part. Due to the meaning of the shortest path principle, lower
and upper bounds on shortest path costs have very di erent interpretations.
Let us consider the j -th origin-destination pair. Since p1j (w) is a path of minimum cost
P
between oj and dj , imposing that a2pj (w) wa  lj means that all paths from oj to dj must have
1

their cost above or equal to lj . This can be expressed by the following.


X
lj  wa; (3:13)
a2p0
j

the path p0j being any path from oj to dj . These constraints are linear and ane6, and their
number is exponential with the density of the graph m=n, as for the depiction of an explicit
5
This procedure would be not polynomially bounded.
6
That is, 0 does not belong to the manifold verifying (3.13) as an equality, when lj 6= 0.
quadratic programming 35

shortest path constraint. The feasible region delimited by lower bounds on shortest path costs
is then convex. Again, much redundancy can be expected in the set of constraints (3.13). As
a consequence, these constraints can be part of our inverse problem without a ecting the global
nature of a solution to this problem.
The underlying interpretation of an upper bound on the shortest path cost is fundamentally
di erent: the j -th upper bound constraint de ned by
X
wa  uj (3:14)
a2p1j (w)
does not compel all paths from oj to dj to have a cost under uj , but it just imposes that there exists
one path from oj to dj whose cost does not overstep the upper bound uj . The (a priori) unknown
path that must follow that condition has to be picked up among the exponential number of paths
starting from oj and arriving at dj . A shortest path procedure will determine an appropriate path
which will complete the constraint de nition in order to check whether the constraint is violated
or not. However, the path that is to be selected for evaluating the constraint violation may vary
with the arc weights w. The path p1j (w) then remains explicitely unidenti ed. Consequently the
constraint (3.14) cannot be expressed as one or more linear constraints, and hence cannot t into
the classical QP framework de ned by (3.1){(3.2).
Moreover, the feasible region determined by constraints of the type (3.14) is non-convex.
Let us show it in this small example: consider the following graph, composed of 3 vertices and 3
arcs (m = 3), shown in Figure 3.1, and consider the constraint

t
o t
a1
,
,
, @ a2

,
@
@
R
@
-
a3
t d

Figure 3.1: A small example for proving the non-convexity.

X
wa  5; (3:15)
a2p1 (w)
where p1 (w) is the shortest path (with respect to the weights w) from vertex o to vertex d. It
is easy to see that w1 = (2 2 10)T and w2 = (10 10 4)T are feasible weight vectors, while
1 (w1 + w2) = (6 6 7)T is infeasible. The feasible region is therefore non-convex.
2

The separability and sparsity


The objective function f (w), as initially formulated in (1.9), displays the property of being
separable. A separable function can be expressed as a linear combination of several single-variable
functions. Here, f (w) can be written as a sum of m functions, each of which includes a single
quadratic programming 36

variable:
X
m
f (w ) = fi (wi) (3:16)
i=1
where fi (w) = 12 (wi , wi )2, (i = 1; : : :; m). This property is sometimes eciently used to speed
up the solving procedure.
One common characteristic that is shared by all the constraints (1.4){(1.6) is that they are
sparse. The constraints involve the weight of only very few arcs since they translate either the
non-negativity of an arc weight, or properties of a (shortest) path which is not eulerian7 in large
graphs.

3.2.3 Searching for a strictly convex QP method


The non-convexity yielded by the presence of upper bound constraints (3.14) does not allow to
ensure that a solution to the inverse problem is global. The very special nature of the inverse
shortest path problem constrained to upper bounds on shortest path costs will be analysed in
Chapter 6. In that chapter local optimality will be de ned more precisely and we will set up a
speci c method that produces a local solution to this problem.
Deferring the resolution of our inverse problem with upper bound constraints to a further
chapter, we now focus our attention on a method that is able to solve the \basic" inverse shortest
path problem de ned by (1.9), (1.4) and (1.5), that is, a strictly convex QP problem with a large
number of sparse constraints involving much redundancy. Note that lower bounds on shortest
path costs will be reintroduced in Chapter 5.

3.3 Note on the complexity of convex QP methods


In order to have insight on how eciently solutions to quadratic problems can be computed, one
naturally invokes the complexity theory. Vavasis [113] recently proposed a complexity analysis of
convex and non-convex QP in a very comprehensive way. Let us rst de ne some terms usually
used in complexity theory.

3.3.1 Solving a problem


A problem is a function F : C ! B , where C is the set of instances encoded as strings of
character. Quadratic programming is a \problem" in this sense. The set C would be of the form
(m; h; G; a; N; b) encoded as strings of characters. In this case, m, h are integers, G is an m  m
matrix, a is an m-vector, N is an m  h matrix whose i-th column is ni , and b is an h-vector.
Each such sixtuple is an \instance" of quadratic programming. The value of F (m; h; G; a; N; b) is,
for instance, the minimum value of aT x + xT Gx over all choices of x satisfying N T x  b. In this
example, the set B would be real numbers. A decision problem is a problem where the output set
B consists of two elements fyes; nog. For instance, consider the problem of determining whether
a vector x is a global solution of a QP problem or not.
7
using once, and only once, each arc of the graph.
quadratic programming 37

Computing the function F requires the computation standard model given by the Turing
machine. We do not want to go into details, which can be found in [50, 113]. We just specify that
the action of a Turing machine is deterministic, that is, cannot make choices. A Turing machine
is said to compute function F : C ! B if, given an instance x of C , it eventually yields F (x) and
halts.

3.3.2 Complexity classes


The complexity refers to the amount of resources required by a computation. A complexity class
is the class of problems that satisfy a certain resource bound. Two well-known complexity classes
are P and NP.
Definition 5 The class P of problems is de ned to be those decision problems F : C !
fyes; nog such that a Turing machine M can compute F , and the number of steps required
by M is bounded by p( ), where p is a polynomial and  is the length of the input.
Note that there is an nontrivial reasoning in order to allow Turing machines handle real numbers,
or more restrictively, rational numbers, without loosing polynomial bounds when applied on
decision problems.
Another diculty comes from the fact that optimization problems are not generally stated as
decision problems. The usual way to work around this indisposition is the following. Suppose that
the optimization problem is that of minimizing f (x) subject to x 2 D; the associated decision
problem can be: given f , D, and a rational8 number , does there exist an x in D such that
f (x)  ?
Showing that an optimization problem lies in P is of great importance and is considered as
a very positive result, since it turns out that when a problem has a polynomial time algorithm,
it generally has an implementation that is ecient in practice. However, there are no techniques
known at present for proving that a problem is not in P. The best technique available is the
theory of NP-completeness, which will be introduced in Chapter 6. For the moment, we just
de ne the complexity class NP. The NP class contains the decision problems such that all
instances yielding a yes can be veri ed in polynomial time.
More formally, we need an alphabet, denoted by , which refers to the nite set of characters
that are necessary to encode all useful instances. Then,  denotes the set of all possible nite
instances that can be expressed with the alphabet .
Definition 6 A decision problem F : C ! fyes; nog is said to lie in NP if there exists a
polynomial p( ), a nite alphabet , a Turing machine M (the certi cate checker) running in
polynomial time and computing a function  : C   ! fyes; nog such that
1. For every x such that F (x) = yes there exists an instance  2  (the certi cate for x)
such that (x; ) = yes.
8
See the remark just above.
quadratic programming 38

2. For every x such that F (x) = no and for every instance  2 , (x; ) = no.
A precision is usually added indicating that, in Item 1, the length of  is bounded by the value
of p(length of x). This de nition means that every yes-instance has a certi cate of polynomial
length, and that there exists a Turing machine that can check the certi cates in polynomial time.
In the optimization framework, a problem is in NP if one can verify in polynomial time whether
a given point is a solution of that problem or not. Many combinatorial problems are in NP since
verifying is usually simple for that kind of problems.
As a direct consequence, remark that P  NP.

3.3.3 Convex quadratic programming


Vavasis showed in [113, Chapter 3] that convex QP problems are in P. The non-convex case will
be of interest in Chapter 6.
Theorem 3.4 The QP problem de ned by (3.1){(3.2) with a positive de nite Hessian G is solv-
able in polynomial time by a deterministic algorithm.
This theorem and the above comments allow the opinion that some ecient algorithm for solving
convex QP problems does exist.

3.4 Resolution strategies


During the past three decades, several algorithms have been suggested to solve the QP problem
(3.1){(3.2). Classifying these methods is not simple because methods sharing some characteristics
often di er from another point of view. In quadratic (and nonlinear) constrained optimization,
one usually distinguishes between primal and dual methods. The techniques employed for han-
dling the constraints rather refer to simplex-type methods and active set methods. One can nd
both techniques in primal and dual approaches.

3.4.1 Primal and dual methods


The presentation of this section may give the impression of keeping primal and dual methods
apart from each other. Both methods are not so separate in practice, and methods exist mixing
both approaches; they are called primal-dual methods.

Primal methods
A primal method works on the original problem directly by searching through the feasible region
for the optimal solution. Each point in the process is (primal) feasible and the value of the
objective function constantly decreases. Such methods bene t from the following advantages:
 Most primal methods do not rely on special problem structure, such as convexity, and hence
apply to a wide-ranging class of problems.
quadratic programming 39

 Since each generated point is feasible, if the process halts before reaching the solution, the
nal point is feasible and may represent an acceptable approximation to the solution of the
original problem.
These methods however present major drawbacks. They require an initial procedure to start
from an initial feasible point. Diculties also come from the need to maintain this feasibility
throughout the process. As noticed by Luenberger in [81], some methods can fail to converge for
problems with inequality constraints unless elaborate precautions are taken, but they generally
have good convergence rates, particularly with linear constraints.
A primal method solving our QP problem (3.1){(3.2) gives a point x that satis es the
following conditions called Kuhn-Tucker conditions [73]: there exist real numbers ui  0, (i =
1; : : :; h) such that
Xh
rf (x) , uirEi(x) = 0 (3:17a)
i=1
and
uiEi (x) = 0for all i = 1; : : :; h: (3:17b)
The vector u is called the vector of Kuhn and Tucker multipliers or Lagrange multipliers. The
conditions (3.17a){(3.17b) geometrically mean that the gradient of f at x lies in the normal
cone de ned at x, that is, can be expressed as a linear combination of the inward normals to
the binding (or active) constraints at x. Note that in case of non-degeneracy9 , the Kuhn and
Tucker multipliers are unique.
Just for reference, famous primal methods are the gradient projection methods and the re-
duced gradient method. Both basic methods can be viewed as the method of steepest descent
applied on the manifold de ned by the binding or active constraints. In linear programming,
the simplex method is well-known for travelling through extremal points of the convex feasible
region.

Dual methods
A dual method does not tackle the original constrained problem directly but instead considers
an alternate problem, the dual problem, whose unknowns are the Lagrange multipliers of the
rst problem. Lagrange multipliers, for convenience denoted by u 2 Rh , in a sense measure the
sensivity of the constraints as they appear in the following function, the dual objective function
which is usually called the Lagrange function or Lagrangian:
X
h
L(x; u) def
= f (x) , ui Ei(x) (3:18)
i=1
where L(x; u) is the Lagrange function of the QP problem de ned by (3.1){(3.2). For a problem
with m variables and h constraints, dual methods thus work in the h-dimensional space of the
9
When rEi (x ), with i such that Ei (x ) = 0, are linearly independent.
quadratic programming 40

Lagrange multipliers u, and solve the dual problem which is formulated as follows.
max d(u) (3:19)
u2Rh
subject to
u  0; (3:20)
where
d(u) = xmin
2Rm
L(x; u): (3:21)
For our convex QP problem, it can be shown that d(u) is concave (any local maximum is then
global), since d(u) = , 12 uT (N T G,1 N )u + uT (b + N T G,1a) , 12 aT G,1 a, where N is the m  h
matrix whose i-th column is ni . The non-negativity of the Lagrange multipliers (3.20) is due
to the inequality constraints (3.2) of the original or primal problem; equality constraints would
leave their Lagrange multiplier unconstrained in sign. Once these multipliers are known, one
must determine the solution point in the space of primal variables x, such that Gx = Nu , a,
in order to supply the desired solution of the QP problem. A method solving the above dual
problem actually searches for a saddle-point (x; u) of the Lagrange function L(x; u), that is, a
point such that
L(x; u)  L(x; u) for all x 2 Rm (3:22a)
and
L(x; u)  L(x; u) for all u  0: (3:22b)
If such a point is found, then x is a global optimum of the primal problem.
Dual methods o er the following attractive features:
 Lagrange multipliers have meaningful intuitive interpretations as prices associated with the
constraints, in the context of practical applications.
 Dual methods do not require to start from an initial primal feasible point.
 Global convergence of dual methods is often guaranteed.
The eciency of dual methods however heavily relies on the convexity of the problem. Dual
procedures also have the disadvantage of supplying a primal feasible solution only when they
terminate.

Primal-dual relation
For primal and dual feasible solutions, x and u, we have that d(u)  f (x). Under di erentia-
bility conditions, optimal points of the primal and dual problem yield the same primal and dual
objective function values
f (x ) = d(u): (3:23)
In our case, the objective function f (x) and the constraints Ei(x) are convex and di erentiable.
Then, x is a global minimum if and only if the Kuhn and Tucker conditions are satis ed. This
result is stated in the theorem that follows.
quadratic programming 41

Theorem 3.5 Provided that there exists x 2 Rm such that Ei (x) > 0 for i = 1; : : :; h (the Slater
condition10), the Kuhn and Tucker conditions are necessary and sucient conditions to obtain
a global optimum x to the QP problem de ned by (3.1) and (3.2) where f (x) is convex: there
exists u  0 verifying
rxL(x; u) = 0 (3:24a)
and
ui Ei(x ) = 0; for all i = 1; : : :; h: (3:24b)
The notation rx indicates the partial derivative with respect to the variable x.
A proof of this theorem can be found in [82].

3.4.2 Simplex-type and active set methods


The methods used to solve QP problem can for the most part be categorized as either modi ed
simplex type methods or active set methods.

Simplex-type methods
Quadratic programming methods that wear the \simplex label" have one or more of the following
properties. They use simplex tableaux; they perform Gauss elimination type pivots on basis
matrices that are derived from the Kuhn-Tucker optimality conditions (3.24a){(3.24b); they
nally may reduce to the simplex method for linear programming for the degenerate case of G = 0.
They roughly consist of a generalization of the simplex method for quadratic programming.
These methods are inappropriate to our inverse problem since the pivot operations are per-
formed on matrices of row size (m + h). The number of our constraints h = nE being typically
exponential, the simplex-type methods would require too large matrices for operating correctly.
See [27] for further details concerning such methods.

Active set methods


These methods are based upon projections onto active sets of constraints and employ operators
of size no larger than m  m. They are consequently generally more ecient and require less
storage than methods of the simplex type. In fact, as mentioned by Luenberger in [81, Chap-
ter 14], a quadratic program with inequality constraints is almost always solved by an active set
method. In particular, active set techniques handle much better the possible redundancy among
the constraints.
In an active set method, some inequality constraints, indexed by the active set denoted by
A, are regarded as equalities while the remaining constraints are temporarily disregarded. The
method updates this active set in order to identify the correct active constraints at the solution
to the original problem (3.1){(3.2). On iteration k a point11 x(k) is calculated which satis es the
10
this condition along with the convexity of both the objective function and the constraints implies that the
Lagrange function has a saddle-point.
11
This point will be primal feasible if the active set strategy is used by a primal method.
quadratic programming 42

active constraints as equalities, that is nTi x(k) = bi for i 2 A. Moreover, apart from degenerate
cases, nTi x(k) > bi for i 62 A. Each iteration thus attempts to locate the solution x(k) to an
equality constrained problem in which only the active constraints occur. To the solution x(k)
corresponds a vector of Lagrange multipliers u(k) for the active constraints in A. The vectors
x(k) and u(k) then verify the Kuhn and Tucker conditions possibly without primal or dual
feasibility depending upon whether the active set technique is combined with a dual or primal
method, respectively. Some constraints thus need to be added and/or removed from the active
set, and a new subproblem then comes under consideration. This is repeated until validation of
the Kuhn-Tucker conditions for the original problem. Technical details about active sets can be
found namely in [41, Chapter 10].

3.4.3 Choosing a particular method


Let us proceed to a short survey of quadratic programming methods in order to choose the one
that is the most appropriate for solving our inverse problem.
The rst known QP methods appeared in the late Fifties and early Sixties: Frank and Wolfe
[46], Beale [4], Wolfe [115], Dantzig [27], Van de Panne and Whinston [109] set up the rst primal
QP algorihms. Being in uenced by the eciency of the simplex method in linear programming,
these methods operate on simplex tableaux. They modify the quadratic problem using Kuhn
and Tucker's developments so that the simplex method can be used. Except for Wolfe's method,
these procedures were dedicated to strict convex QP problems. Another characteristic of these
methods is that they start from a primal feasible point which does not necessarily minimize the
objective function.
In Beale's method, inequality constraints are converted to equations, and an initial feasible
solution is found for the constraint equations without considering the nature of the objective
function. Further steps are attempts to improve the value of the objective function.
At the same time, Theil and Van de Panne [108] proposed a rst dual method for solving
QP problems. They are just followed by Lemke [75] who also proposed a dual algorithm. Note
that Lemke [74] priorly developed the dual-simplex method. Both QP methods assume the strict
convexity of the objective function and follow identical steps in their progress. They were so far
preferable to other methods if the solution laid on relatively few of the constraining hyperplanes.
Theil, Van de Panne and Lemke used an active set strategy without naming it. They had the idea
of starting from the unconstrained minimum, which is G,1 a, where the gradient of f vanishes.
In constrast with the above primal methods, this initial point is usually primal infeasible but it
minimizes the objective function. The dual procedure however needs the inverse of the Hessian
G, but, as noticed in [76], the computations involved in nding a primal feasible solution are
approximately equivalent to nding G,1 . Each step of the dual method then consists of solving
subproblems where some inequality constraints are satis ed in equational form (as explained in
the previous section). Theil and Van de Panne devised ingenious rules in order to limit the search
to a small number of all possible solution combinations.
At the beginning of the main section, we mentioned the diculty to classify QP methods.
quadratic programming 43

Goldfarb showed in [53] that Beale's method, which was developed as an extension of linear
programming, can be viewed as an active set method. This is corroborated in [75] where Lemke
nds his method very close to that of Beale. Fletcher [40] writes similar comments about Dantzig's
method, and Van de Panne and Whinston [110] showed that Beale's method and Dantzig's method
generate the same sequence of points if they both start from the same initial point. This brings
forward the fact that QP methods apparently originate at the same basic idea, and that they
di er by the relative point of view with which they have been developed. The relative merits of
QP methods then show up through extensive computational experience. Indeed, remember that
the storage computational aspect invited us to prefer active set methods against simplex-type
ones. For experience results about the above algorithms, see those obtained by Fletcher [40] and
Goldfarb [53].
Later, Goncalves [58] proposed a primal-dual algorithm based on simplex techniques, and
Stoer [103] developed a method for constrained least-squares. Again, both approaches carry out
Gauss elimination pivots on potentially large matrices. Gill and Murray [52] set up a primal
method using an active set and QR-factorizations on the matrix formed by the normals to the
actives constraints. Their method applies to inde nite QP problems, and has the advantage to be
numerically stable, but it needs an initial primal feasible point. The search for an initial feasible
point can be avoided by an alternative approach proposed by Conn [22] minimizing a penalty
function. This modi cation allows the iterates to be not feasible. According to Gill and Murray
[52], results produced by Conn and Sinclair [23] do not allow rm conclusions.
A more recent method for convex QP is suggested by Goldfarb and Idnani [55]. It can be
viewed as a dual active set method. They use the idea of Theil and Van de Panne of starting from
the unconstrained minimizer of the quadratic function, and factorize the matrix of the normals
to the active constraints by similar techniques as those employed by Gill and Murray. The fac-
torization techniques bring numerical stability to the procedure. The Goldfarb and Idnani (GI)
method already seems to gather the advantages of prior methods, that are appropriate to the
solving of large convex QP problems involving redundancy. Indeed, following Fletcher's appreci-
ation [41], the method is most e ective when there are only few active constraints at the solution,
and is also able to take advantage of a good estimate of the active set A at the solution. This
latter advantage makes the GI method suitable for sequential quadratic programming methods
for nonlinearly constrained optimization calculations. Powell [96] analyzed the Goldfarb and Id-
nani method in the special case where the Hessian G is ill-conditioned due to a tiny eigenvalue.
Powell's conclusions cast some doubt on the numerical stability of Goldfarb and Idnani's imple-
mentation. Powell proposed in [97] a stable implementation of the GI method that circumvents
these diculties. Note that our Hessian G = I , the identity matrix, does not enter the scope of
Powell's improvements.
The GI method suitably meets the requirements for solving our convex QP problem. The
next section is devoted to its analysis.
quadratic programming 44

3.5 The Goldfarb and Idnani method


The Goldfarb and Idnani method (GI) solves the problem of minimizing f (x) = aT x + 12 xT Gx
over x 2 Rm , subject to Ei (x)  nTi x , bi  0 for i = 1; : : :; h, where G is positive de nite.
The matrix G is m  m, a and ni are in Rm , and bi is in Rh . This problem will be referred to
as the convex quadratic program CQP. Since the inverse shortest path algorithm presented in
Chapter 4 will heavily rely on the GI method, we present here a detailed analysis whose aim is
both to establish the primal and dual step formulations, and to prove the nite termination of
the GI algorithm. This detailed presentation closely follows that used by Goldfarb and Idnani in
their common paper [55].

3.5.1 Basic principles and notations


The GI method uses an active set, which is a subset of the h linear inequality constraints that
are satis ed as equalities by the current estimate x of the solution to CQP. The set A will index
the constraints of the active set. Since Golfarb and Idnani use a dual approach, they must
rst provide an initial dual feasible point, that is, a primal point for some subproblem of the
original problem. By relaxing all constraints (A = ; and the Lagrange multipliers u = 0), the
unconstrained minimum of f (x), ,G,1 a, is such a point. The dual method then iterates until
primal feasibility (i.e. dual optimality) is achieved, while maintaining the primal optimality of
intermediate subproblems (i.e. dual feasibility). Let us de ne a subproblem P (K ) as the QP
problem of minimizing f (x) subject only to the subset of constraints Ei (x)  0, i 2 K , where K
is a subset of f1; : : :; hg. The unconstrained minimum is then the solution to P (;). The solution
to P (K ) lies on some \linearly independent active set of constraints" indexed by A  K ; one says
that this solution x is part of an S-pair (x; A). By linear independence of a set of constraints, we
mean that the inward normals ni corresponding to these constraints are linearly independent.
If the dual problem to CQP is set up, then the GI method is equivalent to a primal active
set method being applied to this dual problem. In [55], Goldfarb and Idnani discuss their dual
method in terms of primal subproblems, nding that approach more instructive. We now propose
to examine their approach. Some undetailed proofs in [55] will be re ned here.

3.5.2 The GI algorithm


Here is the basic approach followed by Goldfarb and Idnani.
Algorithm 3.1
Step a. Assume that some S-pair (x; A) is given, typically (,G,1a; ;).
Step b. Pick some q such that constraint q is infeasible in CQP.
Step c. If P (A [ fqg) is infeasible, then exit: CQP is infeasible.
Step d. Else, determine a new S-pair (x; A [ fqg) such that A  A and f (x) > f (x). Set
(x; A) (x; A [ fq g).
quadratic programming 45

Step e. If all constraints are satis ed, then exit: x is the optimal solution to CQP.
Else, go to Step b.
Note that, in Step b, the index q belongs to f1; : : :; hg n A.
We need some more notations to describe the algorithm more formally. Let jAj be the number
of constraints in A, and N be the m  jAj matrix whose columns are the normals ni of the
constraints in the active set A. The algorithm will use two additional matrices when the columns
of N are linearly independent:
N  def
= (N T G,1N ),1 N T G,1 ; (3:25)
which is the Moore-Penrose generalized inverse of N in the space of variables under the transfor-
mation y = G1=2x, and, if I is the m  m identity matrix,
H def
= G,1 (I , NN ); (3:26)
which is the reduced inverse Hessian of the quadratic objective function in the subspace of points
satisfying the active constraints. Indeed, since NN  is the operator of the projection along the
subspace of points satisfying the active constraints, (I , NN ) is a (generally non-orthogonal)
projection onto the manifold verifying the active constraints. Note that H is symmetric. The
operators N  and H satisfy the following properties.
Property 4 Hw = 0 , w = N , with 2 RjAj .

Proof. Since G,1 is inversible, Hw = 0 , (I , NN )w = 0 , w = N , for some 2 RjAj ,


since (I , NN ) is a projection along the subspace spanned by N . 2
Property 5 HGH = H .

Proof. This is easy to see: HGH = G,1(I , NN )2 = G,1 (I , NN ) = H . 2


As a consequence, H is positive semi-de nite since for all x 2 Rm , one can write
xT Hx = xT HGHx = (Hx)T G(Hx)  0; (3:27)
by symmetry of H and positive de niteness of G.
Property 6 N  GH = 0.

Proof. The lefthand side equals N  , N NN  where N N equals the Identity matrix. 2
Let us denote the feasible region of P (A) by F (A), where
F (A) def
= fx 2 Rm j nTi x = bi; i 2 Ag; (3:28)
and the Gradient of the objective f at x by g (x)  rf (x) = Gx + a. Then we can state the
following theorem.
quadratic programming 46

Theorem 3.6 Suppose that x^ belongs to F (A). Then, the minimizer


min f (x) (3:29)
x2F (A)
is attained at x = x^ , Hg (^x).
Proof. By Taylor's formula, f (x) can be developed as
f (^x) + g T (^x)(x , x^) + 12 (x , x^)T G(x , x^): (3:30)
The solution x of minimizing f (x) over F (A) is given by the condition that (I , NN )rf (x) = 0,
i.e.
(I , NN )(g (^x) + G(x , x^)) = 0: (3:31)
The nal expression of x follows from (3.31), (3.26), the fact that (I , NN )G = GHG = GGH
= G(I , NN ), and x^, x 2 F (A). 2
The expression of x in Theorem 3.6 is similar to the solution to Newton's equation adapted to
the case of a projection onto a subspace of Rm . Since x the optimal solution of P (A), it satis es
the Kuhn and Tucker condition (3.17a):
g (x) = Nu(x); (3:32)
where u(x)  0 is the vector of Lagrange multipliers associated to the active constraints at x.
From (3.25) we then have
u(x)  N g (x)  0; (3:33)
and
Hg (x) = 0 (3:34)
since (3.33) implies that Nu(x) = NN g (x), which is equivalent to (I , NN )g (x) = 0 by (3.32).
Conditions (3.33) and (3.34) re ect the dual feasibility and the primal optimality of x with
respect to P (A), respectively. These conditions are sucient as well as necessary for x to be
the optimal solution of P (A). Let us now detail Algorithm 3.1 using the above precisions. The
following algorithm makes use of another set of multipliers
r = N nq (3:35)
which Goldfarb and Idnani call infeasibility multipliers.
Algorithm 3.2
step 0: Find the unconstrained minimum.  
Set x ,G,1 a, f 12 aT x, H G,1 , A ; and u 0 .
step 1: Choose a violated constraint, if any.
Compute the constraint values fEi (x)ghi=1. If all constraints are satis ed, the current x
quadratic programming 47

is the desired solution. Otherwise, a violated constraint is chosen, that is, an index q is
selected in f1; : : :; hg such that Eq (x) < 0. Also set
8 !
>
< u if jAj > 0;
u+ > 0 (3:36)
: 0 if jAj = 0:
step 2: Compute the primal and dual step directions.
These directions are computed by the relations
s = Hnq (3:37)
and, if jAj > 0,
r = N nq : (3:38)
step 3: Determine the maximum steplength to preserve dual feasibility.
De ne
S = fj 2 f1; : : :; jAjg j rj > 0g: (3:39)
The maximal steplength that will preserve dual feasibility is then given by
8  
>
< ur` = minj2S urjj if S 6= ;;
+ +

tf = > ` (3:40)
: +1 otherwise.
step 4: Determine the steplength to satisfy the q-th constraint.
6 0, and is then given by
This steplength is only de ned when s =
tc = , EsTq (nx) : (3:41)
q

step 5: Take step and update the active set.


If tf = 1 and s = 0, then the original CQP is infeasible and the algorithm stops with a
suitable message.
Otherwise, if s = 0, update the Lagrange multipliers by
!
u+ u+ + tf ,r (3:42)
1
and drop the `-th constraint, that is A A n f`g, where ` has been determined in (3.40).
Then go back to step 2 after updating H and N .
If s 6= 0, tc is well-de ned, and one sets
t = min[tf ; tc]; (3:43)
x x + ts; (3:44)
quadratic programming 48

f f + t( 12 t + u+jAj+1)sT nq (3:45)
and 8 !
>
< , r
+
u +t if jAj > 0;
u+ > 1 (3:46)
: u+ + t if jAj = 0:
If t = tc , then set u u+ , add constraint q , that is A A [ fq g, and go back to step 1
after updating H and N  . If, on the other hand, t = tf , drop the `-th constraint, that is
A A n f`g and go back to step 2 after updating H and N .
Note that u, the vector of Lagrange multipliers, has a dimension equal to the number of active
constraints.
We observe that the GI algorithm involves three types of possible iterations.
1. The rst is when the new violated constraint is linearly independent from those already in
the active set, and all the active constraints remain active at the new solution of the QP
subject to the augmented set of constraints. This occurs when t = tc .
2. The second is when the new violated constraint is linearly dependent on those already in
the active set. This occurs when s = 0, or, equivalently, when Nr = nq . In order to preserve
independence of the active set (that is, linear independence of the columns of N ), an old
constraint (the `-th) is dropped from the set before incorporating the new one. As a result,
N is always of full column rank.
3. The third is when the solution of the QP subject to the augmented set of constraints is
such that one of these constraints is not binding. This occurs when t = tf , in which case
the `-th constraint ceases to be binding. As one wishes to keep only binding constraints in
the active set, this constraint is dropped.
An ecient implementation of this algorithm does not need to explicitely compute and store
the operators N  and H that are used by the algorithm. One can store and update the matrices
J = QT L,1 and R obtained from the Cholesky and QR factorizations G = LLT and L,1 N =
Q [R0 ]. We will bring precisions about these factorizations when, in Chapter 4, we specialize the
GI method for our inverse shortest path problem.
Before examining the algorithm's properties, we need to introduce some more notations. The
set A+ denotes the set A [ fq g where q 2 f1; : : :; hg n A is the index of the constraint that is to
be added in the active set. Similarly, A, refers to a subset of A containing one fewer element
than A. Accordingly, N + and N , are the matrices of inward normals corresponding to A+ and
A, , respectively. The normal n+ indicates the normal vector nq added to N to give N + and n,
is the column removed from N to give N , . In agreement, (N + ) and H + denote the operators
de ned in (3.25) and (3.26), respectively, with N + instead of N . Finally, the m-vector ei is the
i-th column of the identity matrix I .
The next two sections discuss the cases where the columns of N + are linearly independent or
not.
quadratic programming 49

3.5.3 Linear independence of the constraints


According to Algorithm 3.2's mechanism, when a new S-pair, (x; A[fq g) say, has to be determined
during Steps 2{5, given the S-pair (x; A), we already know that
Eq (x) < 0; (3:47)
and that
Ei (x) = 0 for all i 2 A: (3:48)
Since x is the optimal solution of P (A), we have that Hg (x) = 0 and u(x)  N g (x)  0
according to (3.34) and (3.33). By Property 4, [Hg (x) = 0] is equivalent to the fact that g (x) can
be written as a linear combination of columns of N . Consequently, g (x) is a linear combination
of columns of N + too, that is, g (x) = N + + . Now, if nq is linearly independent from ni (i 2 A),
one can use the operators H + and (N + ), and the last result is equivalent to
H +g(x) = 0 (3:49)
again by Property 4. Moreover, + is the vector of Lagrange multipliers associated with A+ at
x, with +jAj+1 = 0. We then can assert that
u+ (x)  (N + )g (x)  0: (3:50)
Let us gather the found properties into a de nition.
Definition 7 A triple (x; A; q ) where q 2 f1; : : :; hg n A is said to be a V (violated)-triple if
both nq is linearly independent from the columns of N , and the equations (3.47){(3.50) hold with
A+ = A [ fqg.
The following lemma shows how to nd a point x minimizing f on F (A+ ) from a point x of
a V-triple (x; A; q ).
Lemma 3.7 Consider (x; A; q ), a V-triple, and points of the form
x = x + ts (3:51)
where the primal step direction s is de ned by
s = Hn+: (3:52)
Then we have the following:
H +g (x) = 0; (3:53)
Ei(x) = 0 for all i 2 A; (3:54)
!
u+ (x)  (N +)g(x) = u+ (x) + t ,r ; (3:55)
1
where
r = N nq : (3:56)
Moreover,
Eq (x) = Eq (x) + tsT n+ : (3:57)
quadratic programming 50

Proof. We rst need to establish some results before proving the above equations:
g(x) = G(x + ts) + a = g (x) + tGs; (3:58)
from (3.52) and (3.56), we can write Gs as
!
Gs = (I , NN )n+ = n+ , Nr = N + ,r ; (3:59)
1
H +N + = 0; (3:60)
because (I , N + (N +) ) is a projection along the subspace spanned by the columns of N + ;
(N +) N + = I; (3:61)
since nq is linearly independent from the ni indexed by A; nally,
nTi Hn+ = (Hni )T n+ = 0; for all i 2 A; (3:62)
because (I , NN )ni = 0 for i 2 A.
In order to show (3.53), we use successively (3.51), (3.58){(3.60) and (3.49):
!
H + g(x) = H + g (x) +tH + Gs = t H + N + ,r : (3:63)
| {z } | {z } 1
0 0
Now, let us prove that x belongs to F (A):
Ei (x) = nTi (x + ts) , bi = E +tnTi s = t n| Ti Hn
| i{z(x)}
+
{z } = 0 for all i 2 A: (3:64)
0 by (3.48) 0 by (3.62)
The modi cation of the Lagrange multipliers is as follows:
u+ (x)  (N +) g(x) + t(N +)Gs ! by (3.58)
= (N +) g (x) + t (|N +{z)N +} , r by (3.59)
1 (3:65)
!I
= u+ (x) + t ,r by (3.50) and (3.61).
1
Finally, the evaluation (3.57) of Eq (x) is established by a similar development as that in (3.64).
2.
By Lemma 3.7 we can determine the point xc = x + tc s that minimizes f over F (A+ ): it is
the point such that Eq (xc ) = 0, which implies that tc = ,Eq (x)=sT nq (if sT nq 6= 0). Moreover,
(xc ; A+) will be an S-pair if u+ (xc )  0. If not, then (3.55) allows a smaller value tf of the
steplength t such that tf < tc and that some u+i (xf ) becomes negative, where xf = x + tf s. The
constraint, say ` 2 A, corresponding to that i-th component is dropped from the active set and
(xf ; A, ; q ) satis es the conditions to be a V-triple, where A, = A n f`g. This is formally stated
by the following theorem.
quadratic programming 51

Theorem 3.8 Let (x; A; q ) be a V-triple and x be de ned as in Lemma 3.7 with
t = minftc ; tf g; (3:66)
where
tc = , EsTq (nx+) (3:67)
and 8  
>
< u`r(x) = minj2S ujr(jx) if S 6= ;;
+ +

tf = > ` (3:68)
: +1 otherwise,
where S = fj 2 f1; : : :; jAjg j rj > 0g. The multipliers u+ (x) and r are given by (3.55) and
(3.56) respectively. Then, we have that
Eq (x)  Eq (x) (3:69)
and we observe the following increase of the objective function f
f (x) , f (x) = tsT n+ ( 12 t + ujAj+1(x))  0: (3:70)
If t = tc , then (x; A [ fq g is an S-pair, and if t = tf , then (x; A n f`g; q ) is a V-triple.
In the de nition (3.68) of tf , we abused the notation of the index `, which is actually the index
of the constraint as de ned in CQP (1  `  h), and not its index j (`) in the vector u+ (x) where
1  j (`)  jAj. For the sake of simplicity, let ` refer to the dropped constraint in either cases.
Let us now prove the above theorem.
Proof. Let us rst note that s = G,1(I , NN )n+ is 6= 0 since the linear independence of n+
from the columns of N makes n+ not belong to the null space of (I , NN ), and G,1 is positive
de nite. Then, since (x; A; q ) is a V-triple, we can write
= 5 (n+ )T HGHn+ (3.52)
sT n+ = (n+)T Hn+ Prop. = sT Gs > 0; (3:71)
because G is positive de nite and s 6= 0. As a consequence, t  0 and by (3.57), Eq (x) =
Eq(x) + t |sT{zn+}  Eq (x). Remark that when t = tc , Eq (x) > Eq (x) since tc > 0 from (3.47) and
>0
(3.67).
On the other hand, using Taylor's formula on f with x , x = ts, one has that
f (x) , f (x) = tsT g (x) + 12 t2 sT Gs: (3:72)
By Property 4, H + g (x) = 0 implies that g (x) = N + u+ (x). It then follows that Hg (x) =
HN +u+ (x) = Hn+u+jAj+1 (x), since H projects along the manifold spanned by the columns of
N (the non-zero contribution remains that of the (jAj + 1)-th column of N + , which is n+ ).
Consequently,
sT g(x) = (n+ )T Hg (x) = (n+)T Hn+u+jAj+1(x)  0 (3:73)
quadratic programming 52

by (3.71) and (3.55). Substituting (3.71) and (3.73) into (3.72), gives (3.70). Moreover as long
as t > 0, f (x) > f (x).
Lemma 3.7 and the de nition of t (3.66){(3.68) ensure that x is primal optimal for P (A+ )
(H +g (x) = 0), primal feasible for P (A) (Ei(x) = 0 for i 2 A), and that u+ (x) is dual feasible
( 0). If t = tc , then Eq (x) = 0, x is primal feasible for P (A+ ) and (x; A [ fq g) is an S-pair.
We then have performed a full step in the primal space. If t = tf < tc , then Eq (x) < 0 and
u+` (x) = 0. Since H + g(x) = 0, the latter equation implies that
X
g (x) = N +u+ (x) = u+j(i) (x)ni ; (3:74)
i2A[fqgnf`g
where i is the j (i)-th index in A+ . Consequently, (x; A n f`g; q ) is a V-triple since the set of
normals fni j i 2 A [ fq g n f`gg is of course linearly independent. We then have performed a
partial step in the primal space. 2
The above theorem allows to obtain an S-pair (x; A[fq g) from a V-triple (x; A; q ) with A  A,
such that f (x) > f (x). This is achieved after jAj , jAj partial steps (this number is  jAj) or
one full step.

3.5.4 Linear dependence of the constraints


We now handle the case where the normal n+ is linearly dependent from the columns of the
matrix N . This does not allow (x; A; q ) to be a V-triple. Then, two situations may occur:
 either the subproblem P (A [ fqg) is infeasible,
 or a constraint can be removed from the active set A so that (x; A,; q) is a V-triple.
The rst situation implies that the original problem CQP is also infeasible. The second
situation involves a constraint drop which is similar to the partial step described in Theorem 3.8.
If n+ is a linear combination of the columns of N , then the primal step direction s de ned by
(3.52) becomes
s = Hn+ = 0; (3:75)
since H projects along the subspace spanned by the columns of N . As a consequence, the
steplength computed to verify q -th constraint in the primal space, tc , is in nite. The following
theorem suggests the procedure in such a case.
Theorem 3.9 Assume that (x; A) is an S-pair and that q is the index of a constraint in the set
f1; : : :; hg n A such that
n+  nq = Nr (3:76)
and
Eq (x) < 0: (3:77)
If r  0, then P (A [ fq g is infeasible; otherwise, constraint ` can be dropped from the active set,
where ` veri es
u`(x) = min  ui(x)  ; (3:78)
r` ri >0j1ijAj ri
quadratic programming 53

to give A, = A n f`g and the V-triple (x; A, ; q ).


Again, in (3.78), we abuse of the notation ` as we did in Theorem 3.8.
Proof. Suppose that there exists a feasible solution x = x + s to the problem P (A [ fq g).
On one hand, x must verify Eq (x) = 0 for being a solution to this problem. Consequently, since
(n+ )T s = (|n+{z)T x} , (|n+{z)T x} > 0, one can write
=bq <bq by (3.77)

(n+ )T s = rT N T s > 0; (3:79)


using (3.76). On the other hand, x must be feasible, that is Ei(x)  0 for all i 2 P (A [ fq g).
 Since (x; A) is an S-pair, this result is veri ed for i 2 A, that is, Ei(x) = E| i{z(x)} +nTi s  0,
0
if and only if
N T s  0: (3:80)
 For i = q, (n+)T s must exceed ,Eq (x) which is stricly positive.
One directly sees that, if r  0, both requirements (3.79) and (3.80) cannot be simultaneously
satis ed; hence, in this case, problem P (A [ fq g) is infeasible.
If some component of r is positive, it follows from (3.78) that r` > 0 and from (3.76) that
2 3
X
n` = r 1 4n+ , rj (i)ni 5 : (3:81)
j (`) i2A,
Since (x; A) is an S-pair, we have that
g(x) = Nu(x) = Pi2A uj(i)(x)ni
P
= i2A, uj (i)(x)ni + uj (`) (x)n` (3:82)
P
= i2A, uj (i)(x) , urjj `` rj (i) ni + urjj `` n+ ; by (3.81).
( ) ( )
( ) ( )

Note that in (3.81){(3.82), we distinguish between the `-th constraint and its index j (`) in the
active set.
Now, if we de ne A^ = A, [ fq g, then N^ has full rank, that is,
X
ini = 0 (3:83)
i2A^
implies i = 0 for i 2 A^. Indeed, suppose that (3.83) holds. Then, by (3.76), we can write
X X
ini + q rini = 0; (3:84)
i2A, i2A
that is, X
( i + q ri)ni + q r` n` = 0: (3:85)
i2A,
quadratic programming 54

Since fni j i 2 Ag is linearly independent, we deduce from (3.85) that i + q ri = 0 for all i 2 A, ,
as well as q r` = 0. We know that r` > 0. Thus, q = 0 and hence i = 0 for i 2 A, . The
matrix N^ then has full rank.
It then follows from (3.82) and Property 4 that Hg ^ (x) = 0, and
(
u^(x)  N^ g(x) = uj(i) , ur`` rj(i)  0; for i 2 A, ; (3:86)
u`  0;
r`
since N^ ni = ei for i 2 A, . This establishes that (x; A,; q ) is a V-triple. We then have performed
a dual step. 2
Note that the change that occurs to the active set A and to the dual variables in the partial
step described in Theorem 3.8 is the same as that performed in the dual step described just
above. The only di erence is that x is not changed in the dual step (there is no step in the primal
space) while this primal modi cation generally occurs in a partial step. A dual step emphasizes
that when degeneracy occurs, it is possible to take non-trivial steps in the space of dual variables
without changing x and f (x).

3.5.5 Finite termination of the GI algorithm


The termination of the GI algorithm is stated in the following theorem.
Theorem 3.10 The dual Algorithm 3.2 will solve CQP or indicate that it has no feasible solution
in a nite number of steps.
Proof. Each time Step 1 of Algorithm 3.2 is executed, the current point x solves the sub-
problem P (A) and (x; A) is an S-pair. If x satis es all the constraints in CQP, then it is an
optimal solution to CQP. Otherwise, a new S-pair (x; A) is obtained after one full step via at
most jAj  minfh; mg partial and/or dual steps, or infeasibility is detected according to Theo-
rem 3.8 and Theorem 3.9. If the problem is not infeasible, the algorithm then returns to Step 1
and we have that f (x) > f (x). This shows that an S-pair can never re-occur. Since the number
of possible S-pairs is nite, the Algorithm 3.2 terminates in a nite number of iterations. 2
The numerical results presented by Goldfarb and Idnani in [55] and by Powell in [96, 97]
encourage the use of the GI method against those presented in the previous sections. The GI
implementation always chooses the most violated constraint to add to the active set. In [55], one
can nd an example showing di erent \solution paths" according to diverse constraint selection
strategies. The example witnesses that the most violated heuristic performs well in practice.
Note that primal methods cannot choose which constraint to add in the active set (unless small
infeasibility is tolerated). The GI dual method proved to be superior to primal algorithms because
it appears not to add many constraints to the active set that are not in the nal active set.
Clearly, the number of \drops" is relatively small. Compared to other dual methods, the GI
implementation is far more ecient and also more numerically stable.
4

Solving the inverse shortest path problem

In this chapter, the basic inverse shortest path problem is considered where the constraints are
given as a set of shortest paths and nonnegativity constraints on the weights. We introduce
the concept of \island" in order to characterize the violation of shortest path constraints. The
violation of an explicit shortest path constraint creates one or more island(s): an island is made
of two \shores"; the rst shore indicates a portion of the computed shortest path and the second
shore is the succession of arcs that the path should (but does not) follow; both shores have
common termination vertices. In order to follow Goldfarb and Idnani's method framework, we
establish specialized formulations of the primal and dual step directions, the update of the arc
weights, and the maximum steplength to preserve dual feasibility. These new formulations are
\island-oriented". We also provide a way to check whether the primal step direction s is zero
without computing s explicitly. A computational algorithm is then proposed. Our method is then
tested on practical large scale problems with large numbers of constraints. These tests con rm
the eciency of our method since few constraints, or islands, are added in the active set that are
not active at the solution.
The content of this chapter has been published in [15].

4.1 The problem


The inverse shortest path problem has been motivated in Chapter 1 with examples drawn from
both trac modelling and seismic tomography. We therefore directly remember the formal basic
problem together with its special nature.
We nd convenient to recall the notations that will be used throughout this chapter. A
weighted oriented graph is a triple (V ; A; w), where (V ; A) is an oriented graph with n vertices
and m arcs, and where w is a set of nonnegative weights fwigmi=1 associated with the arcs. We
denote the vertices of V by fvk gnk=1 and the arcs of A by faj = (vs(j ) ; vt(j ))gmj=1 , with s(j ) being
the index of the vertex at origin of the j -th arc and t(j ) the index of the vertex at its end.
We assume that such a weighted oriented graph (V ; A; w) is given, together with a set of
acyclic paths
pj = (aj ; aj ; : : :; ajl j ) (j = 1; : : :; nE );
1 2 ( )
(4:1)
55
solving the inverse shortest path problem 56

where l(j ) is the number of arcs in the j th path (its length), and where
t(ji) = s(ji+1) for i = 1; : : :; l(j ) , 1: (4:2)
If we de ne w as the vector in the nonnegative orthant of Rm whose components are the given
initial arc weights fwig, the problem is then to determine w, a new vector of arc weights, and
hence a new weighted graph G = (V ; A; w) such that
min kw , w k (4:3)
w2Rm
is achieved under the constraints that
wi  0 (i = 1; : : :; m) (4:4)
and that the paths fpj gnj =1
E are shortest paths in G.
Remember that we decided to restrict ourselves to the `2 norm to pro t by the quadratic
programming framework. As a consequence, our inverse shortest path problem became
1 X
m
min
wi 2 (wi , wi )2 (4:5)
i=1
subject to (4.4) and the nE shortest path constraints. In Chapter 3, we established that these
last constraints may be expressed as a (possibly large) set of linear constraints of the type
X X
wk  wk ; (j = 1; : : :; nE ) (4:6)
kjak 2p0j kjak 2pj

where p0j is any path with the same origin and destination as pj . As a consequence, the set of
feasible weights, F say, is convex as it is the intersection of a collection of half spaces. The problem
of minimizing (4.5) subject to (4.4) and (4.6) is then a classical quadratic programming (QP)
problem. This QP is however quite special because its constraint set is (potentially) very large1 ,
very structured, and possibly involves a nonnegligible amount of redundancy. Also the problem
of minimizing (4.5) on the set F of feasible weights may be considered as the computation of a
projection of the unconstrained minimum onto the convex set F . Again, the special structure of
F distinguishes this problem from a more general projection.

4.2 Algorithm design


4.2.1 The Goldfarb-Idnani method for convex quadratic programming
The algorithm we present below is a specialization of the dual QP method by Goldfarb and
Idnani [55]. Let us recall the idea of this method presented in Chapter 3 which is to compute
a sequence of optimal solutions to the quadratic programming problems involving only some of
the constraints that are present in the original problem, that is a sequence of dual feasible points.
1
In general, the number of constraints can be exponential.
solving the inverse shortest path problem 57

An active set of constraints is maintained by the procedure, that is, a set of constraints which
are binding at the current stage of the calculation. A new violated constraint is incorporated
into this set at every iteration of the procedure (some other constraint may be dropped from
it), and the objective function value monotonically increases to reach the desired optimum. This
approach was chosen for two main reasons.
 Since the Goldfarb-Idnani (GI) algorithm is a dual method, it is extremely easy to incor-
porate new constraints once a rst solution has been computed. In our context, this means
that, if a new set of prescribed shortest paths is given, modest computational e ort will be
required to update the solution of the problem.
 The GI method has an excellent reputation for eciency, especially in the case where the
number of constraints is large and near-degeneracy very likely. In particular, the method
avoids slow progress along very close extremal points of the constraint set F .
Also, the GI method and its ecient implementation are discussed in the literature, by Goldfarb
and Idnani in their original paper, but also by Powell in [96] and [97], for example.
Because our method heavily relies on the GI algorithm, we now state this method in its full
generality. In this form, it is designed for solving the QP problem given by
minx f (x) = aT x + 12 xT Gx;
(4:7)
subject to Ei(x) def
= nTi x , bi  0 (i = 1; : : :; h);
where x, a and fni ghi=1 belong to Rm , G is a m  m symmetric positive de nite matrix, b is in
Rh and the superscript T denotes the transpose. As indicated above, the GI algorithm maintains
a set of currently active constraints, A say, and relies on the matrix N whose columns are the
normals ni of the constraints in the active set A. The matrix N is thus of dimension m  jAj,
where jAj is the number of constraints in A. The algorithm also uses two additional matrices,
namely
N  def
= (N T G,1N ),1 N T G,1 ; (4:8)
which is the Moore-Penrose generalized inverse of N in the space of variables under the transfor-
mation y = G1=2x, and
H def
= G,1 (I , NN ); (4:9)
which is the reduced inverse Hessian of the quadratic objective function in the subspace of points
satisfying the active constraints.
We do not re-state here the GI algorithm which is given in detail in Chapter 3 (Algorithm 3.2).
We also refer the reader to Chapter 3 and [55] for further details on the general GI algorithm,
and in particular for the proof that it indeed solves the QP (4.7), provided a solution exists.
Our purpose, in the next paragraphs, is to specialize the GI algorithm to the inverse shortest
path problem given by (4.5), (4.4) and (4.6). We will therefore examine the successive stages of
the algorithm presented above, where the structure of the problem allows some re nement.
solving the inverse shortest path problem 58

4.2.2 Constraints in the active set


We rst wish to analyze how to detect the violation of constraints (4.6), as required in Step 1.

Shortest paths constraints


For each of the given paths pj , we rst de ne Pj as the set of vertices in V that are attained by
this path, that is
Pj def
= fs(aj 1 ); t(aj 1); t(aj 2); : : :; t(aj;l(j ))g: (4:10)
The vertex s(aj 1 ) is called the origin or source of the j -th path, and denoted sj . For every such
path pj with source sj and for a given vector w of arc weights, it is then possible to compute all
the shortest paths in (V ; A; w) from the source sj to all the other vertices of Pj . We will then
detect a violated constraint if, for some vertex v 2 Pj n fsj g, one has that the predecessor of v
on the shortest path from sj to v is di erent from the predecessor of v in the path pj .
In this situation, it is easy to verify that there must be a vertex x 2 Pj closest to v (possibly sj ),
such that x is also on the shortest path from sj to v . Furthermore, there exist two distinguished
paths from x to v , the rst one, noted I + , being the shortest path and the second one, noted
I ,, being given as a subpath of pj . The set of both these paths is called a violating island and
is denoted by I . The path I + is called its positive shore while I , is called its negative shore.
Furthermore, the excess of the island, denoted by E , is de ned as the cost of the positive shore
minus the cost of the negative shore. The constraint associated with the island I is therefore
violated when its excess is negative.

u
v1 -2
a1

,
,
v
a
u -3
2 ,

,
a
v
-4
3 ,

,
u vu
u u u u
a4 ,,
a8 a5 ,a9 a6 ,a10 a7
, ,
, , ,
?
, -,? -,? -?
v5 a11 v6 a12 v7 a13 v8

Figure 4.1: A rst example

On the small example given in Figure 4.1, we assume that the weight vector w is given by
the relation wj = j (that is the arc aj has a weight of j ), while the constraint paths are given by
p1 = (a1 ; a5; a12; a13) and p2 = (a11 ; a12; a10): (4:11)
At this point, it is not dicult to verify that the shortest path from v1 to v8 is the path
(a1; a2; a3; a7): (4:12)
Hence a constraint related to the path p1 is violated at the vertex v8, because the predecessor
of v8 on its shortest path from v1, that is v4 , is di erent from its predecessor on the constraint
solving the inverse shortest path problem 59

path, which is v7. The vertex v above is then v8 , while inspection shows that the relevant vertex
w is v2 . The corresponding violating island is then
I = ((a2; a3; a7); (a5; a12; a13)) ; (4:13)
where I + = (a2; a3; a7) is its positive shore, I , = (a5; a12; a13) its negative shore, and whose
associated excess E is (2 + 3 + 7) , (5 + 12 + 13) = ,18. This violating island is not the only
one for this example. A second one, related to the path p2 , is given for instance by
I 0 = ((a8; a2; a3); (a11; a12; a10)) ; (4:14)
whose excess E 0 is equal to -20.
A violated constraint of the type (4.6) therefore corresponds to a violating island in (V ; A; w).
When it is incorporated in the active set, the constraint is enforced as an equality and the costs
of its negative and positive shore are exactly balanced (see section 4.2.5). The corresponding
island is then called active.

Nonnegativity constraints and bounds on the arc weights


The nonnegativity constraints (4.4) must also be taken into account. When one of them is
violated, which is easy to detect, it may also be incorporated in the active set, along with the
active islands. These bounds are then also called active. They will be regarded in the sequel as
active islands with only one arc in the positive shore and no negative shore.
The active set at a given stage of the calculation will therefore contain a number of active
islands (with or without negative shore). This will be denoted by A = (V; Y ), where V is the set
of currently active islands with a negative shore and Y the set of active islands without negative
shore, that is the set of active bounds.

4.2.3 The dual step direction


The next stage of the specialization of the Goldfarb-Idnani algorithm to our inverse shortest
paths problem is the computation of the dual step direction r in (3.38). As in [55] and [97], this
calculation, which is equivalent to
r = (N T G,1 N ),1N T G,1 nq ; (4:15)
can be performed by maintaining a triangular factorization of the matrix N T G,1 N . However,
our problem has the very important feature that the Hessian matrix G of the quadratic objective
is the identity I . This obviously induces a number of useful algorithmic simpli cations, the rst
one being that (4.15) can be rewritten as
r = (N T N ),1N T nq : (4:16)
The matrix N  is then nothing but the unweighted Moore-Penrose generalized inverse of N .
Therefore, we will only maintain a triangular factorization of the form
N T N = RT R; (4:17)
solving the inverse shortest path problem 60

where R is an upper triangular matrix of dimension jAj. Since N is of full rank, this is equivalent
to maintaining a QR factorization of N of the form
  R ! def
N = Q1 Q2 = QU; (4:18)
0
as is the case in the numerical solution of unconstrained linear least squares problems. Indeed,
it is straightforward to verify that (4.16) is the solution of
r kNr , nq k2 :
min (4:19)
The second useful simpli cation due to the special structure of the problem arises in the com-
putation of the product N T nq in (4.16). The resulting vector indeed contains in position i the
inner product of the i-th active constraint normal with the normal to the q -th constraint. As
both these constraints may be interpreted as islands, the question is then to compute the inner
product of the new island, corresponding to the q -th constraint, with all already active islands.
We then obtain the following simple result.
Lemma 4.1 The vector N T nq appearing in (4.16) is given componentwise by
h T i
N nq i = jIj+ \ Iq+j + jIj, \ Iq, j , jIj+ \ Iq,j , jIj, \ Iq+ j (4:20)
for i = 1; : : :; jAj and j equal to the index of the i-th active island.
Proof. Since h T i T
N nq i = ni nq (4:21)
It is useful to note that, because of (4.4) and (4.6),
8
>
< +1 if ak 2 I`,;
+
[n` ]k = > ,1 if ak 2 I` ; (4:22)
: 0 otherwise,
for k = 1; : : :; m and ` 2 A [ fq g. This equation holds for both types of islands (with or without
negative shore). Taking the inner product of two such vectors (for ` = j and ` = q ) then yields
(4.20). 2
As a consequence, the practical computation of r may be organized as follows:
1. compute the vector y 2 RjAj whose i-th component is given by (4.20),
2. perform a forward triangular substitution to solve the equation
RT z = y (4:23)
for the vector z 2 RjAj ,
3. perform a backward triangular substitution to solve the equation
Rr = z (4:24)
for the desired vector r.
This calculation will be a very important part of the total computational e ort per iteration in
the algorithm.
solving the inverse shortest path problem 61

4.2.4 Interpretation of the dual step direction


The Lagrange parameter ui in a sense represents a relative \price" to pay for relaxing the i-th
(active) constraint at the current stage of calculation.
 A null price ui = 0 ensures, by the complementarity condition (3.24b), that constraint i
does not play any signi cant role at the current point.
 Higher values of ui indicate that contour lines of f correspond to lower values of the objective
function in the direction of the outward2 normal to the i-th constraint.
Theorem 3.7 evinces the vector r as the opposite of the step in the dual space. Constraints that
should leave the active set, when needed, consequently must have a strictly positive corresponding
component in r. Indeed, such constraints are meant to have lower relaxation costs in the direction
along which the current solution progresses. A good relaxation choice should also be taken
according to the relative price decrease. Since u  0, the best relative relaxation choice must
follow equation (3.68).

4.2.5 Determination of the weights


We now examine the way in which changes in the weights may be computed. In the original
GI method, both primal and dual step directions are computed once a new constraint has been
selected for inclusion in the active set (as described in step 2). In our framework, the computation
of the new values of the primal variables may be completely deferred after that of the dual step in a
rather simple way, as will be shown now. This adaptation may be viewed as another consequence
of the fact that G = I for our problem.
Before stating this result more precisely, we introduce some more notation. In order to
complete the description of the set f1; : : :; mg given an active set A = (V; Y ), we recall the
de nition of Y as
Y def
= fi 2 f1; : : :; mg j wi = 0g (4:25)
and we de ne the sets
X def
= fi 2 f1; : : :; mg n Y j 9j 2 V ai 2 Ij g (4:26)
and
Z def
= f1; : : :; mg n (X [ Y ) : (4:27)
The set X thus contains the indices of the arcs that are involved in one of the active islands of V
but are not xed at their lower bounds. The set Z contains the indices of the arcs that are not
involved at all in the active constraints of A.
For i 2 X , we also de ne
I +(i) def = fj 2 V j ai 2 Ij, g:
= fj 2 V j ai 2 Ij+ g and I , (i) def (4:28)
2
towards the infeasible region.
solving the inverse shortest path problem 62

Hence, I + (i) (resp. I , (i)) is the set of active islands of V such that the arc ai belongs to its
positive (resp. negative) shore.
We nally de ne the logical indicator function  [] by
(
1 if condition is true,
 [condition] = (4:29)
0 if condition is false.
We can now state our lemma.
Lemma 4.2 Consider a dual feasible solution for the problem of minimizing (4.5) subject to the
constraints given by an active set A = (V; Y ). Assume furthermore that, among the Lagrange
multipliers fuk gjkA=1j , those associated with the active islands of V are known. Then the weight
vector w corresponding to this dual solution is given by
2 3
X X
wi = [i 2 X [ Z ]ci + [i 2 X ] 4 uk , uk 5 (4:30)
k2I + (i) k2I , (i)
for i = 1; : : :; m.
Proof. We rst note that we can restrict our attention to the weights that are not at their
bounds (i 2 X [ Z ), because we know, by de nition, that wi = 0 for i 2 Y . Every active island
in V thus corresponds to a constraint of the form
X X
wk , wk = 0: (4:31)
kjak 2I + ^k62Y kjak 2I ,^k62Y
The desired expression for wi (i 2 X [ Z ) immediately follows from the Lagrangian equation
@L(w; u) = 0; (4:32)
@wi
where the Lagrangian function for the problem is given by
L(w; u) = 1P ( w , c ) 2 , PjAj u hP + w , P , wi
i
2 i2X [Z i i k=1 k h ijai 2Ik ^i62Y i ijai2Iki^i62Y (4:33)
1P P P P
2 i2X [Z (wi , ci ) , i2X wi k2I + (i) uk , k2I , (i) uk ;
= 2

where we restrict the last major sum to the set X because all other terms are zero. 2
The lemma simply means that the i-th weight can be obtained from wi by adding to it all
Lagrange multipliers corresponding to active islands such that ai belongs to the positive shore
of the island and by substracting all the multipliers of active islands such that ai belongs to the
negative shore.
Consider now the computation of the primal step direction s and of the inner product sT nq .
Note rst that, when (3.41) is reached in the algorithm, the primal step direction s is nonzero
and nq is linearly independent from the columns of N . The value of sT nq is then given by the
following result.
solving the inverse shortest path problem 63

Lemma 4.3 Assume the GI algorithm is applied to the inverse shortest paths problem under
consideration, and that it has reached the point where equation (3.41) should be evaluated. Assume
furthermore that A = (V; Y ) is the active set at this stage of the calculation. Then the primal
step direction s is given componentwise by
2 3
X X
si = [ai 2 Iq+ ] , [ai 2 Iq, ] +  [i 2 X ] 4 rk , rk 5 (4:34)
k2I ,(i) k2I (i)
+

for i = 1; : : :; m. As a consequence,
X X
sT nq = 1 + rk , rk (4:35)
k2I ,(q) k2I + (q)
in the case where the q -th constraint is the lower bound on the q -th weight, and
2 3 2 3
X 41 + X X X X X
sT nq = rk , rk 5 + 41 + rk , rk 5 (4:36)
ijai 2Iq+ k2I , (i) k2I + (i) ijai 2Iq, k2I + (i) k2I , (i)
in the case where the q -th constraint is a violating island.
Proof. We rst note that s, the change in the weight w corresponding to a unit step in the
dual step direction, can be viewed as the sum of two di erent terms s = nq , Nr. The rst term
corresponds to the incorporation of the q -th constraint in the active set and its contribution to
si is +1 if ai belongs to the positive shore of the q -th island, and is -1 if ai belongs to its negative
shore. This is because the (jAj + 1)-th component of the dual step direction, corresponding to
the q -th constraint, is equal to +1. Hence we have that this rst contribution is equal to
[ai 2 Iq+] ,  [ai 2 Iq, ] (4:37)
for the i-th arc. Note that only one of the indicator functions can be nonzero in (4.37). The
second contribution corresponds to the modi cations to wi caused by the fact that ai may also
belong to islands that are already active. In other words, the nonzero components of ,r have to
be taken into account. The equation (4.30) then implies that this second contribution from the
Lagrange multipliers associated with all constraints already in the active set must be equal to
2 3
X X
[i 2 X ] 4 rk , rk 5 : (4:38)
k2I , (i) k2I + (i)
Summing the contributions (4.37) and (4.38) gives (4.34).
Assume now that the q -th constraint is a lower bound. In this case, one has that nq = eq , the
q -th vector of the canonical basis in Rm. Hence the product sT nq is equal to sq . Equation (4.30),
the nonnegativity of the fwigmi=1 and the fact that wq < 0 imply that q 2 X , and (4.35) then
follows from (4.34). On the other hand, if the q -th constraint is a violating island, the normal nq
is then given componentwise by (4.22) with ` = q . Hence we obtain (4.36) from (4.34). 2
solving the inverse shortest path problem 64

4.2.6 Modifying the active set


The active set modi cations (in Step 5 of the GI algorithm) nally require the updating or
downdating of the triangular matrix R, as introduced above in (4.17).
Assume rst that the `-th constraint is dropped from the active set A. This amounts to
dropping a column of N in (4.18), which, in turn, is equivalent to dropping a column of the
upper triangular matrix R. The resulting matrix is therefore upper-Hessenberg, and a sequence
of Givens plane rotation is applied to restore the upper triangular form. This technique is quite
classical, and has already been used in the more general implementations of the GI method, both
in [55] and [97]. The reader is referred to those papers for further details in the context of the
GI algorithm and to [56] for general information on Givens plane rotations and their practical
computation.
If one now wishes to add the `-th constraint to the active set, then N has one more column,
namely nq , and the resulting matrix U in (4.18) then has the form
!
R QT1 nq ; (4:39)
0 QT2 nq
where Q1 and Q2 are de ned in (4.18). Again, this matrix should be restored to triangular
form, and again this can be done by premultiplying it by suitable orthogonal transformations. In
fact, the only necessary modi cation to (4.39) is the premultiplication of the vector QT2 nq by an
orthogonal transformation T, say, such that
TQT2 nq = kQT2 nq ke1; (4:40)
where e1 is the rst vector of the canonical basis of Rm,jAj . Note also that
QT1 nq = R,T N T nq = z; (4:41)
where z has already been computed in (4.24). Moreover, one has that
 
knq k2 = k Q1 Q2 T nq k2 = kzk2 + kQT2 nq k2 = kzk2 + kTQT2 nq k2: (4:42)
Hence the updated matrix R is given by
!
Rupdated = R z ; (4:43)
0
q
where = knq k2 , kz k2. The updating of the triangular factor R is therefore extremely cheap
to compute, mainly because of the fact that z is available from previous calculations. It is also
interesting to note that, because of the equivalence between (4.17) and (4.18), the technique
presented here is in fact identical to the computation of the Cholesky factor of (N + )T N + using
the bordering method (see [51], for example), where N + = N nq . A similar procedure is
also used in [55] and [97].
We nally note that s, the primal step direction, is zero if and only if the residual of the
problem (4.19) is zero, which, in turn, is equivalent to knq k = kz k. This last relation provides a
possible way for testing the equality s = 0 without explicitly computing s.
solving the inverse shortest path problem 65

4.2.7 The algorithm


We are now in position to describe our algorithm for solving our inverse shortest paths problem,
as described by (4.5), (4.4)and (4.6). For this description, we use a small (machine dependent)
tolerance  > 0 to detect to what extent a real value is nonzero, and we de ne the integer  = jAj.
Algorithm 4.1
step 0: Initialization.
Set w w, f ;,  0 and u 0.
0, A
step 1: Compute the current shortest paths.
For j = 1; : : :; nE , compute the shortest paths from sj to every vertex in Pj n fsj g.
step 2: Choose a violated island or exit.
Select Iq , an island whose excess Eq is negative, if any. If no such island exists, then w is
optimal and the algorithm stops.
q + ,
Otherwise, if  = 0, then set jIq j + jIq j and go to Step 5.
Otherwise (that is if  > 0) set !
u u : (4:44)
0
step 3: Revise the triangular factor R.
3a: Add the previous constraint normal nq to N .
If  = 1 then set R = ( ) and go to Step 4.
Otherwise (that is if  > 1) update the upper triangular matrix R using (4.43) and go
to Step 4.
3b: Drop n` from N .
Remove from R the column corresponding to the `-th island, and use Givens rotations
to restore it to upper triangular form, as described in Section 4.2.6.
step 4: Compute the dual step direction.
Compute the vectors z and r, using Lemma 4.1, (4.23) and (4.24). Compute also according
to q
= knq k2 , kzk2: (4:45)
step 5: Determine the maximum steplength to preserve dual feasibility.
Determine the set S according to (3.39), tf (and possibly `) using (3.40).
step 6: Determine the steplength to satisfy the q-th constraint.
If   then go to Step 7b.
Otherwise, compute tc according to (3.41), and s and sT nq as described in Lemma 4.3.
step 7: Take the step and revise the active set.
solving the inverse shortest path problem 66

7a: Compute the steplength t as in (3.43), set c c + ts, revise f according to (3.45) and
u using 8 !
>
< u + t ,r if  > 0;
u > 1 (4:46)
: u+t if  = 0:
If t = tc , set A A [ fq g,   + 1 and go to Step 1.
Otherwise (that is if t = tf ) set A A n f`g,   , 1 and go to Step 3b.
7b: If tf = +1, then the problem is infeasible, and the algorithm stops with a suitable
message.
Otherwise, update the Lagrange multipliers according to (3.42). Set A A n f`g,
  , 1 and go to Step 3b.
Note that, in our current implementation of the algorithm's second step, we choose the current
violated island as that whose excess is most negative. This technique appears to be quite ecient
in practice.

4.2.8 Nonoriented arcs


An important variant of the basic problem occurs when some arcs in the graph are undirected.
In this case, it is quite inecient to replace each of these arcs by two distinct arcs of opposite
orientation, because it increases both the dimension of the problem and the number of constraints.
Indeed, one has to impose that the two new oriented arcs have the same weight.
Fortunately, the algorithm described above can be applied to the case where arcs are nonori-
ented without any modi cation, provided the shortest paths method used in Step 1 can handle
such arcs.

4.2.9 Note
Similar implementation techniques have been used by Calamai and Conn for solving location
problems with a related structure (see [18, 19, 20]). Their technique is however di erent from
ours and a comparison of both approaches will be examined in future work.

4.3 Preliminary numerical experience


4.3.1 The implementation
In order to verify the feasibility of the above described algorithm, a Fortran program was
written and tested on an Apollo DN3000 workstation, using the FTN compiler. The shortest
paths calculations were performed by Johnson's algorithm using a binary heap (see Chapter 2).
The crucial part of our implementation has been both the determination of the violated
islands and the update of the matrix R. On one hand, detecting violated islands has been
achieved by comparing the explicit de nition of each shortest path constraint with the shortest
solving the inverse shortest path problem 67

path tree rooted at the origin of the path (de ning the constraint), proceeding backward from the
destination to the origin of that path, since shortest path trees computations give the predecessor
of each vertex in the trees. On the other hand, the sparsity of R has been taken into account by
means of linked lists. The following operations on R then needed to be specialized: adding and
deleting a column, and performing Givens plane rotations to restore the upper triangular form.
Finally, storing a graph in a computer's memory naturally involves the representation of the
arcs (our variables) by their terminal vertices. Thus care must be exercised to handle vertex vs.
arc representations for the graph, the constraints, and in particular nonoriented arcs.

4.3.2 The tests


We present here a set of seven typical examples extracted from a large collection of tests. The
rst ve arise from the trac modelling problem presented in Section 1, with graphs for two
di erent cities. The next one is obtained on a randomly generated graph while the last one is
built from the graph of a two dimensional rectangular grid. The problems' characteristics are
reported in Table 4.1. We recall that n, m and nE are the number of vertices in the graph, the
number of arcs and the number of shortest path constraints respectively.

n m nE Graph type Constraint paths generation


P1 246 351 245 city 1 a tree in the graph
P2 246 351 600 city 1 all paths between a subset of the nodes
P3 246 351 6724 city 1 all paths from a node subset to another node subset
P4 822 1447 821 city 2 a tree in the graph
P5 822 1447 6806 city 2 all paths from a node subset to another node subset
P6 500 1469 100 random randomly generated paths
P7 3600 7063 650 2D grid all paths from one side of the grid to the other sides
Table 4.1: The inverse shortest path test examples

We summarize the results of the tests in Table 4.2, where the following symbols are used:
iter. : the number of major iterations of the algorithm, that is, the number of full steps in the
primal space (adding a constraint in the active set and requiring the calculation of the
shortest paths and the choice of a new violated constraint)
drops : the number of islands dropped at Step 7 of the algorithm, that is, the number of minor
iterations (partial and dual steps, involving only the computation of the step directions in
the primal and dual space)
jAj : the number of active islands at the solution.
We note that the rst of these numbers is always one larger than the sum of the two others,
because one iteration is required for considering the empty active set.
The following gure illustrates results obtained by applying the inverse shortest path algo-
rithm to a set of problems presented in Chapter 5 (Table 5.1, page 91). The left-hand histogram
solving the inverse shortest path problem 68

iter. drops jA j


P1 35 2 32
P2 77 17 59
P3 167 34 132
P4 246 55 190
P5 468 238 229
P6 436 54 381
P7 171 8 162
Table 4.2: Results obtained on the test problems

shows the total number of iterations partitioned into drops and major iterations. The right-hand
graphic shows the time spent in calculating shortest paths with respect to the overall algorithm
run-time.

140
sh. paths time / overall time 100%
120
80%
100
Iterations

80 60%
60
40%
40
20 20%

0
0%
24 84 220 612 24 84
1300 3280 220 612 1300 3280
m m
Drops Major Iterations

Figure 4.2: Iterations per problem size and shortest paths calculation
Despite the limited character of these experiments, one can nevertheless observe the points
that follow.
 The algorithm is relatively ecient in the sense that it does not, at least in our examples,
add many constraints that are not active at the solution, with the necessity to drop them
at a later stage. See this feature in Figure 4.2.
 One also observes in practice that a fairly substantial part of the total computational e ort
is spent in calculating the necessary shortest paths in order to detect constraint violation
(Figure 4.2).
 Choosing a set of constraint paths from a single tree induces signi cant savings in the
solving the inverse shortest path problem 69

determination of the most violated constraint, because only one shortest path tree is needed.

4.4 Complexity of the inverse shortest paths problem


During the refereeing period of [15] presenting the matter of this chapter, an alternative formu-
lation of the inverse shortest paths problem was communicated to the authors by S. Vavasis.
Representing the cost of the shortest paths from node vi to node vj by the new variables wi;j for
i; j = 1; : : :n, we may then add the constraints
[s(a` ) = vk and t(a` ) = vj ] =) wi;j  wi;k + w` (4:47)
together with the equalities
wi;i = 0 (4:48)
for all i = 1; : : :; n. The constraints on the shortest paths (4.6) may then be rewritten as
wi;q  wj +    + wjl j
1 ( )
(4:49)
for any path of the form (1) with s(aj ) = vi and t(ajl j ) = vq .
1 ( )

There are at most mn inequalities of type (4.47), n equalities of type (4.48) and nP 
2
n inequalities of type (4.49). Hence the total number of constraints in this formulation is
polynomial. As a consequence, the problem is solvable in polynomial time by an interior point
algorithm.
This interesting observation is clearly of theoretical importance, but the inclusion of n2 addi-
tional variables could generate ineciencies in practical implementations.
5

Handling correlations between arc weights

In many applications, modelling networks accurately requires dependences between arc weights.
See, for instance, seismic waves propagating through the earth crust: these waves have similar
velocities as they propagate through media made of similar densities. The motivation for this
research also comes from applications in trac modelling. This chapter considers the inverse
shortest path problem where arc weights are subject to correlation constraints. A new method is
proposed for solving this class of problems. It is constructed as a generalization of the algorithm
presented in Chapter 4 for uncorrelated inverse shortest paths. In the uncorrelated case, the
variables were the arc weights and there was no correlation between them. Now, we partition the
arcs into cells or classes. The weights of the arcs located in the same cell are derived from the same
value called \cell density". The variables of our new problem become these cell densities. The
advantage of such a partition is that the number of variables decreases substantially. Moreover,
the re nement of each cell may increase without a ecting the number of our new variables. On
the other hand, the correlations involve more restrictions, and hence more constraints. Note that
shortest path constraints are not expressed with our new variables, but still involve arc weights.
As a consequence, the concept of island, introduced to formalize the violation of shortest path
constraints, has to be revised in this new context. This chapter will establish the results allowing
our new algorithm to handle such constraints in the space of the cell densities, including implicit
lower bounds constraints on shortest path costs. Preliminary numerical experience with the
new method is presented and discussed. In particular, we propose a computational comparison
between the uncorrelated method (that of Chapter 4) and the correlated one. We also provide
results obtained by using two possible strategies for handling constraints: the rst considers the
rst violated constraint as candidate to enter an active set of constraints, and the second strategy
privileges the most violated constraint.
The matter of this chapter is to be published in [16].

5.1 Motivation
The technique proposed in Chapter 4 for solving an inverse shortest path problem is based on the
solution of a particular instance of the problem's description, which is the problem of recovering
70
handling correlations between arc weights 71

arc weights in a weighted oriented graph, given a (usually incomplete) set of shortest paths in
this graph. In this approach, the arc weights are assumed to be independent from each other.
This last assumption, although reasonable in some applications, is not ful lled in all cases
of interest. Even in the areas mentioned above (transportation and tomography), interesting
questions can be asked where the independence assumption is clearly violated. It is the purpose
of the present chapter to propose an algorithmic approach to overcome this limitation.
We rst illustrate the need for such an extension by an example drawn from transportation
research. This example is presented in detail and subsequently used to motivate the speci c
concepts to be introduced. An additional case of interest in computerized tomography is also
mentioned.

5.1.1 Transportation research


Our rst example deals with the question of reconstructing the costs associated with routes in
an unsaturated transportation network. As mentioned above, a rst approach has been proposed
and tested in Chapter 6 that uses an instance of the inverse shortest path problem. The idea was
to reconstruct the delays associated with links of the network (as perceived by the users) from the
observation of the paths actually taken (assuming that users choose the perceived shortest route
between their origin and their destination). This method is akin to the idea of using \mental
maps" [36] in the process of route planning. Technically speaking, the network under study is
represented by an oriented graph in which a set of shortest paths is known; the question is then
to infer the value of the time delays associated with each arc of the graph and di ering as little
as possible from a set of a priori known weights (derived, for instance, from the knowledge of the
geometrical characteristics of the road).
Applying this methodology to urban situations, it is important to explicitly consider the
delays at signalized junctions, and not to restrict the analysis to the estimation of the delays on
the links only. This can be achieved by using a graph that contains detailed arcs to represent the
various \turns" in a junction. A small example of such a graph is given at Figure 5.1.
Unfortunately, a naive application of this method results in a set of estimated weights for the
\inner arcs" of a given junction that are mutually uncorrelated. This can be considered unrealistic
because the delays at a simple signalized junction all depend (in some xed way) on the relevant
trac light cycle. We may then be interested in reconstructing the light cycles themselves, as
they are perceived by the users, proceeding again from the observation of the routes actually
chosen in the network. The estimated values for the cycles can then be fed into models that
explicitly use trac light phasing.
We now follow this approach and build the following simple model. The network is represented
by a detailed graph as that of Figure 5.1 in which a delay, or weight, is associated with each arc
according to the rule
(
wi = di if the i-th arc does not belong to any junction, (5:1)
id`(i)
if the i-th arc belongs to the `(i)-th junction,
where the di are the delays associated with links or junctions, and where the i specify how the
handling correlations between arc weights 72

3 4 13 14

7 8 17 18
2 6 9 16 19 25

1 5 10 15 20 26
12 11 21
24
23 22

27 28
29
30 43 44
38 36 31 42 45 47

37 35 32 41 46 48

34 33

40 39

Figure 5.1: The rst example involving correlations between arc weights

delay for a turn depends on the global delay (the red light period, for instance) of the relevant
junction. We may trivially extend this de nition to
wi = i d`(i); (5:2)
where we have de ned i def= 1 and `(i) def
= i for all arcs not belonging to any junction.
We then face the problem of estimating the delays d`(i) subject to the constraint that a set
of a priori known paths in the graph must be the shortest ones between their origin and their
destination. As in Chapter 4, this problem is usually underdetermined and a particular solution
can be chosen that minimizes the di erence between the computed delays and some a priori
known values. We then have an inverse shortest path problem as de ned in Chapter 4 whose
variables are the delays (as opposed to the weights).

5.1.2 Seismic tomography


Our second motivating example is drawn from seismic tomography. In this research area, a
(possibly large) geologic zone is discretized into neighbouring cells of constant material density.
The arrival times of compression shock waves generated by earthquakes or arti cial explosions
are then observed by seismographs placed at known locations. The problem is then to reconstruct
the cell densities from an analysis of the paths (rays) used by these waves. One possible way
to proceed is to construct, in each cell, a small graph whose arcs represent the propagation of
an incoming compression wave in di erent directions. For example, if we assume that a zone is
handling correlations between arc weights 73

divided into 2  3 cells, we may then choose a simple cell model consisting of 6 arcs (a square with

j j j j
both diagonals), and then construct the resulting (undirected) network illustrated in Figure 5.2.
1
1 2 3 4
@ 6 , @ , @ ,

j j j j
5@ , @ , @ ,
4 ,
@ 2 ,
@ ,
@
, @ , @ , @
, 3 @ , @ , @
5 6 7 8
@ , @ , @ ,

j j j j
@ , @ , @ ,
,
@ ,
@ ,
@
, @ , @ , @
, @ , @ , @
9 10 11 12

Figure 5.2: The graph generated from a discretization

We consider the following simple model to describe the travel time wi of a compression wave
along the i-th arc within the `(i)-th cell:
wi = i d`(i); (5:3)
where i is now proportional to the length of the i-th arc1 . In our example, the travel times
associated with the arcs of the rst cell in Figure 5.2 (whose sides are assumed to be of unit
length) are given by
wi = dp1 for i = 1; : : :; 4;
(5:4)
wi = 2 d1 for i = 5; 6:
As above, we now consider the question of estimating the cell densities d`(i) from the knowledge
of the wave paths and arrival times. Because of the Fresnel law stating that waves follow shortest
paths in their propagation medium, this is again a variant of the inverse shortest path problem,
where the variables are no longer the weights associated with the arcs, but some more aggregated
quantities (the cell densities) which determine these weights via linear relations.

5.2 The formal problem


Both examples described in the previous section are particular cases of the following formal
problem speci cation.

5.2.1 Classes and densities


For the terminology related to graphs, we let the reader refer to Chapter 2. Yet, we recall here
the basic notations and set those related to cell densities. We consider a directed weighted graph
1
Confusion about the cell to which inner horizontal and vertical arcs belong can be avoided by suitably de ning
`(i).
handling correlations between arc weights 74

(V ; A; w), where (V ; A) is an oriented graph with n vertices and m arcs, and where w a set of
nonnegative weights fwi gmi=1 associated with the arcs. Let V be the set of vertices of the graph
and A = fak = (s(k); t(k))gmk=1 be the set of arcs, where s(k) denotes the vertex at origin of the
arc ak (its \source-vertex") and t(k) the vertex at its end (its \target-vertex"). Also assume that
the set of arcs A is partitioned in L disjoint classes and that a nonnegative density is associated
with each of these classes. Assume nally that the weight of every arc can be computed as an
arc-dependent proportion of the density of the class to which the arc belongs, that is
wi = i d`(i) for i = 1; : : :; m; (5:5)
where `(i) denotes the index of the (unique) class containing the i-th arc. We say that the i-th
arc is associated with the `(i)-th class.
In our rst example, the arcs are the detailed links of the network, including the detailed links
within a junction. They are partitioned into classes corresponding to roads and junctions: the
densities of these classes then correspond to the delays along roads and the trac light cycles at
the junctions. In our second example, the classes correspond to cells of the discretized geological
medium, the densities to their actual physical densities and the arcs to the possible ways in which
a wave can travel across a cell.
Our problem is then to determine values of the class densities that are compatible with a set
of known properties of the weighted graph.

5.2.2 Shortest paths constraints


The main feature of our problem is that we wish to specify that some paths are shortest (between
an origin and a destination). We note that this concept is only meaningful in the weighted graph,
and has no direct translation in terms of classes and densities, which are our variables.
We rst allow to impose that a known path between two vertices is shortest between these
vertices. More formally, we de ne a simple path pj to be an ordered set of arcs of the form
pj = (aj ; aj ; : : :; aj j ) (j = 1; : : :; nE );
1 2 ( )
(5:6)
where (j ) is the number of arcs in this path (its length), and where
t(ji ) = s(ji+1 ) for i = 1; : : :; (j ) , 1: (5:7)
As detailed in Chapter 3, the constraint that a given path pj is shortest can then be expressed
as a (possibly very) large set of linear inequalities of the form
X X
wk  wk ; (5:8)
kjak 2p0j kjak 2pj
where p0j is any path with the same origin and destination as pj .
For future reference we also note that a path pj can be de ned equivalently by the ordered
set Pj of successive vertices that are on pj , i.e.
Pj def
= (s(j1 ); t(j1); t(j2); : : :; t(j(j ))): (5:9)
handling correlations between arc weights 75

In our rst example, we may assume that the network users follow the path that they perceive
to be shortest between their origin and destination. An observation of the paths actually chosen
by these users then gives constraints of the type just described. For instance, we may know that
users travelling from vertex 1 to vertex 38 use the path de ned by
P1 = (1; 5; 10; 15; 24; 29; 36; 38) (5:10)
while those travelling from vertex 1 to 48 use that given by
P2 = (1; 5; 10; 15; 22; 43; 46; 48): (5:11)
We also wish to consider constraints that impose a lower bound on the cost of the shortest
path between two vertices. These constraints were introduced in Chapter 1 and have not been
considered in the basic inverse shortest path method proposed in Chapter 4. We saw in Chapter 3,
Section 3.2, that such a constraint can be expressed by a set of constraints imposing that the
weight of all paths between the two vertices, no and nd say, is bounded below by a constant, that
is X
wk  (no;nd) ; (5:12)
kjak 2g
where g is any path with origin no and destination nd .
In the context of our example, we may know that the time required to reach vertex 42 from
vertex 13 is clearly not smaller than 50 measure units. In this case, we wish to impose that the
weight of the shortest path between these vertices is bounded below by 50.
The number of linear constraints of the form (5.8) and (5.12) is dependent on the number
of possible paths between two vertices in the graph, which grows exponentially with the density
of the graph m=n. Enumerating these constraints is of course out of question, and we will have
to use a \separation procedure" to determine which of these constraints are violated for a given
value of the class densities. This separation procedure is based on the computation of the shortest
paths within the graph, given the weights on its arcs, which are themselves determined by the
cell densities and (5.5).
We could also consider imposing upper bounds (and therefore equalities) on the weight of
some shortest paths. We showed in Chapter 3, Section 3.2 that this type of constraint can no
longer be expressed as a set of linear inequalities, as in (5.8) and (5.12). The problem is therefore
of a di erent nature. This special case will be considered in Chapter 6.

5.2.3 Constraints on the class densities


Besides the constraints on shortest paths, we also include general linear constraints on the class
densities, which are the true variables of our problem. The rst constraint of this type is clearly
that the class densities must be nonnegative (in order to ensure the nonnegativity of the arc
weights). But we may want to specify further linear constraints of the form
X
L
il dl  i (i 2 I ); (5:13)
l=1
handling correlations between arc weights 76

and/or
X
L
il dl = i (i 2 E ); (5:14)
l=1
where the il are general coecients, the i are speci ed constants and the sets I and E index
the inequality and equality constraints respectively.
For instance, the network users of the rst example may be aware that no trac light cycle
exceeds 5 minutes, therefore imposing an explicit upper bound on all class densities representing
such cycles. Other a priori knowledge of the network also might indicate that a given cycle is
longer than another one: this again produces a linear constraint of the type (5.13) on the relevant
class densities.
Observe that linear constraints on the arc weights can be expressed in the form of (5.13) or
(5.14) provided they involve xed sets of arcs. The translation from arc weights to class densities
is then given by (5.5).

5.2.4 The inverse problem


If we remember that our problem is to reconstruct the class densities subject to the constraints
described above, we recall that, as is the case in the basic inverse shortest path problem, the
constraints do not determine the class densities uniquely: the reconstruction problem is underde-
termined. Fortunately, it often happens in applications that some additional a priori knowledge
of expected class densities is available. Using this information then provides stability and unique-
ness of the inversion (see [105]). This a priori information may be obtained either from \direct"
models, for which there are no problem of uniqueness, or from a posteriori information of a
previous inverse problem run with di erent data.
The application of this idea to our framework results in the question of determining the class
densities that are as close as possible to their expected values. Denoting these a priori expected
values by fdl gLl=1, we therefore consider the minimization problem
min kd , dk (5:15)
d2RL
subject to the constraints
dl  0 (l = 1; : : :; L) (5:16)
and a selection of constraints as described above in (5.8), (5.12), (5.13) and (5.14). Of course,
the constraints of type (5.8) and (5.12) should be interpreted as constraints on shortest paths in
the weighted graph whose arc weights are determined by the value of d and (5.5). As decided in
Chapter 1, we choose use the `2-norm to measure the proximity to the a priori information, so
that the objective function (5.15) can now be rewritten as
XL
minL f = 1 (dl , dl)2 : (5:17)
d2R 2 l=1
This particular choice implies that, although arc weights are correlated, class densities are as-
sumed to be independent.
handling correlations between arc weights 77

We note that statistical correlation between weights could be handled by considering the
objective
1 T
w 2 (w , w) C (w , w )
min (5:18)
where C ,1 is a covariance matrix on w and where the superscript T denotes the transpose (see
[105], for instance). There are two main reasons why we will not follow this approach.
1. The formulation (5.18) clearly allows for statistical correlation between the densities, but
does not garantee that the equalities (5.5) do hold.
2. Introducing a non diagonal C as the Hessian of the objective function substantially com-
plicates the algorithm, as will become clear in our later developments.
Methods based on (5.18) therefore constitute an alternative to those presented in this chapter
and deserve a separate study.

5.3 The uncorrelated inverse shortest path problem


As our approach will make extensive use of the technique developed in Chapter 4, we rst recall
that technique proposed therein to solve the (uncorrelated) inverse shortest path problem. In
particular, let us illustrate again the concept of island used to characterize the violation of shortest
path constraints. This is important to introduce the generalization of the concept of island.
Remember that we may rewrite all the constraints of type (5.8) as
Ei(w) def
= nTi w , bi  0 (i = 1; : : :; h); (5:19)
where w and fni ghi=1 belong to Rm and b is in Rh . Note that constraints of type (5.8) have
no constant term, so that all bi in equation (5.19) would be zero. We then de ne the m  jAj
matrix N whose columns are the normals ni of the constraints of the active set A, where jAj is
the number of constraints in A. Since the Hessian G of our objective function equals the identity,
the (Moore-Penrose) generalized inverse of N in the space of variables under the transformation
y = G w simply becomes
1
2

N  def
= (N T N ),1 N T ; (5:20)
and
H def
= (I , NN ); (5:21)
is then the reduced inverse Hessian of the quadratic objective function in the subspace of weights
satisfying the active constraints.
When a constraint (5.8) is violated or active, that is when a path pj has its weight greater than
or equal to that of another path, g say, with the same origin and destination, pj and g determine
together at least one island whose positive shore consists of the part of g that is not common with
pj and whose negative shore consists of the part of pj that is not common to g (see Figure (5.3)).
Of course a violated constraint may generate more than one such island (the paths pj and g may
indeed rst depart from each other, then join and depart again later), but each island necessarily
handling correlations between arc weights 78

s s s s
g

s s


I ,  
 
 pj

ss s
pj  
 I ,
,
, g

s
, - I+
,
,
,
pj ,

 g

Figure 5.3: An island

must correspond to a violated constraint that is implicit in the statement (5.8) (a subpath of a
shortest path is also shortest). We make the choice to consider each such constraint explicitly
and therefore to associate one and only one island with each violated constraint. The algorithm
only considers a subset of all possible islands and assigns an index, q say, to each one of them.
For each such island, the sets Iq+ and Iq, are de ned to be the sets containing the arcs of its
positive and negative shores respectively, while Iq def
= Iq+ [ Iq, . The excess of the island, denoted
Eq(w) (or, more brie y, Eq ) is then given by
X X
Eq (w) def
= wi , wi : (5:22)
ai 2Iq+ ai 2Iq,
We again illustrate these concepts within our rst example. If we assume that the weight of
the path p1 is greater than that of g = (1; 5; 12; 27; 36; 38), the constraint that p1 is shortest is
violated, and these paths determine an island, I1 say, whose positive shore I1+ contains the arcs
joining vertices 5, 12, 27 and 36, while its negative shore I1, contains the arcs joining vertices 5,
10, 15, 24, 29 and 36.
We note that both shores of an island start at the same vertex and end at the same vertex.
Remember that the inverse shortest path algorithm produces a set of dual feasible points and
keeps a set of active constraints A, where each constraint is veri ed as an equality. The algorithm
then proceeds by successively selecting the island whose excess is most negative and by adding it
to its \active set". This is achieved by increasing the weights of the arcs on the positive shore and
reducing the weights on the negative shore until both shores are of equal weight. The process is
continued until no violated constraint is left. In Chapter 4, the algorithm also explicitly handles
the fact that the weights must remain nonnegative. This creates additional constraints that can
also become active in the course of the calculation. When such a bound constraint is violated, we
handling correlations between arc weights 79

consider that it has a positive shore (containing the arc whose weight is negative) and an empty
negative shore. That is why we partitioned the active set A into two subsets
A = (V; Y ); (5:23)
where V is the set of currently active islands with a nonempty negative shore and Y the set of
active islands with empty negative shore (the set of active bounds):
Y def
= fi 2 f1; : : :; mg j wi = 0g: (5:24)
Let us recall the de nition of several sets that will be generalized in the next section, in order to
work in the space of class densities:
X def
= fi 2 f1; : : :; mg n Y j 9j 2 V : ai 2 Ij g (5:25)
and
Z def
= f1; : : :; mg n (X [ Y ) : (5:26)
X contains the indices of the arcs that appear in one of the active islands of V but are not xed
at their lower bounds, while Z contains the indices of the arcs that are not involved at all in the
active constraints of A. For i 2 X , we had de ned
I + (i) def
= fj 2 V j ai 2 Ij+ g and I , (i) def
= fj 2 V j ai 2 Ij, g; (5:27)
which is the set of active islands of V whose positive (resp. negative) shore contains the arc ai .

5.4 An algorithm for recovering class densities


Our purpose is now to develop a specialized variant of the Goldfarb-IdnanI quadratic program-
ming algorithm for recovering the class densities, as opposed to the arc weights. This variant
will be similar in spirit to that of the previous section. However, it will clearly operate in a lower
dimensional space, because the number of classes is typically much smaller than the number of
arcs. From now on, we will therefore place ourselves in RL . Then, Ei depends on the class
densities and can be written as follows:
Ei(d) = nTi d , bi (i = 1; : : :; h); (5:28)
where d and ni both belong now to RL . N is a then matrix of dimension L jAj and the Hessian
G of the objective function (5.17) is the identity matrix of order L. N  and H are still de ned
as above.
If we wish to use the same approach as that recalled in Section 4 for the inverse shortest path
problem, we will need to re-examine successively
1. the concept of island (it will now feature both classes and arcs);
2. the computation of the dual step direction r as a function of N T nq , where nq is the normal
to the constraint corresponding to the newly incorporated island;
handling correlations between arc weights 80

3. the update of the density values;


4. the computation of the primal step direction s (when it is nonzero) and that of the inner
product sT nq when nq is linearly independent from the columns of N ;
5. the determination of the maximum steplength to preserve dual feasibility.
This is the purpose of the next subsections.

5.4.1 Islands, dependent sets and their shores


If the concept of \islands" is natural when considering arcs and paths (as in the inverse shortest
path problem), the naming of the same concept extended to our more general setting is much
less obvious. The problem is that we have again constraints (5.8) and (5.12) that balance the
weight of two paths sharing their origins and destinations (thus de ning an island as above) but
we must also consider general linear relations between the class densities (5.13) and (5.14) where
such a \geographical" interpretation seems irrelevant. However, we can preserve the interpreta-
tion of a violated constraint containing two sets of variables: one that should increase and the
other that should decrease for the constraint to be satis ed. These sets will then respectively
correspond, in the context of general linear constraints, to the positive and negative shores of
the islands in the context of shortest paths. The fact that this \balancing" interpretation of
the constraints then holds both for shortest paths and linear constraints results in substantial
notational simpli cations.
More precisely, we de ne the concept of dependent sets as follows. The i-th dependent set Di
is given by
Di def
= fcl j il 6= 0g; (5:29)
where the cl are classes and where il are the coecients of these densities in (5.13) and (5.14).
Strengthening the analogy between islands and dependent sets, we also de ne the positive and
negative shores of these sets by
Di+ def
= fcl j il > 0g (5:30)
and
Di, def
= fcl j il < 0g: (5:31)
As for the islands, we let Di = Di+ [ Di, .

5.4.2 The dual step direction


As in Chapter 4, the formulation of the dual step direction r = N nq can be rewritten, using
(5.20), as
r = (N T N ),1N T nq : (5:32)
Let us recall that this calculation can be performed by maintaining a triangular factorization of
the matrix N T N ; since the matrix N  is the unweighted generalized inverse of N , it will only be
handling correlations between arc weights 81

necessary to maintain a triangular factorization of the form


N T N = RT R; (5:33)
where R is a upper triangular matrix of dimension jAj. Since N is of full rank, this is equivalent
to maintaining a QR factorization of N of the form
  R ! def
N = Q1 Q2 = QU; (5:34)
0
as is the case in the numerical solution of unconstrained linear least squares problems. Indeed,
it is straightforward to verify that (5.32) may be reformulated as

r kNr , nq k2 :
min (5:35)
The second useful simpli cation due to the special structure of the problem arises in the com-
putation of the product N T nq in (5.32). The resulting vector indeed contains in position i the
inner product of the i-th active constraint normal with the normal to the q -th constraint. As
both these constraints may now be interpreted as islands or dependent sets, we may exploit this
similarity in expressing the value of N T nq .
In order to state this expression in a reasonably compact form, we de ne some additional
notations:

(l) is the set of the arcs located in the class cl, i.e.

(l) def
= fak j `(k) = lg : (5:36)

 ,l (Ii+) and ,l(Ii,) are the \ -weighted" cardinalities of the positive and negative shores of
Ii restricted to the arcs of
(l), that is
X
,l (Ii ) def
= k [ak 2 Ii ] (5:37)
ak 2
(l)
where the k are the proportions de ned by the equation (5.5). Similarly, we also de ne
,l (Di+ ) and ,l (Di, ) by
,l (Di ) def
= il  [cl 2 Di ]: (5:38)
 Finally, we use the symbol Ji to represent either Ii, if the i-th constraint is a proper island,
or Di if the i-th constraint is of the type (5.13) or (5.14). By convention, we set Di = ;
when Ji = Ii and Ii = ; when Ji = Di . ,l (Ji+ ) and ,l (Ji, ) are then given, according to
(5.37) and (5.38), by
(
,l (Ii ) if J is an island,
,l (J  ) def
= (5:39)
i ,l (Di ) if J is a dependent set.
We can now express the inner product of the normal to the q -th active constraint with the
normals of all other active ones.
handling correlations between arc weights 82

Lemma 5.1 The vector N T nq appearing in (5.32) is given componentwise by


h i X  
N T nq i = ,l (Jg+ ) , ,l (Jg, ) ,l (Jq+ ) , ,l (Jq, ) (5:40)
l2B(i)
for i = 1; : : :; jAj, where g = g (i) is equal to the index of the i-th active constraint, and where
B(i) is the set of indices of the classes that appear in constraints g and q, namely
( )
9 a k 2
( l
B(i) = l j or c 2 (D [ D ) ) : a k 2 (Ig [ Iq ) : (5:41)
l g q
Proof. We rst prove the result in the case where the current and active constraints are
islands, that is J = I in (5.39).
We then consider the shortest paths constraints (5.8) and (5.12). Since the arc weights are
bound to the class densities by (5.5), a constraint of type (5.8) can be written as
X X
i d`(i)  i d`(i) (5:42)
ijai 2G ijai 2P
where P is one of the paths pj and G is a path with same origin and destination as P , whose
weight is less or equal than that of P ; a lower bound constraint on a shortest path (5.12) is then
given by X
i d`(i)  (no ;nd ); (5:43)
ijai 2G
where G is a path starting at node no , ending at nd and of weight less or equal than (no ;nd ) .
These formulations now express an island constraint in terms of class densities. By (5.36),
our constraint (5.42) becomes
0 1 0 1
X
L
@ X X
L X
i [ai 2 G ]A dl  @ i [ai 2 P ]A dl: (5:44)
l=1 ai2
(l) l=1 ai 2
(l)
The normal to the constraint (5.44) is thus a vector of RL whose l-th component is:
X
( [ai 2 G ] ,  [ai 2 P ]) i : (5:45)
ai 2
(l)
When a shortest paths constraint is active or violated, we may introduce I + and I , in the
formulation of (5.45), in place of G and P since the arcs in G n I + and in P n I , do not contribute
to the violation excess value.
Then, by (5.37), we obtain the following normal components for the k-th constraint:
 
[nk ]l = ,l (Ik+ ) , ,l (Ik, ) ; for l = 1; : : :; L: (5:46)
In the case where nk is related to the constraint (5.43), its l-th component is reduced to
[nk ]l = ,l (Ik+ ): (5:47)
handling correlations between arc weights 83

The formulation (5.46) is therefore valid for both constraint types because the negative shore Ik,
is empty for a lower bound constraint.
Since h T i T
N nq i = ni nq (5:48)
and g is equal to the index of the i-th active island, (5.40) holds for the island constraints when
observing that the class indices involved (l) are those for which there exists at least one arc
belonging both to the class and to a shore of the island Iq or Ig , that is, we can restrict l to the
set B(i).
The proof is totally similar when considering the dependent set constraints (5.13) and (5.14),
using (5.29), (5.30), (5.31), and (5.38).
For future reference, we note that, in the general case, [nk ]l can be written as
 
[nk ]l = ,l (Jk+ ) , ,l (Jk, ) ; for l = 1; : : :; L; (5:49)
using (5.39). 2
As a consequence of Lemma 5.1, the practical computation of r in (5.32) may be organized as
follows:
1. compute the vector y 2 RjAj whose i-th component is given by (5.40),
2. perform a forward triangular substitution to solve the equation RT z = y for the vector
z 2 RjAj ,
3. perform a backward triangular substitution to solve the equation Rr = z for the desired
vector r.

5.4.3 Determination of the class densities


Before stating this result more precisely, the de nitions of the set V and of Y , X , Z , in
(5.24), (5.25) and (5.26) respectively, have to be generalized for the correlated problem. The
set f1; : : :; Lg is now partitioned as follows: the active set A still has the same formulation (V; Y )
but
V def
= VD [ VI (5:50)
where
VD def
= fcurrently active dependent setsg, (5:51)
VI def
= fcurrently active islandsg. (5:52)
Observe that VD contains the set of equality constraints. These last constraints are obviously
active provided they have been incorporated rst in the active set before handling other constraint
violations.
We now de ne special sets involving active bound constraints on individual class densities:
Y0 def
= fl 2 f1; : : :; Lg j dl = 0g; (5:53)
handling correlations between arc weights 84

Y def
= fl 2 f1; : : :; Lg j dl is at a bound lg; (5:54)
with l being either the lower bound or the upper bound value at which the l-th class density is
currently xed. The value of l equals that of i =il, where i 2 VD is the index of the related
bound constraint. Note that Y0  Y .
To characterize the classes involved in active islands or dependent sets, we de ne the set of
class indices appearing in active dependent sets
XD def
= fl 2 f1; : : :; Lg n Y j 9j 2 VD : cl 2 Dj g; (5:55)
the set of class indices with which arcs in active islands are associated
XI def
= fl 2 f1; : : :; Lg n Y j 9j 2 VI : 9ak 2
(l) \ Ij g; (5:56)
and
X def
= XD [ XI (5:57)
Note that XD and XI are not necessarily disjoint.
The set X thus contains the indices of the classes that are involved in one of the active islands
or dependent sets of V but are not xed at a bound.
The remaining class indices are the elements of the set Z ,
Z def
= f1; : : :; Lg n (X [ Y ) : (5:58)
The set Z contains the indices of the class densities that are not involved at all in the active
constraints of A.
When we consider a class index l in X , de nitions analogous to (5.27) can be made:
 for l 2 XD , we de ne the sets D+(l) and D, (l) as
D (l) def
= fj 2 VD j cl 2 Dj g; (5:59)
that is the set of active dependent sets whose positive (negative) shore involves the l-th
class.
By convention, we then set
I  (l) def
= ; if l 62 XI : (5:60)
 Similarly, if l 2 XI , we de ne the sets I +(l) and I ,(l) as
I  (l) def
= fj 2 VI j 9ak 2
(l) \ Ij g; (5:61)
i.e. the set of active islands such that their positive (negative) shore involves an arc asso-
ciated with the l-th class.
We then set
D (l) def
= ; if l 62 XD : (5:62)
handling correlations between arc weights 85

 We nally de ne, for l 2 X , F + (l) and F , (l) as follows:


F  (l) def
= D (l) [ I  (l): (5:63)

The set F + (l) (resp. F , (l)) then contains active constraints that are not bound constraints and
that involve the class cl in their positive (resp. negative) shore.
Lemma 5.2 Consider a dual feasible solution for the problem of minimizing (5.17) subject to
the constraints in the active set A = (V; Y ). Assume furthermore that, among the Lagrange
multipliers fuk gjkA=1j , those associated with the active islands and dependent sets of V are known.
Then the class density vector d corresponding to this dual solution is given by
2 3
X X
dl =  [l 2 Y n Y0 ] l + [l 2 X [ Z ] dl +  [l 2 X ] 4 ,l (Jg+ )ug , ,l (Jg, )ug 5 (5:64)
g2F + (l) g2F ,(l)
for l = 1; : : :; L.
Proof. De ne the following sets:
 the set of active islands related to lower bound constraints on a shortest path, i.e. constraints
of type (5.12)
BI def
= fq 2 VI j Iq, = ;g; (5:65)
 the set of classes that are involved in the active constraints of type (5.12)
XBI def
= fl 2 f1; : : :; Lg n Y j 9j 2 BI : 9ak 2
(l) \ Ij g; (5:66)

 and the set of classes involved in the active constraints of type (5.8)2:
XBIc def
= fl 2 f1; : : :; Lg n Y j 9j 2 VI n BI : 9ak 2
(l) \ Ij g: (5:67)

The Lagrangian function of our problem is


X
L
L(d; u) = 21 (dl , dl )2 , SD , SIc , SI (5:68)
l=1
where SD is the term involving the active constraints on the class densities, that is
20 1 3
X X
SD = ug 4@ gl dl A , g 5 ; (5:69)
g2VD l2f1;:::LgnY0
SIc is the term involving the active shortest paths constraints,
2 3
X X  
SIc = ug 4 ,l (Ig+ ) , ,l (Ig ) dl 5 ;
, (5:70)
g2VI nBI l2f1;:::LgnY0
2
The sets XBI and XBIc are not necessarily disjoint.
handling correlations between arc weights 86

and SI is the term involving the active lower bound constraints on a shortest path,
20 1 3
X X
SI = ug 4@ ,l (Ig+ ) dl A , g 5 ; (5:71)
g2BI l2f1;:::LgnY0
where g is the lower bound (no ;nd ) de ned in (5.12) and related to the g -th constraint.
Since the class densities indexed by Y are xed at their bound, (5.68) becomes
X X
L(d; u) = 12 (dl , dl)2 + 12 ( l , dl)2 , SD , SIc , SI : (5:72)
l2X [Z l2Y nY 0

When observing that, because of (5.38), gl = (,l (Dg+ ) , ,l (Dg, )), we can rewrite SD using (5.59)
and permuting the two sums in (5.69):
2 3
X X X X
SD = dl 4 ,l (Dg+ ) ug , ,l (Dg, ) ug 5 , g ug : (5:73)
l2XD [(Y nY0 ) g2D+ (l) g2D, (l) g2VD
Similarly, using (5.37) and (5.61), we can modify SIc and SI to obtain:
2 3
X X X
SIc = dl 4 ,l (Ig+ ) ug , ,l (Ig, ) ug 5 (5:74)
l2XBIc g2I (l)nBI
+ g2I ,(l)nBI

and 2 3
X X X
SI = dl 4 ,l (Ig+) ug 5 , g ug : (5:75)
l2XBI g2I + (l)\BI g2BI
Expressing now the condition @ L@d(d;u )
l = 0 (for l 2 X [ Z ) and combining the terms from (5.74)
and (5.75) yields, for l 2 X [ Z , that
dl = [l 2 X [ Z ] dl P
+ [l 2 XD [ (Y n Y0 )] , ( D + ) u , P , , (D,)u  (5:76)
g2D (l) l g g g2D (l) l g g
P P
+

+ [l 2 XI ] ,
g2I (l) ,l (Ig )ug , g2I , (l) ,l (Ig )ug :
+
+

Finally, since dl = l for l 2 Y n Y0 , we obtain the desired expression of dl using (5.57), (5.60),
(5.62) and (5.63) in (5.76). 2
Of course, the multipliers uq (q 2 E ) in (5.64) are not constrained to be nonnegative.

5.4.4 The primal step direction


So far, we have been able to calculate N T nq , the dual step direction r by triangular substitutions,
and the update of the class densities values. We now specialize the computation of the primal step
direction s (when it is nonzero) and of its inner product with nq when nq is linearly independent
from the columns of N .
handling correlations between arc weights 87

Lemma 5.3 Assume that the inverse shortest path algorithm has reached the point where the
primal step direction is to be computed, and assume that A = (V; Y ) is the active set at this stage
of the calculation. Then, when non-zero, the step direction in the primal space s is given by
2 3
X X
sl = [l 2 X ] 4 [nq ]l + ,l (Jk, )rk , ,l (Jk+ )rk 5 (5:77)
k2F , (l) k2F + (l)
for l = 1; : : :; L. Moreover, if q is the index of a violated island or a violated dependent set that
is not a bound on a class density, we have that
2 3
X X X
sT nq = [nq ]i2 + [nq ]i 4 ,i (Jk, )rk , ,i (Jk+ )rk 5 (5:78)
i2D k2F , (i) k2F + (i)
where ( )
D = l 2 X j or 9c a2k 2D
(l) : ak 2 Iq (5:79)
l q
If q is the index of a bound constraint on the lq -th class density, we then have that
 for a lower bound,
X X
sT nq = ,lq (Jq+ ) + ,lq (Jk, )rk , ,lq (Jk+ )rk ; (5:80)
k2F , (lq ) k2F + (lq )

 for an upper bound,


X X
sT nq = ,lq (Jq, ) , ,lq (Jk, )rk + ,lq (Jk+ )rk : (5:81)
k2F , (lq ) k2F + (lq )

Proof. From the de nition of H in (5.21) and that of r = N nq , the primal step direction
s(= Hnq ) may be rewritten as nq , Nr. Then, using (5.49), the l-th component of s can be
expressed as:
jAj 
X 
sl = ,l (Jq+ ) , ,l (Jq, ) , ,l (Jk+ ) , ,l (Jk, ) rk : (5:82)
k=1
Eliminating the null terms and using (5.63), we obtain that
X X
sl = ,l (Jq+ ) , ,l (Jq, ) + ,l (Jk, )rk , ,l (Jk+ )rk : (5:83)
k2F , (l) k2F + (l)
Moreover, if l 62 X , sl must be zero since a class density at a bound cannot change as long as the
bound constraint is active. We thus obtain (5.77) by using (5.46).
Assume now that the q -th constraint is a lower (resp. upper) bound on a class density (the
lq -th one, say). Then, nq = eq (resp. ,eq ), the q-th vector of the canonical basis. Hence the
product sT nq is equal to sq (resp. ,sq ) and (5.80) (resp. (5.81)) follows from (5.77) and the fact
that q 2 X , since dq violates a bound.
handling correlations between arc weights 88

On the other hand, if the q -th constraint is a violated island or a violated dependent set that
is not a bound constraint,
XL  
sT nq = si ,i (Jq+ ) , ,i (Jq, ) (5:84)
i=1
where si has been established in (5.77). Note that both ,i (Jq+ ) and ,i (Jq, ) may be nonzero,
at variance with what happens in the uncorrelated inverse shortest paths problem in Chapter 4.
Finally, we obtain (5.78) from (5.84) by eliminating terms whose contribution is zero. 2

5.4.5 The maximum steplength to preserve dual feasibility


Since equality constraints have been added for our correlated problem, the set S containing the
constraints that are candidates to leave the active set A is
S = fj 2 f1; : : :; jAjg n VE j rj > 0g (5:85)
where
VE def
= fcurrently active equality constraintsg. (5:86)

5.4.6 The algorithm


We are now able to describe the algorithm for solving our correlated inverse shortest path problem.
In this description, we will use a small (machine dependent) tolerance  > 0 to detect to what
extent a real value is nonzero, and we de ne the integer  = jAj.
Note that the revision of the active set method is similar to that presented in Chapter 4.
The following algorithm does not take linear constraints on densities (5.13){(5.14) into account,
simply because they can be handled by the Goldfarb and Idnani method.
Algorithm 5.1
step 0: Initialization.
Set d d, f 0, A ;,  0 and u 0.
Set also w w where w is de ned componentwize by
wi = i d`(i) for i = 1; : : :; m: (5:87)

step 1: Compute the current shortest paths.


For j = 1; : : :; nE , compute the shortest paths from s(j1) to every vertex in Pj n fs(j1)g in
the graph (V ; A; w).
step 2: Choose a violated constraint or exit.
Select Jq , an island whose excess Eq is negative, if any. If no such island exists, then d is
optimal and the algorithm stops.
Otherwise, compute the normal nq to the violated constraint reduced to the space of the
densities RL according to (5.46).
handling correlations between arc weights 89

If  = 0, then set knq k and go to Step 4.


Otherwise (that is if  > 0) set !
u u : (5:88)
0
step 3: Compute the dual step direction.
Compute the vectors z and r, so that (3) and (4.24) using Lemma 5.1. Compute also
according to
q
= knq k2 , kzk2: (5:89)
step 4: Determine the maximum steplength to preserve dual feasibility.
Determine the set S by (5.85), tf (and possibly `) using (3.40) where u = u+ .
step 5: Determine the steplength to satisfy the q-th constraint.
If   then go to Step 6b.
Otherwise, compute s and sT nq as described in Lemma 5.3, and tc according to (3.41).
step 6: Take the step and revise the active set.
6a: Compute the steplength t as min[tf ; tc], set d d + ts, update the arc weight values
w by (5.5) and revise f according to (3.45) (where u = u+) and u using
8 !
>
< u + t ,r if  > 0;
u > 1 (5:90)
: u+t if  = 0:
If t = tc , set A A [ fq g,   + 1 and go to Step 7a.
Otherwise (that is if t = tf ) set A A n f`g,   , 1 and go to Step 7b.
6b: If tf = +1, then the problem is infeasible, and the algorithm stops with a suitable
message.
Otherwise, update the Lagrange multipliers according to (3.42) where u = u+ . Set
A A n f`g,   , 1 and go to Step 7b.
step 7: Revise the triangular factor R.
7a: Add the constraint normal nq to N . If  = 1 then set R = ( ) and go to Step 1.
Otherwise (that is if  > 1) update the upper triangular matrix R using
!
R R z (5:91)
0
and go to step 1.
7b: Drop n` from N . Remove from R the column corresponding to the `-th island, and
use Givens rotations to restore it to upper triangular form. Go to Step 3.
handling correlations between arc weights 90

In our algorithmic framework, the computation of the new values of the primal variables may
be completely deferred after that of the dual step, in contrast with the original method of Golfarb
and Idnani.
Note that in the second step of our current implementation of the algorithm, we do not specify
how to select a violated constraint. We examine two possibilities in the next section.
Also remark that the calculation of tf ensures that equality constraints can never leave the
active set.

5.5 Numerical experiments


In this section, we compare, on one hand, the performance of our method solving the correlated
inverse shortest path problem against that of the method solving the uncorrelated problem, and,
on the other hand, the performance of two implementations of the algorithm solving the correlated
problem.

5.5.1 Implementation remarks


In our implementations, we used linked lists to represent the partition of the arcs into classes.
The diculty has been to set up ecient data structures to be able to calculate ,l (Jk ) easily,
since the need of evaluating these quantities occurs at each stage of the algorihm. Handling at
the same time both constraints on class densities and constraints expressed with arc weights has
also been a challenging part of these implementations.
We also developed a tool for generating various test problems: this generator is able to create
random, 2-D and 3-D grids graphs (with or without diagonals), generate layered or random
densities with associated correlations for the arc weights; constraints can be generated at random,
or between two or more faces when applied to grid graphs, including explicit, implicit, equality
constraints, as well as other linear constraints that are useful to model real practical problems.
Our algorithms were implemented in Fortran 77 on a DECstation 3100 using double pre-
cision arithmetic with the Mips f77 compiler.
All shortest path calculations are performed using Johnson's algorithm with a binary heap
[67], which is presented in Chapter 2.

5.5.2 Correlated method { uncorrelated method


We tested the Algorithm 5.1 against that presented in Chapter 4 or Algorithm 4.1 on correlated
problems typically arising in trac modelling and seismic tomography. These problems mostly
contain constraints of type (5.8) with a few constraints of type general linear inequality and
(5.14). A large amount of problems of increasing dimension were generated. Each problem was
determined by specifying a sparse graph of n vertices joined by m arcs. The path constraints
were generated randomly.
Among those generated, we selected a few representative problems whose characteristics ap-
pear in Table 5.1. We recall that L is the number of classes in the correlated problem, and nE
handling correlations between arc weights 91

the number of known shortest paths de ning the constraints of the problem. The graph of test
problem #2 is illustrated in Figure 1; it is in fact extracted from a larger graph covering a whole
city in a realistic application.

# L n m nE
1 9 16 24 12
2 36 49 84 24
3 100 121 220 56
4 289 324 612 144
5 625 676 1300 312
6 1600 1681 3280 650
Table 5.1: Test problems involving class densities

The results of the test runs are reported in Table 5.2. The columns labeled \CORRELATED"
and \UNCORRELATED" refer to the correlated method and the uncorrelated method respec-
tively. As already mentioned above, the number of variables is much smaller when solving a
correlated problem with the former method than with the latter. The label \var" indicates this
number of variables in each case. jAj is the number of active constraints at the solution. In
the third column (for each method), one can nd the number of dropped constraint(s), that is
the number of minor iterations. The reader can then deduce the number of (major) iterations
that were required to solve the problem as being the sum of the number of minor iterations and
jAj + 1. Finally, the heading \time" refers to the total cpu-time (in seconds) needed to obtain
the solution and \sp time" is the time (in seconds) spent in calculating shortest paths trees.

# CORRELATED UNCORRELATED
var jAj drop(s) time sp time var jAj drop(s) time sp time
1 9 3 0 0.230 0.011 24 3 0 0.851 0.011
2 36 4 0 0.433 0.144 84 4 0 0.925 0.066
3 100 4 0 0.894 0.425 220 5 0 1.269 0.367
4 289 13 0 9.750 4.703 612 16 1 7.285 5.175
5 625 41 1 130.582 49.304 1300 31 2 41.078 34.046
6 1600 89 3 1958.714 473.363 3280 125 1 569.003 461.949
Table 5.2: Comparative test results for the correlated and uncorrelated algorithms

The following gure shows the results obtained in Table 5.2 with the correlated algorithm.
The left-hand histogram illustrates the total number of iterations partitioned into drops and
major iterations. The right-hand graphic brings forward the time spent in calculating shortest
paths with respect to the overall algorithm run-time. The corresponding results obtained via the
uncorrelated algorithm are represented in Figure 4.2 of Chapter 4.
We rst notice that the correlated method runs faster on smaller problems (#1 { #3), while
it is not the case on larger ones (#4 { #6). But, of course, the arc weights produced by the
handling correlations between arc weights 92

CORRELATED Algorithm CORRELATED Algorithm

100
100%

sh. paths time / overall time


80
Iterations

80%
60
60%
40
40%
20
20%
0
0%
9 36 100 289 625 9 36
1600 100 289
m 625 1600
m
Drops Major Iterations

Figure 5.4: The correlated algorithm: iterations per problem size and shortest paths calculation

uncorrelated method lack the necessary correlation between their values, although these values do
generally correspond in their order of magnitude. The usefulness of these weights can therefore be
questioned for practical application. The new method, on the other hand, produces the desired
correlations, as expected.
The shortest paths tree calculation requires roughly the same amount of time for both meth-
ods, even when the problem's dimension increases; this time logically tends to vary in parallel
with the number of major iterations.
In order to explain the cpu-time di erences obtained for the tests #4 { #6 (the correlated
method taking much more time, despite the shortest paths computations being comparable), we
evaluated the time spent in each procedure of both methods using the UNIX pro ler. It turns
out that 40% of the time is used for the shortest paths calculation in the correlated method,
while the proportion increases to 80% for the uncorrelated method. The additional computation
in the new method corresponds to calculating the values of ,l (Jk ), and can take up to 50% of
the total execution time. This calculation relates the smaller problem (in term of class densities)
to the larger one (in term of arc weights).

5.5.3 Selecting violated constraints


We now provide a comparison between the performance of two implementations of our correlated
method.
Both implementations start by incorporating the equality constraints, because they must be
active at the solution. The rst variant then chooses the most violated constraint for inclusion
in the active set (a greedy approach). The second just incorporates the rst violated constraint
found in the list of the original problem constraints instead of the most violated one. This latter
handling correlations between arc weights 93

strategy might prove cheaper in computation time because the procedure of selecting the next
constraint to add to the active set is considerably simpler.
We tested these variants on problems where the number of equality constraints is half the
total number constraints. Characteristics of 6 test problems are given in Table 5.3.

# L n m nE
7 9 16 42 12
8 36 49 156 24
9 64 81 144 24
10 225 256 930 60
11 256 289 544 70
12 441 484 924 140
Table 5.3: Test problems with equality constraints

The results of these test runs are summarized in Table 5.4.

# jAj MOST VIOLATED (MV) FIRST VIOLATED (FV)


drop(s) time sp time drop(s) time sp time
1 8 0 0.188 0.044 2 0.179 0.004
2 18 0 0.457 0.093 1 0.425 0.051
3 24 5 1.386 0.292 17 1.960 0.183
4 76 15 60.105 7.046 76 135.450 4.281
5 68 6 31.214 3.984 88 94.839 5.796
6 154 17 373.980 27.191 327 1786.385 56.53
Table 5.4: Test results on equality constraints

Again, we propose in Figure 5.5 illustrations of the results obtained by these variants of
the correlated algorithm. The contents of left-hand and right-hand graphics are the same as
those explained above for Figure 5.4. Upper illustrations apply for the algorithm choosing the
rst violated constraint, and lower graphics illustrate the algorithm choosing the most violated
constraint as candidate to enter the active set.
We note that variant MV gives the least number of drops and also the smallest computing
time for the larger problems. Variant FV only seems interesting for smaller cases. We also tried
to handle the equality constraints just as the other ones without giving them priority to enter
the active set. The obtained results are sometimes better than the worst of MV and FV, but
never better than the best.
Of course, more experience is required before drawing extensive and de nitive conclusions.
But we feel that the reported tests already illustrate some major trends in the use of inverse
shortest path algorithms.
handling correlations between arc weights 94

CORRELATED Algorithm CORRELATED Algorithm


with FIRST violated constraint
with FIRST violated constraint

500
100%

sh. paths time / overall time


400
Iterations

80%
300
60%
200
40%
100
20%
0
0%
9 36 64 225 256 9 36
441 64 225
m 256 441
m
Drops Major Iterations

CORRELATED Algorithm CORRELATED Algorithm


with MOST violated constraint
with MOST violated constraint

500
100%
sh. paths time / overall time

400
Iterations

80%
300
60%
200
40%
100
20%
0
0%
9 36 64 225 256 9 36
441 64 225
m 256 441
m
Drops Major Iterations

Figure 5.5: Algorithm variants: iterations per problem size and shortest paths calculation
6

Implicit shortest path constraints

In this chapter, we examine the computational complexity of the inverse shortest paths problem
with upper bounds on shortest path costs. The presence of upper bounds on shortest path costs
makes the inverse shortest path problem harder to solve. Indeed, such an upper bound constraint
restricts the cost of one path that is not known explicitly, and therefore cannot be expressed as
one or more linear constraints. Actually, our problem then can become non-convex. Solving this
problem has interesting implications, namely in seismic tomography where ray paths between
known locations are usually not observable and hence unknown. We will prove that obtaining a
globally optimum solution to this problem is NP-complete: we show that a polynomial trans-
formation of the well-known 3SAT problem can be viewed as an inverse shortest path problem
with upper bounds on shortest path costs. An algorithm for nding a locally optimum solution is
then proposed and discussed. The local optimality conditions allow to de ne a \stability region"
around a (local) solution when the shortest paths (de ned at that solution) are unique. A combi-
natorial strategy is set up when the shortest paths are not unique. This is necessary to obtain a
solution for which a stability region can be de ned again. We will see that the stability region of a
local solution depends on the second shortest path costs: the idea is to de ne a region in which the
explicit de nition of some shortest paths does not change. Our algorithm (using an enumeration
strategy) has been implemented and tested on problems arising in practical applications. These
tests bring forward the fact that very few problems (among those not generated at random) need
the recourse to our combinatorial strategy. They also illustrate that the combinatorial aspect of
our problem may appear in practice, although the shortest path uniqueness is usually expected
(in double precision arithmetic).
The content of this chapter is reported in [17].

6.1 Motivating examples


The number of possible and interesting variants of the inverse shortest path problem is large.
Yet many applications (including tomography and some trac modelling questions) feature a
speci c class of constraints in their formulation: bounds on the total weight of shortest paths
between given origins and destinations. Unfortunately, only lower bounds on paths costs have
95
implicit shortest path constraints 96

been considered so far. It is the purpose of this chapter to examine the more dicult case where
upper bounds are present as well.
We now motivate this development with two examples.
The rst arises from seismic tomography. In this eld, one is concerned with recovering ground
layers densities from the observations of seismic waves [79]. According to Fermat's principle, these
waves propagate along rays that follow the shortest path in time across the earth crust. One
can then measure, usually with some error, the propagation time of these rays between a known
source and a known receiver. The problem is then to reconstruct the ground densities from these
observations. One approach [87] uses a discretisation of the propagation medium into a network
whose arcs have weights inversely proportional to the local density. In this framework, one is then
faced with the problem of recovering these arc weights from the knowledge of intervals on seismic
rays travel times and from a priori geological knowledge, the ray paths themselves remaining
unknown. This is an inverse shortest paths problem with bound on the paths' weights.
The second example is drawn from trac modelling. In this research area, graph theory is
used to create a simpli ed view of a road network. An elementary (and often justi ed) behavioural
assumption is that network users choose perceived shortest routes for their journeys [92, 13, 84].
Although these routes might be observable, their precise description might vary across time and
individuals, and their travel cost is usually subject to some estimation. This naturally provides
bounds on the total time spent on shortest paths whose de nition is unavailable. Recovering the
perceived arc costs is an important step in the analysis of network users' behaviour. This is again
a problem of the type considered in this chapter.
In the next section, we formalize the problem and explain why this bound constraints on
shortest paths costs cannot t into the framework of classical convex quadratic programs, as
used in Chapter 4 and 5.

6.2 The problem


We now de ne the problem more formally and discuss its special nature.
Consider a directed weighted graph (V ; A; w), where (V ; A) is an oriented graph with n vertices
and m arcs, and where w a set of nonnegative weights f[w]igmi=1 associated with the arcs, where
we use []i to denote the i-th component of a vector. Let V be the set of vertices of the graph
and A = fak gmk=1 be the set of arcs. A path is then de ned as a set of consecutive arcs of A.
As presented in Chapter 4, the idea of the inverse shortest path method is to determine the
arc weights that are as close as possible to their expected values, subject to satisfying the shortest
path constraints. Denoting these a priori expected values by fwigmi=1 and choosing the `2 norm to
measure the proximity between the weight vectors w and w, we therefore consider the following
least squares problem
Xm
minm f (w) = 12 ([w]i , [w]i )2 (6:1)
w 2R i=1
implicit shortest path constraints 97

subject to the constraints


[w]i  0; i = 1; : : :; m (6:2)
and the bound constraints on the cost of shortest paths
X
[w]a  uq ; q = 1; : : :; nI ; (6:3)
a2p1q (w)

where p1q (w) is the shortest path (with respect to the weights w) starting at vertex oq and arriving
at the vertex dq 1 . The values of uq are upper bounds on the cost of the shortest path from oq to
dq. We allow uq to be in nite. Notethat the shortest path p1q (w) is not necessarily unique for a
given w. This will have important implications later in this chapter.
The method proposed in Chapter 4 is based on the quadratic programming algorithm due to
Goldfarb and Idnani [55]. The idea is to compute a sequence of optimal solutions to the problem
involving only a subset of the constraints present in the original problem. The method therefore
maintains an active set of constraints. Starting from the unconstrained solution, each iteration
incorporates a new constraint in the active set, completing what we call a major iteration. To
achieve this goal, it may be necessary to drop a constraint from the the active set. These drops
occur in minor iterations.
Incorporating upper bound constraints of the type (6.3) in the active set is complex. The
diculty is that expression (6.3) only de nes the path p1q (w) implicitly, while adding a constraint
to the active set (as a linear inequality on arc weights) requires an explicit de nition of that
constraint of the form X
[w]a  uq (6:4)
a2p
where one needs the explicit de nition of the path p as a succession of arcs to specify which arcs
appear in the summation. When such a constraint is activated, one naturally chooses a path
which is currently shortest given the value of the arc weights [w]a. However, as these weights
are modi ed in the course of the optimization, a path that is shortest between a given origin
and destination may vary, and therefore the explicit de nition of the constraint in the form (6.4)
should also vary accordingly. An immediate consequence of this observation is that, besides
adding and dropping constraints of the type (6.4) from the active set, one should also keep track
of the modi cations in the explicit de nitions of the constraints (6.3) which might in turn modify
the active set.

6.3 The complexity of the problem


6.3.1 The convexity of the problem
The diculty of handling constraints of type (6.3) can be partially explained by the fact that
they generate a non-convex feasible region. Our goal of nding a global minimizer of the objective
1
The meaning of the superscript 1 in p1q (w) is to indicate that the shortest path is considered, as opposed to
the second shortest.
implicit shortest path constraints 98

function (6.1) with a method of low complexity then appears much more dicult, despite the
fact that the objective is strictly convex.
Let us recall the small example of Chapter 3 dedicated to illustrate the non-convex nature of
our constraints. Consider the following graph, composed of 3 vertices and 3 arcs (m = 3), shown
in Figure 6.1.

t
o t
a1
,
,
,@ a2
,
@
@
@
a3
t d

Figure 6.1: A small graph

Consider now the problem of minimizing (6.1) subject to the constraint


X
[w]a  5; (6:5)
a2p1(w)
where p1(w) is the shortest path (with respect to the weights w) from vertex o and to vertex
d. It is easy to see that w1 = (2 2 10)T and w2 = (10 10 4)T are feasible solutions, while
1 (w1 + w2) = (6 6 7)T is infeasible. The problem is therefore non-convex.
2

6.3.2 The 3-SAT problem as an inverse shortest path calculation


In this section, we use the terminology of the complexity theory introduced in Chapter 3. As
mentioned in Chapter 4, the original inverse shortest path problem is solvable in polynomial
time since an equivalent formulation of the problem contains a polynomial number of convex
constraints. The original problem then belongs to the class P of decision problems solvable in
polynomial time by a deterministic algorithm [50, 69, 111, 113].
A problem is NP-hard if every problem in the class NP of problems, solvable in polynomial
time on a nondeterministic Turing machine, can be transformed to it. See [50]. Cook [24] proved
that there exist NP-hard problems by showing that the \satis ability" problem has the property
that every other problem in NP can be polynomially reduced to it. Therefore, if the satis ability
problem can be solved with a polynomial time algorithm, then so too can every problem in NP.
In e ect, the satis ability problem is a \hardest" problem in NP. Many other combinatorial
problems, such as the travelling salesman problem, have since been proved to have this same
\universal" property. Vavasis proved in [113] that the general non-convex quadratic problem is
one of those hard problems in NP.
The class of NP-complete problems consists of all such problems which belong themselves to
NP.
We now show that the addition of the constraints (6.3) makes the inverse shortest path
problem NP-hard. This we do by reducing a known NP-hard problem to it.
implicit shortest path constraints 99

A particular instance of the satis ability problem, the 3-SAT problem, is one of the best
known NP-complete problem. We follow [50] for its brief description. Let X be a set of Boolean
variables fx1; x2; : : :; xlg. A truth assignment for X is a function t : X ! ftrue; falseg. Let x
be a variable in X , then we say that x is realized under t if t(x) = true. The variable :x will be
realized under t if and only if t(x) = false. We say that x and :x are literals de ned upon the
variable x. A clause is a set of literals over X , such as fx1 ; x2; :x3g, representing the disjunction
of those literals and is satis ed by a truth assignment if and only if at least one of its member is
realized under that assignment. A set C of clauses over X is satis able if there exists some truth
assignment for X that simultaneoulsy satis es all the clauses in C . The 3-SAT problem consists in
answering the question: is there a truth assignment for C , when the clauses in C contain exactly 3
literals over X ? Cook proved that this problem is NP-complete [24]. We can show that another
problem is NP-hard by showing that 3-SAT can be polynomially transformed to it.
Let ISP denote the decision problem: given an inverse shortest path problem and a bound k,
does there exist a solution with objective value at most k? We show that ISP is NP-complete,
which implies that the inverse shortest path problem is NP-hard.
Theorem 6.1 ISP is NP-complete.

Proof. We proceed as follows:


1. show that problem ISP is in NP,
2. construct a transformation from 3-SAT to ISP, and
3. prove that this is a polynomial transformation.
The rst requirement is easy to verify in our case, because the shortest path problem itself
can be solved in polynomial time and all a nondeterministic algorithm solving ISP need do is
guess a set of arc weights and verify in polynomial time that they satisfy the constraints.
Let us now examine the second requirement and consider a 3-SAT problem with l variables
and p clauses. We represent each variable xi by a small (sub)graph with two distinct paths
between a node si to a node di (see Figure 6.2). The variable xi will be true or false depending

t
1,,
t
tt tt
u1i -1
@
R
@
u2i
1
si 
,
, @
@ di
@
R
@ ,
,
1@@ - 
,
, 1
li1 1 li2

Figure 6.2: The representation of xi

on whether the shortest path from si to di follows the upper path (via vertices u1i , u2i ) or lower
implicit shortest path constraints 100

one (via vertices li1, li2) of its associated graph. Imposing that
si = di,1 (for i = 2; : : :; l), (6:6)
we obtain a \chain-like" resulting graph representing our Boolean variables. A path from vertex
s1 to vertex dl in this graph is therefore equivalent to a truth assignment of all Boolean variables.
We assign an initial cost of 1 to each of the six arcs of the \Boolean graph" xi .
We now describe a representation of our p clauses. A clause c of the 3-SAT problem is a
disjunction of type (xi _ xj _ :xk ), for instance. The clause c will be associated with the choice
among three possible paths going from a vertex named ac to a vertex named bc , where ac and bc
are di erent from the vertices of the Boolean graphs xi (i = 1; : : :; l). Each of the three paths is
formed by three consecutive oriented arcs. The rst arc originates at vertex ac and has a zero cost,
and the last one terminates at vertex bc and has a zero cost too. The middle arc is determined as
one of the arcs (li1; li2) or (u1i ; u2i ) depending on whether the considered variable xi in the clause is
negated or not. The subgraph associated with a clause c of the type (xi _ xi+1 _:xk ) is illustrated
in Figure 6.3.

s
ac

 A
  A
  A

  A
0  
/ A
A 0
+

s s s s s s
 

  0 AA
U
  A

s s s s s s s ppp s s s s
  A

-  - -
A
, 1 @ BR , 1 @ R A,,
, @R
si 
, B @,
B,


 @ d
i+1 sk ,
A @ dk
@
R ,@ R , @
RA ,
@ -  ,B @ -  ,  @A -1 ,
B  
BBN  
B  


0 B  0
B 
 


s
B  
  0
B  
B  
B

bc

Figure 6.3: The subgraph associated with clause c

Our representation of the variables xi and the clauses cj generates an weighted oriented graph,
which we call G . The original cost of any path between ac and bc (c = 1; : : :; p) is 1, and the cost
of the shortest path from s1 to dl is 3l. The 3-SAT problem then is equivalent to the question:
is there a choice of nonnegative arc weights in G such that the cost of the shortest path between
implicit shortest path constraints 101

each pair of nodes (ac ; bc) is zero as well as that of the shortest path from s1 to dl , and such that
the `2 distance of these weights to the original weights is at most 3l?
The equality constraints on the shortest paths in this formulation may be replaced by upper
bound constraints provided that we require the arcs weights (w) to be as close as possible to the
original weights w . The resulting problem is therefore

w kw , w k
min 2 (6:7)
subject to
w0 (6:8)
cost(ac ; bc)  0 c = 1; : : :; p (6:9)
cost(s1 ; dl)  0; (6:10)
where cost(n1; n2) is the cost of a shortest path from n1 to n2 in G .
We recognize, in the formulation (6.7){(6.10), an instance of our inverse shortest path problem
with upper bound constraints on the cost of shortest paths. We thus have found a transformation
from the 3-SAT problem to ISP.
Finally, it is easy to see that this transformation is polynomial, since the instance of ISP
we constructed has 6l + 6p arcs and 5l + 1 + 2p nodes. This complete our proof that ISP is
NP-complete. 2

6.4 An algorithm for computing a local optimum


We saw in Chapter 3 that for convex problems, each critical point is a global optimum. In our
non-convex context, we shall be happy with a local minimum, that is a set of weights w^ such
that, for all w in the intersection of the feasible domain and a neighbourhood of w^, one has that
f (w)  f (w^).
In Chapter 4 and 5, we considered a dual approach to nd a global solution to the inverse
shortest paths problem, because of its robustness and the insurance to reach a global optimum.
Yet, this approach presents a drawback in our new context: it heavily relies on convexity.
On the other hand, primal methods typically generate a sequence of primal feasible iterates
ensuring a monotonic decrease of the objective function. They do not rely so much on convexity
and have the further advantage of giving a approximated solution satisfying the constraints when
the iteration process is interrupted. This approach \from the inside" is the one we have chosen
to follow. The general outline of our proposal is as follows.
1. We rst compute a feasible starting point.
2. At each iterate, we revise the explicit de nition of the shortest path constraints, and solve
the resulting convex problem, using the algorithm proposed in Chapter 4.
3. The calculation is stopped when no further progress can be obtained in this fashion.
implicit shortest path constraints 102

6.4.1 Computing a starting point


Selecting a good starting point is important in this framework. We propose to use Algorithm 4.1
to nd our starting weights. This algorithm will indeed compute an optimal solution of a variant
of the problem where the explicit description of the path constraints is kept xed at that chosen
in w. This variant is
Xm
minm 21 ([w]i , [w ]i)2 (6:11)
w2R i=1
subject to
[w]i  0; i = 1; : : :; m (6:12)
and the bound constraints on the cost of the a priori shortest paths
X
[w]a  uq ; q = 1; : : :; nI : (6:13)
a2p1q (w)
It is important to note that w1, the solution of (6.11){(6.13), is feasible for the original
ISP, although it may not be optimal because, at w1, the explicit de nition of the shortest path
constraints may di er from that at w . This calculated vector is therefore a suitable starting point
for a primal algorithm.

6.4.2 Updating the explicit constraint description


Given a set of weights w, we must choose an explicit description of the constraints associated
with w. Together with the quadratic objective function, this new explicit description then de nes
a convex quadratic program.
We emphasized the word \choose" above because there might be more than a single shortest
path p1q (w) between the origin oq and destination dq of the q -th original constraint in the ISP.
Denoting the number of shortest paths from oq to dq (for a given w) by nq (w), we de ne p1q (w; ij )
(ij = 1; : : :; nq (w)) as the ij -th shortest path from oq to dq . This de nition assumes that we
ordered the nq (w) shortest paths, for instance using lexicographic order. For convenience, we
(re)de ne p1q (w) as the \ rst" shortest path from oq to dq , that is
p1q (w) def
= p1q (w; 1): (6:14)
For future reference, the possible convex feasible regions determined, for a given w, by the con-
straints (6.2){(6.3) will be denoted by F (w; i1; : : :; inI ), where iq (q = 1; : : :; nI ) varies from 1 to
nq (w). Again, for convenience, we de ne
F (w) def
= F (w; 1| ; :{z: :; 1}): (6:15)
nI
Using these notations, P (w) and P (w; i1; : : :; inI ) respectively denote the problem of minimizing
f (w) subject to w 2 F (w), or in F (w; i1; : : :; inI ). Finally, F denotes the generally non-convex
feasible domain determined by (6.2) and (6.3).
Updating the constraint description at w therefore amounts to specifying F (w; i1; : : :; inI ) for
some choice of the indices i1; : : :; inI .
implicit shortest path constraints 103

6.4.3 Reoptimization
Once F (w; i1; : : :; inI ) has been determined at the feasible point w, it is possible to solve the
associated convex quadratic program P (w; i1; : : :; inI ). This process is called \reoptimization".
Because we assume that reoptimization will always take place at a point w which is the solution
of another subproblem P (w0 ; i01; : : :; i0nI ), it is not dicult to see that the new subproblem di ers
from the old one in two di erent ways.
1. Some constraints of P (w0; i01; : : :; i0nI ) are now obsolete because the associated path, although
shortest for w0 , is no longer shortest for w. These constraints must be replaced by constraints
whose explicit description corresponds to paths that are shortest for w.
2. Although p1q (w0; i01; : : :; i0nI ) can still be shortest for w, another shortest path between oq and
dq may be chosen to de ne the new subproblem. The constraint whose explicit description
corresponds to p1q (w; i01; : : :; i0nI ) must then be replaced by another constraint with explicit
description corresponding to p1q (w; i1; : : :; inI ).
As a consequence, some linear inequalities of the form (6.4) are dropped from the subproblem
and some new ones are added.
Adding new linear inequalities can be handled computationally by using the Goldfarb-Idnani
dual quadratic programming method, as is already the case in Chapter 4. Removing linear
inequalities can be handled much in the same way by computing the Goldfarb-Idnani step that
would add them and then taking the opposite. These calculations are straightforward applications
of the method presented in Chapter 4; they are detailed and illustrated in Section 6.5.

6.4.4 The algorithm


We are now in position to specify our proposal for an algorithm that computes a local solution
to ISP.
Algorithm 6.1
Step 0: Initialization
Compute w1 using Algorithm 4.1, the inverse shortest paths algorithm of Chapter 4 for
solving P (w).
Set i 1 and C1 (1; : : :; 1)
Step 1: Update the feasible region
Compute F (wi; Ci).
Step 2: Reoptimization
Compute wi+1 the solution of P (wi ; Ci), using the inverse shortest paths Algorithm 4.1.
If wi+1 6= wi , set i i + 1 and go to Step 1 with Ci = (1; : : :; 1).
Step 3: Choose another shortest
Q path combination
Is there, amongst the nci = nq=1
I n (w ) possible shortest path combinations at w , one that
q i i
implicit shortest path constraints 104

has not been considered yet?


If no, stop: wi is a local minimum of P (w).
Otherwise, rede ne Ci to be (i1; : : :; inI ), the nI -uple of indices corresponding to an untried
combination and go to Step 1.
The reader might wonder if the (possibly costly) loop between Steps 3 and 1 is necessary.
We now show that this is the case by providing a simple example, in which it is not sucient
to examine the (1; : : :; 1) combination of shortest paths only, or even to consider every possible
shortest path separately.
Consider the small graph, composed of 9 vertices and 11 arcs, shown in Figure 6.4.

t
ttt
g
 @
, I
@ c
b ,

ttt
I d ,
6
@  6
@,
6
f i

t t
@
I 
,
@,
a@
, I
e , @ h
Figure 6.4: A small example showing path combinations

Let us assume that [w]i = 10 for i = 1; : : :; 11, and consider the problem of minimizing (6.1)
with m = 11 subject to 12 constraints of type (6.3), de ned by
8
>
> o1 = a; d1 = b; u1 = 10;
>
> o2 = a; d2 = c; u2 = 10;
>
>
> o3 = d; d3 = b; u3 = 5;
>
> o4 = d; d4 = c; u4 = 5;
>
>
>
> o5 = e; d5 = f; u5 = 10;
< o6 = f; d6 = g; u6 = 10;
> (6:16)
> o7 = h; d7 = i; u7 = 10;
>
> o8 = i; d8 = g; u8 = 10;
>
>
>
> o9 = e; d9 = a; u9 = 5;
>
> o10 = h; d10 = a; u10 = 5;
>
> o = b; d11 = g; u11 = 5;
>
: o1112 = c; d12 = g; u12 = 5:
We directly see that, at any solution, all arcs but ad will have a weight equal to 5, since nq = 1 for
all q 6= 1; 2. Suppose now that, for these latter constraints, the shortest paths have been ordered
as in (6.16) and have been considered by the algorithm in that order. As a consequence, solving
implicit shortest path constraints 105

the problem in the feasible region F (w;  1; : : :; 1) will give the solution [w^ ]i = 5 (i = 1; : : :; 11),
since the shortest path from a to b and that from a to c both use vertex d. The objective function
value at w^ is 137.5.
Note now that, at w^, the shortest paths are not unique between the o-d pairs (a; b) and
(a; c), since the paths a , f , b and a , i , c are also shortest. Furthermore, this set of weights
can be improved by considering P (w; ^ 2; 2; : : :; 1), whose solution has every arc weight equal to
5 except that of arc ad, which is equal to 10 and where the objective function has the value
125. Moreover, examining every possible shortest path separately would not allow any progress
because successively solving P (w;^ 2; 1; : : :; 1) and P (w;
^ 1; 2; : : :; 1) still gives the same solution w^ .
It is therefore crucial to consider every combination of shortest paths that are not unique at
a potential solution.

6.4.5 Some properties of the algorithm


In this section, we examine some properties of the algorithm proposed above. In particular, we
show its termination and analyze the \stability" of the local solution it produces.
Theorem 6.2 Algorithm 6.1 above terminates in a nite number of iterations.

Proof. The number of paths between two vertices is nite, since the number of arcs m is
nite. As a consequence, the number of di erent convex polygons F (wi; Ci) computed at Steps 1,
and nci calculated in Step 3 are also nite. The algorithm consists of a sequence of convex inverse
shortest path problems di ering by the actual shortest paths used in the explicit description
of the constraints. Furthermore, each of these subproblems is considered at most once and is
solvable in a nite number of operations. The complete algorithm therefore also terminates in a
nite number of steps. 2
Let us consider the point w^ obtained at termination of the algorithm. We now show that w^
is a local minimum of our problem (6.1){(6.3) and analyze the neighbourhood V (w^ ) around w^ in
which every other feasible point has a higher objective function value. In other words, we show
that w^ is locally \stable" as a local minimum in a neighbourhood of V (w^ ) of w^ in which all the
explicit shortest paths de ning the constraints (6.3) remain unchanged when they are unique.
The solution's \stability" therefore depends on \how far" the second shortest paths are from w^.
Considering the q -th shortest path constraint, we denote the cost of the \optimal" shortest
path from oq to dq by Pq1 , that is
X
Pq1 def
= [w^ ]a: (6:17)
a2p1q (w^)
We already mentioned that p1q (w^ ) may be not unique, although Pq1 is. We then de ne a second
shortest path from oq to dq as a path whose cost is closest but strictly larger than that of p1q (w^),
i.e. Pq1 . The rst such second shortest path (in our prede ned path order) is denoted, if it exists,
by p2q (w^ ) and its cost by Pq2 . If p2q (w^) does not exist, then we set Pq2 = 1 by convention. With
these additional notations, we are now in position to state the next property of our algorithm.
implicit shortest path constraints 106

Theorem 6.3 The point w^ computed by Algorithm 6.1 is a local optimum of P (w), the original
problem. Moreover, f (w)  f (w^ ) for every w in
 
V (w^) def
= w2F j kw , w^k1 < min
q [Pq , Pq ]
2 1 ; (6:18)
where k  k1 is the usual `1 -norm.
Proof. Let us consider the conditions under which p1q (w^) may vary around w^, and de ne
a stability neighbourhood Vq (w^ ) associated with each shortest path constraint. Four cases need
to be examined.
1. Pq2 = 1 and nq = 1.
In this situation, the path from oq to dq is unique and p1q (w) is obviously constant for all
w 2 Rm. We then de ne Vq (w^) = Rm \ F = F .
2. Pq2 = 1 and nq > 1.
There are now more than one path from oq to dq , but they all have the same cost Pq1 .
In this case, an in nitesimal change in the cost w^ may cause the feasible polygon de ned
at w^ to change. However, since w^ is a point produced by our algorithm, choosing any of
the nq , 1 other possible polygons does not produce an objective function decrease. This
indicates that f (w^) may not be improved upon in the neighbourhood Vq (w^ ) = F .
3. Pq2 6= 1 and nq = 1.
In this situation, the explicit description shortest path p1q (w^ ) will not change until its
cost reaches that of the second shortest path. More precisely, p1q (w) is constant in the
neighbourhood
Vq (w^) = fw 2 F : kw , w^k1 < Pq2 , Pq1g: (6:19)
4. Pq2 6= 1 and nq > 1.
This is a combination of the two previous cases. As above, f (w) cannot be improved upon
in the neighbourhood Vq (w^ ) = fw 2 F : kw , w^ k1 < Pq2 , Pq1 g.
Moreover, the algorithm's mechanism implies that we cannot nd a point better than w^ by
considering all combinations of constraint de nitions as examined above for a single constraint.
As a consequence, w^ will be a \stable" solution in the neighbourhood
\
nI  
V (w^) = Vq(w^); = w 2 F : kw , w^k1 < min
q q[ P 2 , P 1] :
q (6:20)
q=1
2
We now examine the case where the original ISP problem also features lower bounds on the
costs of the shortest paths between given origins and destinations, that is constraints of the type
X
0  lq  [w]a; (6:21)
a2p1q (w)
implicit shortest path constraints 107

for q = 1; : : :; nI (where lq can be chosen as zero) and lq  uq . These constraints are much
easier to handle, because the inequality (6.21) must be satis ed for every possible path from oq
to dq . Of course, the number of these linear constraints is typically very high, but the situation is
entirely similar to that handled in Chapter 4 and 5. As a consequence, the technique developed
in these last chapters is directly applicable to each convex subproblem arising in the course of
the solution of problem (6.1){(6.3), (6.21).

6.5 The reoptimization procedure


In this section, we analyse the reoptimization procedure of Step 2 of our algorithm and give some
results about updating the objective function f and the variables of the problem. Let us rst
introduce some notations.

6.5.1 Notations
Let us consider the problem P (w; i1; : : :; inI ). If we \freeze" the arc weight values w, we may
rewrite the i-th constraint of type (6.3) as
Ei(w) def
= nTi w , bi  0 (i = 1; : : :; nI ); (6:22)
where ni 2 Rm and bi = ,ui for i = 1; : : :; nI . The vector ni represents the normal to the i-th
constraint. The matrix of the normal vector of the constraints in the active set indexed by A
will be denoted by N . A, will denote a subset of A containing one fewer element than A, and
N , will represent the matrix of normals corresponding to A, . The normal nr will designate the
column deleted from N to give N , . The index set A, then designates A n frg.
Since the Hessian G of the objective function (6.1) equals the identity, the Moore-Penrose
generalized inverse of N in the space of variables under the transformation y = G w simply is
1
2

N  def
= (N T N ),1 N T ; (6:23)
and
H def
= (I , NN ); (6:24)
the orthogonal projection on the null-space of N , is then the inverse reduced Hessian of the
quadratic objective function in the subspace of weights satisfying the active constraints. Denoting
the gradient of f , by g (w) = w , w, we designate the Lagrange multipliers at the point w by
u(w). Let us de ne P (A) as the problem of minimizing (6.1) subject to the subset of constraints
(6.22) indexed by A and considered as equalities. As proved in Chapter 3, at the optimal solution
w^ of problem P (A), we can write (3.33) and (3.34) as
u(w^)  N g (w^ )  0; (6:25)
and
Hg (w^ ) = 0; (6:26)
implicit shortest path constraints 108

respectively. This formulation comes from the fact that g (w^) is a linear combination of the
columns of N , g (w^) = Nu(w^), as is implied by the rst order condition (6.25). Remember that
conditions (6.25) and (6.26) are also sucient to characterize w^ .
Finally, H , will denote the operator (4.9) with N replaced by N , and we will use similar
notations for u.

6.5.2 How to reoptimize


Suppose that w^ is the solution to the problem P (A), and that there exists 1  i  nI such that
X X
[w^]a < [w^ ]a; (6:27)
a2p1i (w^) a2pi
where pi is the path from oi to di that was considered as shortest between these two vertices just
before reaching w^ as current solution. Let r refer to the index to the explicit active constraint
associated to the path pi which is no longer shortest (this supposes that r 2 A). The needed
reoptimization then consists of nding w , the optimal solution to P (A,) (since constraint r
ceases being active), that is
w = wmin
2M
f (w); (6:28)
where M = fw 2 Rm j nTi w = bi; i 2 A, g. Since w^ 2 M, w is attained at the well-known point
w^ , H ,g (w^ ) (see Theorem 3.6). This solves the reoptimization in the primal space.
Now, in order to update the the Lagrange multipliers u(w^), we need the steplength t such
that
w = w^ + ts; (6:29)
where s is the primal step direction solving the reoptimization starting from w^ . Then,
u(w) = u(w^) + td; (6:30)
where d is the dual step direction solving the reoptimization. We are now concerned to nd t
and d.
The method to nd w from w^ is precisely the reverse of that computing w^ from w , that is,
computing the solution of P (A, [ frg), knowing that of P (A,). Figure 6.5 shows this feature
when jAj = 2. The reverse problem is solved via the method of Goldfarb and Idnani (see
Chapter 3).
Since nr is linearly independent from fni gi2A, , we apply that method in the case where the
constraint to be added (of normal nr ) is linearly independent from the active set A, . Assuming
that w is the optimal solution to P (A, ), we obtain the optimum of P (A, [frg) at w^ = w + ts0
by Lemma 3.7, where
s0 = H , nr : (6:31)
! !
The corresponding Lagrange multipliers are u(w^ ) =
u,(w) + t ,d, , with
0 1
d, = (N ,)nr (6:32)
implicit shortest path constraints 109

feasible region feasible region

nr ni ni

ŵ ŵ s
w*
-nr

w̄ w̄

Figure 6.5: Solving P (A) and P (A, ): one iteration

and 
t = ur (w^) = , (Esr0 )(Twn) : (6:33)
r
!
Since s = ,s0 and d =
d, , we have proved the following theorem.
,1
Theorem 6.4 If w^ is solution of P (A), and if s = ,H , nr , then, the weight vector w = w^ + ts,
such that t = ur (w^ ), veri es the optimality conditions of P (A, ), that is, the primal optimality of
w
H ,g (w) = 0; (6:34)
the primal feasibility of w
Ei(w)  nTi w , bi  0 (i 2 A, ); (6:35)
and the dual feasibility of w
u, (w)  (N , )g (w)  0: (6:36)
As a consequence, note that
f (w ) , f (w^ ) = 12 [ur (w^ )]2nTr s  0: (6:37)
because nTr s  0, since H , is positive semi-de nite.

6.6 Some numerical experiments


We now present some results obtained with a preliminary implementation of the algorithm de-
scribed above.
implicit shortest path constraints 110

6.6.1 Implementation details


Our method has been implemented in double precision Fortran 77 and C and has been run on
a DECstation 3100 under Ultrix. This implementation has been very challenging, especially the
part related the combinatorial strategy coming into play when the shortest paths are not unique
at a potential solution. Indeed, remember that in this case, we must enumerate and examine all
shortest paths combinations. First, we had to modify our shortest path algorithm in order to be
able to compute how many shortest paths there were between two vertices, and compute the k-th
shortest path when needed, because it was obviously out of question to store them all. This has
been achieved recursively in C language. Then we set up a combinatorial procedure to determine
a \no yet visited" shortest path combination again recursively (in C) without storing any further
information. These C procedures were called from Fortran subroutines. A last dicult point
has been to detect a change in the feasible region: this has been achieved by comparing temporary
les containing active shortest path descriptions at di erent stages of the algorithm.
Our program selects among possible active shortest paths in Step 3 by examining rst paths
that di er as close to the destination as possible, ties being broken by considering vertices in
their numbering order. The shortest path calculations are performed by Johnson's variant of
Dijkstra's algorithm using a binary heap (see Chapter 2 for its description).

6.6.2 Tests
We have selected a few problems whose graphs, shortest path constraints and a priori weights
have been generated in di erent ways. The problems and their characteristics are summarized
in Table 6.1. In this table, the heading \vertices" refers to the number of vertices in the graph,
\graph type" indicates how the network is generated, \weights" indicates how the a priori weights
are chosen (a layered choice means that subset of arcs were chosen with constant costs, corre-
sponding to grid levels in the case of grid-like graphs), \constraints" indicates how the shortest
path constraints are chosen: either by choosing origins and destinations at random or by choosing
them along the faces of the grids, when applicable.

Problem vertices m nI graph type weights constraints


Example 9 11 12 See Section 6.4.4
P1 100 180 125 2D grid constant random + faces
P2 181 504 100 2D grid constant faces
P3 100 180 100 2D grid layered random
P4 100 210 100 random random random
P5 100 180 70 2D grid layered faces
P6 181 504 100 2D grid layered faces
P7 500 769 26 random layered random
P8 300 860 400 random random random
Table 6.1: Test examples and their characteristics

All these problems but P1 were solved in that a local minimum was found for all of them.
implicit shortest path constraints 111

The results of applying our pilot code to these problems is reported in Table 6.2. In this
P
table, i and nc = ij =1 ncj refer to the number of iterations of the algorithm and the total number
of possible active shortest paths combinations respectively, as described in Section 6.4.4. The
column \comb" indicates how many of the nc paths combinations were e ectively examined by
the algorithm before termination. The symbol , means that it has not been possible to solve
problem P1 in less than a week on our workstation.

Problem i nc comb
Example 2 4 4
P1 -  109 -
P2 2 48 1
P3 1 0 0
P4 1 0 0
P5 1 0 0
P6 2 0 0
P7 1 0 0
P8 1 0 0
Table 6.2: Results for the test problems

The following comments can be made on these results.


1. Many of our problems were solved with a single iteration (i = 1) of our algorithm, but not
all of them. However, the number of iterations remains small on these examples.
2. As expected, problems with randomly generated weights were solved in a single iteration.
3. As shown be the behaviour of the algorithm on problem P1, the combinatorial aspect of the
method may practically appear. The development of better heuristics to improve the choice
of the active paths combination seems therefore useful. Comments about P1's results are
made below.
4. The detection of paths with equal costs is nontrivial in nite precision. We have chosen
to consider all paths whose relative costs di er by at most one hundred times machine
precision. Further consideration should probably be given to this potentially important
stability issue.
Many tests were performed on 2D grid graphs (see Table 6.1). The results of Table 6.2 indicate
more than one iteration of Algorithm 6.1 only when applied to these graphs, and particularly
when the constraints are generated between faces of the grid. When generated at random, the
shortest paths are very likely \independent", in the sense that their explicit de nitions should
share very few arcs, and should be \far" from each other. As a consequence, activating such
a constraint would not in uence the explicit de nition of other shortest path constraints. In
contrast, as lots of shortest paths are de ned between faces of a grid, their explicit de nitions
implicit shortest path constraints 112

should share many arcs and be usually \close" to each other, so that modifying arc weights on
one of them may in turn modify the explicit de nition of proximate shortest paths very likely.
Table 6.2 also indicate that the combinatorial nature of the problem showed up again only
with grid graphs. This is due to the fact that the nonuniqueness of shortest paths relies on the
density of the graph m=n: an increase in that value tends to reduce the nonuniqueness of the
shortest paths. This trend in 2D grids has been analysed by Moser in his Ph. D. Thesis [87]. We
report here one interesting result he mentioned in his report.
Let us consider a 2D grid and denote its nodes by a double index (i; j ) specifying their
Cartesian position in the grid. The number of shortest paths from node (0; 0) to node (i; j ) is
designated by s(i; j ). We choose the simplest 2D grid which is squared or cross-ruled like the
streets of New York, where each link or arc have the same weight. For convenience, we do not
mention the grid size and will refer to the entire grid by specifying i; j  0. In such grids, s(i; j )
can be recursively computed by the following relation:
s(i + 1; j + 1) = s(i + 1; j ) + s(i; j + 1); for i; j  0 (6:38)
with the initial condition that
s(i; 0) = s(0; j ) = 1; for i; j  0: (6:39)
Indeed, there are two shortest paths from (i; j ) to (i + 1; j + 1), one going through node (i + 1; j )
and the other one through node (i; j + 1). We can make use of an auxiliary function f (x; y ) in
order to solve (6.38) knowing (6.39):
X
f (x; y ) def
= s(i; j )xiy j (6:40)
i;j 0
where
0  x; y < 1: (6:41)
The function f (x; y ) can be developed as follows.
f (x; y ) = 1 + Pi1 s(i; 0)xi + Pj 1 s(0; j )y j + Pi;j 1 s(i; j )xiy j by (6:39)
P P
= 1 + 1,x x + 1,y y + i;j 1 s(i; j , 1)xiy j + i;j 1 s(i , 1; j )xiy j by (6:38)
P P
= 1 + 1,x x + 1,y y + y i1;j 0 s(i; j )xiy j + x i0;j 1 s(i; j )xiy j
= 1 h+ 1,x x + 1,y y + i h i
y f (x; y) , Pj0 s(0; j )y j + x f (x; y ) , Pi0 s(i; 0)xi
= 1 + (x + y )f (x; y )
(6:42)
We then have that
f (x; y ) = 1,x1,y
P
= k0 (x + y )k
P P
= k0 ki=0 ( ki )xiy k,i
(6:43)
P
= i;j 0 ( i+ji )xi y j ;
implicit shortest path constraints 113

where ( ab ) = [a!(bb,! a)!] are the binomial coecients, which represent the number of possible choices
of choosing a items amongst a set of b items without distinguishing their ordre and without taking
the same item twice.
As a consequence (6.40) and (6.43) give
s(i; j ) = ( i+ji ) = (i i+!jj! )! : (6:44)
It is easy to check that s(i; j ) will grow very rapidly even when (i; j ) lies not far from (0; 0) (for
instance, s(10; 10) = 184756).
Moser mentioned similar results for 2D grids with higher densities. He observed that s(i; j )
decreases very rapidly when each node originates at an increasing number arcs. However, this
does not mean that shortest path uniqueness is attained with high densities. Moser extrapolated
his observations and conjectured that shortest paths are uniquely determined only for nodes on
a few straight2 lines through (0; 0).
This explains the potential nonuniqueness that is present in our grids, especially those built
with constant arc weights. The resolution of problem P1 su ered from this potential nonunique-
ness. Let us nally mention that amongst all generated problems (not only those presented in this
section) problem P1 is the only one which presented a so strong combinatorial aspect (nc  109 ).

2
Grids generated by Moser include namely square diagonals which allows to reach several nodes by straight
lines from (0; 0).
7

Conclusion and perspectives

This thesis deals with instances of the inverse shortest path problem. Signi cant questions
arising in applied mathematics can be formulated as instances of this inverse problem. Chapter 1
showed the pertinence of such a formulation in problems arising in trac modelling and seismic
tomography. In a typical inverse shortest path problem we want to recover some attributes of a
network, while knowing informations about the shortest paths between certain origin-destination
pairs from observing actual ow on the network. To make the solution unique, we assume that
a set of attributes is a priori known, and we would like the solution to be as close as possible
to these a priori known values. Thus, the objective function is to minimize the norm of the
di erence between the solution and the known vector of attributes. The constraints assure that
the shortest paths are the same or verify the same properties under the solution, and may
represent other relationship between the paths or variables of the problem. Solving the problem
require an algorithm solving the direct problem (the shortest path problem) and an algorithmic
framework that depends on the choice of the norm for evaluating the proximity of the solution
to the a priori values. Chapter 2 discussed the choice of a shortest algorithm with respect to the
properties of graphs representing the networks under study. Johnson's algorithm was selected
to solve the direct problem in our context. This method could eventually be combined with
updating techniques for eciency. Chapter 3 examined the quadratic programming context and
nally preferred the Goldfarb and Idnani approach to solve our problem. The need to handle an
exponential number of linear constraints involving much redundancy has guided this choice.
In the uncorrelated inverse shortest path problem, the variables are the weights on the arcs
and there is no correlation between them. In the correlated problem, however, the arcs are
divided into classes, and the weights of the arcs in the same class are derived from the same value
which is referred to as the class' density. Thus, the weights within each group are correlated, and
the variables are actually the densities. Correlation of course implies more restriction and hence
more constraints.
In Chapter 4, the uncorrelated inverse shortest paths problem has been posed and a compu-
tational algorithm has been proposed for one of the many problem speci cations: the constraints
are given as a set of shortest paths and nonnegativity constraints on the weights. The proposed
algorithm has been programmed and run on a few examples, in order to prove the feasibility of
114
conclusion and perspectives 115

the approach.
Chapter 5 provides a modi ed method for solving the inverse shortest path problem with
correlated arc weights. To achieve this goal, we generalized the inverse shortest path method
of Chapter 4 to take the desired correlation into account. We derived new expressions for the
primal step and other quantities in the algorithm. We tested our new algorithm on a wide class of
correlated problems and compared it with the original uncorrelated method. Finally, two possible
strategies for handling constraints were considered and compared in this context.
Finally, in Chapter 6, we have presented and motivated the inverse shortest path problem
with upper bounds on shortest paths costs. These constraints may no longer be expressed as
sets of linear constraints. The resulting feasible region may therefore be non-convex. The NP-
completeness of nding a global solution of this problem has then been shown. An algorithm for
local minimization has been presented, analyzed and tested on a few examples.
In this thesis, we have supplied algorihms to solve the main common instances of the inverse
shortest path problem. The possible extensions are many. New perspectives should concern
the use of other norms in the objective function. The resulting (stronger) non-linearity then
would call for di erent approaches. Other types of constraint speci cations are also of obvious
interest. The problem of recovering attributes of the second (third, : : : ) shortest paths is also
a challenging area (for instance, this could help in retrieving the successive shortest time waves
in seismic tomography). Further research could cover heuristics for active path selection and
stability analysis. We are also interested in applying the algorithms discussed in this thesis to
practical cases in trac engineering and computerized tomography.
Extensions or applications of the inverse shortest path problem have been informally proposed
after several conferences given on the subject. A. Lucena (London 1991) perceived an application
of our method to the update of Lagrange multipliers in an algorithm solving the time-dependent
travelling salesman problem [80]. Another application suggested by S. Boyd (August 1992) resides
in nding or estimating transition probabilities in Markov chains given the maximum likehood
path. The formulation of this last problem includes an objective function that does not depend
on a usual norm, but involves a \max" function. The objective function has the property of
remaining convex in this particular de nition. The connection between this problem and ours
can be viewed when identifying the vertices with states or events of a Markov chain, arcs with
possible transitions, paths as sequences of states and arc weights with transition probabilities.
We hope that still numerous such extensions will show up in future research.
A

Symbol Index

This appendix is intended to help the reader in readily retrieving the meaning of a symbol used
in Chapters 4{6.

Symbol Purpose De nition Page

ak k-th arc of the oriented graph (V ,A,w) Section 4.1 55


bi constant term in the linear constraint Ei (x) (4.7) 57
cl class number l of the set C Section 5.2.1 73
dl density value of the class cl Section 5.2.1 73
dl a priori expected value of dl Section 5.2.4 76
eq q-th vector of the canonical basis in RL
f objective function of the minimizing problem (4.5), (5.17) 56, 76
h number of constraints in the inverse shortest path problem (4.7) 57
m number of arcs in A Section 4.1 55
n number of vertices in V Section 4.1 55
nic number of shortest path combinations at i-th iteration Algo. 6.1 103
ni normal to the i-th constraint Ei (4.7) 57
nE number of explicit shortest path constraints (4.1) 55
nI number of implicit shortest path constraints (6.3) 97
pj j -th explicit path constraint (4.1) 55
p0j any path with same origin and destination as pj (4.6) 56
q index of the current violated constraint (4.16) 59
r dual step direction (4.16) 59
s primal step direction (4.34), (5.77) 63, 87
s(k) index of the vertex at origin of the arc ak (its \source vertex") Section 4.1 55
tc steplength to satisfy the q-th constraint (3.41) 47
tf the maximum steplength to preserve dual feasibility (3.40) 47
t(k) index of the vertex at end of the arc ak (its \target vertex") Section 4.1 55
u the vector of the Lagrange multipliers, i.e. the dual variables (3.33) 46
wi weight of the arc ai Section 4.1, (5.5) 55, 74
wi a priori expected value of wi Section 4.1, (5.87) 55, 88
yi i-th component of the product N T nq (4.1) 60
z intermediate vector to compute r by triangular substitution (4.23) 60

116
symbol index 117

Symbol Purpose De nition Page

A set of active constraints Sect. 4.2.1 & 4.2.5 57, 61


Ci the shortest path combination at iteration i Algo. 6.1 103
D dependent on the class densities (5.29) 80
D+ the positive shore of D (5.30) 80
D, the negative shore of D (5.31) 80
D + (l ) set of dependent constraints involving the class cl in their positive (5.59), (5.62) 84, 84
shore
D , (l ) set of dependent constraints involving the class cl in their negative (5.59), (5.62) 84, 84
shore
Ei (x) general form of a constraint (4.7) 57
F + (l) union of the sets D+ (l) and I + (l) (5.63) 85
F ,(l) union of the sets D, (l) and I ,(l) (5.63) 85
G Hessian matrix of the objective function (4.16) 59
H the reduced inverse Hessian of f in the subspace of points satisfying (4.9) 57
the active constraints
I island on the arc costs, union of I + and I , Section 4.2.2 58
I+ the positive shore of I Section 4.2.2 58
I, the negative shore of I Section 4.2.2 58
I (l)
+
set of island constraints involving an arc belonging the class cl in (4.28), (5.61) 61, 84
their positive shore
I ,(l) set of island constraints involving an arc belonging the class cl in (4.28), (5.61) 61, 84
their negative shore
J general constraint interpreted as two shores, whether it is an island Section 5.4.1 80
I or a dependent set D
J+ the positive shore of J Section 5.4.1 80
J, the negative shore of J Section 5.4.1 80
L number of classes in C Section 5.2.1 73
N matrix whose columns are the ni , the normals of the active Section 4.2.1 57
constraints
N Moore-Penrose generalized inverse of N (4.8) 57
P (w) Problem of minimizing f over F (w) 102
Pj set of vertices attained by the path pj (4.10) 58
Q orthogonal matrix such that N = QU (4.18) 60
R triangular factor matrix (4.17) 59
S set of (non equality) constraints such that the matching r compo- (3.39) 47
nent is strictly positive
U triangular factor matrix such that N = QU (4.18) 60
V set of active constraints (4.2.2) 59
VI set of currently active islands (5.52) 83
VD set of currently active dependent sets (5.51) 83
VE set of currently active equality constraints (5.86) 88
X set of the indices of the classes that are involved in active (4.26), (5.57) 61, 84
constraints
XI set of the indices of the classes that are involved in active islands (5.56) 84
XD set of the indices of the classes that are involved in active (5.55) 84
dependent
Y set of the indices of the classes whose density is currently at a (4.25), (5.54) 61, 84
bound
Y0 set of the indices of the classes whose density is currently zero (5.53) 83
Z set of the indices of the classes that are not involved in the active (4.27), (5.58) 61, 84
constraints
symbol index 118

Symbol Purpose De nition Page

` index of the constraint to drop out of the active set (3.40) 47


`(i) index of the class with which the arc ai is associated (5.5) 74
A set of the arcs of the graph, numbered from 1 to m Section 4.1 55
B(i) set of class indices for computing N T nq (5.41) 82
C set of classes numbered from 1 to L Section 5.2.1 73
D set of class indices for computing sT nq (5.79) 87
F (w) convex feasible region determined at w (6.15) 102
E set indexing the equality constraints (5.14) 76
I set indexing the inequality constraints (5.13) 75
V set of vertices in the graph, numbered from 1 to n Section 4.1 55
l the bound value at which the class density dl is currently xed, (5.54) 84
when the bound is active
i proportion factor on the arc ai to determine the cost wi, when (5.5) 74
multiplied by the class density d`(i)
il general coecients de ning the constraints on the class densities (5.13), (5.14) 75
j number of arcs in the path pj (4.1) 55
 number of current active constraints, i.e. jAj Section 4.2.7 65
i constant term of the constraints on the class densities (5.13), (5.14) 75
no ;nd lower bound value on the weight of a shortest path (5.12) 75
,l (D ) 1 if the class cl belongs to the dependent shore D, 0 otherwise (5.38) 81

,l (I ) sum of the proportional factors i related to the arcs ai of
(l) (5.37) 81
belonging to the island shore I 

(l) the set of the arcs associated with the class cl (5.36) 81
Bibliography

[1] A.V. Aho, J.E. Hopcroft and J.D. Ullman, The Design and Analysis of Computer
Algorithms, Addison-Wesley, Reading, MA, 1974.
[2] S.S. Anderson, Graph theory and nite combinatorics, Markham Publishing Company,
Chicago, 180 pp., 1970.
[3] M. Avriel, Nonlinear Programming: Analysis and Methods, Prentice-Hall, Inc., Engle-
wood Cli s, NJ, 512 pp., 1976.
[4] E.M.L. Beale, \On Quadratic Programming", Naval Research Logistics Quarterly, vol. 6,
pp. 227{244, 1959.
[5] R. Bellman, \On a routing problem", Quart. Appl. Math., vol. 16, pp. 87{90, 1958.
[6] D.P. Bertsekas, \A new algorithm for the assignment problem", Mathematical Program-
ming, vol. 21, pp. 152{171, 1981.
[7] D.P. Bertsekas, \The Auction Algorithm for Assignment and Other Network Flows
Problems: A Tutorial", INTERFACES, vol. 20:4, pp. 133{149, July{August 1990.
[8] D.P. Bertsekas, \An auction algorithm for shortest paths", Lab. for Information and
Decision Systems Report P-2000, MIT, Cambridge, MA, 1990, revised February 1991.
SIAM Journal on Optimization, to appear.
[9] D.P. Bertsekas, Linear Network Optimization: Algorithms and Codes, The MIT Press,
Cambridge, Massachusetts, 359 pp., 1991.
[10] J.C.G. Boot, \Notes on quadratic programming: the Kuhn-Tucker and Theil-van de
Panne conditions, degeneracy, and equality constraints", Management Science, vol. 8, No.
1, pp. 85{98, October 1961.
[11] J.C.G. Boot, \On trivial and binding constraints in programming problems", Manage-
ment Science, vol. 8, pp. 419{441, 1962.
[12] J.C.G. Boot, Quadratic Programming, North-Holland Publishing Co., Amsterdam, 213
pp.,1964.

119
bibliography 120

[13] P.H.L. Bovy and E. Stern, Route Choice: Way nding in Transport Networks, Kluwer
Academic Publishers, Dordrecht, 1990.
[14] D. Burton, Analyse et implementation de methodes des plus courts chemins dans un
reseau urbain, Master's Thesis, Facultes Universitaires Notre-Dame de la Paix, Namur,
1986.
[15] D. Burton and Ph.L. Toint, \On an instance of the inverse shortest paths problem",
Mathematical Programming, vol. 53, pp. 45{61, 1992.
[16] D. Burton and Ph.L. Toint, \On the use of an inverse shortest paths algorithm for
recovering linearly correlated costs", Mathematical Programming (to appear), 1993.
[17] D. Burton, B. Pulleyblank and Ph.L. Toint, \The inverse shortest path problem
with upper bounds on shortest path costs", Internal Report, FUNDP, (submitted to ORSA
Journal on Computing), 1993.
[18] P.H. Calamai and A.R. Conn, \A stable algorithm for solving the multifacility loca-
tion problem involving Euclidian distances", SIAM Journal on Scienti c and Statistical
Computing, vol. 4, pp. 512{525, 1980.
[19] P.H. Calamai and A.R. Conn, \A second-order method for solving the continuous
multifacility location problem", in: G.A. Watson, ed., Numerical Analysis: Proceedings of
the Ninth Biennial Conference, Dundee, Scotland, Lectures in notes in Mathematics 912,
Springer-Verlag (Berlin, Heidelberg and New York), pp. 1{25, 1982.
[20] P.H. Calamai and A.R. Conn, \A projected Newton method for lp norm location prob-
lem", Mathematical Programming, vol. 38, pp. 75{109, 1987.
[21] A. Cayley, \On the theory of the analytical forms call trees", Philos. Mag., vol. 13, pp.
172{176, 1857. Mathematical Papers, Cambridge, vol. 3, pp. 242{246, 1891.
[22] A.R. Conn, \Constrained optimization using non-di erentiable penalty function", SIAM
Journal of the Numerical Analysis, vol. 10, pp. 760{784, 1973.
[23] A.R. Conn and J.W. Sinclair, \Quadratic programming via a non-di erentiable penalty
function", Department of Combinations and Optimization Research, University of Water-
loo, Rep. CORR 75-15, 1975.
[24] S. Cook, \The complexity of Theorem Proving Procedures", Proc. 3rd Ann. ACM Symp.
on Theory of Computing, Association for Computing Machinery, New York, 151{158, 1971.
[25] W.-K. Chen, Applied graph theory, North-Holland Publishing Company, 484 pp., 1971.
[26] N. Christofides, Graph Theory. An algorithmic approach, Academic Press, London, 400
pp., 1975.
bibliography 121

[27] G.B. Dantzig, Quadratic Programming, A Variant of the Wolfe-Markowitz Algorithm,


Research Report 2, Operations Research Center, University of California, Berkeley, 1961.
[28] E.V. Denardo and B.L. Fox, \Shortest-route methods: 1. reaching, pruning, and buck-
ets", Operations Res., vol. 27, pp. 161{186, 1979.
[29] N. Deo, Graph Theory with Applications to Engineering and Computer Science, Prentice-
Hall, Englewood Cli s, NJ, 478 pp., 1974.
[30] N. Deo and C. Pang, \Shortest-Path Algorithms: Taxonomy and Annotation", Networks,
vol. 14, pp. 275{323, 1984.
[31] R.B. Dial, \Algorithm 360: Shortest path forest with topological ordering", Commun.
ACM, vol.12, pp. 632{633, 1969.
[32] R.B. Dial, \A probabilistic multipath trac assignment model which obviates path enu-
meration", Transportation Research, vol. 5, pp. 83{111, 1971.
[33] R.B. Dial, F. Glover, D. Karney and D. Klingman, \A computational analysis of
alternative algorithms and labeling techniques for nding shortest path trees", Networks,
vol. 9, pp. 215{248, 1979.
[34] E.W. Dijkstra, \A note on two problems in connexion with graphs", Numerische Math-
ematik, vol. 1, pp. 269{271, 1959.
[35] K.A. Dines and R.J. Lytle, \Computerized geophysical tomography", Proc. IEEE, vol.
67, pp. 1065{1073, 1979.
[36] R.M. Downs and D. Stea, Maps in minds, Harper and Row, New York, 1977.
[37] S.E. Dreyfus, \An appraisal of some shortest path algorithms", Operations Res., vol. 17,
pp. 393{412, 1969.
[38] S.M. Easa, \Shortest route with movement prohibition", Transportation Research B, vol.
19, nr. 3, pp. 197{208, 1985.
[39] L. Euler, \Solutio problematis ad geometriam situs pertinentis", Comment. Academiae
Sci. Imp. Petropolitanae, vol. 8, pp. 128{140, 1736. Opera Omnia Series, vol. I-7, pp. 1{10,
1766. English translation in \The Konigsgerg bridges", Sci. Amer., vol. 189, pp. 66{70,
July 1953.
[40] R. Fletcher, \A general quadratic programming algorithm", Journal of the Institute of
Mathematics and its Applications, vol. 7, pp. 76{91, 1971.
[41] R. Fletcher, Practical Methods of Optimization, 2nd Edition, Wiley-Interscience, 436
pp., 1987.
bibliography 122

[42] M. Florian, S. Nguyen and S. Pallottino, \A Dual Simplex Algorithm for Finding
all Shortest Paths", Networks, vol. 11, pp. 367{378, 1981.
[43] R.W. Floyd, \Algorithm 97: shortest path", Comm. ACM, vol. 5, p. 345, 1962.
[44] L.R. Ford, \Network ow theory", Report P-923, The Rand Corporation, Santa Monica,
CA, 1956.
[45] L.R. Ford and D.R. Fulkerson, Flows in networks, Princeton University, Princeton,
NJ, 1962.
[46] M. Frank and P. Wolfe, \An algorithm for quadratic programming", Naval Research
Logistics Quarterly, vol. 3, pp. 95{110, 1956.
[47] S. Fujishige, \A note on the problem of updating shortest paths", Networks, vol. 11, pp.
317{319, 1981.
[48] G. Gallo and S. Pallottino, \Shortest path methods: A unifying approach", Mathe-
matical Programming Study, vol. 26, pp. 38{64, 1986.
[49] G. Gallo and S. Pallottino, \Shortest path algorithms", Annals of Operations Re-
search, vol. 13, pp. 3{79, 1988.
[50] M.R. Garey and D.S. Johnson, Computers and intractability. A guide to the theory of
NP-Completeness, W.H. Freeman and Company, San Fransisco, 1979.
[51] J.A. George and J.W. Liu, Computer solution of large positive de nite systems,
Prentice-Hall, Englewood Cli s, 1981.
[52] P.E. Gill and W. Murray, \Numerically stable methods for quadratic programming",
Mathematical Programming, vol. 14, pp. 349{372, 1978.
[53] D. Goldfarb, \Extension of Newton's method and simplex methods for solving quadratic
programs", in: F.A. Lootsma, ed., Numerical methods for non-linear optimization (Aca-
demic Press, London, 1972), pp. 239{254, 1972.
[54] D. Goldfarb, J. Hao and S.-R. Kai, \Shortest path algorithms using dynamic breadth-
rst search", Networks, vol. 21, pp. 29{50, 1991.
[55] D. Goldfarb and A. Idnani, \A Numerically Stable Dual Method for Solving Strictly
Convex Quadratic Programs", Mathematical Programming, vol. 27, pp. 1{33, 1983.
[56] G.H. Golub and C.F. van Loan, Matrix computations, North Oxford Academic, Oxford,
476 pp., 1983.
[57] M. Gondran and M. Minoux, Graphes et algorithmes, 2nd edition, Editions Eyrolles,
Paris, 546 pp., 1990.
English translation by S. Vajda, Graphs and Algorithms, Wiley-Interscience, NY, 1984.
bibliography 123

[58] A.S. Goncalves, \A primal-dual method for quadratic programming with bounded vari-
ables", in: F.A. Lootsma, ed., Numerical methods for non-linear optimization (Academic
Press, London, 1972), pp. 255{263, 1972.
[59] S. Goto, T. Ohtsuki and T. Yoshimura, \Sparse matrix techniques for the shortest
path problem", IEEE Trans. Circuits and Systems, CAS-23, pp. 752{758, 1976.
[60] F. Glover, R. Glover and D. Klingman, \Computational study of an improved short-
est path algorithm", Networks, vol. 14, pp. 25{, 1984.
[61] W.R. Hamilton, \Account of the icosian calculus", Proc. Roy. Irish. Acad., vol. 6, pp.
415{416, 1853{7.
[62] D.Y. Handler and P.B. Mirchandani, Location on Networks: Theory and Algorithms,
The MIT Press, Cambridge, Massachusetts, 233 pp., 1979.
[63] F. Harary, Graph theory, Addison-Wesley Publishing Company, 274 pp., 1972.
[64] F. Harary, R.Z. Norman and D. Cartwright, Structural Models: An Introduction to the
Theory of Directed Graphs, John Wiley & Sons, Inc., New York, 1965.
[65] G.T. Herman, Image reconstruction from projections: the fundamentals of computerized
tomography, Academic Press, New York, 1980.
[66] D.B. Johnson, \A note on Dijkstra's shortest path algorithm", J. Assoc. Comput. Mach.,
vol. 20, pp. 385{388, 1973.
[67] D.B. Johnson, \Ecient algorithms for shortest paths in sparse networks", J. Assoc.
Comput. Mach., vol. 24, pp. 1{13, 1977.
[68] E.L. Johnson, \On shortest paths and sorting", Proceedings of the 25th ACM Annual
Conference, pp. 510{517, 1972.
[69] R.M. Karp, \On the computational complexity of combinatorial problems", Networks,
vol. 5, pp. 45{68, 1975.
[70] A. Kershenbaum, \A note on nding shortest path trees", Networks, vol. 11, pp. 399-400,
1981.
[71] G. Kirchhoff, \U ber die Au osung der Gleichungen, auf welche man bei der Unter-
suchung der linearen Verteilung galvanischer Strome gefuhrt wird", Ann. Phys. Chem., vol.
72, pp. 497{508, 1847.
[72] D.E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching, Addison-
Wesley, Reading, MA, 1973.
[73] H.W. Kuhn and A.W. Tucker (Eds.), Linear inequalities and related systems, Princeton
University Press, Princeton, NJ, 1956.
bibliography 124

[74] C.E. Lemke, \The dual method for solving the linear programming problem", Naval
Reasearch Logistics Quarterly, vol. 1, No. 1, 1954.
[75] C.E. Lemke, \A method of solution for quadratic programs", Management Science, vol.
8, pp. 442{453, 1962.
[76] N.P. Loomba and E. Turban, Applied programming for management, Holt, Rinehart &
Winston, Inc., 475 pp., 1974.
[77] F.A. Lootsma, ed., Numerical methods for non-linear optimization, Academic Press, Lon-
don, 440 pp., 1972.
[78] A.K. Louis, \Computerized tomography. I: Physical background and mathematical mod-
elling", Extended version of a conference, given in February 1984 at the Facultes Universi-
taires Notre-Dame de la Paix, Namur (Belgium), 1984.
[79] A.K. Louis and F. Natterer, \Mathematical problems of computerized tomography",
Proc. IEEE, vol. 71, no 3, pp. 379{389, 1983.
[80] A. Lucena, \Time-dependent traveling salesman problem | the deliveryman case", Net-
works, vol. 20, pp. 753{763, 1990.
[81] D.G. Luenberger, Linear and nonlinear programming, 2nd. edition, Addison-Wesley,
Reading, MA, 491 pp., 1984.
[82] M. Minoux, Programmation mathematique : theorie et algorithmes- tome 1, Dunod, Paris,
294 pp., 1983.
English translation by S. Vajda, Mathematical Programming: Theory and Algorithms,
Wiley-Interscience, NY.
[83] M. Minoux and G. Bartnik, graphes, algorithmes, logiciels, Dunod, Paris, 428 pp., 1986.
[84] P. Mirchandani and H. Soroush, \Generalized Trac Equilibrium with Probabilistic
Travel Times and Perceptions", Transportation Science, vol. 21, no 3, pp. 133{152, 1987.
[85] E.F. Moore, \The shortest path through a maze", in Proceedings of the International
Symposium on the Theory of Switching, Part II, 1957, Harvard University, Cambridge,
MA, pp. 285{292, 1959.
[86] T.J. Moser, \Shortest path calculation of seismic rays", Geophysics, vol. 56, pp. 59{67,
1991.
[87] T.J. Moser, \The shortest path method for seismic ray tracing in complicated media",
Ph.D. Thesis, Rijksuniversiteit Utrecht, 1992.
[88] J.D. Murchland, \A xed matrix method for all shortest distances in a directed graph
and for inverse problems", Ph.D. Thesis, Karlsruhe University, 1970.
bibliography 125

[89] G.L. Nemhauser and L.A. Wolsey, Integer and Combinatorial Optimization, A Wiley-
Interscience Publication, John Wiley & Sons, 763 pp., 1988.
[90] G. Neumann-Denzau and J. Behrens, \Inversion of seismic data using tomographical
reconstruction techniques for investigations of laterally inhomogeneous media", Geophys.
J. R. astr. Soc., vol. 79, pp. 305{315, 1984.
[91] G. Nolet, ed., Seismic Tomography, D. Reidel Publishing Company, Dordrecht, 387 pp.,
1987.
[92] V.E. Outram and E. Thompson, \Driver's perceived cost in route choice", Proceedings
- PTRC Annual Meeting, London, pp. 226{257, 1978.
[93] S. Pallottino, \Shortest path methods: complexity, interrelations and new propositions",
Networks, vol. 14, pp. 257{267, 1984.
[94] U. Pape, \Implementation and eciency of Moore algorithms for the shortest route prob-
lem", Mathematical Programming, vol. 7, pp. 212{222, 1974.
[95] A.R. Pierce, \Bibliography on algorithms for shortest path, shortest spanning tree and
related circuit routing problems", Networks, vol. 5, pp. 129{149, 1975.
[96] M.J.D. Powell, \On the quadratic programming algorithm of Goldfarb and Idnani",
Mathematical Programming Study, vol. 25, pp. 45{61, 1985.
[97] M.J.D. Powell, \ZQPCVX, A Fortran subroutine for convex quadratic programming",
Report DAMTP/NA17, Department of Applied Mathematics and Theoretical Physics, Uni-
versity of Cambridge, Cambridge, UK, 1983.
[98] F.S. Roberts, Graph Theory and Its Applications to Problems of Society, SIAM, CBMS-
NSF, Regional Conference Series in Applied Mathematics, Philadelphia, Pennsylvania, 122
pp., 1978.
[99] B. Roy, Algebre moderne et theorie des graphes, Tome II, Dunod, Paris, 1970.
[100] A. Sartenaer, \On the application of the auction algorithm of Bertsekas for the search
of shortest routes in a urban network", Technical Report 92/27, Facultes Universitaires
Notre-Dame de la Paix, Namur, Departement de Mathematique, 1991.
[101] Y. Sheffi, Urban Transportation Networks, Prentice-Hall, Englewood Cli s, 1985.
[102] P.A. Steenbrink, Optimization of Transport Networks, Wiley, Bristol, 1974.
[103] J. Stoer, \On the numerical solution of constrained least-squares problems", SIAM Jour-
nal on Numerical Analysis, vol. 8, No. 2, pp. 382{411, 1971.
[104] A. Tarantola, Inverse problem theory. Methods for data tting and model parameter
estimation, Elsevier, 1987.
bibliography 126

[105] A. Tarantola and B.Valette, \Generalized nonlinear inverse problems solved using
the least square criterion", Reviews of Geophys. and Space Phys., vol. 20, pp. 219{232,
1982.
[106] R.E. Tarjan, \Complexity of combinatorial algorithms", SIAM Review, vol. 20, nr. 3, pp.
457{491, 1978.
[107] R.E. Tarjan, Data Structures and Network Algorithms, SIAM, CBMS-BSF, Regional
Conference Series in Applied Mathematics, Philadelphia, 131 pp., 1983.
[108] H. Theil and C. Van de Panne, \Quadratic programming as an extension of classical
quadratic maximization", Management Science, vol. 7, No. 1, pp. 1{20, October 1960.
[109] C. Van de Panne and A. Whinston, \The simplex and the dual method for quadratic
programming", Operations Research Quarterly, vol. 15, pp. 355{389, 1964.
[110] C. Van de Panne and A. Whinston, \A comparison of two methods for quadratic
programming", Operations Research, vol. 14, pp. 422{441, 1966.
[111] J. Van Leeuwen, Ed., Algorithms and Complexity, Volume A of Handbook of Theoretical
Computer Science, Elsevier, Asmterdam, and The MIT Press, Cambridge, Massachusetts,
996 pp., 1990.
[112] D. Van Vliet, \Improved shortest path algorithms for transport networks", Transporta-
tion Research, vol. 12, pp. 7{20, 1978.
[113] S.A. Vavasis, Nonlinear Optimization: Complexity Issues, Oxford University Press, Inc.,
NY, 165 pp., 1991.
[114] J.W.J. Williams, \Algorithm 232: Heapsort", Comm. ACM, vol. 7, pp. 347{348, 1964.
[115] P. Wolfe, \The Simplex Method for Quadratic Programming", Econometrica, vol. 27,
pp. 382{398, 1959.
[116] J.H. Woodhouse and A.M. Dziewonski, \Mapping the upper mantle: three-dimen-
sional modeling of Earth structure by inversion of seismic waveforms", Journal of Geophys-
ical Research, vol. 89 B7, pp. 5953{5986, 1984.
[117] J.Y. Yen, A shortest path algorithm, Ph.D. Thesis, University of California, Berkeley,
1970.

You might also like