Professional Documents
Culture Documents
Robert Sedgewick
Princeton University
Introduction
Subtext: the scientific method Motivating example
Grid graphs
Search methods
is necessary in algorithm design and implementation Small world graphs
Conclusion
hypothesis
model: classical probability and statistics
hypothesis: frequency values should be uniform
weak experiment:
• generate random numbers model experiment
012345678901234567 ...
• use x test to check frequency
2
random?
V = 365
Q. Does a given sequence exhibit some property
that random number sequences exhibit?
Birthday paradox
Average count of random numbers generated
until a duplicate happens is about pV/2
s
Applications
• graph-based optimization models
• networks
• percolation
• computer vision
• social networks
• (many more) t
Basic research
• fundamental abstract operation with numerous applications
• worth doing even if no immediate application
• resist temptation to prematurely study impact
Introduction
Motivating example: maxflow Motivating example
Grid graphs
Search methods
Ford-Fulkerson maxflow scheme Small world graphs
Conclusion
• find any s-t path in a (residual) graph
• augment flow along path (may create or delete edges)
• iterate until no path exists
Graph parameters
• number of vertices V
• number of edges E
• maximum capacity C
worst case
upper bound
VE/2
shortest
VC
max capacity 2E lg C
worst case
for example
upper bound
VE/2 177,000
shortest
VC 17,700
How many steps to find each path? 2000 (worst-case upper bound)
Introduction
Motivating example: max flow Motivating example
Grid graphs
Search methods
Compare performance of Ford-Fulkerson implementations Small world graphs
Conclusion
• shortest augmenting path
• maximum-capacity augmenting path
worst case
for example actual
upper bound
VE/2 177,000
shortest 37
VC 17,700
max capacity 2E lg C 26,575 7
total is a
How many steps to find each path? < 20, on average factor of a million high
for thousand-node graphs!
Introduction
Motivating example: max flow Motivating example
Grid graphs
Search methods
Compare performance of Ford-Fulkerson implementations Small world graphs
Conclusion
• shortest augmenting path
• maximum-capacity augmenting path
Graph parameters
• number of vertices V
• number of edges E
• maximum capacity C
worst case
upper bound
VE2/2 WARNING: The Algorithm General
shortest
VEC has determined that using such results
to predict performance or to compare
max capacity 2E2 lg C algorithms may be hazardous.
Introduction
Motivating example: lessons Motivating example
Grid graphs
Search methods
Small world graphs
Conclusion
Goals of algorithm analysis
• predict performance (running time) worst-case bounds
Common wisdom
• random graph models are unrealistic
• average-case analysis of algorithms is too difficult
• worst-case performance bounds are the standard
which ones??
Bounds are useful in many applications:
BUT
• all three process all E edges in the worst case
• diverse kinds of graphs are encountered in practice
s s
s s s
t t
t
t t
Percolation
Random walk
Graph covering
??
s M 2 vertices
about 2M 2 edges
M vertices edges
7 49 84
15 225 420
31 961 1860
63 3969 7812
127 16129 32004
255 65025 129540
511 261121 521220
t
G.findPath(V-1-M/2, M/2);
for (int k = t; k != s; k = G.st(k))
System.out.println(s + “-” + t);
Introduction
DFS: standard implementation Motivating example
Grid graphs
Search methods
graph ADT constructor code Small world graphs
Conclusion
for (int k = 0; k < E; k++)
{ int v = a[k].v, w = a[k].w; graph representation
adj[v] = new Node(w, adj[v]); vertex-indexed array of
linked lists
adj[w] = new Node(v, adj[w]);
two nodes per edge
}
searchR(s, t);
0 1 2
}
Introduction
Basic flaw in standard DFS scheme Motivating example
Grid graphs
Search methods
cost strongly depends on arbitrary decision in client code! Small world graphs
Conclusion
...
for (int i = 0; i < V; i++)
{
if ((i+1) % M != 0) a[e++] = new Edge(i, i+1);
order of these
if (i % M != 0) a[e++] = new Edge(i, i-1);
statements
if (i < V-M) a[e++] = new Edge(i, i+M); determines
if (i >= M) a[e++] = new Edge(i, i-M); order in lists
}
...
t t bad news
for ANY
1/2
~E/2 ~E graph model
Introduction
Addressing the basic flaw Motivating example
Grid graphs
Search methods
Small world graphs
Advise the client to randomize the edges? Conclusion
represent graph
Solution: Use a randomized iterator with arrays,
not lists
N
standard iterator
x
int N = adj[x].length;
i
for(int i = 0; i < N; i++)
{ process vertex adj[x][i]; }
randomized iterator
i N
int N = adj[x].length;
for(int i = 0; i < N; i++) x
s s
s t t
t
Introduction
Experimental results Motivating example
Grid graphs
Search methods
for basic algorithms Small world graphs
Conclusion
UF
DFS is substantially faster than BFS and UF
M V E BFS DFS UF
7 49 168 .75 .32 1.05
15 225 840 .75 .45 1.02
31 961 3720 .75 .36 1.14
63 3969 15624 .75 .32 1.05
127 16129 64008 .75 .40 .99
255 65025 259080 .75 .42 1.08
DFS
BFS
Analytic proof?
Multiple searches?
• interleaving strategy?
• merge strategy?
• how many?
• which algorithm?
Hybrid algorithms
• which combination?
• probabilistic restart?
• merge strategy?
• randomized choice?
Multiple searchers
• use N searchers
• one from the source
• one from the destination
• N-2 from random vertices
• Additional factor of 2 for N>2
Result: not much help anyway 1.40 BFS
.70
.40
DFS
.12
Examples:
• Add random edges to grid graph s
• Add random edges to any sparse graph
with local clustering
• Many scientific models
t
Q. How do we find an st-path in a small-world graph?
Introduction
Small-world graphs Motivating example
Grid graphs
Search methods
model the six degrees of separation phenomenon Small world graphs
Small-world
Conclusion
Murder on the
Orient Express
Cold Donald
Sutherland Kathleen Joe Versus
Mountain
Quinlan the Volcano
airlines
To Catch Noon
a Thief
John
• extensive simulations
Gielguld
Portrait Lloyd
of a Lady The Eagle Bridges
Nicole has Landed
Kidman
roads
Murder on the
Orient Express
Cold Donald
Sutherland Kathleen Joe Versus
neurobiology
An American John Animal
Hamlet Haunting Belushi House
Apollo 13
Vernon
Dobtcheff
The
• huge graphs
Woodsman Kevin
Bacon Tom
Hanks
evolution
Bill
Paxton
Wild
Things The River
Jude Wild
The Da
Paul
Vinci Code
Herbert
Meryl
Enigma Streep Audrey
social influence
Kate Tautou
Winslet Yves
Titanic
Aubert Shane
Eternal Sunshine Zaza
of the Spotless
Mind
protein interaction
A tiny portion of the movie-performer relationship graph
percolation
internet
electric power grids Example 2: Protein interaction
political trends • small-world model
. • natural process
. • experimental
. validation
Introduction
Finding a path in a small-world graph Motivating example
Grid graphs
Search methods
is a heavily studied problem Small world graphs
Small-world
Conclusion
How does 2-way DFS do in this model? no change at all in graph code
just a different graph model
Experiment:
• add M ~ E1/2 random edges to an M-by-M grid graph
• use 2-way DFS to find path
Surprising result: Finds short paths in ~ E1/2 steps!
Introduction
Finding a path in a small-world graph Motivating example
Grid graphs
is much easier than finding a path in a grid graph Search methods
Small world graphs
Small-world
Conclusion
Evidence in favor
1. Experiments on many graphs
2. Proof sketch for grid graphs with V shortcuts
• step 1: 2 E1/2 steps ~ 2 V1/2 random vertices
• step 2: like birthday paradox
t
two sets of 2V 1/2randomly chosen s
vertices are highly unlikely to be disjoint
Path length?
Multiple searchers revisited?
Questions
• What are the BFS, UF, and DFS constants in grid graphs?
• Is there a sublinear algorithm for grid graphs?
• Which methods adapt to directed graphs?
• Can we precisely analyze and quantify costs for small-world graphs?
• What is the cost distribution for DFS for any interesting graph family?
• How effective are these methods for other graph families?
• Do these methods lead to faster maxflow algorithms?
• How effective are these methods in practice?
• ...
Introduction
Conclusion: subtext revisited Motivating example
Grid graphs
Search methods
Small world graphs
The scientific method is necessary Conclusion
Scientific method
• create a model describing natural world
• use model to develop hypotheses
model experiment
• run experiments to validate hypotheses
• refine model and repeat