You are on page 1of 12

Beyond Worst Case Analysis

Amortized Analysis Worst-case analysis.


■ Analyze running time as function of worst input of a given size.

Average case analysis.


■ Analyze average running time over some distribution of inputs.
■ Ex: quicksort.

Amortized analysis.
■ Worst-case bound on sequence of operations.
■ Ex: splay trees, union-find.

Competitive analysis.
■ Make quantitative statements about online algorithms.
■ Ex: paging, load balancing.

Princeton University • COS 423 • Theory of Algorithms • Spring 2001 • Kevin Wayne 2

Amortized Analysis Dynamic Table


Amortized analysis. Dynamic tables.
■ Worst-case bound on sequence of operations. ■ Store items in a table (e.g., for open-address hash table, heap).
– no probability involved ■ Items are inserted and deleted.
■ Ex: union-find. – too many items inserted ⇒ copy all items to larger table
– sequence of m union and find operations starting with n – too many items deleted ⇒ copy all items to smaller table
singleton sets takes O((m+n) α(n)) time.
– single union or find operation might be expensive, but only α(n) Amortized analysis.
on average
■ Any sequence of n insert / delete operations take O(n) time.
■ Space used is proportional to space required.
■ Note: actual cost of a single insert / delete can be proportional to n
if it triggers a table expansion or contraction.

Bottleneck operation.
■ We count insertions (or re-insertions) and deletions.
■ Overhead of memory management is dominated by (or
proportional to) cost of transferring items.

3 4
Dynamic Table: Insert Dynamic Table: Insert
Dynamic Table Insert Accounting method.
Initialize table size m = 1. ■ Charge each insert operation $3 (amortized cost).
– use $1 to perform immediate insert
INSERT(x) – store $2 in with new item
IF (number of elements in table = m) ■ When table doubles:
Generate new table of size 2m.
– $1 re-inserts item
Re-insert m old elements into new table.
m ← 2m – $1 re-inserts another old item

Insert x into table.

Aggregate method.
■ Sequence of n insert ops takes O(n) time. n log 2 n
■ Let ci = cost of ith insert. ∑ ci ≤ n+ ∑ 2j
i =1 j =0
= n + ( 2n − 1)
 i if i - 1 is an exact power of 2
ci =  < 3n
 1 otherwise

5 6

Dynamic Table: Insert and Delete Dynamic Table: Insert and Delete
Insert and delete. Insert and delete.
■ Table overflows ⇒ double table size. ■ Table overflows ⇒ double table size.
■ Table ≤ ½ full ⇒ halve table size. ■ Table ≤ ¼ full ⇒ halve table size.

! Bad idea: can cause thrashing.


Dynamic Table Delete
Initialize table size m = 1.

1 1 1 1 DELETE(x)
2 2 2 2 IF (number of elements in table ≤ m / 4)
Generate new table of size m / 2.
3 3 3 3
m ← m / 2
4 4 4 4 Reinsert old elements into new table.
5 5
Delete x from table.

7 8
Dynamic Table: Insert and Delete Dynamic Table: Delete
Accounting analysis.
■ Charge each insert operation $3 (amortized cost). 1 2 3 4 5 6 7 8
– use $1 to perform immediate insert
– store $2 with new item
1 2 3 4 5 6 7
■ When table doubles:
– $1 re-inserts item
– $1 re-inserts another old item 1 2 3 4 5 6

■ Charge each delete operation $2 (amortized cost).


1 2 3 4 5
– use $1 to perform delete
– store $1 in emptied slot
■ When table halves: 1 2 3 4
– $1 in emptied slot pays to re-insert a remaining item into new
half-size table
1 2 3 4 Contract table

9 10

Dynamic Table: Insert and Delete Binary Search Tree


Theorem. Sequence of n inserts and deletes takes O(n) time. Binary tree in "sorted" order.
■ Amortized cost of insert = $3. ■ Maintain ordering property for ALL sub-trees.
■ Amortized cost of delete = $2.

root (middle value)

left subtree right subtree


(larger values) (smaller values)

11 12
Binary Search Tree Binary Search Tree
Binary tree in "sorted" order. Binary tree in "sorted" order.
■ Maintain ordering property for ALL sub-trees. ■ Maintain ordering property for ALL sub-trees.

51 51

14 72 14 72

06 33 53 97 06 33 53 97

13 25 43 64 84 99 13 25 43 64 84 99

13 14

Binary Search Tree Splay Trees


Insert, delete, find (symbol table). Search Splay trees (Sleator-Tarjan, 1983a). Self-adjusting BST.
■ Amount of work proportional to height of tree. ■ Most frequently accessed items are close to root.
■ O(N) in "unbalanced" search tree. ■ Tree automatically reorganizes itself after each operation.
■ O(log N) in "balanced" search tree. Insert – no balance information is explicitly maintained
■ Tree remains "nicely" balanced, but height can potentially be n - 1.
Types of BSTs. ■ Sequence of m ops involving n inserts takes O(m log n) time.
■ AVL trees, 2-3-4 trees, red-black trees.
■ Treaps, skip lists, splay trees. Theorem (Sleator-Tarjan, 1983a). Splay trees are as efficient (in
amortized sense) as static optimal BST.
BST vs. hash tables.
■ Guaranteed vs. expected performance. Theorem (Sleator-Tarjan, 1983b). Shortest augmenting path algorithm
for max flow can be implemented in O(mn log n) time.
■ Growing and shrinking.
■ Sequence of mn augmentations takes O(mn log n) time!
■ Augmented data structures: order statistic trees, interval trees.
■ Splay trees used to implement dynamic trees (link-cut trees).

15 16
Splay Splay
Find(x, S): Determine whether element x is in splay tree S. Implementing Join(S, S’).
Insert(x, S): Insert x into S if it is not already there. ■ Call Splay(+∞, S) so that largest element of S is at root and all
Delete(x, S): Delete x from S if it is there. other elements are in left subtree.
Join(S, S’): Join S and S’ into a single splay tree, assuming that ■ Make S’ the right subtree of the root of S.
x < y for all x ∈ S, and y ∈ S’.
Implementing Delete(x, S).
All operations are implemented in terms of basic operation:
■ Call Splay(x, S) to bring x to the root if it is there.
■ Remove x: let S’ and S’’ be the resulting subtrees.
Splay(x, S): Reorganize splay tree S so that element x is at the
root if x ∈ S; otherwise the new root is either ■ Call Join(S’, S’’).
max { k ∈ S : k < x} or min { k ∈ S : k > x} .
Implementing Insert(x, S).
Implementing Find(x, S). ■ Call Splay(x, S) and break tree at root to form S’ and S’’.
■ Call Splay(x, S). ■ Call Join(Join(S’, {x}), S’’).
■ If x is root, then return x; otherwise return NO.

17 18

Implementing Splay(x, S) Implementing Splay(x, S)


Splay(x, S): do following operations until x is root. Splay(x, S): do following operations until x is root.
■ ZIG: If x has a parent but no grandparent, then rotate(x). ■ ZIG: If x has a parent but no grandparent.
■ ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y ■ ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y
are either both left children or both right children. are either both left children or both right children.
■ ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y ■ ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y
is a left child and the other is a right child. is a left child and the other is a right child.

root
z x
x
y

y y
y D A
x
C ZIG(x) A
x ZIG-ZIG z
C B
A B ZAG(y) B C

A B C D
19 20
Implementing Splay(x, S) Splay Example
Splay(x, S): do following operations until x is root. Apply Splay(1, S) to tree S:
10
■ ZIG: If x has a parent but no grandparent.
■ ZIG-ZIG: If x has a parent y and a grandparent, and if both x and y 9
are either both left children or both right children.
8
■ ZIG-ZAG: If x has a parent y and a grandparent, and if one of x, y
is a left child and the other is a right child. 7

6 ZIG-ZIG
z x
5

y ZIG-ZAG z y 4
A
3
x
D A B C D 2

1
B C
21 22

Splay Example Splay Example


Apply Splay(1, S) to tree S: Apply Splay(1, S) to tree S:
10 10

9 9

8 8

7 7

6 ZIG-ZIG 6 ZIG-ZIG

5 1

4 4

1 2 5

2
3

23 24
Splay Example Splay Example
Apply Splay(1, S) to tree S: Apply Splay(1, S) to tree S:
10 10

9 1

8 8

1 6 9

6 ZIG-ZIG 4 7 ZIG

4 7 2 5

2 5 3

25 26

Splay Example Splay Example


Apply Splay(1, S) to tree S: Apply Splay(2, S) to tree S:
1

10
1 2

8
10 1 8
6 9
8 4
10
4 7
6 9 3 6 9
Splay(2)
2 5 7
4 5 7

3
2 5

27 28
Splay Tree Analysis Splay Tree Analysis
Definitions. Splay invariant: node x always has at least µ(x) credits on deposit.
■ Let S(x) denote subtree of S rooted at x.
■ |S| = number of nodes in tree S. Splay lemma: each splay(x, S) operation requires ≤ 3(µ(S) - µ(x)) + 1
credits to perform the splay operation and maintain the invariant.
■ µ(S) = rank =  log |S| .
2
■ µ(x) = µ (S(x)).
Theorem: A sequence of m operations involving n inserts takes
S(8) O(m log n) time.
1 8
Proof:
■ µ(x) ≤  log n  ⇒ at most 3  log n  + 1 credits are needed for
|S| = 10 4
10 each splay operation.
µ(2) = 3
µ(8) = 3 ■ Find, insert, delete, join all take constant number of splays plus
µ(4) = 2 3 6 9 low-level operations (pointer manipulations, comparisons).
µ(6) = 1 ■ Inserting x requires ≤  log n  credits to be deposited to maintain
µ(5) = 0 invariant for new node x.
5 7
■ Joining two trees requires ≤  log n  credits to be deposited to
maintain invariant for new root.

29 30

Splay Tree Analysis Splay Tree Analysis


Splay invariant: node x always has at least µ(x) credits on deposit. Proof of splay lemma (ZIG): It takes ≤ 3(µ(S) - µ(x)) + 1 credits to
perform a ZIG operation and maintain the splay invariant.
Splay lemma: each splay(x, S) operation requires ≤ 3(µ(S) - µ(x)) + 1 root
credits to perform the splay operation and maintain the invariant. S y x S’

x y
Proof of splay lemma: Let µ(x) and µ’(x) be rank before and single C A
ZIG, ZIG-ZIG, or ZIG-ZAG operation on tree S. ZIG
A B B C
■ We show invariant is maintained (after paying for low-level
operations) using at most:
– 3(µ(S) - µ(x)) + 1 credits for each ZIG operation.
■ In order to maintain invariant, we must pay:
– 3(µ’(x) - µ(x)) credits for each ZIG-ZIG operation.
µ ′ ( x ) + µ ′ (y ) − µ ( x ) − µ (y ) = µ ′ (y ) − µ ( x ) µ(y) = µ’(x)
– 3(µ’(x) - µ(x)) credits for each ZIG-ZAG operation.
≤ µ′ (x ) − µ (x )
≤ 3( µ ′ ( x ) − µ ( x ) )
Thus, if a sequence of of these are done to move x up the tree, we Use extra credit to pay for
= 3( µ (S ) − µ ( x ) )

µ’(x) = µ(S)

get a telescoping sum ⇒ total credits ≤ 3(µ(S) - µ(x)) + 1. low-level operations.

31 32
Splay Tree Analysis Splay Tree Analysis
Proof of splay lemma (ZIG-ZIG): It takes ≤ 3(µ’(x) - µ(x)) credits to Proof of splay lemma (ZIG-ZIG): It takes ≤ 3(µ’(x) - µ(x)) credits to
perform a ZIG-ZIG operation and maintain the splay invariant. perform a ZIG-ZIG operation and maintain the splay invariant.
■ Nasty case: µ(x) = µ’(x).
S z x S’
■ We show in this case µ’(x) + µ’(y) + µ’(z) < µ(x) + µ(y) + µ(z).
y y – don’t need any credit to pay for invariant
D A
– 1 credit left to pay for low-level operations
x z
C B so, for contradiction, suppose µ’(x) + µ’(y) + µ’(z) ≥ µ(x) + µ(y) + µ(z).
ZIG-ZIG ■ Since µ(x) = µ’(x) = µ(z), by monotonicity µ(x) = µ(y) = µ(z).
A B C D
■ After some algebra, it follows that µ(x) = µ’(z) = µ(z).
■ Let a = 1 + |A| + |B|, b = 1 + |C| + |D|, then
µ ′ ( x ) + µ ′ ( y ) + µ ′ (z ) − µ ( x ) − µ ( y ) − µ (z ) = µ ′ ( y ) + µ ′ (z ) − µ ( x ) − µ ( y )  log a  =  log b  =  log (a+b+1) 
= ( µ ′ ( y ) − µ ( x )) + ( µ ′ ( z ) − µ ( y )) z S S’ x
■ WLOG assume b ≥ a.
≤ ( µ ′ ( x ) − µ ( x )) + ( µ ′ ( x ) − µ ( x )) y
y
= 2( µ ′ ( x ) − µ ( x ) )  log(a + b + 1)  ≥  log( 2a )  D A
If µ’(x) > µ(x), then can afford to

= 1 +  log a  x z
pay for constant number of low-level C B
operations and maintain invariant using ≤ 3(µ’(x) - µ(x)) credits. >  log a 
ZIG-ZIG
A B C D
33 34

Splay Tree Analysis


Proof of splay lemma (ZIG-ZAG): It takes ≤ 3(µ’(x) - µ(x)) credits to
perform a ZIG-ZAG operation and maintain the splay invariant. Augmented Search Trees
■ Argument similar to ZIG-ZIG.

z x

y ZIG-ZAG
z y
A
x
D A B C D

B C

35
Princeton University • COS 423 • Theory of Algorithms • Spring 2001 • Kevin Wayne
Interval Trees Interval Trees

(7, 10) (17, 19) (7, 10) (17, 19)

(5, 11) (15, 18) (5, 11) (15, 18)

(4, 8) (21, 23) (4, 8) (21, 23)

Support following operations. Key ideas: (17, 19)


■ Tree nodes contain interval.
Interval-Insert(i, S): Insert interval i = (li, ri ) into tree S. ■ BST keyed on left endpoint.
Interval-Delete(i, S): Delete interval i = (li, ri ) from tree S. (5, 11) (21, 23)
Interval-Find(i, S): Return an interval x that overlaps i, or
report that no such interval exists.
(4, 8) (15, 18)

Key Interval (7, 10)


37 38

Interval Trees Finding an Overlapping Interval


(17, 19) Interval-Find(i, S): return an interval x that overlaps i = (li, ri ), or
(7, 10) report that no such interval exists.
(5, 11) (15, 18)
Interval-Find (i, S)
(4, 8) (21, 23) x ← root(S)
(17, 19) 23
WHILE (x != NULL)
IF (x overlaps i)
Key ideas: (17, 19) 23 (5, 11) 18 (21, 23) 23 RETURN t
■ Tree nodes contain interval. IF (left[x] = NULL OR
max[left[x]] < li)
■ BST keyed on left endpoint. x ← right[x]
■ Additional info: store max (5, 11) 18 (21, 23) 23 (4, 8) 8 (15, 18) 18 ELSE
endpoint in subtree rooted x ← left[x]
at node. RETURN NO
(4, 8) 8 (15, 18) 18 max in (7, 10) 10
subtree Splay last node on path
traversed.

(7, 10) 10
39 40
Finding an Overlapping Interval Finding an Overlapping Interval
Interval-Find(i, S): return an interval x that overlaps i = (li, ri ), or Interval-Find(i, S): return an interval x that overlaps i = (li, ri ), or
report that no such interval exists. report that no such interval exists.

Interval-Find (i, S) Interval-Find (i, S)


Case 1 (right). If search goes right, Case 2 (left). If search goes left,
then there exists an overlap in right x ← root(S) then there exists an overlap in left x ← root(S)
subtree or no overlap in either. subtree or no overlap in either.
WHILE (x != NULL) WHILE (x != NULL)
IF (x overlaps i) IF (x overlaps i)
Proof. Suppose no overlap in right. Proof. Suppose no overlap in left.
RETURN t RETURN x
■ left[x] = NULL ⇒ IF (left[x] = NULL OR ■ li ≤ max[left[x]] = rj for IF (left[x] = NULL OR
no overlap in left. max[left[x]] < li) some interval j in left subtree. max[left[x]] < li)
■ max[left[x]] < li ⇒ x ← right[x] ■ Since i and j don’t overlap, we have x ← right[x]
no overlap in left. ELSE li ≤ ri ≤ lj ≤ rj. ELSE
x ← left[x] x ← left[x]
■ Tree sorted by l ⇒ for any interval
RETURN NO RETURN NO
left[x] i = (li, ri ) k in right subtree: ri ≤ lj ≤ lk ⇒
Splay last node on path no overlap in right subtree. Splay last node on path
max
traversed. traversed.
i = (li, ri ) j = (lj, rj)
k = (lk, rk)
41 42

Interval Trees: Running Time VLSI Database Problem


Need to maintain augmented data structure during tree-modifying ops. VLSI database problem.
■ Rotate: can fix sizes in O(1) time by looking at children: ■ Input: integrated circuit represented as a list of rectangles.
■ Goal: decide whether any two rectangles overlap.
 max[ left [ x ] ]

max[ x ] = max  max[ right [ x ] ]
Algorithm idea.
 r
 x ■ Move a vertical "sweep line" from left to right.
■ Store set of rectangles that intersect the sweep line in an interval
search tree (using y interval of rectangle).
(11, 35) 35 (6, 20) 35
?

(6, 20) 20 (11, 35) 35


?
C A
30 14
ZIG

A B B C
14 19 ZAG 19 30
43 44
VLSI Database Problem Order Statistic Trees
Add following two operations to BST.
Select(i, S): Return ith smallest key in tree S.
VLSI (r1, r2 ,..., rN) Rank(i, S): Return rank of x in linear order of tree S.
Sort rectangle by x coordinate (keep two copies of
rectangle, one for left endpoint and one for right). Key idea: store size of subtrees in nodes.

FOR i = 1 to 2N
IF (ri is "left" copy of rectangle)
IF (Interval-Find(ri, S)) m 8
RETURN YES
ELSE
Interval-Insert(ri, S) c 5 P 2
ELSE (ri is "right" copy of rectangle)
Interval-Delete(ri, S)

b 1 f 2 q 1

d 1 h 1 Key Subtree size


45 46

Order Statistic Trees


Need to ensure augmented data structure can be maintained during
tree-modifying ops.
■ Rotate: can fix sizes in O(1) time by looking at children.

y 29 29 w
ZIG

w 11 22 y
Z V
17 6

V X ZAG X Z
6 4 4 17

47

You might also like