You are on page 1of 117

Trees

Make Money Fast!

Stock
Fraud

2004 Goodrich, Tamassia

Ponzi
Scheme

CS 600.226: Data Structures, Professor:


Greg Hager

Bank
Robbery

What is a Tree
! In computer science, a

!
!

tree is an abstract model


of a hierarchical
structure
A tree consists of nodes
with a parent-child
relation
US
Applications:
n
n
n

Organization charts
File systems
Europe
Programming
environments

2004 Goodrich, Tamassia

ComputersRUs

Sales

Manufacturing

International

Asia

CS 600.226: Data Structures, Professor:


Greg Hager

Laptops

Canada

Desktops

R&D

Formal Definition
! A tree T is a set of nodes such that T is
Empty, or
n Has a distinguished node r referred to as
the root and
n Each node v r has a unique parent node
w in T
n

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Tree Terminology
! Root: node without parent (A)
! Internal node: node with at least
!
!
!
!
!
!

! Subtree: tree consisting of


a node and its
descendants

one child (A, B, C, F)


External node (a.k.a. leaf ): node
without children (E, I, J, K, G, H, D)
Ancestors of a node: parent,
grandparent, grand-grandparent,
etc.
Depth of a node: number of
ancestors
Height of a tree: maximum depth
of any node (3)
E
Descendant of a node: child,
grandchild, grand-grandchild, etc.
Siblings C & D

2004 Goodrich, Tamassia

subtree
I

CS 600.226: Data Structures, Professor:


Greg Hager

More Terms
! A tree is ordered if the children of a node have a
linear ordering

! A edge of a tree T is a pair of nodes (u,v) such that u


is a parent of v or vice-versa

! A path in T is a seqence of nodes such that any


adjacent nodes share an edge

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Linked Structure for Trees


!

A node is represented by
an object storing
n
n
n

Element
Parent node
List of children nodes

B
D

A
C

E
C

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Tree ADT
! We use positions to abstract
!

nodes
Generic methods:
n
n
n
n

n
n

integer size()
boolean isEmpty()
Iterator iterator()
Iterator positions()

! Accessor methods:
n
n
n

position root()
position parent(p)
positionIterator children(p)

2004 Goodrich, Tamassia

! Query methods:

boolean isInternal(p)
boolean isExternal(p)
boolean isRoot(p)

! Update method:
n

object replace (p, o)

! Additional update methods

may be defined by data


structures implementing the
Tree ADT

CS 600.226: Data Structures, Professor:


Greg Hager

Preorder Traversal
! A traversal visits the nodes of a
!
!

tree in a systematic manner


In a preorder traversal, a node is
visited before its descendants
Application: print a structured
document
1

Make Money Fast!

1. Motivations

9
2. Methods

1.1 Greed

1.2 Avidity

2004 Goodrich, Tamassia

Algorithm preOrder(v)
visit(v)
for each child w of v
preorder (w)

6
2.1 Stock
Fraud

7
2.2 Ponzi
Scheme

CS 600.226: Data Structures, Professor:


Greg Hager

References

8
2.3 Bank
Robbery

Parenthetical Representation
! Parenthetical(T,v)
s = toString(v)
n if (T.isInternal(v))
n

w s+= (
w for (Node<E> w: T.children(v))
n

s += Parenthetical(T,w) + ;

w s+=)
n

return s

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Postorder Traversal
! In a postorder traversal, a
!

node is visited after its


descendants
Application: compute space
used by files in a directory and
its subdirectories
9

Algorithm postOrder(v)
for each child w of v
postOrder (w)
visit(v)

cs16/

homeworks/

todo.txt
1K

programs/

h1c.doc
3K

h1nc.doc
2K

2004 Goodrich, Tamassia

4
DDR.java
10K

5
Stocks.java
25K

CS 600.226: Data Structures, Professor:


Greg Hager

6
Robot.java
20K

Other Possibilities
! Breadth-first
n

visit all nodes at level d before visiting level


d+1

! In-order/mixed
n

visit some children, then parent, then


remaining children

! ...
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Some Natural Recursions


! depth: the number of ancestors of a tree
public static <E> int depth (Tree<E> T,
Node<E> v) {
if (T.isRoot()) return 0;
else return 1 + depth(T,T.parent(v));
}

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Some Natural Recursions


! height: the length of the longest path to an external
child
n

note height=depth of some external node

public static <E> int height (Tree<E> T, Node<E> v) {


if (T.isExternal()) return 0;
int h = 0;
for (Node<E> w: T.children(v))
h=Math.max(h,height2(T,w));
return 1+h; }
What is the complexity of this?
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Binary Trees
! A binary tree is a tree with the
following properties:
n

! Applications:
n

Each internal node has at most two


children (exactly two for proper
binary trees)
The children of a node are an
ordered pair

n
n

arithmetic expressions
decision processes
searching
A

! We call the children of an internal


!

node left child and right child


Alternative recursive definition: a
binary tree is either
n
n

a tree consisting of a single node, or


a tree whose root has an ordered
pair of children, each of which is a
binary tree

H
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Arithmetic Expression Tree


! Binary tree associated with an arithmetic expression
n
n

internal nodes: operators


external nodes: operands

! Example: arithmetic expression tree for the


expression (2 (a - 1) + (3 b))
+


2
a
2004 Goodrich, Tamassia

-

1

CS 600.226: Data Structures, Professor:


Greg Hager

Constructing an Expression
Tree
! Given a postfix string, construct the

expression tree
! Use a stack to hold intermediate results
! Think of it like evaluation except the
operation is to construct the tree

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Decision Tree
! Binary tree associated with a decision process
n
n

internal nodes: questions with yes/no answer


external nodes: decisions

! Example: dining decision


Want a fast meal?
No

Yes

How about coffee?

On expense account?

Yes

No

Yes

No

Starbucks

Spikes

Al Forno

Caf Paragon

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Properties of Proper Binary Trees


! Notation

! Properties:

n number of nodes
e number of
external nodes
i number of internal
nodes
h height

n
n
n
n
n
n
n

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

e=i+1
n = 2e - 1
hi
h (n - 1)/2
e 2h
h log2 e
h log2 (n + 1) - 1

BinaryTree ADT
! The BinaryTree ADT
extends the Tree
ADT, i.e., it inherits
all the methods of
the Tree ADT
! Additional methods:
n
n
n
n

! Update methods

may be defined by
data structures
implementing the
BinaryTree ADT

position left(p)
position right(p)
boolean hasLeft(p)
boolean hasRight(p)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Linked Structure for Binary Trees


!

A node is represented by
an object storing
n
n
n
n

Element
Parent node
Left child node
Right child node

Node objects implement


the Position ADT

B
A

2004 Goodrich, Tamassia

D
C

CS 600.226: Data Structures, Professor:


Greg Hager

Array-Based Representation of
Binary Trees
! nodes are stored in an array
1
A

3
B

let rank(node) be defined as follows:


4
rank(root) = 1
n if node is the left child of parent(node),
rank(node) = 2*rank(parent(node))
n if node is the right child of parent(node),
rank(node) = 2*rank(parent(node))+1

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

6
C

10

11
G

Inorder Traversal
! In an inorder traversal a

Algorithm inOrder(v)
if hasLeft (v)
inOrder (left (v))
visit(v)
if hasRight (v)
inOrder (right (v))

node is visited after its left


subtree and before its right
subtree
Application: draw a binary
tree
n
n

x(v) = inorder rank of v


y(v) = depth of v

4
3

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Print Arithmetic Expressions


!

Specialization of an inorder
traversal
n

print operand or operator


when visiting node
print ( before traversing left
subtree
print ) after traversing right
subtree

-

a

2004 Goodrich, Tamassia

Algorithm printExpression(v)
if hasLeft (v)
print(()
inOrder (left(v))
print(v.element ())
if hasRight (v)
inOrder (right(v))
print ())
((2 (a - 1)) + (3 b))

CS 600.226: Data Structures, Professor:


Greg Hager

Evaluate Arithmetic Expressions


! Specialization of a postorder Algorithm evalExpr(v)
traversal
n

recursive method returning


the value of a subtree
when visiting an internal
node, combine the values
of the subtrees

if isExternal (v)
return v.element ()
else
x evalExpr(leftChild (v))
y evalExpr(rightChild (v))
operator stored at v
return x y

-

5

2004 Goodrich, Tamassia

1
CS 600.226: Data Structures, Professor:
Greg Hager

Binary Search Trees

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Binary Search
!

Binary search can perform operation find(k) on a dictionary implemented


by means of an array-based sequence, sorted by key
n
n
n

similar to the high-low game


at each step, the number of candidate items is halved
terminates after O(log n) steps

Example: find(7)

l
0

11

14

16

18

l
0

1
1

3
3

11

14

16

18

19

11

14

16

18

19

11

14

16

18

19

h
4

l=m =h
2004 Goodrich, Tamassia

19

CS 600.226: Data Structures, Professor:


Greg Hager

Binary Search
Trees
! A binary search tree is a

binary tree storing keys (or


key-value entries) at its
internal nodes and satisfying
the following property:
n

Let u, v, and w be three


nodes such that u is in the
left subtree of v and w is in
the right subtree of v. We
have
key(u) key(v) key(w)

The image cannot be displayed. Your computer may not


have enough memory to open the image, or the image
may have been corrupted. Restart your computer, and
then open the file again. If the red x still appears, you
may have to delete the image and then insert it again.

! An inorder traversal of a

binary search trees visits the


keys in increasing order

6
2
1

! External nodes do not store


items (a proper tree)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

9
4

BST Operations
public class BinarySearchTree<AnyType extends Comparable<? super AnyType>>

!
!
!
!
!
!
!
!

//
//
//
//
//
//
//
//

void insert( x )
--> Insert x
void remove( x )
--> Remove x
boolean contains( x ) --> Return true if x is present
Comparable findMin( ) --> Return smallest item
Comparable findMax( ) --> Return largest item
boolean isEmpty( )
--> Return true if empty; else false
void makeEmpty( )
--> Remove all items
void printTree( )
--> Print tree in sorted order

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Contains
! To search for a key k,
!

!
!

we trace a downward
path starting at the root
The next node visited
depends on the
outcome of the
comparison of k with
the key of the current
node
If we reach a leaf, the
key is not found and we
return false
Example: contains(4):
n

Algorithm TreeSearch(k, v)
if T.isExternal (v)
return false
if k < key(v)
return TreeSearch(k, T.left(v))
else if k = key(v)
return true
else { k > key(v) }
return TreeSearch(k, T.right(v))

2
1

Call TreeSearch(4,root)

2004 Goodrich, Tamassia

<

CS 600.226: Data Structures, Professor:


Greg Hager

>

4

Insertion
!
!
!

<

To perform operation insert(k,


o), we search for key k (using
TreeSearch)
Assume k is not already in the
tree, and let w be the leaf
reached by the search
We insert k at node w and
expand w into an internal
node
Example: insert 5

2
1

>

>

w
6

2
1

9
4

8
5

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

protected BinaryNode<AnyType> insert( AnyType x,


BinaryNode<AnyType> t )
{
if( t == null )
t = new BinaryNode<AnyType>( x );
else if( x.compareTo( t.element ) < 0 )
t.left = insert( x, t.left );
else if( x.compareTo( t.element ) > 0 )
t.right = insert( x, t.right );
else
throw new DuplicateItemException( x.toString( ) );
return t;
}

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Deletion
!
!
!

To perform operation
remove(k), we search for key k
Assume key k is in the tree,
and let let v be the node
storing k
If node v has a leaf child w, we
remove v and w from the tree
with operation
removeExternal(w), which
removes w and its parent
Example: remove 4

<

2

>

9
4

8
5

6
2
1

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

9
5

Deletion (cont.)
!

We consider the case where the


key k to be removed is stored at
a node v whose children are
both internal
n

n
n

we find the internal node w that


follows v in an inorder traversal
we copy key(w) into node v
we remove node w and its left
child z (which must be a leaf) by
means of operation
removeExternal(z)

1
3

8
6

z
1
5

Example: remove 3

8
6

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!
!

protected BinaryNode<AnyType> remove( AnyType x, BinaryNode<AnyType> t )


{
if( t == null )
throw new ItemNotFoundException( x.toString( ) );
if( x.compareTo( t.element ) < 0 )
t.left = remove( x, t.left );
else if( x.compareTo( t.element ) > 0 )
t.right = remove( x, t.right );
else if( t.left != null && t.right != null ) // Two children
{
t.element = findMin( t.right ).element;
t.right = removeMin( t.right );
}
else
t = ( t.left != null ) ? t.left : t.right;
return t;
}

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Performance
! Consider a dictionary

with n items
implemented by means
of a binary search tree
of height h
n
n

the space used is O(n)


methods find, insert and
remove take O(h) time

! The height h is O(n) in


the worst case and
O(log n) in the best
case

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

AVL Trees
! Binary search trees which maintain O(log
n) height
! Maintain height balance property

Heights of children differ by at most 1


n Local property to maintain, but guarantees
global property of overall height
n

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Analyzing AVL height


! n(h) : minimum nodes for AVL tree of height h
!
! Base conditions
n(1) = 1; n(2) = 2

! Recurrence relation

n(h) = 1 + n(h-1) + n(h-2) > 2*n(h-2)


n(h) > 2*n(h-2) > 4*n(h-4) > 8*n(h-6), etc.
n(h) > 2i*n(h-2i)

! Set i to achieve base condition

h-2i=2 i=(h-2)/2 = h/2-1


n(h) > 2(h-2)/2*n(1) = 2h/2-1*2 = 2h/2

! Bounding h
n
n

log(n(h)) > h/2


h < 2(log(n(h)) ==> O(logn)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor: Greg Hager

Inserting with balanced height


! Insert node into binary search tree as
usual

Insert occurs at leaves


n Increases height of some nodes along path
to root
n

! Walk up towards root


n

If unbalanced height is found, restructure


unbalanced region with rotation operation

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Restructuring
(as Single Rotations)
! Single Rotations:
a=z

b=y

single rotation

b=y

a=z

c=x

c=x
T0

T1

T3

T2

c=z

T0

T2

b=y

single rotation

b=y

T1

T3

a=x

c=z

a=x
T0

T1
2004 Goodrich, Tamassia

T2

T3

T3

T2

CS 600.226: Data Structures, Professor:


Greg Hager

T1

T0

Restructuring
(as Double Rotations)
! double rotations:
double rotation

a=z

b=x

c=y

a=z

c=y

b=x
T0

T3

T2

T0

T2

T1

T3

T1
double rotation

c=z

b=x

a=y

a=y

c=z

b=x
T3

T0
T2

T3

T2

T1
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

T1

T0

Insertion in an AVL Tree


! Insertion is as in a binary search tree
! Always done by expanding an external node.
! Example:
44
44
17

78

17

78

c=z

a=y
32

50

48

88

32

62

50

48

before insertion
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

88

62

54

after insertion

b=x

Balanced Tree
3
2
0

15

30

50

75

1
0

35

2004 Goodrich, Tamassia

40

1
0

45

65

60

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Insert (case 1)
4
3
0

30

15
Unbalance
d node

50

75

2
0

35

40

1
1

45
0

65

60

43
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Rotate Left . . .
4
3

30

75

15

50
1

40

35

45

65

60

43
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Rotate Left . . .
4
3

30

75

40

45

35

15

50

65

60

43
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Rotate Left - Balanced!


3
2

40

15

75

30
0

50
1

45
0

35 43

2004 Goodrich, Tamassia

65

60

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Insert (case 2)
4
2
0

30

15

50

75

1
0

35

40

2
0

45

65

60
0

53
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80
Unbalance
d node

Rotate Right. . .
4
2
0

30

15

50

75

1
0

35

40

2
0

45

65

60
0

53
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Rotate Right. . .
4
2
0

30

15

50

1
0

35

40

1
0

45

60
0

53
2004 Goodrich, Tamassia

75

CS 600.226: Data Structures, Professor:


Greg Hager

65

0
0

70

80

Rotate Right - Balanced!


3
2
0

30

15

1
0

35

2004 Goodrich, Tamassia

50

40

60

45

53

CS 600.226: Data Structures, Professor:


Greg Hager

65

1
0

70

75

80

Insert (case 3)
4
3
0

30

15
Unbalance
d node

50

75

2
1

35

40

1
0

45

65

60

33
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Double Rotation Right-Left - Right . . .


4
3
0

30

15

75

2
1

50

35

40
0

65

45 60

33
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Double Rotation Right-Left - Right . . .


4
3
0

50

30
1

15
0

33

2004 Goodrich, Tamassia

35

75

40

0
0

45

65

60

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Double Rotation Right-Left - Right . . .done!


4
3
0

15

30
0

33

50

75

35

1
1

40

0
0

65

60

45

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Double Rotation Right-Left - Left . . .


4
3
0

15

30
0

33

50

75

35

1
1

40

0
0

65

60

45
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Double Rotation Right-Left - Left . . .


4

15

30

35

33

50

75

40

0
0

45

2004 Goodrich, Tamassia

65

60

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Double Rotation Right-Left - Balanced!


3
2
1
0

15

30

35
0

33

2004 Goodrich, Tamassia

50

75

40

1
0

45

65

60

CS 600.226: Data Structures, Professor:


Greg Hager

70

80

Restructure Procedure
! Consider the first unbalanced node encountered

(walking upward) and its two descendants along that


path
! Sort them in increasing order and label as a, b, and c
! Place b as the parent of a and c where the unbalanced
node was
! Hook up the (up to) 4 subtrees as the appropriate
children of a and c

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Trinode Restructuring
! let (a,b,c) be an inorder listing of x, y, z
! perform the rotations needed to make b the topmost node of
the three

(other two cases


are symmetrical)

a=z

a=z

case 2: double rotation


(a right rotation about c,
then a left rotation about a)

c=y

b=y
T0

T0

b=x

c=x
T1

T3

b=y
T2

case 1: single rotation


(a left rotation about a)
2004 Goodrich, Tamassia

T1

T3

a=z

T0

b=x

T2

c=x

T1

T2

a=z

T3

CS 600.226: Data Structures, Professor:


Greg Hager

T0

c=y

T1

T2

T3

Insertion Example, continued


44
2

17
3

1
32

unbalanced...

50

48

2
1

78

2y

64

4
62

88

x
5

T3

54

T2

T0

T1

44

4
3

2
17
1
32

...balanced

2004 Goodrich, Tamassia

2 y
2

CS 600.226: Data Structures, Professor:


Greg Hager

48

T0

50

x
z6

62

The image cannot be


displayed. Your computer
may not have enough
memory to open the
image, or the image may
have been corrupted.
Restart your computer, and
then open the file again. If
the red x still appears, you
may have to delete the
image and then insert it
again.

78

54

T1

7
88

T2
T3

Analyzing Insert
! Upward traversal with height

recomputation takes O(h) = O(logn)


! Restructure takes O(1)
! The restructure always reduces the height
of the unbalanced node
n

So only one restructure is necessary

! Total time: O(logn) + O(1) = O(logn)


2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Removal in an AVL Tree


! Removal begins as in a binary search tree, which means the node
!

removed will become an empty external node. Its parent, w, may


cause an imbalance.
Example:
44

44

17

62

32

50

48

17

78

54

50

88

before deletion of 32
2004 Goodrich, Tamassia

62

CS 600.226: Data Structures, Professor:


Greg Hager

48

78

54

after deletion

88

Rebalancing after a Removal


!
!
!

Let z be the first unbalanced node encountered while travelling up the tree
from w. Also, let y be the child of z with the larger height, and let x be the child
of y with the larger height (or on same side as y if height equal.
We perform restructure(x) to restore balance at z.
As this restructuring may upset the balance of another node higher in the tree,
we must continue checking for balance until the root of T is reached
a=z
w

62

44

17

50

48

2004 Goodrich, Tamassia

c=x

78

54

44

b=y

62

88

17

78

50

48

88

54

CS 600.226: Data Structures, Professor:


Greg Hager

Remove Algorithm
! Perform removal as with binary search tree
n

May decrease height of some nodes on path to the


root

! Walk upwards to the root


n

If unbalanced height is found, restructure


unbalanced region with rotation operation

! Remove is also O(logn)


n

But multiple restructure operations may be


necessary along the way

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Running Times for


AVL Trees
! a single restructure is O(1)
n

using a linked-structure binary tree

! find is O(log n)
n

height of tree is O(log n), no restructures needed

! insert is O(log n)
n
n

initial find is O(log n)


Restructuring up the tree, maintaining heights is O(log n)

! remove is O(log n)
n
n

initial find is O(log n)


Restructuring up the tree, maintaining heights is O(log n)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Splay Trees
v

6
8

3
4

2004 Goodrich, Tamassia

Splay Trees

76

Splay Trees are


Binary Search Trees

all the keys in the blue


region are 20

(20,Z)

note that two keys of


equal value may be wellseparated

(10,A)

(35,R)

! BST Rules:
n

items stored only at internal


nodes
keys stored at nodes in the
left subtree of v are less
than or equal to the key
stored at v
keys stored at nodes in the
right subtree of v are
greater than or equal to the
key stored at v

(7,T)
(1,Q)

(1,C)

(5,H)

Splay Trees

(5,I)

77

(7,P)

(36,L)

(37,P)
(40,X)

(10,U)

all the keys in the yellow


region are 20

(5,G)

! An inorder traversal will

2004 Goodrich, Tamassia

(21,O)

(8,N)

(2,R)

return the keys in order

(14,J)

(6,Y)

Splay Tree Properties


! Every access causes the deepest accessed node to be
splayed to top
! Roughly halves (on average) the depth of most
nodes on access path
! We can show M access take O(M log n) time
n

Amortization analysis shows O(log n) asymptotic


performance

! Any one access could be O(n) worst case


! Splaying looks a lot like AVL trees but with less
algorithmic and storage overhead

2004 Goodrich, Tamassia

Splay Trees

78

Searching in a Splay Tree:


Starts the Same as in a BST
(20,Z)

! Search proceeds down


the tree to found item
or an external node.
! Example: Search for
time with key 11.

(10,A)

(7,T)
(1,Q)

(1,C)

(5,H)

79

(7,P)

(5,G)
(5,I)

Splay Trees

(14,J)

(21,O)

(8,N)

(2,R)

2004 Goodrich, Tamassia

(35,R)

(6,Y)

(36,L)

(10,U)

(37,P)
(40,X)

Example Searching in a BST,


continued
(20,Z)

! search for key 8, ends at


an internal node.

(10,A)

(7,T)
(1,Q)

(1,C)

(5,H)

80

(7,P)

(5,G)
(5,I)

Splay Trees

(14,J)

(21,O)

(8,N)

(2,R)

2004 Goodrich, Tamassia

(35,R)

(6,Y)

(36,L)

(10,U)

(37,P)
(40,X)

Splay Trees do Rotations after


Every Operation (Even Search)
! new operation: splay
n

splaying moves a node to the root using rotations

right rotation
n

makes the left child x of a node y into


ys parent; y becomes the right child
of x
y

left rotation
n

makes the right child y of a node x


into xs parent; x becomes the left
child of y
x

a right rotation about y

a left rotation about x


y

x
x

T3

T1
y

T1

T2
(structure of tree above y
is not modified)

2004 Goodrich, Tamassia

Splay Trees

x
T2

T1

T2

T3

(structure of tree above x


is not modified)

T3
81

T3

T1

T2

Splaying:
start with
node x
is x the
root?

yes

is a left-left grandchild means x is a left child of its


parent, which is itself a left child of its parent
n p is xs parent; g is ps parent
n x

is x a left-left
grandchild?

stop

yes
no

is x a child of
the root?

is x a right-right
grandchild?

no

yes

yes

is x a right-left
grandchild?
is x the left
child of the
root?
yes

zig

right-rotate
about the root
2004 Goodrich, Tamassia

Splay Trees

no

yes

is x a left-right
grandchild?

zig
left-rotate about
the root

yes

82

zig-zig
right-rotate about g,
right-rotate about p

zig-zig
left-rotate about g,
left-rotate about p

zig-zag
left-rotate about p,
right-rotate about g

zig-zag
right-rotate about p,
left-rotate about g

Visualizing the
Splaying Cases

zig-zag

z
z

y
T4
x

T4
T3

T1

T1

zig-zig

T2

T1

T2

T3

T4

T3

T2

T4

T1
z

T3
T3

2004 Goodrich, Tamassia

T2

Splay Trees

zig

T4

T1

T2
83

T1

T2

T3

T4

Splaying Example
!

let x = (8,N)
n x is the right child of its parent,
which is the left child of the
grandparent
n left-rotate around p, then rightrotate around g

(20,Z)

(10,A)

(35,R)

g
(14,J)

(7,T)

p
(1,Q)

(21,O)

(8,N)

1.

(37,P)

(36,L)

(before
rotating)

(40,X)

x
(1,C)

(5,H)

(2,R)

(10,U)

(7,P)

(5,G)

(20,Z)
(6,Y)

(5,I)

g
x

(10,A)

(14,J)

(8,N)

(20,Z)

(35,R)

(21,O)

(37,P)

(35,R)

(8,N)

g
p

(10,U)

(7,T)
(1,Q)

(1,C)

(36,L)

(40,X)

(7,T)
(1,Q)

(7,P)

2.

(5,H)

(2,R)

(1,C)

(after first rotation)

(5,I)

(6,Y)

2004 Goodrich, Tamassia

Splay Trees

(7,P)

84

(37,P)

(36,L)

(14,J)

(40,X)

3.

(after second
rotation)

(5,G)
(5,I)

(21,O)

(10,U)

(5,H)

(2,R)

(5,G)

(10,A)

(6,Y)

x is not yet the root, so


we splay again

Splaying Example, Continued


! now x is the left child of the root
(20,Z)

(1,Q)

(10,A)

(7,P)

(2,R)

(5,G)
(5,I)

(21,O)
(14,J)

(10,U)

(5,H)

right-rotate around root

(35,R)

(8,N)

(7,T)

(1,C)

(6,Y)

(37,P)

(36,L)

(40,X)

(8,N)

2.
(20,Z)

(7,T)

1.
(before applying
rotation)

(1,Q)

(1,C)

(7,P)

(14,J)

(10,U)

(5,G)
(5,I)

(35,R)

(10,A)

(5,H)

(2,R)

(after rotation)

(21,O)

(37,P)

(36,L)

(6,Y)

x is the root, so stop


2004 Goodrich, Tamassia

Splay Trees

85

(40,X)

Example Result
of Splaying

(20,Z)

(10,A)

before

(14,J)

(7,T)

tree might not be more balanced


(1,Q)
e.g. splay (40,X)
n before, the depth of the shallowest leaf is(1,C)
(5,H)
3 and the deepest is 7
n after, the depth of shallowest leaf is 1
(2,R)
and deepest is 8

!
!

(35,R)

(21,O)

(8,N)

(37,P)

(36,L)

(40,X)

(10,U)

(7,P)

(5,G)
(40,X)
(6,Y)

(5,I)

(20,Z)
(20,Z)

(7,T)
(1,Q)

(1,C)

(14,J)

(2,R)

(7,P)

(1,Q)

(35,R)

(10,U)

(21,O)

(1,C)

(36,L)

2004 Goodrich, Tamassia

Splay Trees

(5,H)

86

(14,J)

(7,P)

(5,G)
(5,I)

(6,Y)

(37,P)

(35,R)

(8,N)

(2,R)

after first splay

(5,G)
(5,I)

(7,T)

(37,P)

(8,N)

(5,H)

(10,A)

(40,X)

(10,A)

(6,Y)

(21,O)

(36,L)

(10,U)

after second
splay

Splay Tree Manipulation


! Search: Splay found node
! Insert: Insert as a normal binary tree
and splay
! Delete:

Search (puts node at root)


n Delete yielding TL and TR
n Access largest node in TL, gives tree with
no right child
n Put in TR
2004 Goodrich, Tamassia
n

Splay Trees

87

Splay Tree Definition


! a splay tree is a binary search tree
where a node is splayed after it is
accessed (for a search or update)

deepest internal node accessed is splayed


n splaying costs O(h), where h is height of
the tree which is still O(n) worst-case
n

w O(h) rotations, each of which is O(1)

2004 Goodrich, Tamassia

Splay Trees

88

Splay Trees & Ordered


Dictionaries
! which nodes are splayed after each operation?
method
findElement

insertElement

removeElement

2004 Goodrich, Tamassia

Splay Trees

splay node
if key found, use that node
if key not found, use parent of ending external node
use the new node containing the item inserted
use the parent of the internal node that was actually
removed from the tree (the parent of the node that the
removed item was swapped with)

89

Amortized Analysis of
Splay Trees
! Running time of each operation is proportional to time
!
!
!
!

for splaying.
Define rank(v) as the logarithm (base 2) of the number
of nodes in subtree rooted at v.
Costs: zig = $1, zig-zig = $2, zig-zag = $2.
Thus, cost for playing a node at depth d = $d.
Imagine that we store rank(v) cyber-dollars at each
node v of the splay tree (just for the sake of analysis).

2004 Goodrich, Tamassia

Splay Trees

90

Cost per zig


y

zig
x

x
T4

T3
T1

T2

T1

T2

T3

T4

! Doing a zig at x costs at most rank(x) - rank(x):


n

cost = rank(x) + rank(y) - rank(y) rank(x)


< rank(x) rank(x).

2004 Goodrich, Tamassia

Splay Trees

91

Cost per zig-zig and zig-zag


z

zig-zig

T4

T1

T3
T1

y
z
T2

T2

T3

! Doing a zig-zig or zig-zag at x costs at most


3(rank(x) - rank(x)) - 2.
zig-zag

y
T1

T4
T2

Splay Trees

T1

2004 Goodrich, Tamassia

T3
92

T2

T3

T4

T4

Cost of Splaying
! Cost of splaying a node x at depth d of a tree
rooted at r:
n
n

at most 3(rank(r) - rank(x)) - d + 2:


Proof: Splaying x takes d/2 splaying substeps:
d /2

cost cost i
i =1

d /2

(3( rank i ( x ) rank i 1 ( x )) 2) + 2


i =1

= 3( rank ( r ) rank 0 ( x )) 2(d / d ) + 2


3( rank ( r ) rank ( x )) d + 2.
2004 Goodrich, Tamassia

Splay Trees

93

Performance of
Splay Trees
! Recall: rank of a node is logarithm of its size.
! Thus, amortized cost of any splay operation is
O(log n).
! In fact, the analysis goes through for any
reasonable definition of rank(x).
! Splay trees can actually adapt to perform
searches on frequently-requested items much
faster than O(log n) in some cases.
2004 Goodrich, Tamassia

Splay Trees

94

Multi-way Search Trees


! Each node may store multiple key-element
pairs
! Node with d children (d-node) stores d-1
key-element pairs
! Children have keys that fall either before
smallest parent key, after largest parent
key, or between two parent keys
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Example Multi-way Search


Tree
50
60 70 80

20 30
10 15

25

22

40 42 45

55

64 66

75

85 90

27

! External node between each pair of keys and before/after


(n-1) + 1 + 1 = n+1 external nodes

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Multi-Way Inorder Traversal


! We can extend the notion of inorder traversal from binary trees
!
!

to multi-way search trees


Namely, we visit item (ki, oi) of node v between the recursive
traversals of the subtrees of v rooted at children vi and vi + 1
An inorder traversal of a multi-way search tree visits the keys in
increasing order

11
8

2 6 8
2

24
15

12

27

14

10
7

11

13

CS 600.226: Data Structures, Professor:


Greg Hager

30

18
19

16

15
2004 Goodrich, Tamassia

32

17

Multi-Way Searching
!
!

Similar to search in a binary search tree


A each internal node with children v1 v2 vd and keys k1 k2 kd-1
n
n
n
n

!
!

k = ki (i = 1, , d - 1): the search terminates successfully


k < k1: we continue the search in child v1
ki-1 < k < ki (i = 2, , d - 1): we continue the search in child vi
k > kd-1: we continue the search in child vd

Reaching an external node terminates the search unsuccessfully


Example: search for 30

11
2 6 8

24
15

27

32
30

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Multi-way Search Analysis


! Number of nodes traversed is up to h
! Work at each node is function of d
n

O(log d) if structure storing keys provides efficient


search, otherwise O(d)

! Total worst case time


n
n

O(hlog dmax) or O(h dmax)


If dmax is bounded by small constant, just O(h)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

B-trees
! A B-tree of order M is a MWST s.t.
n
n
n

The root has between 2 and M children


A nonleaf nodes have ceil(M/2) to M children
All leaves have the same depth

! B+ trees (in the book) have the additional


properties that:
n
n

All data occurs at leaves


Leaves may have a different size than internal
nodes.

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

How Many Nodes Can a B Tree


Store?
!
!
!
!

A 2,4 tree
A 11,21 tree
A 17, 32 tree

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

(2,4) Trees
! A (2,4) tree (also called 2-4 tree or 2-3-4 tree) is a multi-way
search with the following properties
n
n

Node-Size Property: every internal node has at most four children


Depth Property: all the external nodes have the same depth

! Depending on the number of children, an internal node of a


(2,4) tree is called a 2-node, 3-node or 4-node

10 15 24
2 8

2004 Goodrich, Tamassia

12

18

CS 600.226: Data Structures, Professor:


Greg Hager

27

32

Height of a (2,4) Tree


! Theorem: A (2,4) tree storing n items has height O(log n)
!
Proof:
n
n

Let h be the height of a (2,4) tree with n items


Since there are at least 2i items at depth i = 0, , h - 1 and no
items at depth h, we have
n 1 + 2 + 4 + + 2h-1 = 2h - 1
Thus, h log (n + 1)

! Searching in a (2,4) tree with n items takes O(log n) time


depth items
0

h-1

2h-1

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Insertion
!

We insert a new item (k, o) at the parent v of the leaf reached by


searching for k
n
n

We preserve the depth property but


We may cause an overflow (i.e., node v may become a 5-node)

Example: inserting key 30 causes an overflow


10 15 24
2 8

12

18

10 15 24
2 8

2004 Goodrich, Tamassia

12

18

CS 600.226: Data Structures, Professor:


Greg Hager

27 32 35

27 30 32 35

Overflow and Split


!

We handle an overflow at a 5-node v with a split operation:


n
n

let v1 v5 be the children of v and k1 k4 be the keys of v


node v is replaced nodes v' and v"
w v' is a 3-node with keys k1 k2 and children v1 v2 v3
w v" is a 2-node with key k4 and children v4 v5

key k3 is inserted into the parent u of v (a new root may be created)

The overflow may propagate to the parent node u

u
12

15 24

18

27 30 32 35

12

v1 v2 v3 v4 v5
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

18

15 24 32

v'

27 30

v1 v2 v3 v4

v"

35

v5

Simple Insertion (no overflow)


10

10

12 14

Insert 15

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

12 14 15

Insertion with Overflow


10

10

Insert 11
12 14 15

11 12 14 15
Split
10 14

5
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

11 12

15

Insert with Cascading Split


6 8 10
5

Insert 11

12 14 15

6 8 10
5

10
6 8
5

2004 Goodrich, Tamassia

11 12 14 15

Split
Split

14
9

11 12

15

6 8 10 14
5

CS 600.226: Data Structures, Professor:


Greg Hager

11 12

15

Analysis of Insertion
!
!
!
!

Algorithm insert(k, o)

! Let T be a (2,4) tree


with n items

1.
We search for key k to locate the
insertion node v

2.
We add the new entry (k, o) at
node v
3. while overflow(v)
if isRoot(v)
create a new empty root above v
v split(v)

n
n

Tree T has O(log n)


height
Step 1 takes O(log n)
time because we visit
O(log n) nodes
Step 2 takes O(1) time
Step 3 takes O(log n)
time because each split
takes O(1) time and we
perform O(log n) splits

! Thus, an insertion in a

(2,4) tree takes O(log n)


time

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Deletion
!
!
!

We reduce deletion of an entry to the case where the item is at the node with
leaf children
Otherwise, we replace the entry with its inorder successor (or, equivalently, with
its inorder predecessor) and delete the latter entry
Example: to delete key 24, we replace it with 27 (inorder successor)
10 15 24
2 8

12

18

27 32 35

10 15 27
2 8

2004 Goodrich, Tamassia

12

18

CS 600.226: Data Structures, Professor:


Greg Hager

32 35

Underflow and Fusion


! Deleting an entry from a node v may cause an underflow, where
!
!

node v becomes a 1-node with one child and no keys


To handle an underflow at node v with parent u, we consider two
cases
Case 1: the adjacent siblings of v are 2-nodes
n

Fusion operation: we merge v with an adjacent sibling w and move


an entry from u to the merged node v'
After a fusion, the underflow may propagate to the parent u

9 14

2 5 7

2004 Goodrich, Tamassia

10

2 5 7

CS 600.226: Data Structures, Professor:


Greg Hager

9
10 14

v'

Underflow and Transfer


! To handle an underflow at node v with parent u, we consider
!

two cases
Case 2: an adjacent sibling w of v is a 3-node or a 4-node
n

Transfer operation:
1. we move a child of w to v
2. we move an item from u to v
3. we move an item from w to u
After a transfer, no underflow occurs

u
4 9

2004 Goodrich, Tamassia

w
6 8

CS 600.226: Data Structures, Professor:


Greg Hager

u
4 8
w
6

Simple Removal
6 8 10
5

2004 Goodrich, Tamassia

Remove 14
12 14 15

CS 600.226: Data Structures, Professor:


Greg Hager

6 8 10
7

12 15

Removal with Swap


6 8 10
5

6 8

Remove 10
12 14 15

12 14 15

Swap
6 8 12
5
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

14 15

Removal with Transfer


Remove 9

6 8 10
5

12 14 15

6 8 10
5

12 14 15
Transfer
(~rotate)

6 8 12
5
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

10

14 15

Removal with Fusion


6 8 10
5

6 8 10

Remove 7
12 14 15

12 14 15

Fusion
6 10
5
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

8 9

12 14 15

Analysis of Deletion
! Let T be a (2,4) tree with n items
n

Tree T has O(log n) height

! In a deletion operation
n

We visit O(log n) nodes to locate the node from which to


delete the entry
We handle an underflow with a series of O(log n) fusions,
followed by at most one transfer
Each fusion and transfer takes O(1) time

! Thus, deleting an item from a (2,4) tree takes O(log


n) time

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

External Memory Searching


! Memory Hierarchy
Registers
Cache
RAM

External Memory
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Types of External Memory


!
!
!
!
!

Hard disk
Floppy disk
Compact disc
Tape
Distributed/networked memory

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Primary Motivation
! External memory access much slower
than internal memory access

orders of magnitude slower


n need to minimize I/O Complexity
n can afford slightly more work on data in
memory in exchange for lower I/O
complexity
n

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Application Areas
!
!
!
!
!

Searching
Sorting
Data Processing
Data Mining
Data Exploration

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Disk Blocks
! Data is read one block at a time
pack as much into a block as possible
n minimize number of block reads necessary
n

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

I/O Efficient Dictionaries


! Balanced tree structures
Typically O(log2n) transfers for query or
update
n Want to reduce height by constant factor as
much as possible
n Can be reduced to O(logBn) = O(log2n/log2B)
n

w B is number of nodes per block

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

(a,b) Trees
! Generalization of (2,4) trees
! Size property: internal node has at least a
children and at most b children
n

2 <= a <= (b+1)/2

! Depth property: all external nodes have

same depth
! Height of (a,b) tree is (logn/logb) and
O(logn/loga)
2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

B-Trees

! Choose a and b to be (B)


! Height is now O(logBn)
! I/O complexity for search is
O(logBn)

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Sets and Maps


! Set: a Collection that does not allow
duplicates
n
n

http://download.oracle.com/javase/tutorial/collections/interfaces/set.html
Can be implemented in many ways, including TreeSet

! Map: a Collection of Key-Value pairs


that does not allow duplicate keys
n
n

http://download.oracle.com/javase/tutorial/collections/interfaces/map.html
Can be implemented in many ways, including TreeMap

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

Summary
!
!
!
!
!
!

Binary search trees


AVL trees and balancing
Splay trees
Multi-way trees
2-4 trees
B-trees

2004 Goodrich, Tamassia

CS 600.226: Data Structures, Professor:


Greg Hager

You might also like