You are on page 1of 25

Chapter 5

Trees

5.1 Introduction
The data type tree is familiar from everyday life. Examples include family trees and the
tree of directories and files on a disk.
A tree consists of a set of nodes that can contain data that are linked by edges. They
satisfy two properties.

1. A tree is connected. It is possible to get from any node to any other by following the
edges.

2. A tree has no loops. There is only one way to get from one node to another.

In computer science a tree always has a root node from which to start. We normally
draw a tree upside-down with the root node at the top.

H:
PP
 P
  PP
PP
  PP
 PP
web workspace test.zip

file.html myproject
PPP
 PP
  PP
 PP
 PP
file1.java file1.class .project

Figure 5.1: A tree of files

1
CHAPTER 5. TREES 2

There are some useful names for features of a tree.

Child A nodes directly below a given node.


H: has children web, workspace and test.zip.

Parent The node directly above a given node.


The parent of myproject is workspace.

Siblings Nodes are siblings if they have the same parent.


file1.java, file1.class and .project are siblings.

Ancestor A node between a given node and the root.


web and H: are ancestors of file.html.

Descendant A node that has a given node as ancestor.


The descendants of workspace are myproject file1.java, file1.class and
.project.

Leaf node A node with no children.


file.html, file1.java, file1.class, .project and test.zip are leaf nodes.

Interior node A node that is not a leaf node.

Level The number of nodes from the root to that node.


The root, H: has level 1. myproject has level 3.

Height The height of a tree is its number of levels.


Our tree has height 4.

Subtree The subtree below a node is the tree with that node as root and with all its
descendants.
workspace has a subtree of height 3.

As our trees have a root node they cannot be empty.

5.2 Tree interfaces


5.2.1 Binary Trees
A tree is a binary tree if every node has at most two children. In a binary tree we often
divide the two children into left and right.
A tree where every node has at most one child is a linked list!
Here is an interface for a binary tree:
CHAPTER 5. TREES 3

h
HH
  H
 HH

h Hh
  HH

@ @
@ @
h @h @h
@ @
A A A
 A A  A
h Ah Ah h Ah
 A A  A

Figure 5.2: A binary tree

package com328 . tree ;


/* * BinaryTree
* @author C.T. Stretch
*/
p ub li c i n t e r f a c e BinaryTree <E >
{
/* * Get the size of the tree .
* @return the number of nodes .
*/
p ub li c i n t size ();

/* * Get the height of the tree .


* @return the height .
*/
p ub li c i n t height ();
/* * Get the data from the root .
* @return the root data .
*/
p ub li c E getData ();

/* * Set the root data


* @param data the data .
*/
p ub li c void setData ( E data );

/* * Test if the tree is a leaf .


* @return true for a leaf .
*/
p ub li c boolean isLeaf ();
CHAPTER 5. TREES 4

/* * Tests if the tree has a left child .


* @return true if the tree has a left child .
*/
p ub li c boolean hasLeft ();

/* * Tests if the tree has a right child .


* @return true if the tree has a right child .
*/
p ub li c boolean hasRight ();

/* * Returns the left child .


* @return The left child tree .
* @throws NoSuchElementException if there is no left child .
*/
p ub li c BinaryTree <E > getLeft ();

/* * Returns the right child .


* @return The right child tree .
* @throws NoSuchElementException if there is no right child .
*/
p ub li c BinaryTree <E > getRight ();

/* * Set the left child


* @param left The new left child tree .
*/
p ub li c void setLeft ( BinaryTree <E > left );

/* * Set the right child


* @param right The new right child tree .
*/
p ub li c void setRight ( BinaryTree <E > right );
}
in addition to this interface we will need some constructors, at least SomeTree(E data)
to give a tree with just a root node containing the data.
A simple use of a binary tree is a Binary decision tree. Here a user is asked a sequence
of yes-no questions in order to make a choice. The questions are held in the interior nodes
of a binary tree. The left node is taken for yes and the right node for no. The leaves hold
the final choices.
The AnimalFinder is a simple decision tree program. The program asks a number of
yes-no questions to select an animal.
If it fails to find the animal required the user can add a new animal and a new question.
CHAPTER 5. TREES 5

Does it have two legs?


PPP
 PP
 PP
 PP
q
P



Can it fly? Does it live in a burrow?
 Q
 @ Q
 @ Q
 @ Q


+ R
@ Qs
Q
Bird Human Rabbit Dog

Figure 5.3: A decision tree

Here is some code from AnimalFinder. The data is held in a tree of strings. This
method is called if the user responds yes or no to a question. here represents the place we
have arrived in the tree.

void guess ( boolean yes )


{
i f ( here . isLeaf ())
{
i f ( yes )
{
instruction . setText (" Click < Restart > to start again " );
question . setText (" It was a " + here . getData ());
}
else
{
state = STATE_GET_ANIMAL ;
question . setEditable ( true );
instruction . setText (
"I give up . Type your animal below and click <Yes >" );
question . setText ("" );
}
}
else
{
here = ( yes ) ? here . getLeft () : here . getRight ();
setQuestion ();
}
}
This method adds a new animal and question to the tree.
CHAPTER 5. TREES 6

void getQuestion ( boolean yes )


{
i f ( yes )
{
String newQuestion = question . getText ();
i f (! newQuestion . endsWith ("?" ))
newQuestion += "?";
String oldAnimal = here . getData ();
here . setData ( newQuestion );
here . setLeft (new MyBinaryTree < String >( newAnimal ));
here . setRight (new MyBinaryTree < String >( oldAnimal ));
modified = true ;
}
start ();
}

5.2.2 General trees


General trees are more complicated to deal with than binary trees. As we do not know
how many children a node may have we cannot have a method for each child. Instead we
need some sort of list structure to hold the children.
ADTs for general trees are quite varied as different methods are required for different
problems. Here is a possible interface for a general tree.

package com328 . tree ;


import java . util . Iterator ;

/* * A interface to represent a general tree


*/
p ub li c i n t e r f a c e Tree <E >
{
/* * Get the size of the tree .
* @return the number of nodes
*/
p ub li c i n t size ();

/* * Get the height of the tree .


* @return the height
*/
p ub li c i n t height ();
CHAPTER 5. TREES 7

/* * Get the number of children of the root .


* @return the number of children .
*/
p ub li c i n t numberOfChildren ();

/* * Get the data from the root .


* @return Object the root data .
*/
p ub li c Object getData ();

/* * Set the data in the root


* @param data The data to set
*/
p ub li c void setData ( E data );

/* * Get an iterator over the child subtrees


* @return Iterator an iterator that gives the subtrees .
*/
p ub li c Iterator < Tree <E > > childIterator ();

/* * Add a tree as a subtree of the root .


* @param tree The tree to add .
*/
p ub li c void addChild ( Tree <E > tree );
}
Again at least a one node constructor is needed.
The following example uses the interface to create a tree representing a file and directory
structure. It then demonstrates running over the tree to count all the leaves that have
names ending with .java.
Notice that both methods are recursive. Most operations on trees need recursive meth-
ods.

package com328 . tree ;


import java . io . File ;
import java . util . Iterator ;

import uucPack . InOut ;

/* * MakeFileTree
* @author C.T. Stretch
*/
p ub li c c l a s s MakeFileTree
CHAPTER 5. TREES 8

{
s t a t i c Tree < String > theTree ;

p ub li c s t a t i c void main ( String [] args )


{
InOut . print (" Enter file or directory name : " );
String name = InOut . readString ();
InOut . println (" Creating tree from " + name );
theTree = makeTree (new File ( name ));
InOut . println (" Tree created " );
InOut . println (" Size = " + theTree . size ());
InOut . println (" Height = " + theTree . height ());
InOut . println (" Number of java files = " + countType ( theTree , ". java
}

/* * This recursive method lists creates the tree


* @param f A File object representing a file or directory .
* @return The tree representing the file structure .
*/
p r i v a t e s t a t i c Tree < String > makeTree ( File f )
{
Tree < String > t = new ListTree < String >( f . getName ());
i f ( f . isDirectory ())
{
File [] files = f . listFiles ();
f o r ( i n t i = 0; i < files . length ; i ++)
t . addChild ( makeTree ( files [ i ]));
}
return t ;
}

/* * Count the number of leaves in the tree with names ending


* with an extension .
* @param t the tree to count .
* @param ext the extension to look for .
* @return the count .
*/
p r i v a t e s t a t i c i n t countType ( Tree < String > t , String ext )
{
i n t count = 0;
i f ( t . numberOfChildren () == 0)
{
String s = t . getData ();
CHAPTER 5. TREES 9

i f ( s . endsWith ( ext ))
count ++;
}
else
{
Iterator < Tree < String > > it = t . childIterator ();
while ( it . hasNext ())
{
count += countType ( it . next () , ext );
}
}
return count ;
}
}

5.3 Tree Implementations


5.3.1 Binary Tree Implementation
We can implement a tree in a similar way to a linked list, using Java object references as
the edges of the tree.
For a binary tree we just keep references to the two children.

package com328 . tree ;


import java . util . NoSuchElementException ;

/* * MyBinaryTree
* Simple implementation of a binary tree
* @author C.T. Stretch
*/
p ub li c c l a s s MyBinaryTree <E > implements BinaryTree <E >
{
p r i v a t e BinaryTree <E > left , right ;
p r i v a t e E data ;

MyBinaryTree ( E data )
{
t h i s . data = data ;
}

p ub li c i n t size ()
{
CHAPTER 5. TREES 10

i n t n =1;
i f ( left != n u l l ) n += left . size ();
i f ( right != n u l l ) n += right . size ();
return n ;
}

p ub li c i n t height ()
{
i n t m =0 , n =0;
i f ( left != n u l l ) m = left . height ();
i f ( right != n u l l ) n = right . height ();
return 1+(( m > n )? m : n );
}

p ub li c void setData ( E data )


{
t h i s . data = data ;
}

p ub li c E getData ()
{
return data ;
}

p ub li c boolean hasLeft ()
{
return left != n u l l ;
}

p ub li c boolean hasRight ()
{
return right != n u l l ;
}

p ub li c BinaryTree <E > getLeft ()


{
i f ( left == n u l l ) throw new NoSuchElementException (" No left child " );
return left ;
}

p ub li c BinaryTree <E > getRight ()


{
i f ( right == n u l l ) throw new NoSuchElementException (" No right child " );
CHAPTER 5. TREES 11

return right ;
}

p ub li c void setLeft ( BinaryTree <E > left )


{
t h i s . left = left ;
}

p ub li c void setRight ( BinaryTree <E > right )


{
t h i s . right = right ;
}

p ub li c boolean isLeaf ()
{
return ( left == n u l l )&&( right == n u l l );
}
}
Notice we do not use a separate Node class, the data of tree itself represents the root
node. We can only do this as we do not allow empty trees.
Notice we do not keep a separate size variable, instead we calculate the size when it is
needed. Keeping a size variable would be tricky as the size could be changed if a subtree
was altered.
The height and size are both calculated by recursive methods. The height of a tree is
the maximum of the height of its child subtrees plus one.
The size of a tree is the sum of the sizes of its child subtrees plus one.

5.3.2 General Tree Implementation


If the maximum number of children is known we can use that many child variables, or an
array of children.
For a general tree we use a list as we do not know how many children there are.
We use an iterator to loop over all the children of a node.
Notice how the iterators are used in the height and size methods.

package com328 . tree ;


import java . util . ArrayList ;
import java . util . Iterator ;
import java . util . List ;

/* *
* @author chris
CHAPTER 5. TREES 12

*/
p ub li c c l a s s ListTree <E > implements Tree <E >
{

p r i v a t e E data ;
p r i v a t e List < Tree <E > > children ;

ListTree ( E data )
{
t h i s . data = data ;
children =new ArrayList < Tree <E > >();
}

p ub li c i n t size ()
{
i n t n = 0;
Iterator < Tree <E > > it = children . iterator ();
while ( it . hasNext ())
n += it . next (). size ();
return n +1;
}

p ub li c i n t height ()
{
int n= 0, m;
Iterator < Tree <E > > it = children . iterator ();
while ( it . hasNext ())
{
m = it . next (). height ();
i f (m > n)
n= m;
}
return n +1;
}

p ub li c i n t numberOfChildren ()
{
return children . size ();
}

p ub li c Object getData ()
{
return data ;
CHAPTER 5. TREES 13

p ub li c void setData ( E data )


{
t h i s . data = data ;
}

p ub li c Iterator < Tree <E > > childIterator ()


{
return children . iterator ();
}

p ub li c void addChild ( Tree <E > tree )


{
children . add ( tree );
}
}

5.3.3 Traversing a tree


We often want to do something for each node of a tree, for example printing the contents
of the node. There are several common orders in which we can visit the nodes. In all of
them we visit the children in left to right order.
Ah
@
@
Bh @hC
@
@ @
@ @
Dh @hE @hF
@ @
@
@
h h h @h
@

G H I J

Figure 5.4: Tree orders

Preorder traversal Visit the node before its children. ABDGEHCFIJ.


Postorder traversal Visit the children then the node. GDHEBIJFCA
Level order traversal Visit each level of the tree from left to right. ABCDEFGHIJ.
This is a breadth-first traversal, the others are depth first.
CHAPTER 5. TREES 14

Inorder traversal (Only for binary trees.) Visit the left child, then the node then the
right child. GDBHEACIFJ.

The depth-first traversals are easy to do recursively:


void preorderTraversal ( BinaryTree < String > b )
{
doSomething ( b . getData ());
i f ( b . hasLeft ()) preorderTraversal ( b . getLeft ());
i f ( b . hasRight ()) preorderTraversal ( b . getRight ());
}
To do inorder or postorder traversals we just need to change the order of the three lines.
It is sometimes better to use a non-recursive traversal. We can do the depth-first
traversal using a stack. Preorder is the easiest of these. We start with the root node on
the stack, at each step we pop a node off the stack, process it, and push its children onto
the stack in reverse order. We finish when the stack is empty.
A good way to implement a traversal is as an iterator.
c l a s s PreOrderIterator <E > extends Iterator <E >
{
Stack < BinaryTree <E > > s =new ArrayStack < BinaryTree <E > >();

PreOrderIterator ()
{ push ( t h i s );
}

boolean hasNext ()
{ return ! s . isEmpty ();
}

E next ()
{ BinaryTree <E > t = s . pop ());
i f ( t . hasRight ()) s . push ( t . getRight ());
i f ( t . hasLeft ()) s . push ( t . getLeft ());
return t . getData ();
}
}
If we iterate over the tree above the stack looks like:
CHAPTER 5. TREES 15

A
C B
C E D
C E G
C E
C H
C
F
J I
J
.
The level order traversal can be done using a queue. We start with the root in the
queue at each step we remove a node from the queue, process it and add its children in
order to the queue.
If we iterate over the tree above the queue looks like:
A
C B
E D C
F E D
G F E
H G F
J I H G
J I H
J I
J
.

5.4 Search trees


5.4.1 Introduction
Recall that we looked earlier at methods for searching for a value in an array. We saw
two algorithms, a linear search has O(n) time-efficiency, and can be done for any data. A
binary search can only be done for sorted data, and so the data needs to have an order
that we can sort into, however it has time-efficiency O(log(n)).
We can use either of these methods for lists as well as arrays.
Binary searching is very effective for fixed data. We sort the data once and then can
search it as often as we like.
If we have data that we want to be able to change we need to keep it sorted. We do
not need to sort the whole of the data again as long as we insert items at the correct place.
This means addition and removal of items are O(n) operations.
A binary search tree can give a way of storing data so that searching, adding and
removing are all O(log n)
CHAPTER 5. TREES 16

5.4.2 Binary search trees


A binary search tree is a tree with a value in each node. We will take the values at the
nodes to be all different. The values have an ordering, and the data is arranged so that at
every node the values of the left child subtree are less than the value at the node and the
values of the right child subtree are greater than the value at the node.
Figure 5.5 is a binary search tree,
Figure 5.6 is not.

10 h
@
@
6 h @h12
@
@ @
@ @
4 h @h7 @h14
@ @
@ @
@ @
h @h h @h
@ @

1 5 13 16

Figure 5.5: A binary search tree

10 h
@
@
4 h @h14
@
@ @
@ @
2 h @h12 @h 13
@ @

Figure 5.6: Not a search tree

Binary search trees are allowed to be empty.


An inorder traversal of a binary search tree visits the nodes in numerical order.
Searching for a value in a BST is easy. Start with the root node. If the value sought is
here we are finished. If the value is less than the root value look in the left child, if it is
greater we look in the right child. Repeat this until we find the value or the child we look
for is missing.
In our example: Search for 5. Try 10, too big so try 6, too big so try 4, too small so
try 5. Found it!
Search for 15: Try 10, too small so try 12 too small so try 14, too small so try 16, too
big but no left child, its not there!
CHAPTER 5. TREES 17

Adding
Adding a value to a BST is easy. Do a search to try to find the value, if it is not there we
have arrived at a node with no child where our value should be. We just add a new leaf at
that point with the value.
For example to add 11 to the tree in Figure 5.5 we search until we reach 12 which has
no left child, so we add a left child containing 11. To add 3 our search reaches 1 where we
add a right node.

Removing
Removing a leaf node is easy. To remove a non-leaf node we cannot leave a hole where
that node was. To remove a node with only one child we can replace it by that child. In
particular we can remove the smallest or largest node from any subtree as they cannot
have two children. If we have a node with two children we can remove the smallest value
from the right subtree (or the largest from the left subtree) and put its value in the node
to be removed. To find the smallest (or largest) keep going left (or right).

Tree balancing
There are many ways that we can arrange data in a BST. The numbers 1,2,3 can be held
in five different ways.

3 h h 3 h2 1 h h 1
@ @ @
@ @ @
2 h h 1 h @h3 @h @h 2
@ @ @
@ 1 3 @
@ @
1 h @h 2 h2 @h3
@ @

Figure 5.7: Five trees holding 1,2,3

Notice that once we have chosen the shape of the tree we have no choice where to put
the numbers.
With larger data sets we have many more choices of tree.
For searching purposes it is clear that we want to make the height of the tree as small
as possible, as the height is the maximum number of comparisons needed. The worst case
is where the data is in a single chain, such as the outside four cases in Figure 5.7.
A binary tree with levels 1, 2, , k all filled has n = 2k 1 nodes. So the smallest
height for a binary tree with n nodes is about log2 (n).
To achieve a fast search we can use a complete binary tree. A binary tree is complete
if all levels except the last are full, and the last level is filled left to right
CHAPTER 5. TREES 18

5 h 4 h
@ @
@ @
h 7 @h 2 h h
@ @
3 A A
6 @
 @
A  A  @
2 h 4 Ah 6 h 1 h 3 Ah 5 h 7 @h
A  A  @
 

Figure 5.8: Two complete trees

The two trees in Figure 5.8 are complete. Notice that if we add a value of 1 to the first
tree we have to get the second tree. This means that we need to move the value in every
node. In other words if we want to keep our tree complete adding a node has order O(n).
This is no better than searching a list with binary search, we have a fast search but a slow
add.
A complete tree is fast to search but slow to maintain. By allowing a bit more flexibility
in the tree we can make it fast to maintain while keeping it fast to search.
We say a BST is balanced if at every node of the tree the height of its two children
differ by no more than one.
x x
@ @
@ @
x @x x @x
@ @
@ @
@ @
x @x
@ x x @x
@
@
@
x x @x

Figure 5.9: Unbalanced and balanced trees

A balanced tree is close enough to a complete tree that the search operation is still
O(log n), but flexible enough that addition is also O(log n).
If you add a node to a balanced tree it may become unbalanced. We shall see that it
can be restored to balance by doing operations called rotations on the tree.
In the diagrams A, B and C are subtrees. Notice that either tree is a BST if the values
in A are less than l, the values in B are between l and r and the values in C are greater
than r. So if we rotate at any node in a BST we still have a BST.
If we have added a node to a balanced tree we can keep it balanced by either a single
rotation or a pair of rotations as in Figure 5.11.
The left right double rotation is the mirror image.
A pair of rotations is needed when the long branch at an unbalanced node has a dog-leg,
CHAPTER 5. TREES 19

r
h l h
@ @
@ @
l h r
@h
@ @
@
@ C A @
@ @
@ @
@ @
A B B C
Right rotation
-

 Left rotation

Figure 5.10: Left and right rotations

that is in Figure 5.11 we want to rotate left at x, but the left child of z is longer than the
right, so we must first rotate right at z to cure this. In a double rotation the first rotation
is always in a child of the unbalanced node.
Using these operations we can rebalance any tree after adding or removing a node.
A tree with addition and removal methods that rebalance it is called an AVL tree after
Adelson-Velskii and Landis. Searching, adding and removing are all O(log n).
We can use an AVL tree to give another fast sorting algorithm called a tree sort. Put
the data into the tree one at a time, then list them using an inorder traversal. This is
O(n log n).

5.4.3 Sorted sets and maps


How do we use a search tree? There are two ADT interfaces it is suited for.

A sorted set A set of items (each item can appear at most once), where the items have
an ordering and are kept in order. The iterator runs through the items in the order
they are kept in.

A sorted map A map or dictionary stores pairs of objects, the key and the value. The
keys must be all different. From a key you can retrieve its value. For a sorted map
the keys have an ordering and are kept in order.

For a sorted set we store the items in the nodes of our search tree.
For a sorted map we store both the key and the value in the nodes of our search tree.
The Java library provides an interface SortedSet<E> and a class TreeSet<E> that
implements it. To use a SortedSet you can either set a Comparator in the constructor
and use elements that can be compared by this comparator or you can use elements that
implement the Comparable interface. Most of the methods of SortedSet are the same as
CHAPTER 5. TREES 20

x x y
h h h
@ @ @
@ z
@ @ @
@ y z
@h @h x h @h
@

A @ A @ A @
y @ @
@ z
A  @
h @h
@ A  @
@ A  @
@ D B @ A B C D
@ @
@ @
@ @
B C C D
- -
Rotate right at z Rotate left at x

Figure 5.11: A right left double rotation

Set, which are the same as Collection. The extra features are that you cant add the same
element twice, and that the iterator returns the elements in order. There are a few extra
methods such as E first() and E last() to return the first and last elements. TreeSet
uses a tree to provide fast O(log n) methods to add and remove elements.
The Library provides an interface SortedMap<K,V> and a class TreeMap<K,V> to im-
plement it.
The K is the type of the keys and V is the type of the values.
Some of the methods of SortedMap are:

// Methods of Map
void clear ();
boolean containsKey ( Object key );
boolean containsValue ( Object value );
V get ( Object key );
boolean isEmpty ()
Set <K > keySet (); // Returns the set of keys .
V put ( K key , V value ); // Set the value for this key
// Returns the previous value or null
V remove ( Object key );
int size ();
Collection <V > values (); // Returns the collection of values
// Methods of SortedMap
K firstKey ();
K lastKey ();
CHAPTER 5. TREES 21

As an example using a sorted set we will look at a program for listing all the different
words in a text file. We use a sorted set of strings created by

p r i v a t e SortedSet < String > set =new TreeSet < String >();
We then find all the words in our file as strings and add them to the set. Remember
if the word is already in it will not be added again. We can then display the set. It will
appear in alphabetical order.

/* * Reads the words into a set .


* @param in the stream to read from
* @throws IOException if an error occurs .
*/
void readSet ( BufferedReader in ) throws IOException
{
int c;
char ch ;
String word ="";
set . clear ();
while (( c = in . read ())!= -1)
{
ch =( char ) c ;
i f ( Character . isLetter ( ch ))
word += Character . toLowerCase ( ch );
e l s e i f ( word . length () >0)
{
set . add ( word );
word ="";
}
}
list . setListData ( set . toArray ());
message . setText ( set . size ()+
" words from "+ set . first ()+ " to "+ set . last ());
}
If we want to know how often each word occurs we can use a map where the key is the
word and the value is an Integer holding the frequency. We construct it by

p r i v a t e SortedMap < String , Integer > map =new TreeMap < String , Integer >();
If the word is not yet in the map it is added with frequency 1. If the word is in the
map the frequency is increased by 1.
CHAPTER 5. TREES 22

void readMap ( BufferedReader in ) throws IOException


{
int c;
char ch ;
String word ="";
map . clear ();
while (( c = in . read ())!= -1)
{
ch =( char ) c ;
i f ( Character . isLetter ( ch ))
word += Character . toLowerCase ( ch );
e l s e i f ( word . length () >0)
{
i f ( map . containsKey ( word ))
map . put ( word ,1+ map . get ( word ));
else
map . put ( word ,1);
word ="";
}
}
String [] s = map . keySet (). toArray (new String [0]);
Frequency [] fr =new Frequency [ map . size ()];
f o r ( i n t i =0; i < fr . length ; i ++)
fr [ i ]=new Frequency ( s [ i ] , map . get ( s [ i ]));
i f ( byFrequency . isSelected ())
Arrays . sort ( fr );
list . setListData ( fr );
message . setText ( map . size ()+
" words from "+ fr [0]+ " to "+ fr [ fr . length -1]);
}
The output from scanning Alice in wonderland looks like

a : 636
abide : 1
able : 1
about : 93
...
yourself : 10
youth : 6
zealand : 1
zigzag : 1
CHAPTER 5. TREES 23

5.4.4 Other search trees


2-3 trees
A 2-3 tree is a search tree where every interior node has 2 or 3 children.
A 2-node has one data item and two children. Data in the left subtree must be smaller
than the data item, and data in the right subtree bigger.
A 3-node has two data items s and l with s < l. Data in the left subtree must be
smaller than s, data in the middle subtree between s and l and data in the right subtree
bigger than l.
Leaf nodes can contain one or two data items.
We require that all the leaf nodes of a 2-3 tree are at the same level. This makes a 2-3
tree complete.

15 30
! HH
!!! HH
! ! H
! !! HH
H
6 10 20 40
A A A
A A A
A A A
A A A
1 3 8 12 17 23 35 45

Figure 5.12: A 2-3 tree

Searching a 2-3 tree is as before except we have three choices at some nodes. To find
8 in the example: Look at the root, 8 < 15 so go left, 8 is between 6 and 10 so take the
middle child. Found!
To add to a 2-3 tree find the leaf node where is belongs and add the value. If this node
now has three values split it and move the middle value up. This may require splitting
further nodes up the tree. A new level starts when the root gets split.
In our example first add 9. This just goes in the same leaf as 8. Now add 2. This goes
in the 1,3 leaf which splits pushing the 2 up into the 6, 10 node. This now splits pushing
the 6 up into the root. Finally the root splits pushing the 15 up into a new root.
Notice that to add to a 2-3 tree we need to search down the tree and then add up the
tree.
We will not look at removing a value, which can be done but is complicated.

2-4 trees
2-4 trees can have nodes with 2, 3 or 4 children. 2 and 3 nodes are as above, 4-nodes
have three data values and the subtree values are distributed between them as you would
expect.
CHAPTER 5. TREES 24

15
XXX
XXX
X
X
6 30
PP
 HH
 PP
 HH 
 P
P
2 10 20 40
A A A A
A A A A
A A A A
A A A A
1 3 89 12 17 23 35 45

Figure 5.13: After adding 9 and 2

In a 2-4 tree we can avoid having to work up and down the tree during addition. We
split the 4 nodes on the way down, then we can always add a value without having to go
back up the tree.

Red-Black trees
A red-black tree is equivalent to a 2-4 tree. Instead of using three different types of node
it only uses binary nodes. 3 or 4 nodes are represented by two or three binary nodes as
in Figure 5.14. We indicate the top node of one of these multiple nodes by marking it in
some way, we think of this as colouring it red.

10
5 10 20 @
@
 A @ 5 20
 A @
 A @ @
 A @ @

Figure 5.14: 2-4 and red-black nodes

The Java library uses red-black trees for its TreeMap and TreeSet classes.

B-trees
We can use nodes with larger numbers than 4 children. These are called B-trees (B for
Block) for most purposes these make things slower, as we need to search along the block
of data to find which child to follow. The are useful if our data is so large that we need
to store it on disk. Disks read their data in blocks. so if we make our nodes fill a block
CHAPTER 5. TREES 25

we are using the slow disk operations most efficiently. Large databases use versions of the
B-tree to store their data.

You might also like