You are on page 1of 11

CS 2604 Spring 2004

Binary Search Trees Binary Search Trees 1

The general binary tree shown in the previous chapter is not terribly useful in practice.
The chief use of binary trees is for providing rapid access to data (indexing, if you will)
and the general binary tree does not have good performance.
Suppose that we wish to store data elements that contain a number of fields, and that
one of those fields is distinguished as the key upon which searches will be performed.

A binary search tree or BST is a binary tree that is either empty or in which the
data element of each node has a key, and:

1. All keys in the left subtree (if there is one) are less than the key in
the root node.
2. All keys in the right subtree (if there is one) are greater than or equal
to the key in the root node.
3. The left and right subtrees of the root are binary search trees.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

BST Insertion Binary Search Trees 2

Here, the key values are characters (and only key values are shown).
Inserting the following key values in the given order yields the given BST:

D
D G H E B D F C

B G
In a BST, insertion is always
at the leaf level. Traverse
the BST, comparing the new
C E H
value to existing ones, until
you find the right spot, then
add a new leaf node holding
that value. D F

What is the resulting tree if the (same) key values are inserted in the order:

B C D D E F G H or E B C D D F G H

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 1


CS 2604 Spring 2004

Searching in a BST Binary Search Trees 3

Because of the key ordering imposed by a BST, searching resembles the binary search
algorithm on a sorted array, which is O(log N) for an array of N elements.
A BST offers the advantage of purely dynamic storage, no wasted array cells and no
shifting of the array tail on insertion and deletion.

B G

A C E H

Trace searching for the key value E.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

BST Deletion Binary Search Trees 4

Deletion is perhaps the most complex operation on a BST, because the algorithm must
result in a BST. The question is: what value should replace the one deleted? As with
the general tree, we have cases:

- Removing a leaf node is trivial, just set the relevant child pointer in the parent
node to NULL.
- Removing an internal node which has only one subtree is also trivial, just set
the relevant child pointer in the parent node to target the root of the subtree.

B G
NULL

C E H

D F

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 2


CS 2604 Spring 2004

BST Deletion Binary Search Trees 5

- Removing an internal node which has two subtrees is more complex

Simply removing the node would


disconnect the tree. But what value D
should replace the one in the
targeted node?
B G
To preserve the BST property, we
must take the smallest value from
the right subtree, which would be C F H
the closest succcessor of the value
being deleted.
E F

Fortunately, the smallest value will always lie in the left-most node of the subtree.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

BST Deletion Binary Search Trees 6

So, we first find the left-most node


E
of the right subtree, and then swap
data values between it and the
targeted node. B G
Note that at this point we dont
necessarily have a BST. C F H

Now we must delete the swapped D F


value from the right subtree.

That looks straightforward here since the node in question is a leaf. However
- the node will NOT be a leaf in all cases
- the occurrence of duplicate values is a complicating factor (so we might want to
have a DeleteRightMinimum() function to clean up at this point).

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 3


CS 2604 Spring 2004

Deleting the Minimum Value Binary Search Trees 7

Suppose we want to delete the


E
value E from the BST:
After swapping the F with the E,
B G
we must delete
We must be careful to not confuse
this with the other node containing C F H
an F.

Also, consider deleting the value G. In this case, the right subtree is just a leaf node,
whose parent is the node originally targeted for deletion.

Moral: be careful to consider ALL cases when designing.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

BST Template Interface Binary Search Trees 8


A BST template can be derived from the general binary tree class, overriding all of the
search-related functions inherited from the general binary tree base:

template <typename T>


class BSTreeT : public BinaryTreeT<T> {
private:

bool InsertHelper(const T& D, BinNodeT<T>* sRoot);


bool DeleteHelper(const T& D, BinNodeT<T>* sRoot);
bool DeleteRightMinimum(BinNodeT<T>* sRoot);
T* const FindHelper(const T& D, BinNodeT<T>* sRoot);

public:
BSTreeT(); // create empty BST
BSTreeT(const T& D); // root holds D

bool Insert(const T& D); Arguably, the use of


// insert element
bool Delete(const T& D); inheritance is
// delete element
T* const Find(const T& D); unjustified given
// return access to D
the degree of
virtual ~BSTreeT(); change needed in
}; the derived type.
// . . . continues with member function implementations . . .

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 4


CS 2604 Spring 2004

BST Constructor and Destructor Binary Search Trees 9

The default BST constructor just initializes an empty tree by calling the base default
constructor:
template <typename T>
BSTreeT<T>::BSTreeT() : BinaryTreeT<T>() { }

The second BST constructor is entirely similar, merely calling the corresponding base
constructor.

The BST destructor is simply an empty shell. The base class destructor will be invoked
automatically, as needed, and the derived class does not allocate any dynamic content
itself, so the base destructor is adequate.

template <typename T>


BSTreeT<T>::~BSTreeT() { }

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

BST Search Implementation Binary Search Trees 10

The BST Find() function takes advantage of the BST data organization:
template <typename T>
T* const BSTreeT<T>::Find(const T& toFind) {

if (Root == NULL) return NULL;

return (FindHelper(toFind, Root));


}

template <typename T>


T* const BSTreeT<T>::FindHelper(const T& toFind, BinNodeT<T>* sRoot) {

if (sRoot == NULL) return false; Uses operator== for the type T,


which is customized by the client
if (sRoot->Element == toFind) { for the particular application at
Current = sRoot; hand.
return &(Current->Element);
}
Search direction is
if (toFind < sRoot->Element) determined by
return FindHelper(toFind, sRoot->Left); relationship of target
else data to data in
return FindHelper(toFind, sRoot->Right); current node.
}

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 5


CS 2604 Spring 2004

BST Insert Implementation Binary Search Trees 11

The public Insert() function is just a stub to call the recursive helper:

template <typename T> Warning:


bool BSTreeT<T>::Insert(const T& D) {
the BST definition in these
notes allows for duplicate
return InsertHelper(D, Root);
data values to occur, the
}
logic of insertion may need
to be changed for your
specific application.

The stub simply calls the helper function..


The helper function must find the appropriate place in the tree to place the new node.
The design logic is straightforward:
- locate the parent "node" of the new leaf, and
- hang a new leaf off of it, on the correct side

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

BST Insert Helper Binary Search Trees 12

The InsertHelper() function:

template <typename T>


bool BSTreeT<T>::InsertHelper(const T& D, BinNodeT<T>*& sRoot) {

if (sRoot == NULL) { // found the location


BinNodeT<T>* Temp = new(nothrow) BinNodeT<T>(D);
if (Temp == NULL) return false;
sRoot = Current = Temp;
return true;
}

// recursive descent logic goes next . . .


}

When the parent of the new value is found, one more recursive call takes place, passing
in a NULL pointer to the helper function.
Note that the insert helper function must be able to modify the node pointer parameter,
and that the search logic is precisely the same as for the find function.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 6


CS 2604 Spring 2004

BST Delete Implementation Binary Search Trees 13

The public Delete() function is very similar to the insertion function:


template <typename T>
bool BSTreeT<T>::Delete(const T& D) {

if ( Root == NULL ) return false;

return DeleteHelper(D, Root);


}

The DeleteHelper() function design is also relatively straightforward:


- locate the parent of the node containing the target value
- determine the deletion case (as described earlier) and handle it:
- parent has only one subtree
- parent has two subtrees

The details of implementing the delete helper function are left to the reader

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

Some Refinements Binary Search Trees 14

The given BST template inherits a number of useful features:


- a function to provide the size of the tree
- a function to provide the height of the tree
- a function to display the tree in a useful manner

It is also useful to have some instrumentation during testing. For example:


- log the values encountered and the directions taken during a search
This is also easy to add, but it poses a problem since we generally do not want to see
such output when the BST is used.
I resolve this by adding some data members and mutators to the template that enable
the client to optionally associate an output stream with the object, and to turn logging
of its operation on and off as needed.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 7


CS 2604 Spring 2004

Adding an Iterator Binary Search Trees 15

A BST iterator can be added in a very similar way to that shown earlier for the linked
list iterator.

As before:
- add a declaration of a class iterator within the BST declaration
- an iterator will store a pointer to a tree node; dereferencing it will return the address
of the data element within that node.
- implement the expected increment/decrement operators for the iterator
- add the expected iterator-supplying functions to the BST
- there's no case for an iterator-based insertion function in a BST, but there may be a
case for an iterator-based deletion function
- add a search function that returns an iterator
There are two considerations:
- what traversal pattern should the iterator provide?
- is it necessary to modify the BST implementation aside from adding support
functions?
Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

Iterator-based Find Binary Search Trees 16

The public Insert() function is just a stub to call the recursive helper:

template <typename T>


typename BSTreeT<T>::iterator BSTreeT<T>::Find(const T& toFind) {

if (Root == NULL) return iterator(NULL);

return (FindHelper(toFind, Root));


}

The public stub simply calls the helper function:

template <typename T>


typename BSTreeT<T>::iterator BSTreeT<T>::FindHelper(const T& toFind,
BinNodeT<T>* sRoot) {

if (sRoot == NULL) return iterator(NULL);

if (sRoot->Element == toFind) {
return iterator(sRoot);
}

// recursive descent logic goes next . . .


}

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 8


CS 2604 Spring 2004

Iterator Increment Logic Binary Search Trees 17

The only issue that is handled differently from the the linked list iterator is the pattern by
which an iterator steps forward or backwards within the BST.
Consider stepping forward as in an inorder traversal:

A I

B E J

D G

F H

The pattern is reasonably straightforward, but how can we move up from a node to its
parent within the BST?

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

Client-side Use Binary Search Trees 18

Here is a somewhat trivial use of the iterator-supplied traversal:

BSTreeT<int>::iterator It;
for (It = T.begin(); It != T.end(); It++) {
if ( *It <= Floor ) {
*It = Floor;
}
}

Note also that the client can implement an inorder search of the BST using the same
approach. In fact, the BST itself could do that instead of recursion.

One point here is that the code above shows how the client can traverse the structure
without knowing anything about the actual physical layout compare to a traversal of an
STL vector, or even of a simple array.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 9


CS 2604 Spring 2004

Instrumentation Binary Search Trees 19


The extended BST template: The client may entirely ignore the
ability to log an execution trace, or
template <typename T> class BSTreeT { enable it and turn logging on and off at
protected: will.
. . .
ostream* Log; template <typename T>
bool loggingOn; bool BSTreeT<T>::logOn() {
. . . if ( Log != NULL )
loggingOn = true;
public: return loggingOn;
. . . }
bool setLogStream(ostream* const L);
bool logOn();
Of course, it's important to avoid
bool logOff();
dereferencing a null pointer
};

template <typename T> T* BSTreeT<T>::InsertHelper(. . .) {


. . .
if ( Elem < (sRoot->Element) ) {
if ( loggingOn )
*Log << "Insert: going left from " << (sRoot->Element) << endl;
return InsertHelper(Elem, sRoot->Left);
}
. . .
}

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

Balance in a BST Binary Search Trees 20

However, a BST with N nodes does not always provide O(log N) search times.

B G

A C E H
B
A well-balanced BST. This
will have log(N) search
times in the worst case.
A G

C H

What if we inserted the


values in the order: D
A poorly-balanced BST. This
A B C D E G H will not have log(N) search
times in the worst case.
E
Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 10


CS 2604 Spring 2004

Search Cost in a BST Binary Search Trees 21

From an earlier theorem on binary trees, we know that a binary tree that contains L nodes
must contain at least 1 + log L levels.
If the tree is full, we can improve the result to imply that a full binary tree that contains N
nodes must contain at least log N levels.
So, for any BST, the there is always an element whose search cost is at least log N.

Unfortunately, it can be much worse. If the BST is a "stalk" then the search cost for the
last element would be N.

It all comes down to one simple issue: how close is the tree to having minimum height?

Unfortunately, if we perform lots of random insertions and deletions in a BST, there is no


reason to expect that the result will have nearly-minimum height.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

Cost of Insertion/Deletion in a BST Binary Search Trees 22

Clearly, once the parent is found, the remaining cost of inserting a new node in a BST is
constant, simply allocating a new node object and setting a pointer in the parent node.
So, insertion cost is essentially the same as search cost.

For deletion, the argument is slightly more complex. Suppose the parent of the targeted
node has been found.
If the parent has only one subtree, then the remaining cost is resetting a pointer in the
parent and deallocating the node; that's constant.
But, if the parent has two subtrees, then an additional search must be carried out to find
the minimum value in the right subtree, and then an element copy must be performed, and
then that node must be removed from the right subtree (which is again a constant cost).
In either case, we have no more than the cost of a worst-case search to the leaf level, plus
some constant manipulations.
So, deletion cost is also essentially the same as search cost.

Computer Science Dept Va Tech January 2004 Data Structures & File Management 2000-2004 McQuain WD

William D McQuain, January 2004 11

You might also like