You are on page 1of 25

Ramya M V

Data structures and algorithms

UNIT III

Trees: Binary tree, Terminology, Representation, Traversals, Applications – Binary


search tree – AVL tree. Heaps, disjoint sets

TREES

Basic terminologies
TREE: is defined as a finite set of one or more nodes such that

• there is one specially designated node called ROOT and


• the remaining nodes are partitioned into a collection of sub-trees of the root each of which is
also a tree.

Example LEVEL

0
A

1
B C

D E F G 2

3
H I J

NODE: Node of a tree stores the actual data and links to the other node.

The nodes of a tree have a parent-child relationship. The root does not have a parent ; but each one of
the other nodes has a parent node associated to it . A node may or may not have children is called a leaf
node or terminal nodes.

A line from a parent to a child node is called a branch . If a tree has n nodes, one of which is the root
there would be n-1 branches.
Ramya M V
Data structures and algorithms

The number of sub-trees of a node is called its degree. The degree of A is 2, F is 1 and J is zero. The
leaf node is having degree zero and other nodes are referred as non-terminals.

The degree of a tree is the maximum degree of the nodes in the tree.

Nodes with the same parent are called siblings. Here D & E are all siblings. H & I are also siblings.

The level of a node is defined by initially letting the root be at level zero. If a node is al level l, there its
children are at level l+1.

The height or depth of a tree is defined to be the maximum level of any node in the tree.

A set of trees is called forest; if we remove the root of a tree we get a forest. In the above fig, if we
remove A, we get a forest with three trees.

D
A
A

B C E F
B D
C

H I
E F G

H I J

G
K L M

K L M
FIG: Tree FIG: Forest(Sub-trees)

PROPERTIES OF A TREE:

1. Any node can be the root of the tree and each node in a tree has the property that there is
exactly one path connecting that node with every other node in the tree.The tree in which the
root is identified is called a rooted tree ; a tree in Which the root is not identified is called a
free tree.
2. Each node, except the root, has a unique parent.
Ramya M V
Data structures and algorithms

BINARY TREES:

A binary tree is a tree, which is, either empty or consists of a root node and two disjoint binary
trees called the left sub-tree and right sub-tree.

B C

D F G

FIG: Binary Tree

In a binary tree , no node can have more than two children. So every binary tree is a tree, not every tree
is a binary tree.

A complete binary tree is a binary tree in which all interval nodes have degree and all leaves
are at the same level

FIG: Complete binary tree

B C

FIG: Strictly binary tree


F G

H I
Page 3
Ramya M V
Data structures and algorithms

If every non-leaf node in a binary tree has non-empty left and right sub-trees, the tree is termed
as strictly binary tree
If A is the root of a binary tree and B is the root of its left or right sub-tree, then A is said to be
the parent of B and B is said to be the left or right child of A.

Node n1 is an ancestor of node n2, if n1 is either the parent of n2 or the parent of some ancestor of n2.

Here n2 is a descendant of n1, a node n2 is a left descendant of node n1 if n2 is either the left child of
n1 or a descendant of the left child of n1.

A right descendant may be similarly defined the number of nodes at level i is 2i. For a complete
binary tree with k levels contains 2i nodes.

• To illustrate the above definition, consider the sample tree as given in fig.
• In the sample tree T, there is a set of 12 nodes. Here A is a special node being the root of the
tree. Remaining nodes are partitioned into 3 sets T1,T2 and T3; they are sub trees of the root
node A. by definition, each sub tree is again a tree. Observe that a tree is defined recursively. A
same tree can be expressed in a string notation as shown below.

BINARY TREE REPRESENTATION:

LINEAR REPRESENTATION OF A BINARY TREE

The linear representation method of implementing a binary tree uses a one-dimensional array of
size ((2^d+1)-1) where d is the depth of the tree.

Once the size of the array has been determined the following method is used to represent the tree.
Ramya M V
Data structures and algorithms

Store the root in 1st location of the array.

1. If a node is in location n of the array store its left child at location 2n and its right child at
(2n+1).

In c, arrays start at position O; therefore instead of numbering the trees nodes from 1 to n, we number
them from 0 to n-1. The two child of a node at position P are in positions 2P+1 and 2P+2.

The following figure illustrate arrays that represent the almost complete binary trees.

B C

D E F G

H I

0 1 2 3 4 5 6 7 8

A B C D E F G H I

We can extend this array representation of almost complete binary trees to an array representation of
binary trees generally.

The following fig(A) illustrates binary tree and fig(B) illustrates the almost complete binary tree of
fig(a). Finally fig(C) illustrates the array implementation of the almost complete binary tree.

B C

F G

Fig(A) Binary tree


H I

Page 5
Ramya M V
Data structures and algorithms

B C

F G

H I

Fig(B) Almost a complete binary tree

0 1 2 3 4 5 6 7 8 9 10 11 12

A B C F G H I

Fig ( C ) Array Representation

ADVANTAGES:

1. Given a child node , its parent node can be determined immediately. If a child node is at
location N in the array, then its parent is at location N/2.
2. It can be implemented easily in languages in which only static memory allocation is directly
available.
3.
DISADVANTAGES:

1. Insertion or deletion of a node causes considerable data movement up and down the array,
using an excessive amount of processing time.

2. Wastage of memory due to partially filled trees.

LINKED LIST REPRESENTATION:

Linked lists most commonly represent binary trees. Each node can be considered as having 3
elementary fields : a data field, left pointer, pointing to left sub-tree and right pointer pointing to the
right sub-tree.

The following figure is an example of linked storage representation of a binary tree.

Page 6
Ramya M V
Data structures and algorithms

A
B 0 C
B C

0 D 0 0F 0 G 0
D F G

0 H 0
H

FIG: Binary tree FIG: Linked representation of a binary tree

Although for most purposes the linked representation of a binary tree is efficient, it does have
certain disadvantages. Namely,

1. Wasted memory space is well pointers.

2. Given a node, it is difficult to determined to parent.

3. Its implementation algorithm is more difficult in languages that do not offer dynamic storage
techniques.

BINARY TREE TRAVERSALS (***)

Explain all binary tree traversal procedure with neat diagrams. (11 Marks Nov 2013,Nov
2014,Nov 2015)

Another common operation is to traverse a binary tree; that is, to pass through the tree, enumerating
each of its nodes once.

• The contents of each node can be printed or processed. In either case, each node is visited as it
enumerated.
• The order in which the nodes of a linear list are visited in a traversal is clearly from first to last.
• However, there is a no such “natural” linear order for the nodes of a tree. Thus, different
ordering is used for traversal in different cases.

Page 7
Ramya M V
Data structures and algorithms

• There are three traversal methods. In each of these methods, nothing need to done to traverse an
empty binary tree.
• The methods are all defined recursively, so that traversing a binary tree involves visiting the
root and traversing its left and right subtrees.
• The only difference among the methods is the order in which these three operations are
performed.
1.To traverse a nonempty binary tree in preorder (also known as depth-first order), we perform
the following three operations;
1. Visit the root.
2. Traverse the left subtree in preorder.
3. Traverse the right subtree in preorder.
void pretrav(NODEPTR tree)
{
if (tree!=NULL)
{
printf(“%d”, tree->info);
pretrav(tree->left);
pretrav(tree->right);
}
}
2.To traverse a nonempty binary tree inorder (or symmetric);
1. Traverse the left subtree in inorder.
2. Visit the root.
3. Traverse the right subtree in inorder.
void intrav(NPDEPTR tree)
{
if(tree!=NULL)
{
intrav(tree->left);

Page 8
Ramya M V
Data structures and algorithms

printf(“%d\n”, tree->info);
intrav(tree->right);
}
}
3.To traverse a nonempty binary tree in postorder;
1. Traverse the left subtree in postorder.
2. Traverse the right subtree in postorder.
3. Visit the root.
void posttrav(NODEPTR tree)
{
if(tree!=NULL)
{
posttrav(tree->left);
posttrav(tree->right);
printf(“%d\n” tree->info);
}
}
Figure illustrates two binary trees and their traversal in preorder, inorder and postorder.

Inorder traversal

Page 9
Ramya M V
Data structures and algorithms

Preorder traversal

Postorder Traversal

APPLICATIONS OF BINARY TREE

There are 4 applications of a Binary Tree

• Finding out the Duplicate


• Traversal
• Sorting
• Evaluation of Expression Trees

Page 10
Ramya M V
Data structures and algorithms

3. Traversal
Explain the traversals of binary tree with examples. (11 Marks April 2014)
Explain all binary tree traversal procedures with neat diagrams. (11 Marks Nov 2013)
What are the three tree traversal methods in a tree? (11 Marks April 2015)

• Another common operation is to traverse a binary tree; that is, to pass through the tree,
enumerating each o its node once.
• The contents of each node can be printed or processed. In either case, each node is visited as it
enumerated.

• The order in which the nodes of a linear list are visited in a traversal is clearly from first to last.
• However, there is a no such “natural” linear order for the nodes of a tree. Thus, different
ordering is used for traversal in different cases.
• There are three traversal methods. In each of these methods, nothing need to done to traverse an
empty binary tree.
• The methods are all defined recursively, so that traversing a binary tree involves visiting the
root and traversing its left and right subtrees.
• The only difference among the methods is the order in which these three operations are
performed.
• To traverse a nonempty binary tree in preorder (also known as depth-first order), we perform
the following three operations;
1. Visit the root.
2. Traverse the left subtree in preorder.
3. Traverse the right subtree in preorder.

void pretrav(NODEPTR tree)


{
if (tree!=NULL)
{
printf(“%d”, tree->info);
pretrav(tree->left);
pretrav(tree->right);
Ramya M V
Data structures and algorithms

}
}
To traverse a nonempty binary tree inorder (or symmetric);
4. Traverse the left subtree in inorder.
5. Visit the root.
6. Traverse the right subtree in inorder.
void intrav(NPDEPTR tree)
{

if(tree!=NULL)
{
intrav(tree->left);
printf(“%d\n”, tree->info);
intrav(tree->right);
}
}

To traverse a nonempty binary tree in postorder;


1. Traverse the left subtree in postorder.
2. Traverse the right subtree in postorder.
3. Visit the root.
void posttrav(NODEPTR tree)
{
if(tree!=NULL)
{
posttrav(tree->left);
posttrav(tree->right);
printf(“%d\n” tree->info);
}
}.
Ramya M V
Data structures and algorithms

3. Evaluation of Expression Trees

• As another application of binary tree, consider the following method of representing an


expression containing operands and binary operators by a strictly binary tree.
• The root of the tree contains an operator that is to be applied to the results of evaluating the
expressions represented by the left, and right subtrees.
• A node representing an operator in a nonleaf, whereas a node representing an operand is a leaf.
• A preorder traversal yields the prefix form of the expression.
• Similarly, traversing a binary expression tree in postorder places an operator after its two
operands, so that a postorder traversal produces the postfix form of the expression.
• The postorder traversals of the binary trees yields the postfix forms.
• Since the root (operator) is visited after the nodes of the left subtree and before the nodes of the
right subtree (the two operands), we might expect an inorder traversal to yield the infix form of
the expression. Indeed, if the binary tree is traversed, the infix expression A+B*C is obtained.
However, a binary expression tree does not contain parentheses, since the operations are
implied by the structure of the tree.
• Thus expressions whose infix form requires parentheses to override explicitly the conventional
Ramya M V
Data structures and algorithms

precedence rules cannot be retrieved by simple inorder traversals of the trees yield the
expressions.

TYPES OF BINARY TREES

• There are several types of binary trees possible each with its own properties.
• Few important frequently used trees are listed as below.
1. Expression tree 1. Huffmann Tree
2. Binary Search Treee 2. Height balanced Tree (AVL Tree)
3. Heap Tree 3. Weight balanced Tree
4. Threaded Binary Tree 4. Decision Tree.
Ramya M V
Data structures and algorithms

BINARY SEARCH TREES (BST) (AND ITS OPERATIONS)(

• An important application of binary trees is their use in searching.


• The property that makes a binary tree into a binary search tree is that for every node, X, in the
tree, the values of all the keys in the left subtree are smaller than the key value in X, and the
values of all the keys in the right subtree are larger than the key value in X.
• Notice that this implies that all the elements in the tree can be ordered in some consistent
manner. In Figure, the tree on the left is a binary search tree, but the tree on the right is not.
• The tree on the right has a node with key 7 in the left subtree of a node with key 6 (which
happens to be the root).

Two binary trees (only the left tree is a search tree)

Find_min and find_max

These routines return the position of the smallest and largest elements in the tree, respectively.
Although returning the exact values of these elements might seem more reasonable, this would be
inconsistent with the find operation. It is important that similar-looking operations do similar things. To
perform a find_min, start at the root and go left as long as there is a left child. The stopping point is the
smallest element. The find_max routine is the same, except that branching is to the right child.
Ramya M V
Data structures and algorithms

Insert

• The insertion routine is conceptually simple.


• To insert x into tree T, proceed down the tree as you would with a find.
• If x is found, do nothing (or "update" something). Otherwise, insert x at the last spot on the path
traversed. Figure shows what happens.
• To insert 5, we traverse the tree as though a find were occurring. At the node with key 4, go
right, but there is no subtree, so 5 is not in the tree, and this is the correct spot.
• Duplicates can be handled by keeping an extra field in the node record indicating the frequency
of occurrence.
• This adds some extra space to the entire tree, but is better than putting duplicates in the tree
(which tends to make the tree very deep).
• Of course this strategy does not work if the key is only part of a larger record.
• If that is the case, then it is possible to keep all of the records that have the same key in an
auxiliary data structure, such as a list or another search tree.

Binary search trees before and after inserting 5

• Following figure shows the code for the insertion routine.


• Since T points to the root of the tree, and the root changes on the first insertion, insert is written
as a function that returns a pointer to the root of the new tree. Lines 8 and 10 recursively insert
and attach x into the appropriate subtree.
Ramya M V
Data structures and algorithms

Delete

As is common with many data structures, the hardest operation is deletion.

Possibilities

• node is a leaf, it can be deleted immediately.


• node has one child, the node can be deleted after its parent adjusts a pointer to bypass the node.
• node with two children.
• The general strategy is to replace the key of this node with the smallest key of the right subtree
(which is easily found) and recursively delete that node (which is now empty).
• Because the smallest node in the right subtree cannot have a left child, the second delete is an
easy one. Figure shows an initial tree and the result of a deletion.
• The node to be deleted is the left child of the root; the key value is 2. It is replaced with the
smallest key in its right subtree (3), and then that node is deleted as before.
Deletion of a node (4) with one child, before and after

Deletion of a node (2) with two children, before and after

• The code in Figure performs deletion. It is inefficient, because it makes two passes down the tree to find and
delete the smallest node in the right subtree when this is appropriate.
• If the number of deletions is expected to be small, then a popular strategy to use is lazy deletion:
When an element is to be deleted, it is left in the tree and merely marked as being deleted.
• This is especially popular if duplicate keys are present, because then the field that keeps count of the frequency
of appearance can be decremented.
• If the number of real nodes in the tree is the same as the number of "deleted" nodes, then the depth of the tree is
only expected to go up by a small constant, so there is a very small time penalty associated with lazy deletion.
Also, if a deleted key is reinserted, the overhead of allocating a new cell is avoided.
HEAP

Definition: A heap is a specialized tree-based data structure that satisfied the heap property:

• if B is a child node of A, then key(A) ≥ key(B). This implies that an element with the
greatest key is always in the root node, and so such a heap is sometimes called a max-heap.
Of course, there's also a min-heap.

Applications: A heap has many applications, including the most efficient implementation of
priority queues, which are useful in many applications. In particular, heaps are crucial in
several efficient graph algorithms.

Variants:

• 2-3 heap
• Binary heap
• Many many others

Binary heap storage rules -- A heap implemented with a binary tree in which the
following two rules are followed:

• The element contained by each node is greater than or equal to the elements of that
node's children.
• The tree is a complete binary tree.

Example: which one is a heap?

Heap Implementation
Adding an Element to a Heap

Example: We want to insert a node with value 42 to the heap on the left.
The above process is called reheapification upward.

Pseudocode for Adding an Element:

1.Place the new element in the heap in the first available location. This keeps the structure as a complete
binary tree, but it might no longer be a heap since the new element might have a greater value than its
parent.
2.while (the new element has a greater value than its parent) swap the new element with its parent.
3.Notice that Step 2 will stop when the new element reaches the root or when the new element's parent has a
value greater than or equal to the new element's value.

Removing the Root of a Heap

The procedure for deleting the root from the heap -- effectively extracting the maximum element in a max-
heap or the minimum element in a min-heap.

The above process is called reheapification downward.

Psuedocode for Removing the Root:

1.Copy the element at the root of the heap to the variable used to return a value.
2.Copy the last element in the deepest level to the root and then take this last node out of the tree. This
element is called the "out-of-place" element.
3.while (the out-of-place element has a value that is lower than one of its children) swap the out-of-place
element with its greatest-value child.
4.Return the answer that was saved in Step 1.
5.Notice that Step 3 will stop when the out-of-place element reaches a leaf or it has a value that is greater or
equal to all its children.

Now, think about how to build a heap. Check out the example of inserting 27, 35, 23, 22, 4, 45, 21, 5, 42
and 19 to an empty heap.
AVL TREES

What if the input to binary search tree comes in a sorted (ascending or descending) manner? It
will then look like this −

It is observed that BST's worst-case performance is closest to linear search algorithms, that is
Ο(n). In real-time data, we cannot predict data pattern and their frequencies. So, a need arises
to balance out the existing BST.

Named after their inventor Adelson, Velski & Landis, AVL trees are height balancing binary
search tree. AVL tree checks the height of the left and the right sub-trees and assures that the
difference is not more than 1. This difference is called the Balance Factor.

Here we see that the first tree is balanced and the next two trees are not balanced −

In the second tree, the left subtree of C has height 2 and the right subtree has height 0, so the
difference is 2. In the third tree, the right subtree of A has height 2 and the left is missing, so it
is 0, and the difference is 2 again. AVL tree permits difference (balance factor) to be only 1.

BalanceFactor = height(left-sutree) − height(right-sutree)


If the difference in the height of left and right sub-trees is more than 1, the tree is balanced
using some rotation techniques.

AVL Rotations
To balance itself, an AVL tree may perform the following four kinds of rotations −

• Left rotation
• Right rotation
• Left-Right rotation
• Right-Left rotation
The first two rotations are single rotations and the next two rotations are double rotations. To
have an unbalanced tree, we at least need a tree of height 2. With this simple tree, let's
understand them one by one.

Left Rotation
If a tree becomes unbalanced, when a node is inserted into the right subtree of the right
subtree, then we perform a single left rotation −

In our example, node A has become unbalanced as a node is inserted in the right subtree of
A's right subtree. We perform the left rotation by making A the left-subtree of B.

Right Rotation
AVL tree may become unbalanced, if a node is inserted in the left subtree of the left subtree.
The tree then needs a right rotation.

As depicted, the unbalanced node becomes the right child of its left child by performing a
right rotation.
Left-Right Rotation
Double rotations are slightly complex version of already explained versions of rotations. To
understand them better, we should take note of each action performed while rotation. Let's
first check how to perform Left-Right rotation. A left-right rotation is a combination of left
rotation followed by right rotation.

State Action

A node has
been inserted
into the right
subtree of the
left subtree.
This
makes C an
unbalanced
node. These
scenarios
cause AVL
tree to perform
left-right
rotation.

We first
perform the
left rotation on
the left subtree
of C. This
makes A, the
left subtree
of B.

Node C is still
unbalanced,
however now,
it is because of
the left-subtree
of the left-
subtree.

We shall now
right-rotate the
tree,
making Bthe
new root node
of this
subtree. C now
becomes the
right subtree
of its own left
subtree.

The tree is
now balanced.

Right-Left Rotation

The second type of double rotation is Right-Left Rotation. It is a combination of right rotation
followed by left rotation.

State Action

A node has been


inserted into the
left subtree of the
right subtree. This
makes A, an
unbalanced node
with balance factor
2.

First, we perform
the right rotation
along Cnode,
making C the right
subtree of its own
left subtree B.
Now, B becomes
the right subtree
of A.
Node A is still
unbalanced
because of the right
subtree of its right
subtree and
requires a left
rotation.

A left rotation is
performed by
making B the new
root node of the
subtree. A becomes
the left subtree of
its right subtree B.

The tree is now


balanced.

NOTE: Refer Class note also for examples.

You might also like