You are on page 1of 13

11-10-2013

1
ESO 207A / 211
Data Structures and Algorithms
3-0-0-9
Rajeev Kumar
r aj v AT i i t k .ac .i n
Course URL: web.cse.iitk.ac.in/~eso207/
L17 : M W Th : 1000 1050 hrs
Lec t ur e 33: Ht -Bal anc ed Sear c h Tr ees
Mul t i -way B-Tr ee
Balanced Trees: AVL, B-Tree & Red Black Tree
Multi-way Search Tree: B-Tree
Ref.: Aho (Section 11.4)
Weiss (Section 4.7)
Corman (Chapter 18)
Next: Red Black Tree
Ref.:Corman (Chapter 13)
11-10-2013
2
A m-ary Search Tree storing n elements is said to be
height balanced if its height is O (log
m
n).
Approaches to ht-balancing
Strict Balance
Tree must always be balanced perfectly.
m-way, 2-3-4 (m-way with m=4), B/B+-tree
Nearly balanced
Allows a little out-of-balance
AVL tree {A rotated BST for near balancing}
Red Black Tree {2-3-4 tree BST, color, rotation}
Adjust on access
Not balanced, though Amortized time is O(log n)
Ex.: Splay tree
No balancing: BST
Height Balanced Tree
B Tr ee
(B, B* , B
+
Tr ees)
11-10-2013
3
A B-tree of order m is an m-way search tree (Generalized BST)
A node can have, max. no. of children, m
Each node may contain
a large number of keys,
k s m-1, which are ordered,
k
1
< k
2
< k
i
< k
m-1
with T
0
, T
1
, T
m-1
sub-trees.
All leaves are on
the same level
Perfectly
Balanced
Tree
m = 5
Balanced Multi- (m-) way Search Tree (B-Tree)
The root has
at most m children, but
may have as few as 2 if it is not a leaf, or
no children if the tree consists of the root alone.
All internal nodes except the root have
at most m non-empty children, and
at least m/2( non-empty children {For m=5, 3 s nC s 5}
All leaves are at the same level.
Designed for VLDB with moderate tree height, O(log
m
n);
For m=128 1 million records could be accommodated in a tree of ht 3.
Designed to have moderate height minimize disk / file
(secondary mem.) access External Sorting
Used for Internal Sorting too.
Balanced Multi- (m-) way Search Tree (B-Tree) . . .
11-10-2013
4
B-trees were designed to flatten the tree structure and to allow for
larger blocks of data that could then be tuned so that the size of a
node is the same size as a block on secondary storage.
Min. disk accesses (sequential access)
In B-tree, m/2( NC m.
Both keys and records are stored in its interior nodes
In B* tree, (2m1)/3 NC m ; each non-root node is 2/3
rd
full.
In B
+
tree, each data record appear in leaf only.
All records are stored at the leaf level of the tree;
only keys are stored in interior nodes.
Types of B-Trees : Variations
A node in B-tree is
int m;
typedef struct mnode{
int in_use; //no. of keys in use
Key keys[m-1];
struct mnode *child[m];
}
A pair of arrays
A B-tree is a large array of Nodes.
m-way Search Tree (B-Tree) Representation
11-10-2013
5
A Sample B-tree of Order 5
Search is analogous as in a BST.
Within a node, Binary search is typically (but not necessarily)
used to find the child tree of interest.
Tree Height = O(log
m
n) = O(log n / log m)
Key Search within an internal node:
Linear O(m)
Binary O(log
2
m)
Searching an element
By Linear key search on a node O(m . log n / log m )
By Binary key search on a node O(log n)
Insertion & Deletion : same as for search
B-Tree : Analysis Sketch
11-10-2013
6
For efficiency and access of secondary storage
size of m ? node size ~ 1 disk block access
Disk Access time per block
Seek time (position the read/write head on the track)
Latency time (proper sector moves beneath the head)
Transmission time
m is decided by
Disk block size (s)
Record size (r)
Pointer size (p), and others.
or,
A detailed analysis for m involves Disk access time.
B-Tree : Analysis Sketch for Size of m
s r m p m s + ) 1 (
(

+
+
=
r p
r s
m
INSERT and DELETE need the following local
operations:
Node splitting
Node fusing
Key sharing (or key borrowing)
The first one is used by INSERT.
The last 2 are used by DELETE.
Each of them takes O (height) time.
B-Tree Restructuring : Insert and Delete
11-10-2013
7
When inserting an item, first do a search for it in the B-tree. If the item is
not already in the B-tree, this unsuccessful search will end at a leaf.
When inserting into a B-tree, a value is inserted directly into a leaf. This
leads to three common situations that can occur:
Case I: A key is placed into a leaf that still has room: just insert the new
item here. Note that this may require that some existing keys be moved
one to the right to make room for the new item.
Case II: The leaf in which a key is to be placed is full: the node must be
"split" with about half of the keys going into a new node to the right of
this one. The median (middle) key is moved up into the parent node. (Of
course, if that node has no room, then it may have to be split as well.)
Note that when adding to an internal node, not only might we have to
move some keys one position to the right, but the associated pointers have
to be moved right as well.
Case III: The root of the B-tree is full: split the root. The median key
moves up into a new root node, thus causing the tree to increase in height
by one.
B-Tree : Insert
Insert the following letters into an empty B-tree
of order 5:
C N G A H E K Q M F W L T Z D P R X Y S
Order 5 means that
A node can have a max. of 5 children and 4 keys.
All nodes other than the root must have a min. of 2
keys, and 3 children.
B-Tree : Insert into empty B-Tree of Order 5
11-10-2013
8
The first 4 letters C N G A get inserted into the
same node, resulting in the following tree:
Insert H
no room in above node, split it into 2 nodes,
move median Gup into a new root node
B-Tree : Insert
Insert E, K, and Q
Insert M
split the node, M is median, move up
B-Tree : Insert
11-10-2013
9
Insert F, W, L and T
Insert Z
Split, move median T up
Insert D
B-Tree : Insert
Insert D
Split, move median D up, then insert P, R, X, Y
Insert S
Split, move median Q up, Split, move median M up
B-Tree : Insert
11-10-2013
10
Search for the value to delete. There are two main cases to be
considered:
Case I: Deletion from a leaf: it can simply be deleted from the
node, perhaps leaving the node with too few elements; so
some additional changes to the tree will be required
Case II: Deletion from a non-leaf: If the value is in an internal
node,
choose a new separator (either the largest element in the left
subtree or the smallest element in the right subtree),
remove it from the leaf node it is in, and
replace the element to be deleted with the new separator
(for the leaf node with an element deleted)
B-Tree : Delete
Additional changes -- Rebalancing after deletion
If the right sibling has more than the minimum number of
elements
Borrow one, adjust the separator
If the left sibling has more than the minimum number of elements
Borrow one, adjust the separator
If both immediate siblings have only the minimum number of
elements
Create a new node with all the elements from the deficient node,
all the elements from one of its siblings, and the separator in the
parent between the two combined sibling nodes.
Remove the separator from the parent, and replace the two
children it separated with the combined node.
If that brings the number of elements in the parent under the
minimum, repeat these steps with that deficient node, unless it is
the root, since the root may be deficient.
B-Tree : Additional Changes after Delete
11-10-2013
11
Delete T
Internal node select the smallest element from the
right subtree to replace T)
B-Tree : Delete
Delete R: leaf node, need rebalance:
Options: Merge or Split
Borrow a key from right sibling, adjust separator: move W
down, combine with S, move X up to the parent
Delete
11-10-2013
12
Delete E (leaf node, need rebalance after deletion)
Left and right sibling has only minimum keys,
Create a new node: combine with left sibling, the separator
from the parent, and the deficient node
Delete
Continue rebalancing
The sibling has only minimum keys
Create a new node: combine the deficient node with the
separator from the parent, and the right sibling
Delete
Deficient
11-10-2013
13
Additional changes -- Rebalancing after deletion
If the right sibling has more than the minimum number of
elements
Borrow one, adjust the separator
If the left sibling has more than the minimum number of elements
Borrow one, adjust the separator
If both immediate siblings have only the minimum number of
elements
Create a new node with all the elements from the deficient node,
all the elements from one of its siblings, and the separator in the
parent between the two combined sibling nodes.
Remove the separator from the parent, and replace the two
children it separated with the combined node.
If that brings the number of elements in the parent under the
minimum, repeat these steps with that deficient node, unless it is
the root, since the root may be deficient.
Revisiting: Additional Changes after Delete
B Tree
Home Assignment:
Write Algorithms for Insert
& Delete in B tree.

You might also like