You are on page 1of 15

Chapter 8

Disjoint Set ADT


Preliminary Definitions

A set is a collection of objects.

Set A is a subset of set B if all elements of A are in B.


Subsets are sets

Union of two sets A and B is a set C which consists of all


elements in A and B

Two sets are mutually disjoint if they do not have a


common element.
A partition of a set is a collection of subsets such that
Union of all these subsets is the set itself
Any two subsets are mutually disjoint

S = {1,2,3,4}, A = {1,2}, B = {3,4}, C = {2,3,4}, D = {4}


Is A, B a partition of S?
Yes

Is A, C partition of S? No

Is A, D partition of S? No
Union and Find Operations

Operations on partitions.

Union
Need to form union of two different sets of a partition

Find
Need to find out which set an element belongs to
Every set in the partition has a number.
The numbers can be anything as long as different sets have
distinct numbers.
Find(a) returns the number of the set containing a.

Can two different sets contain the same element?

No, the sets in a partition are disjoint


Linked List Representation of Disjoint Sets

A simple way to implement a disjoint-set data structure is to


represent each set by a linked list. The first object in each
linked list serves as its set's representative.
Each object in the linked list contains a set member, a
pointer to the object containing the next set member, and a
pointer back to the representative. Each list maintains
pointers head, to the representative, and tail, to the last
object in the list.
Within each linked list, the objects may appear in any order
(subject to our assumption that the first object in each list is
the representative).
Linked List Representation of Disjoint Sets
A simple implementation of union

The simplest implementation of the UNION operation


using the linked-list set representation takes
significantly more time than MAKE-SET or FIND-SET.
We perform UNION(x, y) by appending x's list onto the
end of y's list. We use the tail pointer for y's list to
quickly find where to append x's list.
The representative of the new set is the element that
was originally the representative of the set containing
y.
Union
In fact, it is not difficult to come up with a
sequence of m operations on n objects that
requires Θ(n2) time. Suppose that we have
objects x1, x2, ..., xn. We execute the
sequence of n MAKE-SET operations
followed by n - 1 UNION operations, so that m
= 2n - 1.
If we keep appending the long list to the
shorter one, the runtime will be Θ(n2)
Weighted Union heuristic
Suppose each list also includes the length of the list
and that we always append the smaller list onto the
longer, with ties broken arbitrarily.
Theorem 2.1: Using the linked-list representation of
disjoint sets and the weighted-union heuristic, a
sequence of m MAKE-SET, UNION, and FIND-SET
operations, n of which are MAKE-SEToperations,
takes O(m + n lg n) time.
Theorem 2.1 Proof:
Consider a fixed object x. We know that each time x's representative pointer was
updated, x must have started in the smaller set. The first time x's representative
pointer was updated, therefore, the resulting set must have had at least 2 members.
Similarly, the next time x's representative pointer was updated, the resulting set must
have had at least 4 members. Continuing on, we observe that for any k ≤ n, after x's
representative pointer has been updated ⌈lg k ⌉ times, the resulting set must have at
least k members. Since the largest set has at most n members, each object's
representative pointer has been updated at most ⌈lg n ⌉ times over all the UNION
operations.
The total time used in updating the n objects is thus O(n lg n).
The time for the entire sequence of m operations follows easily. Each MAKE-SET and
FIND-SET operation takes O(1) time, and there are O(m) of them. The total time for the
entire sequence is thus O(m + n lg n).
Disjoint Set Data Structure

Every element has a number.

Elements of a set are stored in a tree (not necessarily binary)

The set is represented by the root of the tree.


The number assigned to a set is the number of the root
element.
B = {3, 4}

B is assigned number 3

Are the numbers distinct for different sets?

No two sets have the same root as they are disjoint,


thus they have distinct numbers
Find(a) returns the number of the root node of the
tree containing a.

B = {3, 4}
Find(4) returns? 3

3
Find(3) returns? 3

Union operation makes one tree sub-tree of another


Root of one tree becomes child of the root of another.
B = {3, 4} A = {1,2}
3 1

4 2

Want to do A union B
1

We have:
2
3

You might also like