You are on page 1of 67

Data Structures

Mostafa M. Aref

Ain Shams University


Faculty of Computer & Information Sciences

1
Problem-Solving
● Problem - Reasoning - Solution - Test
● Analytic Approach
● Algorithmic Approach
– Input, Process, Output
– Algorithm is a sequence of executable instructions
● no ambiguity (instruction or sequence)
● finite (steps or execution)
– Sequence: I/O, variables assignment
– Selection:If . . . Else
– Repetition: looping
– Condition Looping: while or until

2
Software Engineering
● Software Engineering
– Requirement Specification
– Analysis (Input, Output, Formula, Units)
– Design (algorithm, verification)
– Implementation (Language)
– Testing
● Waterfall Model
Specification
Analysis
Design
Implementation
Testing
3
Evolution of Object-Oriented Programming
● Variables
– A variable is a value that can change, depending on conditions or on information
passed to the program.
● Data types
– A set of values from which a variable, constant, function, or other expression may
take its value. A type is a classification of data that tells the compiler or
interpreter how the programmer intends to use it.
● User defined data types
● Abstract Data types
– A type whose internal form is hidden behind a set of access functions. Objects of
the type are created and inspected only by calls to the access functions. This allows
the implementation of the type to be changed without requiring any changes
outside the module in which it is defined.
– Abstraction
– Encapsulation
● Object-Oriented Programming 4
Object-Oriented Programming
● Advantages of Object-Oriented Programming
– Simplicity
– Modularity
– Modifiability
– Extensibility
– Flexibility
– Maintainability
– Reusability

5
Object-Oriented Features
● Abstraction: The process of capturing the essential features and ignoring the detail
● Encapsulation: information hiding mechanism
– Object: An entity which has both variables and methods
● Variables comprise the state of the object
● Methods are mechanisms for accessing or changing the state of the
object
● Public variables or methods
● Private variables and methods
– Class: a template that can be used to create many objects with a different name and identity yet
sharing the method code and shared variables
● Instantiation: The process of creating objects using the description of a class
Myclass m1 = new Myclass ();
– Message Passing: an object sends a message to another requesting that a service be
performed.
● Inheritance
● Polymorphism: the same message is sent to a collection of objects & each object will
be able to respond in its own way

6
Abstraction
● Object-Oriented Strategies

● Properties of a Good Abstraction


– Well named Coherent
– Accurate Minimal 7
– Complete
Separation
What Goals Policy Product (Interface, specification, requirement) Ends
How Plans Mechanism Process, Implementation Means
● Software Interface: the external, visible aspects of a software artifact by which the behavior
of the artifact is elicited.
● Software Implementation: The programmed mechanism that realizes the behavior implied
by an interface.
● Separation: In software systems, the independent specification of a software interface and one
or more software implementations of that interface.

8
Data Structures
● Data structures is concerned with the representation
and manipulation of data.
● All programs manipulate data.
● So, all programs represent data in some way.
● Data manipulation requires an algorithm.
● Algorithm design methods needed to develop
programs that do the data manipulation.
● The study of data structures and algorithms is
fundamental to Computer Science.

9
Data Structures
● Data object: Set or collection of instances
– integer = {0, +1, -1, +2, -2, +3, -3, …}
– daysOfWeek = {S,M,T,W,Th,F,Sa}
● Instances may or may not be related
– myDataObject = {apple, chair, 2, 5.2, red, green, Jack}
● Relationships that exist among instances and elements that
comprise an instance. Among instances of integer
– 369 < 370 or 280 + 4 = 284
● Among elements that comprise an instance 369
– 3 is more significant than 6
– 3 is immediately to the left of 6
– 9 is immediately to the right of 6
● The relationships are usually specified by specifying operations on one
or more instances. add, subtract, predecessor, multiply
10
Linear (or Ordered) Lists
● Instances are of the form: (e0, e1, e2, …, en-1), where ei denotes a list
element n >= 0 is finite list size is n
● L = (e0, e1, e2, e3, …, en-1) relationships
– e0 is the zero’th (or front) element
– en-1 is the last element
– ei immediately precedes ei+1
● Linear List Examples
– Students in CS502 =(Jack, Jill, Abe, Henry, Mary, …, Judy)
– Quizes in CS502 =(Quiz1, Quiz2, Quiz3)
– Days of Week = (S, M, T, W, Th, F, Sa)
– Months = (Jan, Feb, Mar, Apr, …, Nov, Dec)
11
Linear List Operations
– size( ): determine list size L = (a,b,c,d,e)size = 5
– get(theIndex): get element with given index
● L = (a,b,c,d,e) get(0) = a, get(2) = c, get(4) = e
● get(-1) = error, get(9) = error
– indexOf(theElement): determine the index of an element
● L = (a,b,d,b,a) indexOf(d) = 2, indexOf(a) = 0, indexOf(z) = -1
– remove(theIndex): remove and return element with given index L =
(a,b,c,d,e,f,g)
● remove(2) returns cand L becomes (a,b,d,e,f,g)
● index of d,e,f, and g decrease by 1
● remove(-1) => error remove(20) => error
– add(theIndex, theElement): add an element so that the new element
has a specified index L = (a,b,c,d,e,f,g)
● add(0,h) => L = (h,a,b,c,d,e,f,g)
● index of a,b,c,d,e,f, and g increase by 1
● add(2,h) => L = (a,b,h,c,d,e,f,g)
● index of c,d,e,f, and g increase by 1
● add(10,h) => error add(-6,h) => error 12
Data Structure Specification
● Language independent: Abstract Data Type
● Linear List Abstract Data Type
AbstractDataType LinearList
{instances
ordered finite collections of zero or more elements
operations
isEmpty(): return true iff the list is empty, false otherwise
size(): return the list size (number of elements in the list)
get(index): return the indexth element of the list
indexO f(x): return the index of the first occurrence of x
in the list, return -1 if x is not in the list
remove(index): remove and return the indexth element,
elements with higher index have their index reduced by 1
add(theIndex, x): insert x as the indexth element,elements with theIndex >= index have their
index increased by 1
output(): output the list elements from left to right

13
Linear List as C++ Class
class LinearList
{ public:
boolean isEmpty();
int size();
Object get(int index);
int indexOf(Object elem);
Object remove(int index);
void add(int index, Object obj);
String toString(); }

● Extending A C++ Class


class ArrayLinearList : public LinearLis
{protected:
Object element []; // array of elements
int size; // number of elements in array
// code for all Array implementation must come here }
14
Linear List Array Representation
use a one-dimensional array element[]
a b c d e
0 1 2 3 4 5 6
L = (a, b, c, d, e), Store element i of list in element[i].
Right To Left Mapping
e d c b a
Mapping That Skips Every Other Position

a b c d e
Wrap Around Mapping

d e a b c

a b c d e
Add/Remove An Element Size 5
add(1,g) size = 6

a g b c d e
15
Array Representation
● Data Type Of Array element[]
– Data type of list elements is unknown.
– Define element[] to be of data type Object.
– Cannot put elements of primitive data types (int, float, double, char, etc.) into our linear
lists.
● Length of Array element[]
– Don’t know how many elements will be in list.
– Must pick an initial length and dynamically increase as needed
● Create An Empty List
ArrayLinearList a (100), b( ), c;
ArrayLinearList a = new ArrayLinearList(100),
b = new ArrayLinearList(),
LinearList a (100), b( ), c;
LinearList a = new ArrayLinearList(100),
b = new ArrayLinearList(),
● Using A Linear List
a.size();
a.add(0,2);
b.remove(0);
if (a.isEmpty())
16
a.add(0, 5);
Class ArrayLinearList
● The Class ArrayLinearList
ArrayLinearList(int initialCapacity)
{ if (initialCapacity < 1)
out << "initialCapacity must be >= 1";
else element = new Object [initialCapacity]; }
ArrayLinearList() /** create a list with initial capacity 10 */
{ this(10); } // use default capacity of 10
boolean isEmpty() /** return true iff list is empty */
{return size == 0;}
int size() /** return current number of elements in list */
{return size;}
Object get(int index)
{if (index < 0 || index >= size)
out << "index = " << index << " size = " << size;
else return element[index]; }
int indexOf(Object theElement)
{ for (int i = 0; i < size; i++) // search element[] for theElement
if (element[i].equals(theElement))
return i;
return -1; } // theElement not found

17
The Class ArrayLinearList
Object remove(int index)
{if (index < 0 || index >= size)
out << "index = " << index << " size = " << size;
// valid index, shift elements with higher index
else Object removedElement = element[index];
for (int i = index + 1; i < size; i++)
element[i-1] = element[i];
element[--size] = null;
return removedElement; }
void add(int index, Object theElement)
{ if (index < 0 || index > size) // invalid list position
out << "index = " << index << " size = " << size;
else if (size == element.length) // valid index, make sure we have space
// no space, double capacity
element = ChangeArrayLength.changeLength1D(element, 2 * size);
for (int i = size - 1; i >= index; i--) // shift elements right one position
element[i + 1] = element[i];
element[index] = theElement;
size++; }

18
Linked Representation
– lists elements are stored, in memory, in an arbitrary order
– explicit information (called a link) is used to go from one element to the next
– Layout of L = (a,b,c,d,e) using an array representation.

a b c d e
– A linked representation uses an arbitrary layout.

c a e d b
firstNode
– pointer (or link) in e is null
– use a variable firstNode to get to the first element a
● Normal Way To Draw A Linked List

firstNode
null
a b c d e

19
Chain
– A chain is a linked list, each node represents one element.
– There is a link or pointer from one element to the next.
– The last node has a null pointer.
● Node Representation
class ChainNode //
{ Object element; ChainNode *next;
ChainNode( ) { } // constructors come here
ChainNode(Object element) {this.element = element;}
ChainNode(Object element, ChainNode *next)
{this.element = element; this.next = next;} }
– get(0) desiredNode = firstNode; // gets you to first node
– get(1) desiredNode = firstNode.next; // gets the second node
– get(2) desiredNode = firstNode.next.next; //gets the third node
– get(5) desiredNode = firstNode.next.next.next.next.next;
● Remove An Element
– remove(0) firstNode
n
ChainNode * temp = firstNode; a b c d e
firstNode = firstNode.next; ull
delete temp; beforeNode
– remove(2)
– first get to node just before node to be removed,
beforeNode = firstNode.next;
– now change pointer in beforeNode
ChainNode * temp = beforeNode.next;
beforeNode.next = beforeNode.next.next; 20
delete temp;
Add an Element
– add(0,’f’)
● get a node, set its data and link fields
ChainNode *newNode = new ChainNode(‘f’, firstNode);
● update firstNode firstNode = newNode;
firstNode = new ChainNode( ‘f’, firstNode);
firstNode
null
f a b c d e
newNode
– Add element at the middle
– add(3,’f’)
beforeNode = firstNode.next.next;
ChainNode *newNode = new ChainNode(‘f’, beforeNode.next);
beforeNode.next = newNode;
● first find node whose index is 2
● next create a node and set its data and link fields
● finally link beforeNode to newNode
beforeNode = firstNode.next.next;
beforeNode.next = new ChainNode(‘f’, beforeNode.next);

firstNode newNode
f
null
a b c d e 21
beforeNode
The Class Chain
/** linked implementation of LinearList */
class Chain: LinearList
{public:
ChainNode *firstNode=null; // data members
int size=0;
Chain(int initialCapacity) /** create a list that is empty */
{ // initial values of firstNode and size are null and 0, respectively }
Chain( ) {this(0);}
boolean isEmpty( ) /** @return true iff list is empty */
{return size == 0;}
int size( ) /** @return current number of elements in list */
{return size;}
Object get(int index)
{if (index < 0 || index >= size)
{out << "index = " << index << " size = " << size; return;}
ChainNode currentNode = firstNode; // move to desired node
for (int i = 0; i < index; i++)
currentNode = currentNode.next;
return currentNode.element; }
int indexOf(Object theElement) // search the chain for theElement
{ ChainNode *currentNode = firstNode;
int index = 0; // index of currentNode
while (currentNode != null && currentNode.element!=theElement)
{ currentNode = currentNode.next; // move to next node
index++; }
if (currentNode == null) // make sure we found matching element
return -1; 22
else return index; }
The Class Chain(2)
public Object remove(int index)
{if (index < 0 || index >= size)
{out << "index = " << index << " size = " << size; return;}
Object removedElement;
if (index == 0) // remove first node
{ removedElement = firstNode.element;
firstNode = firstNode.next; }
else
{ ChainNode *beforeNode = firstNode; // get before node
for (int i = 0; i < index - 1; i++)
beforeNode = beforeNode.next;
removedElement = beforeNode.next.element;
beforeNode.next = beforeNode.next.next; // remove desired node }
size--;
return removedElement; }
public void add(int index, Object theElement)
{ if (index < 0 || index > size) // invalid list position
{out << "index = " << index << " size = " << size; return;}
if (index == 0) // insert at front
firstNode = new ChainNode(theElement, firstNode);
else
{
ChainNode beforeNode = firstNode; // find before node
for (int i = 0; i < index - 1; i++)
beforeNode = beforeNode.next;
// insert after beforeNode
beforeNode.next = new ChainNode(theElement, beforeNode.next); }
size++; }

23
Other types of Linked List
● Chain With Header Node
headerNode
null
a b c d e

● Circular List

firstNode

a b c d e

● Doubly Linked Circular List With Header Node

headerNode

a b c d e

24
Stacks
– Linear list.
– One end is called top. top E
– Other end is called bottom. D
– Additions to and removals from the top end only. C
● Add a cup to the stack. B
● Remove a cup from new stack. bottom A
● A stack is a LIFO list.
● The class Stack
Class Stack
{ public:
boolean empty();
Object peek();
void push(Object theObject);
Object pop();}

25
● Parentheses Matching
Stacks Applications
– (((a+b)*c+d-e)/(f+g)-(h+j)*(k-l))/(m-n)
– Output pairs (u,v) such that the left parenthesis at position
u is matched with the right parenthesis at v.
● (2,6) (1,13) (15,19) (21,25) (27,31) (0,32) (34,38)
– (a+b))*((c+d)
● (0,4)
● right parenthesis at 5 has no matching left parenthesis
● (8,12)
● left parenthesis at 7 has no matching right parenthesis
– scan expression from left to right
– when a left parenthesis is encountered, add its position to the stack
– when a right parenthesis is encountered, remove matching position from stack
– Example: (((a+b)*c+d-e)/(f+g)-(h+j)*(k-l))/(m-n)
– (2,6), (1,13) (15,19) (21,25) (27,31) (0,32)

2 15 21 27
1 - - -
0 0 0 0

26
Stacks Applications(2)
● Towers Of Hanoi
● 64 gold disks to be moved from tower A to tower C
● each tower operates as a stack
● cannot place big disk on top of a smaller one
● 3-disk Towers Of Hanoi
● 3-disk Towers Of Hanoi
– 7 disk moves 43
2
1
● Recursive Solution A B C
– n > 0 gold disks to be moved from A to C using B
– move top n-1 disks from A to B using C
– move top disk from A to C
– move top n-1 disks from B to C using A
– moves(n) = 0 when n = 0
– moves(n) = 2*moves(n-1) + 1 = 2n-1 when n > 0
● moves(64) = 1.8 * 1019 (approximately)
● Performing 109 moves/second, a computer would take about 570 years
to complete. 27
Derive From A Linear List Class
● Chess Story
 One 1 grain of rice on the first square, 2 for next, 4 for next, 8 for next, and so on.
 Surface area needed exceeds surface area of earth.
● Method Invocation And Return
– public void a( ) { …; b(); …} return address in e()
– public void b( ) { …; c(); …} return address in d()
– public void c( ) { …; d(); …} return address in c()
– public void d( ) { …; e(); …} return address in b()
– public void e( ) { …; c(); …} return address in a()

● Derive From ArrayLinearList


– stack top is either left end or right end of linear list, when top is right end of linear list
– empty() => isEmpty()
– peek() => get(0) or get(size() - 1)
– push(theObject) => add(size(), theObject)
– pop() => remove(size()-1)

● Derive From Chain


– stack top is either left end or right end of linear list, when top is left end of linear list
– empty() => isEmpty()
– peek() => get(0)
– push(theObject) => add(0, theObject)
– pop() => remove(0)
28
Deriving from ArrayLinearList
class DerivedArrayStack is ArrayLinearList
{ // constructors come here
public boolean empty( ) {return isEmpty();}
public Object peek( )
{ if (empty()) {out << “Empty Stack Exception”; return;}
return get(size() - 1) }
public void push(Object theElement)
{add(size(), theElement);}
public Object pop()
{ if (empty()) {out << “Empty Stack Exception”; return;}
return remove(size() - 1); } }
● Merits of deriving from ArrayLinearList
– Code for derived class is quite simple and easy to develop.
– Code is expected to require little debugging.
– Code for other stack implementations such as a linked implementation are easily
obtained.
– Just replace extends ArrayLinearList with extends Chain
– For efficiency reasons we must also make changes to use the left end of the list as
the stack top rather than the right end. 29
Evaluation of deriving from ArrayLinearList
● Demerits
– All public methods of ArrayLinearList may be performed on a stack.
● get(0) … get bottom element
● remove(5)
● add(3, x)
● So we do not have a true stack implementation.
● Must override undesired methods.
– Unnecessary work is done by the code.
● peek() verifies that the stack is not empty before get is invoked. The index check done
by get is, therefore, not needed.
● add(size(), theElement) does an index check and a for loop that is not entered. Neither is
needed.
● pop() verifies that the stack is not empty before remove is invoked. remove does an
index check and a for loop that is not entered. Neither is needed.
● So the derived code runs slower than necessary.
● Evaluation
– Code developed from scratch will run faster but will take more time (cost) to
develop.
– Tradeoff between software development cost and performance.
– Tradeoff between time to market and performance.
– Could develop easy code first and later refine it to improve performance.
30
– Use an int variable top.
Code From Scratch
– Stack elements are in stack[0:top].
– Top element is in stack[top].
– Bottom element is in stack[0].
– Stack is empty iff top = -1.
– Number of elements in stack is top+1.
class ArrayStack
{int top; // current top of stack
Object [] stack; // element array
// constructors come here
public Object pop()
{ if (empty()) {out << “Empty Stack Exception”; return;}
Object topElement = stack[top];
return topElement;} }

31
– Linear list.
Queues
– One end is called front. Bus
Stop
– Other end is called rear. front rear
– Additions are done at the rear only. rear
– Removals are made from the front only.
● Queue class
class Queue
{ public:
boolean isEmpty();
Object getFrontEelement();
Object getRearEelement();
void put(Object theObject);
Object remove(); }
● Derive From ArrayLinearList
– when front is left end of list and rear is right end
– Queue.isEmpty() => ArrayLinearList.isEmpty() getFrontElement() =>
get(0)
– getRearElement() => get(size() - 1)
– put(theObject) => add(size(), theObject)
32
– remove() => remove(0)
● Custom Linked Code
Custom Array Queue
– Develop a linked class for Queue from scratch to get better preformance than obtainable by
deriving from ExtendedChain.
● Custom Array Queue
– Use a 1D array queue. queue[]
● Circular view of array. [2] [3]
● Use integer variables front and rear.
– front is one position counterclockwise from first element [1] [4]
– rear gives position of last element
● Add An Element [0] [5]
– Move rear one clockwise.
– Then put into queue[rear]. [2] [3]
A B
● Remove An Element front rear
– Move front one clockwise. [1] C [4]
– Then extract from queue[front].
[0] [5]
● Moving Clockwise
– rear++;
– if (rear = = queue.length) rear = 0;
– rear = (rear + 1) % queue.length;
● Empty That Queue
– When a series of removals causes the queue to become empty, front = rear.
– When a queue is constructed, it is empty.
– So initialize front = rear = 0.
33
Problems with Queues
● A Full Tank Please
– When a series of adds causes the queue to become full, front = rear.
– So we cannot distinguish between a full queue and an empty queue!
● Remedies.
– Don’t let the queue get full.
● When the addition of an element will cause the queue to be full, increase array
size.
● This is what the text does.
– Define a boolean variable lastOperationIsPut.
● Following each put set this variable to true.
● Following each remove set to false.
● Queue is empty iff (front == rear) && !lastOperationIsPut
● Queue is full iff (front == rear) && lastOperationIsPut
– Performance is slightly better when first strategy is used.
34
Trees
● Computer Scientist’s View
● Linear Lists And Trees root
– Linear lists are useful for serially leaves

ordered data.
● (e0, e1, e2, …, en-1)
branches
● Days of week.
● Months in a year. nodes
● Students in this class.
– Trees are useful for hierarchically ordered data.
● Employees of a corporation.
– President, vice presidents, managers, and so on.
● Java’s classes.
– Object is at the top of the hierarchy.
– Subclasses of Object are next, and so on.

35
Hierarchical Data And Trees
– The element at the top of the hierarchy is the root.
– Elements next in the hierarchy are the children of the root.
– Elements next in the hierarchy are the grandchildren of the root,
and so on.
– Elements at the lowest level of the hierarchy are the leaves.
– Java’s Classes
root
Object
children of root
Number Throwable OutputStream

grand children of root


Integer Double Exception FileOutputStream

great grand child of root


RuntimeException

36

Tree Definition
A tree t is a finite nonempty set of elements.
– One of these elements is called the root.
– The remaining elements, if any, are partitioned into trees, which are called the subtrees of t.
● Subtrees rootObject
Number Throwable OutputStream

Integer Double Exception FileOutputStream

RuntimeException

● Leaves
● Parent, Grandparent, Siblings, Ancestors, Descendents
● Levels – Caution
– Some texts start level numbers at 0 rather than at 1.
– Root is at level 0. Its children are at level 1.
– The grand children of the root are at level 2. And so on.
– We shall number levels with the root at level 1.
● height = depth = number of levels
● Node Degree = Number Of Children
● Tree Degree = Max Node Degree - Degree of the above tree = 3
● Binary Tree
37
Binary Tree
– Finite (possibly empty) collection of elements.
– A nonempty binary tree has a root element.
– The remaining elements (if any) are partitioned into two binary trees.
– These are called the left and right subtrees of the binary tree.
● Differences Between A Tree & A Binary Tree
– No node in a binary tree may have a degree more than 2, whereas there is no limit
on the degree of a node in a tree.
– A binary tree may be empty; a tree cannot be empty.
– The subtrees of a binary tree are ordered; those of a tree are not ordered.
● Differences Between A Tree & A Binary Tree
– The subtrees of a binary tree are ordered; those of a tree are not
ordered. a a

b b
– Are different when viewed as binary trees.
– Are the same when viewed as trees.

38
Arithmetic Expressions
● Arithmetic Expressions
– (a + b) * (c + d) + e – f/g*h + 3.25
– Expressions comprise three kinds of entities.
● Operators (+, -, /, *).
● Operands (a, b, c, d, e, f, g, h, 3.25, (a + b), (c + d), etc.).
● Delimiters ((, )).
– Operator Degree
● Number of operands that the operator requires.
● Binary operator requires two operands.
– a+b c/d e-f
● Unary operator requires one operand.
– +g -h
– Infix Form
● Normal way to write an expression.
● Binary operators come in between their left and right operands.
– a*b a+b*c a*b/c
– (a + b) * (c + d) + e – f/g*h + 3.25

39
Arithmetic Expressions (2)
– Operator Priorities
● How do you figure out the operands of an operator?
– a+b*c a*b+c/d
● This is done by assigning operator priorities.
– priority(*) = priority(/) > priority(+) = priority(-)
● When an operand lies between two operators, the operand associates with the
operator that has higher priority.
● Tie Breaker
– When an operand lies between two operators that have the same
priority, the operand associates with the operator on the left. a+
b–c a*b/c/d
● Delimiters
– Subexpression within delimiters is treated as a single operand,
independent from the remainder of the expression.
● (a + b) * (c – d) / (e – f)

40
Arithmetic Expressions (3)
● Infix Expression Is Hard To Parse
– Need operator priorities, tie breaker, and delimiters.
– This makes computer evaluation more difficult than is necessary.
– Postfix and prefix expression forms do not rely on operator priorities, a tie
breaker, or delimiters.
– So it is easier for a computer to evaluate expressions that are in these forms.
● Postfix Form
– The postfix form of a variable or constant is the same as its infix form. a, b,
3.25
– The relative order of operands is the same in infix and postfix forms.
– Operators come immediately after the postfix form of their operands. Infix = a
+b Postfix = ab+
● Postfix Examples
– Infix = a + b * c Postfix = a b c * +
– Infix = a * b + c Postfix = a b * c +
– Infix = (a + b) * (c – d) / (e + f)
– Postfix = a b + c d - * e f + /
● Unary Operators
– Replace with new symbols.
● + a => a @ + a + b => a @ b +
● - a => a ? - a-b => a ? b - 41

Postfix Evaluation
Scan postfix expression from left to right pushing operands on to a stack.
– When an operator is encountered, pop as many operands as this operator needs; evaluate
the operator; push the result on to the stack.
– This works because, in postfix, operators come immediately after their operands.
– Example: (a + b) * (c – d) / (e + f)
● Prefix Form
– The prefix form of a variable or constant is the same as its infix form. a, b,
3.25
– The relative order of operands is the same in infix and prefix forms.
– Operators come immediately before the prefix form of their operands. Infix = a + b
Postfix = ab+ Prefix = +ab
● Binary Tree Form
– a+b -a -
+

a b a

//
– (a + b) * (c – d) / (e + f)
* +
e f
+ -
a b c d
42
Binary Tree Properties & Representation
● Merits Of Binary Tree Form
– Left and right operands are easy to visualize.
– Code optimization algorithms work with the binary tree form of an expression.
– Simple recursive evaluation of expression.
● Minimum Number Of Nodes
– Minimum number of nodes in a binary tree whose height is h. At least one node at
each of first h levels.
– minimum number of nodes is h
● Maximum Number Of Nodes
– All possible nodes at first h levels are present.
– Maximum # of nodes = 1 + 2 + 4 + 8 + … + 2h – 1 = 2h-1
● Number Of Nodes & Height
– Let n be the # of nodes in a binary tree whose height is h.
– h <= n <= 2h – 1 log2(n+1) <= h <= n

43
Full Binary Tree
1

2 3

4 5 6 7
8 9 10 11 12 13 14 15
– A full binary tree of a given height h has 2h – 1 nodes.
● Numbering Nodes In A Full Binary Tree
– Number the nodes 1 through 2h – 1.
– Number by levels from top to
bottom.
– Within a level number
from left to right.
– Node Number Properties
● Parent of node i is node i / 2, unless i = 1.
● Node 1 is the root and has no parent.
● Left child of node i is node 2i, unless 2i > n, n is the # of nodes.
● If 2i > n, node i has no left child.
● Right child of node i is node 2i+1, unless 2i+1 > n, where n is the # of nodes.
● If 2i+1 > n, node i has no right child.
● Binary Tree Representation
– Array Representation
● Number the nodes using the numbering scheme for a full binary tree. The node that is
numbered i is stored in tree[i]. 44
Array Representation
a 1
tree[] a b c d e f g h i j 2 3
0 5 10 b c

● Right-Skewed Binary Tree 4


d e
5 6
f g
7
– An n node binary tree needs 8 9 10
h i j
an array whose length
is between n+1 and 2n. a 1
b 3
7
c
tree[] a - b - - - c - - - - - - - d 15
0 5 10 15 d
● Linked Representation
– Each binary tree node is represented as an object whose data type is BinaryTreeNode.
– The space required by an n node binary tree is n * (space required by one node).
● The Class BinaryTreeNode
class BinaryTreeNode
{ Object element;
BinaryTreeNode leftChild; // left subtree
BinaryTreeNode rightChild;// right subtree
// constructors and any other methods come here
} 45
Binary Tree Operations
– Determine the height.
– Determine the number of nodes.
– Make a clone.
– Determine if two binary trees are clones.
– Display the binary tree.
– Evaluate the arithmetic expression represented by a binary tree.
– Obtain the infix form of an expression.
– Obtain the prefix form of an expression.
– Obtain the postfix form of an expression.
● Binary Tree Traversal
– Many binary tree operations are done by performing a traversal of
the binary tree.
– In a traversal, each element of the binary tree is visited exactly once.
– During the visit of an element, all action (make a clone, display,
evaluate the operator, etc.) with respect to this element is taken.
– Preorder, Inorder, Postorder or Level order 46
Binary Tree Traversal
● Preorder Traversal a
void preOrder(BinaryTreeNode t)
{ if (t != null)
{ visit(t); b c
preOrder(t.leftChild);
f
preOrder(t.rightChild); } } d e
– Preorder Example (visit = print)
g h i j
abdgheicfj /
– Preorder Of Expression Tree
/*+ab-cd+ef * +
– Gives prefix form of expression!
e f
● Inorder Traversal + -
void inOrder(BinaryTreeNode t) a b c d
{ if (t != null)
{ inOrder(t.leftChild);
visit(t); /
inOrder(t.rightChild); }}
* +
– Inorder Example (visit = print) e
+ f
gdhbeiafjc -
– Inorder By Projection a b c d
(Squishing)
– Inorder Of a + b * c - d / e + f
Expression Tree
47
– Gives infix form of expression (sans parentheses)!
Postorder Traversal
public static void postOrder(BinaryTreeNode t)
{ if (t != null)
{ postOrder(t.leftChild);
postOrder(t.rightChild);
visit(t); }}

– Postorder Example (visit = print) g h d i e b j f c a


– Postorder Of Expression Tree ab+cd-*ef+/
– Gives postfix form of expression!
● Traversal Applications
– Make a clone. Determine height. Determine # of nodes.
● Level Order
Let t be the tree root.

while (t != null)
{ visit t and put its children on a FIFO queue;
remove a node from the FIFO queue and call it t;
} // remove returns null when queue is empty

– Level-Order Example (visit = print) abcdefghij


48
Binary Tree Construction
– Suppose that the elements in a binary tree are distinct.
– Can you construct the binary tree from which a given traversal
sequence came?
● When a traversal sequence has more than one element, the binary tree is not
uniquely defined.
● Therefore, the tree from which the sequence was obtained cannot be
reconstructed uniquely.
– Can you construct the binary tree, given two traversal sequences?
● Depends on which two sequences are given.
a a

b b

– Preorder And Postorder


● preorder = ab postorder = ba
● Preorder and postorder do not uniquely define a binary tree.
● Nor do preorder and level order (same example).
● Nor do postorder and level order (same example). 49

Binary
Inorder And Preorder
Tree Construction
● inorder = g d h b e i a f j c a
– Inorder And Postorder
● Scan postorder from right to left using gdhbei fjc
inorder to separate left and right subtrees. a

● inorder = g d h b e i a f j c b fjc
● postorder = g h d i e b j f c a
gdh ei
● Tree root is a; gdhbei are in left subtree; fjc are in right subtree.
● preorder = a b d g h e i c f j
● Scan the preorder left to right using the inorder to separate left and right subtrees.
● a is the root of the tree; gdhbei are in the left subtree; fjc are in the right subtree.
● b is the next root; gdh are in the left subtree; ei are in the right subtree.
● d is the next root; g is in the left
a
subtree; h is in the right subtree. b fjc
d ei
– Inorder And Level Order
g h
● Scan level order from left to right using inorder to separate left and right subtrees.
● inorder = g d h b e i a f j c
● level order = a b c d e f g h i j
● Tree root is a; gdhbei are in left subtree; fjc are in right subtree.

50
– G = (V,E)
Graphs
– V is the vertex set. u v u v
– Vertices are also called nodes and points.
– E is the edge set. 8
– Each edge connects two different vertices. 2 10
– Edges are also called arcs and lines. 3
– Directed edge has an orientation (u,v). 1 9 11
– Undirected edge has no orientation (u,v). 4
– Undirected graph => no oriented edge. 5
– Directed graph => every edge has an orientation.
6
2 7
n=1 3
8
1 10
n=2 4
n=4 5 9 11
● Applications 6
– Communication Network: 7
● Vertex = city, edge = communication link.

– Driving Distance/Time Map


● Vertex = city, edge weight = driving distance/time.

– Street Map: Some streets are one way.


51
Complete Undirected Graph
●Has all possible edges.
– Number Of Edges Undirected Graph
● Each edge is of the form (u,v), u != v.
● Number of such pairs in an n vertex graph is n(n-1).
● Since edge (u,v) is the same as edge (v,u), the number of edges in a complete undirected

graph is n(n-1)/2.
● Number of edges in an undirected graph is <= n(n-1)/2.

– Number Of Edges--Directed Graph


● Each edge is of the form (u,v), u != v.
● Number of such pairs in an n vertex graph is n(n-1).
● Since edge (u,v) is not the same as edge (v,u), the number of edges in a complete
directed graph is n(n-1).
● Number of edges in a directed graph is <= n(n-1).
● Vertex Degree
– Number of edges incident to vertex.
– Sum of degrees = 2e (e is number of edges)
– in-degree is number of incoming edges
– out-degree is number of outbound edges
– each edge contributes 1 to the in-degree of some vertex and 1 to the out-degree of some
other vertex
– sum of in-degrees = sum of out-degrees = e, where e is the number of edges in the digraph
52
Searching Algorithm
● Linear Search: Unsorted data
– Using Array
int lin_search1(L_TYPE *list, long int value)
{ int loc;
for (loc = 0; loc < list->size && list->info[loc].id != value; ++loc);
if (list->info[loc].id == value) return (loc);
else return (-1); }

● Using Linked List


N_PTR lin_search2((L_TYPE *list, long int value)
{ N-PTR loc;
for (loc = list; loc != NULL && loc->info.id != value; loc = loc->next);
if (loc->info.id == value) return (loc);
else return (NULL); }

● Using Recursion
N_PTR lin_search3((L_TYPE *list, long int value)
{ if (list == NULL) return (NULL);
else if (loc->info.id == value) return ( list);
else return (lin_search3 (list->next, value)); }

53
Binary Search
int bin_search1(L_TYPE *list, long int value, int low, int high)
{ int mid;
while (low <= high) {
mid = (low + high) /2;
if ( list->info[mid].id == value) return (mid);
else if ( list->info[mid].id <value)
low = mid + 1;
else high = mid - 1; }
return (-1); }
int bin_search2(L_TYPE *list, long int value, int low, int high)
{ int mid;
if (low > high) return (-1);
mid = (low + high) /2;
if ( list->info[mid].id == value) return (mid);
else if ( list->info[mid].id <value)
return(bin_search2(list,value,mid+1,high));
else 54
return(bin_search2(list,value,low,mid-1));}
Sorting Algorithms
● Bubble Sort
void bubble_sort(int list[ ], int size)
{ int i,temp,sorted ,pass=1;
do { sorted = 1;
for ( i = 0; i < size - pass; i++)
if (list[i] > list [i + 1]) {
temp = list[i];
list[i] = list[i+1];
list[i+1] = temp;
sorted = 0; }
pass++; }
while (!sorted); }
● Selection Sort
– Basic Idea:make a number of passes through the list or a part of the list and, on
each pass, select one element to be correctly positioned.
67, 33, 21, 84, 49, 50, 75 => 21 , 33 , 67 , 84 , 49 , 50 , 75
void selection_sort (int list[ ], int size)
{ int i,j,min_pos,temp;
for (i = 0; i < size-1; i++) {
min_pos = i;
for (j = i+1; j < size; j++)
if (list[j] < list[min_pos])
min_pos = j;
if (min_pos != i) {
temp = list[i];
list[i] = list[min_pos];
list[min_pos] = temp; } } } 55
Insertion Sort
28 81 03 47 17 13 55 65 23 18 67 38 36
03 28 81 47
03 28 47 81 17
03 17 28 47 81 13
03 13 17 28 47 81 55
03 13 17 28 47 55 81 65
03 13 17 28 47 55 65 81 23
● Shell Sort
28 81 03 47 17 13 55 65 23 18 67 38 36

13 38 03 23 17 28 55 36 47 18 67 81 65
03 18 13 23 17 28 47 36 55 38 65 81 67
03 13 17 18 23 28 36 38 47 55 65 67 81
56
Quicksort
● Quicksort uses a divide-and-conquer strategy a recursive approach to problem-
solving in which the original problem partitioned into simpler sub-problems, each
subproblem considered independently. Subdivision continues until subproblems
obtained are simple enough to be solved directly
● Choose some element called a pivot
● Perform a sequence of exchanges so that
– all elements that are less than this pivot are to its left and
– all elements that are greater than the pivot are to its right.
 divides the (sub)list into two smaller sublists,
– each of which may then be sorted independently in the same way.
1. If the list has 0 or 1 elements, return. // the list is sorted
Else do:
Pick an element in the list to use as the pivot.
Split the remaining elements into two disjoint groups:
SmallerThanPivot = {all elements < pivot}
LargerThanPivot = {all elements > pivot}
Return the list rearranged as:
Quicksort(SmallerThanPivot), pivot, Quicksort(LargerThanPivot).

57
Quick Sort
28 81 03 47 17 13 55 65 23 18 67 38 36
36 03 47 17 13 28 23 18 38 55 67 81 65

03 13 47 17 36 28 23 18 38 55 65 67 81

03 13 18 17 23 28 36 47 38 55 65 67 81

03 13 17 18 23 28 36 38 47 55 65 67 81
void quicksort (int list[ ], int left, int right)
{ int pivot,p_value,i,mid,temp;
if (left < right) {
mid = (left+right)/2; p_value = list[mid];
list[mid] = list[left]; list[left] = p_value;
pivot = left;
for (i = left+1; i <=right; i++)
if (list[i] < p_value) {temp= list[++ pivot]; list[pivot] = list[i]; list[i] = temp; }
temp = list[pivot];
list[pivot] = list[left];
list[left] = temp;
quicksort(list, left, pivot-1);
quicksort(list, pivot+1,right); } } 58
Bucket sort
● Assumes the input is generated by a random process that distributes
elements uniformly over [0, 100).
● Idea:
– Divide [0, 100) into n equal-sized buckets.
– Distribute the n input values into the buckets.
– Sort each bucket.
– Then go through buckets in order, listing elements in each one.
● Example:
– 89, 88, 21, 17, 37, 65, 44, 53, 23, 54, 87, 77

.. 17 .. 21 23 .. 37 .. 44 .. 53 54 .. 65 .. 77 .. 87 88 89

59
Hash Tables
– Worst-case time for get, put, and remove is O(size).
– Expected time is O(1).
● Ideal Hashing
– Uses a 1D array (or table) table[0:b-1].
● Each position of this array is a bucket.
● A bucket can normally hold only one dictionary pair.
– Uses a hash function f that converts each key k into an index in the
range [0, b-1].
● f(k) is the home bucket for key k.
– Every dictionary pair (key, element) is stored in its home bucket
table[f[key]].
– Pairs are: (22,a), (33,c), (3,d), (73,e), (85,f).
– Hash table is table[0:7], b = 8.
– Hash function is key/11.
– Pairs are stored in table as below:
(3,d) (22,a) (33,c) (73,e) (85,f)
60
[0] [1] [2] [3] [4] [5] [6] [7]
Hash Table Issues
– get, put, and remove take O(1) time.
– What Can Go Wrong?
● Where does (26,g) go?
● Keys that have the same home bucket are synonyms.
– 22 and 26 are synonyms with respect to the hash function that is in use.
● The home bucket for (26,g) is already occupied.
– A collision occurs when the home bucket for a new pair is occupied by a pair with a different key.
– An overflow occurs when there is no space in the home bucket for the new pair.
– When a bucket can hold only one pair, collisions and overflows occur together.
– Need a method to handle overflows.
● Hash Table Issues
– Choice of hash function.
– Overflow handling method.
– Size (number of buckets) of hash table.
● Hash Functions
– Two parts:
● Convert key into an integer in case the key is not an integer.
– Done by the method hashCode().
● Map an integer into a home bucket.
– f(k) is an integer in the range [0, b-1], where b is the number of buckets in the table.
● Map Into A Home Bucket
– Most common method is by division.
● homeBucket = Math.abs(theKey.hashCode()) % divisor;
– divisor equals number of buckets b.
– 0 <= homeBucket < divisor = b 61
Uniform Hash Function
● Uniform Hash Function
– Let keySpace be the set of all possible keys.
– A uniform hash function maps the keys in keySpace into buckets such that
approximately the same number of keys get mapped into each bucket.
– Equivalently, the probability that a randomly selected key has bucket i as its home
bucket is 1/b, 0 <= i < b.
– A uniform hash function minimizes the likelihood of an overflow when keys are
selected at random.
● Hashing By Division
– keySpace = all ints.
– For every b, the number of ints that get mapped (hashed) into
bucket i is approximately 232/b.
– Therefore, the division method results in a uniform hash function
when keySpace = all ints.
– In practice, keys tend to be correlated.
– So, the choice of the divisor b affects the distribution of home
buckets. 62
Selecting The Divisor
– Because of this correlation, applications tend to have a bias towards
keys that map into odd integers (or into even ones).
– When the divisor is an even number, odd integers hash into odd
home buckets and even integers into even home buckets.
– 20%14 = 6, 30%14 = 2, 8%14 = 8
– 15%14 = 1, 3%14 = 3, 23%14 = 9
– The bias in the keys results in a bias toward either the odd or even
home buckets.
– When the divisor is an odd number, odd (even) integers may hash
into any home.
● 20%15 = 5, 30%15 = 0, 8%15 = 8
● 15%15 = 0, 3%15 = 3, 23%15 = 8
– The bias in the keys does not result in a bias toward either the odd or
even home buckets.
– Better chance of uniformly distributed home buckets.
– So do not use an even divisor. 63
Selecting The Divisor (2)
– Similar biased distribution of home buckets is seen, in practice,
when the divisor is a multiple of prime numbers such as 3, 5, 7, …
– The effect of each prime divisor p of b decreases as p gets larger.
– Ideally, choose b so that it is a prime number.
– Alternatively, choose b so that it has no prime factor smaller than
20.
● Java.util.HashTable
– Simply uses a divisor that is an odd number.
– This simplifies implementation because we must be able to
resize the hash table as more pairs are put into the
dictionary.
● Array doubling, for example, requires you to go from a 1D array
table whose length is b (which is odd) to an array whose length is
2b+1 (which is also odd).
64
Overflow Handling
– An overflow occurs when the home bucket for a new pair (key,
element) is full.
– We may handle overflows by:
● Search the hash table in some systematic fashion for a bucket that is not full.
– Linear probing (linear open addressing).
– Quadratic probing.
– Random probing.
● Eliminate overflows by permitting each bucket to keep a list of all pairs for
which it is the home bucket.
– Array linear list.
– Chain.
● Linear Probing – Get And Put
– divisor = b (number of buckets) = 17.
– Home bucket = key % 17.

65
Linear Probing
0 4 8 12 16
34 0 45 6 23 7 28 12 29 11 30 33
– Put in pairs whose keys are 6, 12, 34, 29, 28, 11, 23, 7, 0, 33, 30, 45
● Remove: remove(0)
0 4 8 12 16
34 45 6 23 7 28 12 29 11 30 33
● Search cluster for pair (if any) to fill vacated bucket.
0 4 8 12 16
34 45 6 23 7 28 12 29 11 30 33

– remove(34)
0 4 8 12 16
0 45 6 23 7 28 12 29 11 30 33
● Search cluster for pair (if any) to fill vacated bucket.
0 4 8 12 16
0 45 6 23 7 28 12 29 11 30 33
66
Linear Probing (2)
– remove(29)
0 4 8 12 16
34 0 45 6 23 7 28 12 11 30 33

● Search cluster for pair (if any) to fill vacated bucket.


0 4 8 12 16
34 0 6 23 7 28 12 11 30 45 33

● Performance Of Linear Probing


– Worst-case get/put/remove time is Theta(n), where n is the number of pairs in the
table.
– This happens when all pairs are in the same cluster.

67

You might also like