You are on page 1of 361

Data structure

The logical and mathematical model of a particular organization of data is called Data structure. The main characteristics of a data structure are: Contains component data items which may be atomic or another data structure A set of operations on one or more of the component items Defines rules as to how components relate to each other and to the structure as a whole The choice of a particular data structure depends on following consideration: It must be rich enough in structure to mirror actual relationships of data in real world for example the hierarchical relationship of the entities is best described by the tree data structure. The structure should be simple enough that one can effectively process the data when necessary.

The various data structures are divided into following categories: Linear data structure- A data structure whose elements form a sequence, and every element in structure has a unique predecessor and a unique successor. Examples of linear data structure are: Arrays Linked Lists Stacks Queues Non-Linear data structures- A data structure whose elements do not form a sequence. There is no unique predecessor or unique successor. Examples of non linear data structures are trees and graphs.

Linear Data Structures Arrays- An array is a list of finite number of elements of same data type. The individual elements of an array are accessed using an index or indices to the array. Depending on number of indices required to access an individual element of an array, array can be classified as: One-dimensional array or linear array that requires only one index to access an individual element of an array Two dimensional array that requires two indices to access an individual element of array

The arrays for which we need two or more indices are known as multidimensional array.

Linked List Linear collection of data elements called nodes Each node consists of two parts; data part and pointer or link part Nodes are connected by pointer links. The whole list is accessed via a pointer to the first node of the list Subsequent nodes are accessed via the link-pointer member of the current node Link pointer in the last node is set to null to mark the lists end Use a linked list instead of an array when You have an unpredictable number of data elements (dynamic memory allocation possible) Your list needs to be sorted quickly

Diagram of two data nodes linked together

15

10

Data member and pointer

NULL pointer (points to nothing)

struct node { int data; struct node *nextPtr; } nextPtr

Points to an object of type node Referred to as a link Ties one node to another node

Types of linked lists: Singly linked list Begins with a pointer to the first node Terminates with a null pointer Only traversed in one direction Circular, singly linked Pointer in the last node points back to the first node Doubly linked list Two start pointers first element and last element Each node has a forward pointer and a backward pointer Allows traversals both forwards and backwards Circular, doubly linked list Forward pointer of the last node points to the first node and backward pointer of the first node points to the last node Header Linked List Linked list contains a header node that contains information regarding complete linked list.

Stacks-A stack, also called last-in-first-out (LIFO) system, is a linear list in which insertions (push operation) and deletions (pop operations) can take place only at one end, called the top of stack . Similar to a pile of dishes Bottom of stack indicated by a link member to NULL

Constrained version of a linked list The two operations on stack are: push Adds a new node to the top of the stack pop
Removes a node from the top Stores the popped value Returns true if pop was successful

Queues- A queue, also called a First-in-First-out (FIFO) system, is a linear list in which insertions can take place at one end of the list, called the rear of the list and deletions can take place only from other end , called the front of the list. Similar to a supermarket checkout line Insert and remove operations

Tree- A tree is a non-linear data structure that represents a hierarchical relationship between various elements. The top node of a tree is called the root node and each subsequent node is called the child node of the root. Each node can have one or more than one child nodes. A tree that can have any number of child nodes is called a general tree. If there is an maximum number N of successors for a node in a tree, then the tree is called an N-ary tree. In particular a binary (2-ary) tree is a tree in which each node has either 0, 1, or 2 successors. Binary trees Binary tree can be empty without any node whereas a general tree cannot be empty. All nodes contain two links None, one, or both of which may be NULL The root node is the first node in a tree. Each link in the root node refers to a child A node with no children is called a leaf node

Diagram of a binary tree

B A C D

Binary search tree A type of binary treee Values in left subtree less than parent Values in right subtree greater than parent Facilitates duplicate elimination Fast searches, maximum of log n comparisons

47

25
11 7 17 43 31 44 65 68

77
93

Graph- A graph, G , is an ordered set (V,E) where V represent set of elements called nodes or vertices in graph terminology and E represent the edges between these elements. This data structure is used to represent relationship between pairs of elements which are not necessarily hierarchical in nature. Usually there is no distinguished `first' or `last' nodes. Graph may or may not have cycles

Algorithms
A finite set of steps that specify a sequence of operations to be carried out in order to solve a specific problem is called an algorithm Properties of Algorithms: 1. Finiteness- Algorithm must terminate in finite number of steps 2. Absence of Ambiguity-Each step must be clear and unambiguous 3. Feasibility-Each step must be simple enough that it can be easily translated into the required language 4. Input-These are zero or more values which are externally supplied to the algorithm 5. Output-At least one value is produced

Conventions Used for Algorithms


Identifying number-Each algorithm is assigned as identification number Comments-Each step may contain a comment in brackets which indicate the main purpose of the step Assignment statement-Assignment statement will use colon-equal notation Set max:= DATA[1] Input/Output- Data may be input from user by means of a read statement Read: Variable names Similarly, messages placed in quotation marks, and data in variables may be output by means of a write statement: Write: messages or variable names

Selection Logic or conditional flow If condition, then: [end of if structure] Double alternative If condition, then Else: [End of if structure]
Multiple Alternatives If condition, then: Else if condition2, then: Else if condition3, then Else: [End of if structure]

Iteration Logic Repeat- for loop Repeat for k=r to s by t: [End of Loop] Repeat-While Loop Repeat while condition: [End of loop]

Algorithm complexity
An algorithm is a sequence of steps to solve a problem. There can be more than one algorithm to solve a particular problem and some of these solutions may be more efficient than others. The efficiency of an algorithm is determined in terms of utilization of two resources, time of execution and memory. This efficiency analysis of an algorithm is called complexity analysis, and it is a very important and widelystudied subject in computer science. Performance requirements are usually more critical than memory requirements. Thus in general, the algorithms are analyzed on the basis of performance requirements i.e running time efficiency. Specifically complexity analysis is used in determining how resource requirements of an algorithm grow in relation to the size of input. The input can be any type of data. The analyst has to decide which property of the input should be measured; the best choice is the property that most significantly affects the efficiency-factor we are trying to analyze. Most commonly, we measure one of the following : the number of additions, multiplications etc. (for numerical algorithms). the number of comparisons (for searching, sorting) the number of data moves (assignment statements)

Based on the type of resource variation studied, there are two types of complexities Time complexity Space complexity Space Complexity- The space complexity of an algorithm is amount of memory it needs to run to completion. The space needed by a program consists of following components: Instruction space-space needed to store the executable version of program and is fixed. Data space-space needed to store all constants, variable values and has further two components: Space required by constants and simple variables. This space is fixed. Space needed by fixed sized structured variable such as arrays and structures. Dynamically allocated space. This space usually varies.

Environment stack space- Space needed to store information needed to resume the suspended functions. Each time a function is invoked following information is saved on environment stack Return address i.e from where it has to resume after completion of the called function Values of all local variables and values of formal parameters in function being invoked. Time complexity- Time complexity of an algorithm is amount of time it needs to run to completion. To measure time complexity, key operations are identified in a program and are counted till program completes its execution. Time taken for various key operations are: Execution of one of the following operations takes time 1: 1. assignment operation 2. single I/O operations 3. single Boolean operations, numeric comparisons 4. single arithmetic operations 5. function return 6. array index operations, pointer dereferences

Running time of a selection statement (if, switch) is the time for the condition evaluation + the maximum of the running times for the individual clauses in the selection. Loop execution time is the number of times the loop body is executed + time for the loop check and update operations, + time for the loop setup. Always assume that the loop executes the maximum number of iterations possible Running time of a function call is 1 for setup + the time for any parameter calculations + the time required for the execution of the function body.

Expressing Space and time complexity: Big O notation


It is very difficult to practically analyze the variation of time requirements of an algorithm with variation in size of input. A better approach to express space/time complexity is in the form of a function f (n) where n is the input size for a given instance of problem being solved. Efficiency (algorithm A) = a function F of some property of A's input. We have to decide which property of the input we are going to measure; the best choice is the property that most significantly affects the efficiency-factor we are trying to analyze. For example, the time taken to sort a list is invariably a function of the length of the list. The speed of an algorithm varies with the number of items n. The most important notation used for expressing this function f(n) is Big O notation.

Big O notation is a characterization scheme that allows to measure the properties of algorithm such as performance and/or memory requirements in general fashion. big O notation Uses the dominant term of the function Omits lower order terms and coefficient of dominant Apart from n (size of input), efficiency measure will depend on three cases which will decide number of operations to be performed. Best- Case performance under ideal condition Worst-case performance under most un favorable condition Average case performance under most probable condition. Big O notation tries to analyze each algorithm performance in worst condition. It is the rate of increase of f(n) that is examined as a measure of efficiency.

Rate of growth: Big O notation


Suppose F is an algorithm and suppose n is the size of input data. Clearly, complexity f(n) of F increases as n increases. Rate of increase of f(n) is examined by comparing f(n) with some standard functions such as log2n, n, nlog2n, n2,n3 and 2n. One way to compare f(n) with these standard functions is to use functional O notation defined as follows: Definition: If f(n) and g(n) are functions defined on positive integers with the property that f(n) is bounded by some multiple of g(n) for almost all n. That is, suppose there exist a positive integer no and a positive number M such that for all n> no we have |f(n) | M |g(n)| Then we may write f(n)=O g(n) Which is read as f(n)(time taken for number of operations) is of the order of g(n). If g(n)=n, it means that f(n) is a linear proportional to n. For g(n)=n2 , f(n) is proportional to n2. Thus if an array is being sorted using an algorithm with g(n)=n2, it will take 100 times as long to sort the array that is 10 times the size of another array.

Based on Big O notation, algorithms can be categorized as Constant time ( O(1)) algorithms Logarithmic time algorithms (O(logn)) Linear Time algorithm (O(n) Polynomial or quadratic time algorithm (O(nk)) Exponential time Algorithm (O(kn)) It can be seen that logarithmic function log(n) grows most slowly whereas kn grows most rapidly and polynomial function nk grows in between the two extremities. Big-O notation, concerns only the dominant term, low-order terms and constant coefficients are ignored in a statement. Thus if g(n) = n2+2n, the variation is taken as O(n2) rather than O(n). Complexities of some well known searching and sorting algorithms is: Linear Search: O(n) Mergesort: O(nlogn) Binary Search: O(logn) Bubble sort: O(n2)

RUNNING TIME
let f(n) be the function that defines the time an algorithm takes for problem size n. The exact formula for f(n) is often difficult to get. We just need an approximation What kind of function is f ? Is it constant? linear? quadratic? .... and, our main concern is large n. Other factors may dominate for small n

COMPLEXITY SAMPLES assume different algorithms to solve a problem with time functions:T1(n), T2(n), T3(n), T4(n) plot each for a range of n

108
107 Time taken 106 105

2n

n3
n2

nlogn n

104 103 102 1 10 100 1000 Input size n logn 10,000

Rate of Growth of f(n) with n

Other Asymptotic notations for Complexity Analysis


The big O notation defines the upper bound function g(n) for f(n) which represents the time/space complexity of the algorithm on an input characteristic n . The other such notations are: Omega Notation ()- used when function g(n) defines the lower bound for function f(n). It is defined as |f(n)| M|g(n)| Theta Notation ()- Used when function f(n) is bounded both from above and below by the function g(n). It is defined as c1|g(n)| |f(n)| c2.|g(n)| Where c1 and c2 are two constants. Little oh Notation (o)- According to this notation, f(n)=o g(n) iff f(n) = Og(n) and f(n) g(n)

Time-Space Tradeoff- The best algorithm to solve a given problem is one that requires less space in memory and takes less time to complete its execution. But in practice, it is not always possible to achieve both of these objectives. Thus, we may have to sacrifice one at the cost of other. This is known as Time-space tradeoff among algorithms. Thus if space is our constraint, we have to choose an algorithm that requires less space at the cost of more execution time. On the other hand, if time is our constraint, such as in real time systems, we have to choose a program that takes less time to complete execution at the cost of more space.

Various operations performed on Arrays


Traversal- processing each element in the list Searching- Finding location of an element Insertion- Adding a new element Deletion- Removing an element Sorting-Arranging elements in some type of order Merging- Combining two lists into a single list

Traversing Linear Arrays- Traversing a linear array is basically, visiting each element of array exactly once. Traversing is usually required to count number of elements of array or to perform other operations on each element of array. Traversing a linear array depends on size of array. Thus, traversal is O(n) operation for an array of size n. Algorithm: (Traversing a linear Array) Here LA is a linear array with lower bound LB and upper Bound UB. This algorithm traverses LA applying an operation PROCESS to each element of LA Step 1: [Initialize Counter] Set k:=LB Step 2: Repeat step 3 and 4 while k<=UB Repeat for k=LB to UB: Step 3: [Visit Element] Apply PROCESS Apply PROCESS to to LA[k] LA[k] Step 4: [increase Counter] set k:=k+1 [End of step 2 loop] Step 5: Exit

Insertion and Deletion in an Array- Insertion refers to operation of adding an element to an array and deletion refers to deleting an element from an array Insertion- Insertion can be done at various positions but the main cases undertaken are: Insertion at the start Insertion at the end Insertion in the middle Insertion at any other position Inserting an element at the end of the array can be done easily provided the memory space allocated for the array is large enough to accommodate the additional element. For inserting at any other position, elements of array have to be shifted downwards from the position where insertion is to be done. The new element is then inserted in vacated position.

Algorithm: (Inserting into a Linear Array) INSERT(LA,N,K,ITEM) Here LA is a linear array with N elements and K is a positive integer such that K<=N. This algorithm inserts an element ITEM into Kth position in LA. Step 1: [Initialize counter] Set J:=N Step 2: Repeat steps 3 and 4 while J K: [Move Jth element downward] Set LA[J+1]:=LA[J] [ Decrease Counter] Set J:=J -1 [End of Step 2 Loop] Step 3: [Insert element] Set LA[K]:=ITEM Step 4:[Reset N] Set N:=N+1 Step 5: Return

Deletion refers to the operation of removing an element from exiting list of elements. Like insertion, deletion can be done easily from the end of the list. However to delete an element from any other location, the elements are to be moved upwards in order to fill up the location vacated by the removed element.

Algorithm: (Deleting from a Linear Array) DELETE(LA,N,K,ITEM) Here LA is a linear array with N elements and K is a positive integer such that K<=N. This algorithm deletes an element ITEM from Kth position in LA. Step 1: [Initialize counter] Set ITEM:=LA[K] Step 2: Repeat for J=K to N-1: [Move J+1st element upward] Set LA[J]:=LA[J+1] [End of Step 2 Loop] Step 3: [Reset N] Set N:=N+1 Step 4: Return

Analysis of Insertion and deletion operation


The best possible case in insertion operation is when the item is inserted at the last position. In this case, no movement of elements is required. The worst case occurs when the element has to be inserted at the beginning of the list. In this case, we have to move all the elements down the list. Therefore, the while loop executes n times, each moving one element down. Thus complexity of insertion operation is O(n), i.e linear time. The best case in deletion occurs when the element to be deleted is the last element of the array. In this case, no element is moved up. The worst case occurs when element is deleted from the first position. In this case, all (n-1) elements are moved up. The while loop executes n1 times, each time moving one element down. Thus complexity of deletion operation is also O(n) i.e linear time.

Algorithm: (Linear Search) LINEAR(DATA, N, ITEM, LOC) Here DATA is a linear array with N elements and ITEM is a given item of information. This algorithm finds the location LOC of ITEM in DATA, or sets LOC:=0 if search is unsuccessful. Step 1: [Initialize Counter] Set LOC:=1 Step 2: [Search for ITEM] Repeat while LOC<=N: If DATA[LOC] = ITEM, Then: Write: Element is at the location, LOC Exit [End of if structure] Set LOC:=LOC+1 [End of Loop] Step 3: [Unsuccessful] If LOC=N+1 , then: Set LOC:=0 Step 4: Return

Binary Search: Binary search is a method of searching in which array list is divided into two equal parts. The main requirement of binary search method is that the array has to be in sorted order. If the elements of array list are in ascending order, then , if desired element is less than the middle element of the list, it will never be present in second half of the array and if desired element is greater than middle element, it will never be present in first half of the array list. Thus focus is given only to the desired half of the array list. Algorithm: BINARY(DATA, LB, UB, ITEM, LOC) Here DATA is a sorted array with lower bound LB and upper bound UB, and the ITEM is a given item of information. The variables BEG, END and MID denote, beginning, end and middle location of a segment of elements of DATA. This algorithm finds the location LOC of ITEM in DATA or set LOC=NULL Step 1: [Initialize segment variable] Set BEG:=LB, END:=UB and MID:= INT((BEG+END)/2)

Step 2: Repeat while BEG<=END and DATA[MID]ITEM If ITEM < DATA[MID], then: Set END:= MID -1 Else: Set BEG:= MID +1 [End of if structure] Step 3: Set MID:=INT((BEG+END)/2) [End of step2 loop] Step 4: If DATA[MID]=ITEM, then: Set LOC:=MID Else : Set LOC:=NULL [End of if structure] Step 5: Return

Analysis of Linear Search and Binary Search LINEAR SEARCH In best possible case, item may occur at first position. In this case, search operation terminates in success with just one comparison. However, the worst case occurs when either item is present at last position or element is not there in the array. In former case, search terminates in success with n comparisons. In latter case, search terminates in failure with n comparisons. Thus in worst case, the linear search is O(n) operation. In average case, the average number of comparisons required o find the location of item is approximately equal to half of the number of elements in the array. BINARY SEARCH In each iteration or in each recursive call, search is reduced to one half of the size of array. Therefore, for n elements, there will be log2n iterations or recursive calls. Thus complexity of binary search is O(log2n). This complexity will be same irrespective of position of element, even if it is not present in array.

Sorting Techniques in Array


Bubble sort method- The bubble sort method requires n-1 passes to sort an array where n is the size of array. In each pass, every element of the array a[i] is compared with a[i+1] for i=0 to n-k where k is the pass number. If a[i]>a[i+1], they are swapped. This will cause largest element to move or bubble up.

Algorithm: BUBBLE(DATA, N) Here DATA is an array with N elements. This algorithm sorts the elements in DATA. Step 1: Repeat step 2 and 3 for K=1 to N-1: Step 2: [Initialize pass pointer PTR] Set PTR:=1 Step 3:[Execute Pass] Repeat while PTR N-K IF DATA[PTR] > DATA[PTR + 1], then: Interchange DATA[PTR] and DATA[PTR + 1] [End of If structure] Set PTR:=PTR +1 [End of Step 3 loop] [End of Step 1 Loop] Step 4: Return

Complexity analysis of Bubble sort method: Traditionally, time for sorting an array is measured in terms of number of comparisons. The number f(n) of comparisons in bubble sort is easily computed. Specifically, there are n-1 comparisons during first pass, n-2 comparisons in second pass and so on. Thus, f(n)=(n-1) +(n-2) +-------+2+1 = n(n-1)/2 = n2 / 2 n/2 = O(n2) Thus time required to sort an array using bubble sort method is proportional to n2 where n is number of input items. Thus complexity of bubble sort is O(n2)

Selection sort method: Selection sort method requires n-1 passes to sort an array. In first pass, find smallest element from elements a[0],a[1],---a[n-1] and swap with first element. i.e a[0]. In second pass, find the smallest element from a[1],a[2]----------a[n-1] and swap with a[1] and so on. Algorithm: (Selection sort) SELECTION(A, N) This algorithm sorts an array A with N elements. Step 1: Repeat steps 2 and 3 for k=1 to N-1 Step 2: Set Min:=A[k] and LOC:=k Step 3: Repeat for j=k+1 to N: If Min>A[j], then: Set Min:= A[j] and LOC:=J [End of if structure] [End of step 3 Loop]

Step 4: [Interchange A[k] and A[LOC]] Set Temp:= A[k] Set A[k]:= A[LOC] Set A[LOC]:=Temp [End of step 1 Loop] Step 5 Return Complexity of Selection sort method: The number f (n) of comparisons in selection sort algorithm is independent of original order of elements. There are n-1 comparisons during pass 1 to find the smallest element, n-2 comparisons during pass 2 to find the second smallest element, and so on. Accordingly, f (n)= (n-1)+(n-2)+----------------------+2+1 = n(n-1)/2= O(n2) The f (n) holds the same value O(n2) both for worst case and average case.

Insertion sort- In this procedure of sorting, we pick up a value and insert it at appropriate place in sorted sub list i.e during kth operation or iteration, the element a[k] is inserted in its proper place in sorted sub array a[1],a[2],a[3]---a[k-1]. This task is accomplished by comparing a [k] with a[k-1],a[k-2] ------- and so on until the first element a [j] such that a [j] a [k] is found. Then each element a[k-1],a[k-2]--------a[j+1] are moved one position up and a [k] is inserted in ( j+1)th position in array.

Algorithm: (Insertion sort) INSERTIONSORT(A , N) This algorithm sorts the array A with N elements. Step 1: Set A[0]:= - Step 2: Repeat steps 3 to 5 for K=2 to N: Step 3: Set TEMP:=A[K] and PTR:=K-1 Step 4: Repeat while TEMP <A[PTR] : [Moves element forward] Set A[PTR + 1]:=A[PTR] Set PTR:= PTR 1 [End of step 4 Loop] Step 5:[Inserts element in proper place] Set A[PTR + 1]:= TEMP [End of step 2 Loop] Step 6: Return

Complexity of Insertion Sort The number f(n) of comparisons in insertion sort algorithm can be easily computed . Worst case occurs when array A is in reverse order and inner loop must use n-1 comparisons. f(n)=1+2+----------------n-1 =n(n-1)/2=O(n2) In average case, there will be approximately (n-1)/2 comparisons in inner loop f(n)=1/2+2/2+---------------(n-1)/2=n(n-1)/4=O(n2)

Quick Sort- Quick sort is a sorting technique that uses the idea of

divide and conquer. This algorithm finds the element, called pivot,
that partitions the array into two halves in such a way that the elements in left sub array are less than and elements in right sub array are greater than the partitioning element (Pivot). Then, these two sub arrays are sorted separately in the same way by dividing them further.

The main task in quick sort is to find the element that partitions the
array into two halves and to place it at its proper location in array.

Algorithm: QUICKSORT( A, N) This algorithm sorts an array A with N elements. LOWER and UPPER are the two stacks maintained for storing the lower and upper indexes of list or sublist Step 1:[Initialize] Set TOP:=NULL Step 2:[Push boundary values of A onto stacks when A has 2 or more elements] If N > 1, then: Set TOP:= TOP + 1, LOWER[1]:=1 and UPPER[1]:=N Step 3: Repeat while TOP NULL [Pop sub list from stacks] Set BEG:=LOWER[TOP], END:=UPPER[TOP] and TOP:=TOP-1 Step 4: Call QUICK(A, N, BEG, END, LOC) Step 5: [Push left sub list onto stacks when it has 2 or more elements] If BEG < LOC-1, then : Set TOP:=TOP+1, LOWER[TOP]:=BEG, UPPER[TOP]:=LOC-1 [End of If structure]

Step 6: [Push right sub list onto stacks when it has 2 or more elements] If LOC+1 < END, then : Set TOP:=TOP+1, LOWER[TOP]:=LOC+1, UPPER[TOP]:=END [End of If structure] [End of Step 3 loop] Step 7: Return

Algorithm: QUICK(A, N, BEG, END, LOC) Here A is an array with N elements. Parameters BEG and END contain the boundary values of the sublist of A to which this procedure applies. LOC keeps track of the position of the first element A[BEG] of the sublist during the procedure. The local variables LEFT and RIGHT will contain the boundary values of the list of elements that have not been scanned. Step 1: [Initialize]. Set LEFT:=BEG, RIGHT:=END and LOC:=BEG Step 2: [Scan from right to left] (a) Repeat while A[LOC] A[RIGHT] and LOCRIGHT: RIGHT:=RIGHT-1 [End of Loop] (b) If LOC=RIGHT, then: Return (c ) If A[LOC]>A[RIGHT], then: (i) [Interchange A[LOC] and A[RIGHT] ]

Set TEMP:=A[LOC] Set A[LOC]:=A[RIGHT] Set A[RIGHT]:=TEMP (ii) Set LOC:=RIGHT (iii) Go To Step 3 [End of If Structure] Step 3: [Scan from left to right] (a) Repeat while A[LEFT] A[LOC] and LEFT LOC: Set LEFT:=LEFT+1 [End of Loop] (b) If LOC=LEFT, then: Return (c) If A[LEFT]>A[LOC], then (i) [Interchange A[LEFT] and A[LOC] ]

Set TEMP:=A[LOC] Set A[LOC]:=A[LEFT] Set A[LEFT]:=TEMP (ii) Set LOC:=LEFT (iii) Go To Step 2 [End of If Structure]

Algorithm: Quicksort (DATA, FIRST, LAST) This algorithm sorts the array DATA with N elements. Step 1: [Initialization] Set low:=FIRST and high :=LAST and pivot:= DATA[ (low+high)/2 ][Middle element of array] Step 2: Repeat while low high: Step 3: Repeat while DATA [low] < pivot: Set low:= low +1 [End of step 3 loop] Step 4: Repeat while DATA [high] > pivot: Set high:= high -1 [End of step 4 loop] Step 5: If low high, then: Set temp:= DATA [low] Set DATA [low]:=DATA [high] Set DATA [high]:= temp Set low:= low +1 Set high:= high -1 [End of if structure] [End of step 2 Loop]

Step 6: If first < high, then: Quicksort (DATA, first, high) [End of if structure] Step 7: If low < last, then: Quicksort (DATA, low, last) [End of if structure] Step 8: Exit

Complexity of Quick sort algorithm- The running time of a sorting algorithm is determined by number f(n) of comparisons required to sort n elements. The quick sort algorithm has a worst case running time of order O(n2) but an average case running time of O (n log n). The worst case occurs when the list is already sorted. Then the first element will require n comparisons to recognize that it remains at first position. The first sublist will be empty but the second sublist will have n-1 elements. Accordingly, second sublist will require n-1 comparisons to recognize that it remains at the second position. And so on. Thus, F(n)=n+(n-1)+-------+2+1=n(n+1)/2=n2/2+O(n)=O(n2) The complexity n log n of average case comes from the fact that each reduction step of algorithm produces two sub lists. Accordingly, Reducing the initial list places 1 element and produces two sublists Reducing two sublists places 2 elements and produces four sublists Reducing four sublists places 4 elements and produces eight sublists Reducing eight sublists places 8 element and produces sixteen sublists

Thus reduction in kth level finds the location of 2k-1 elements. Hence there will be approximately log2n levels of reduction steps. Further each level uses atmost n comparisons Thus f(n)=nlogn

Merging of Arrays
Suppose A is a sorted list with r elements and B is a sorted list with s elements. The operation that combines the element of A and B into a single sorted list C with n=r + s elements is called merging.

Algorithm: Merging (A, R,B,S,C) Here A and B be sorted arrays with R and S elements respectively. This algorithm merges A and B into an array C with N=R+ S elements Step 1: Set NA:=1, NB:=1 and NC:=1 Step 2: Repeat while NA R and NB S: IF A[NA] B[NB], then: Set C[NC]:= A[NA] Set NA:=NA +1 Else: Set C[NC]:= B[NB] Set NB:= NB +1 [End of If structure] Set NC:= NC +1 [End of Loop]

Step 3: If NA >R, then: Repeat while NB S: Set C[NC]:= B[NB] Set NB:= NB+1 Set NC := NC +1 [End of Loop] Else: Repeat while NA R: Set C[NC]:= A[NA] Set NC:= NC + 1 Set NA := NA +1 [End of loop] [End of If structure] Step 4: Return

Complexity of merging: The input consists of the total number n=r+s elements in A and B. Each comparison assigns an element to the array C, which eventually has n elements. Accordingly, the number f(n) of comparisons cannot exceed n: f(n) n = O(n)

MergeSort- Merge sort is a sorting technique in which the series is sub divided into smaller arrays till the array is reduced to only two elements. The smaller arrays are sorted and are recursively combined to form the complete sorted array list. This is one of the most efficient sorting algorithm.

Algorithm: Mergesort (A, BEG,END) Step 1: If BEG < END, then: Set MID:= (BEG +END)/2 Call Mergesort (A, BEG,MID) Call Mergesort (A, MID+1, END) Call Merging( A, BEG, MID, MID+1, END) [End of If structure] Step 2: Return

Algorithm: Merging (A, LB,LR,RB,RR) This algorithm merges the elements of sub arrays in sorted form.LB and RB are the lower bounds of sub array to be sorted and LR and RR are the upper bounds of the subarrays respectively Step 1: Set NA:=LB, NB:=RB and NC:=LB Step 2: Repeat while NA LR and NB RR: IF A[NA] < A[NB], then: Set C[NC]:= A[NA] Set NA:=NA +1 Else: Set C[NC]:= A[NB] Set NB:= NB +1 [End of If structure] Set NC:= NC +1 [End of Loop]

Step 3: If NA >LR, then: Repeat while NB RR: Set C[NC]:= A[NB] Set NB:= NB+1 Set NC := NC +1 [End of Loop] Else: Repeat while NA LR: Set C[NC]:= A[NA] Set NC:= NC + 1 Set NA := NA +1 [End of loop] [End of If structure] Step 4:[Updating the original array with temporary array C] Repeat for K:= LB to RR: Set A[K]:=C[K] [End of Loop] Step 5: Return

Complexity of Merge-sort Algorithm- In merge sort algorithm, major work is done in the merge procedure, which is an O(n) operation. The merge procedure is called from merge sort procedure after the array is divided into two halves and each half has been sorted. In each of recursive calls, one for left half and one for right half, the array is divided into halves, thus dividing the array into four segments. At each level, the number of segments double. Therefore, total divisions are log2n. Moreover each pass merges a total of n elements and thus each pass will require at most n comparisons. Hence, for both worst case and average case f(n)= nlog2n= O(nlog2n)

The only disadvantage of merge sort is that it uses an extra temporary array, of the same size as that of input array, to merge the two halves. The elements of temporary array are copied back to original array before next merging.

Radix sort: Radix sort is a technique which is usually used when large lists of names are to be sorted alphabetically. Using this technique, one can classify the list of names into 26 groups. The list is first sorted on first letter of each name i.e names are arranged in 26 classes where the first class consists of names that begin with alphabet A, the second class consists of those names that begin with alphabet B and so on. During second pass each class is alphabetized according to second letter of the name and so on. To sort decimal numbers, where radix or base is 10, we need 10 buckets. These buckets are numbered 0,1,2,3,4,5,6,7,8,9. Unlike, sorting names, decimal numbers are sorted from right to left i.e first on unit digit, then on tens digit, then on hundredth digit and so on.

Algorithm: Radix (A, N) This algorithm sorts an array A with N elements Step 1: Find the largest number of the array Step 2: Set digitcount:=Number of digits of the largest number in given array Step 3: Repeat for pass=1 to digitcount Initialize buckets Step 4: Repeat for i=1 to n-1 Set digit:=Digit number pass of a[i] Put a [i] in bucket number digit Increment bucket count for bucket numbered digit [End of Step 4 Loop] Collect all the numbers from buckets in order [End of Step 3 loop] Step 5: Return

Algorithm: Radix (A, N) This algorithms sorts an array A with N elements Step 1: [Finding Largest Number] Set largest=A[0] Repeat for k=1 to N If A [k]>largest, then: largest=A[k] [End of for loop] Step 2:[Counting the digits of largest number] Repeat while largest > 0 digitcount:=digitcount+1 largest:=largest/10 [End of loop] Step 3:[Execute a pass] Repeat for pass=0 to digitcount Step 4:[Initializing the buckets] Repeat for k=0 to 9 bucketcount[k]:=0 [End of for Loop] Step 5: Repeat for i=0 to N-1: digit:=(A[i]/divisor)MOD 10 bucket [digit] [bucketcount[digit]++]:=a[i] [End of for Loop] Step 6: Repeat for k=0 to 10 Step 7: Repeat for J=0 to bucketcount[k] A[i++]:=bucket[k][J] [End Of Step 7 Loop] [End of Step 6 Loop] Set divisor:=divisor*10 [End of Step 3 loop] Step 8 Return

Radix Sort Demo


13 13 30 30 28 28 56 36 56

36

28 36 56 13 30
36 13 28 13 30 1 2 3 4 36 56 56 5 6

56 36 30 28 13

30 Radix Bins 0

28 7 8

Hashing-The search time of each algorithm depends on the number n of elements in the collection S of data. Is it possible to design a search of O(1) that is, one

that has a constant search time, no matter where the element is located in the list? In theory, this goal is not an impossible dream. Lets look at an example. We have a list of employees of a fairly small company. Each of 100 employees has an ID number in the range 0 99. If we store the elements (employee records) in the array, then each employees ID number will be an index to the array element where this employees record will be stored.

In this case once we know the ID number of the employee, we can directly access his record through the array index. There is a one-toone correspondence between the elements key and the array index. However, in practice, this perfect relationship is not easy to establish or maintain. For example: the same company might use employees five-digit ID number as the primary key. In this case, key values run from 00000 to 99999. If we want to use the same technique as above, we need to set up an array of size 100,000, of which only 100 elements will be used:

Obviously it is very impractical to waste that much storage in order to make sure that each employees record in a unique and predictable location. But what if we keep the array size down to the size that we will actually be using (100 elements) and use just the last two digits of key to identify each employee? For example, the employee with the key number 54876 will be stored in the element of the array with index 76. And the employee with the key 98759 will be stored in the element of the array with index 59. Note that the elements are not stored according to the value of the key as they were in the previous example. In fact, the record of the employee with the key number 54876 precedes the record of the employee with key number 98759, even though the value of its key is larger. Moreover, we need a way to convert a five-digit key number to two-digit array index. We need some function that will do the transformation. Using this technique we call the array a Hash Table and a function is called a Hash Function.

Hash Table is a data structure in which keys are mapped to array positions by a hash function. This table can be searched for an item in O(1) time using a hash function to form an address from the key. The easiest way to conceptualize a hash table is to think of it as an array. When a program stores an element in the array, the elements key is transformed by a hash function that produces array indexes for that array. Hash Function is a function which, when applied to the key, produces an integer which can be used as an address in a hash table. The intent is that elements will be relatively randomly and uniformly distributed. In the example above the code for the hash function would look like this (assuming TABLE_SIZE was defined 100): int hash_function (int id_num) { return (id_num % TABLE_SIZE); }

Then, for example, to find an element in the table, a program applies hash function to the elements key, producing the array index at which the element is stored. find (int key) { int index; index = hash_function(key); //finding the element }

Collision
Collision is a condition resulting when two or more keys produce the same hash location. For example, we have already stored several employees records, and our table looks something like this:

Suppose, our next employees key is 57879. Then our hash function will produce array index 79. But the array element with index 79 already has a value. As the array begins to fill, keys will inevitably produce the same array index when transformed by the hash function. When more than one element tries to occupy the same array position, we have a collision. Now, another question arises: cant we generate the hash function that never produces a collision? The answer is that in some cases it is possible to generate such a function. This is known as a perfect hash function. Perfect Hash Function is a function which, when applied to all the members of the set of items to be stored in a hash table, produces a unique set of integers within some suitable range. Such function produces no collisions. Good Hash Function minimizes collisions by spreading the elements uniformly throughout the array.

There is no magic formula for the creation of the hash function. It can be any mathematical transformation that produces a relatively random and unique distribution of values within the address space of the storage. Although the development of a hash function is trial and error, Some popular hash functions frequently used are: Division method-Choose a number m larger than the number n of keys in K. The number chosen is usually the prime number.The hash function H is defined by: H(k)=k(mod m) or H(k)=k(mod m) + 1 The second hash function is used when we want hash addresses should range from 1 to m rather than 0 to m-1. Midsquare method- The key k is squared. Then the hash function is defined as H(k)=L Where L is obtained by deleting digits from both ends of k2.

Folding method-The key k is partitioned into parts k1,k2,k3----kr where each part except the last part has equal number of digits as the required address.Then the parts are added together, ignoring the last carry. That is, H(k)=k1+k2+--------+kr Where the leading digits, if carry, are ignored.

Example: Consider the company each of whose 68 employees is assigned a unique 4-digit employee number. Suppose L consists of 100 two-digit addresses:00,01,02.99. We apply the above studied hash functions to each of the following employee numbers: 3205, 7148, 2345 (a) Division Method- Choose a prime number m close to 99, such as 97. Then H(3205)=4, H(7148)=67 H(2345)=17 For starting the addresses from 01, H(3205)=4+1=5, H(7148)=67+1=68 H(2345)=17+1=18 (b) Midsquare method-The following calculations are performed: k: 3205 7148 2345 k2 : 10 272 025 51 093 904 5 499 025 H(k): 72 93 99

Folding method- Chopping the key k into two parts and adding yields the following hash addresses: H(3205)=32+05=37 H(7148)=71+48=19 H(2345)=23 +45= 68

Collision Resolution
The two ways of dealing with collisions are: rehashing and chaining. The particular way that one chooses depends upon many factors. One important factor is the ratio of the number n of keys in K to number m of hash addresses in hash table.This ratio, =n/m is called the load factor. The efficiency of a hash function with a collision resolution procedure is measured by the average number of probes needed to find the location of record with a given key k.

Rehashing - Resolving a collision by computing a new hash location (index) in the array. Linear Probing is one type of rehash (getting a new index) - A very simple one. In case of linear probing, we are looking for an empty spot incrementing the offset by 1 every time. We explore a sequence of location until an empty one is found as follows: index, index + 1, index + 2, index + 3, ... In the case of linear probing the code for rehash method will look like this: int rehash(int index) { int new_index; new_index = (index + 1) % TABLE_SIZE; return new_index; }
An alternative to linear probing is quadratic probing.

Quadratic Probing is a different way of rehashing. In the case of quadratic probing we are still looking for an empty location. However, instead of incrementing offset by 1 every time, as in linear probing, we will increment the offset by 1, 3, 5, 7, ... We explore a sequence of location until an empty one is found as follows: index, index + 1, index + 4, index + 9, index + 16, ...

Retrieving a value. When it comes to retrieving a value, the program recomputes the array index and checks the key of the element stored at that location. If the desired key value matches the key value at that location, then the element is found. The search time is on the order of 1, O(1) (1 comparison per data item). If the key doesnt match, then the search function begins a sequential search of the array that continues until: the desired element is found the search encounters an unused position in the array, indicating that the element is not present the search encounters the index which was produced by the hash function, indicating that the table is full and the element is not present In worst case, the search takes (n-1) comparison, which is on the order of n, O(n). Worst case is if the table is full and the search function goes through the whole array, (n-1) elements, every time comparing the desired key value with the key at the array location. In the worst case element is either found at the last position before the one that was produced by the hash function or not found at all. Advantages to this approach All the elements (or pointer to the elements) are placed in contiguous storage. This will speed up the sequential searches when collisions do occur. Disadvantages to this approach As the number of collisions increases, the distance from the array index computed by the hash function and the actual location of the element increases, increasing search time. Element tend to cluster around elements that produce collisions*. As the array fills, there will be gaps of unused locations. The hash table has a fixed size. At some point all the elements in the array will be filled. The only alternative at that point is to expand the table, which also means modify the hash function to accommodate the increased address space.

Collision Resolution Using Linked List or Chaining-An alternative to using adjacent storage (as in array) is to use linked list . In this case table is implemented as an array of linked lists. Every element of the table is a pointer to a list. The list will contain all the elements with the same index produced by the hash function. For example, when the hash function produces key 79, the new node is created and the array element with index 79 now points to this node. When another elements key produces index 79, the new node will be created and attached to the first one and so on. The same set of data as in the previous example will be organized as follows when we use linked list for handling collisions:

Retrieving a value- When it comes to retrieve a value, the hash function is applied to the desired key and the array index is produced. Then the search function accesses the list to which this array element points to. It compares the desired key to the key in the first node. If it matches, then the element is found. In this case the search of the element is of the order of 1, O(1). If not, then the function goes through the list comparing the keys. Search continues until the element is found or the end of the list is detected. In this case the search time depends on the length of the list.
Advantages to this approach The hash table size is unlimited (or limited only by available storage space). In this case you dont need to expand the table and recreate a hash function. Collision handling is simple: just insert colliding records into a list. Disadvantages to this approach As the list of collided elements (collision chains) become long, search them for a desired element begin to take longer and longer. Using pointers slows the algorithm: time required is to allocate new nodes

LINKED LIST

Linked List- A linked list or one-way list is a linear collection of data elements, called nodes , where linear order is given by means of pointers. Each node is divided into two parts: The first part contains the information of the element. Second part, called the linked field or next pointer field, contains the address of the next node in the list. Representation of Linked list in memory Linked list is maintained in memory by two linear arrays: INFO and LINK such that INFO[K] and LINK[K] contain respectively the information part and the next pointer field of node of LIST. LIST also requires a variable name such as START which contains the location of the beginning of the list and the next pointer denoted by NULL which indicates the end of the LIST.

START 5

BED NUMBER 1 2 3 4 5 6 7 8 9 10 11 12

PATIENT Kirk Dean Maxwell Adams Lane Green Samuels Fields Nelson

NEXT 7 11 12 3 4 1 0 8 9

Algorithm: (Traversing a Linked List) Let LIST be a linked list in memory. This algorithm traverses LIST, applying an operation PROCESS to each element of LIST. Variable PTR points to the node currently being processed. Step 1: Set PTR:= START Step 2: Repeat while PTR NULL: Apply PROCESS to INFO[PTR] Set PTR:= LINK[PTR] [PTR now points to next node] [End of Step 2 Loop] Step 3: Exit

Searching an unsorted Linked List Algorithm: SEARCH( INFO,LINK,START,ITEM,LOC) LIST is a linked list in memory. This algorithm finds the location of LOC of the node where ITEM first appears in LIST , or sets LOC=NULL Step 1: Set PTR:=START Step 2: Repeat while PTR NULL If ITEM = INFO[PTR], then: Set LOC := PTR Return Else: Set PTR:= LINK[PTR] [End of If structure] [End of Step 2 Loop] Step 3: [Search is unsuccessful] Set LOC:=NULL Step 4: Return

Complexity of this algorithm is same as that of linear search algorithm for linear array. Worst case running time is proportional to number n of elements in LIST and the average case running time is proportional to n/2 with condition that ITEM appears once in LIST but with equal probability in any node of LIST.
Binary search cannot be applied to linked list as there is no provision to locate the middle element of LIST. This property is one of the main drawback of using a linked list as a data structure.

Search in sorted list Algorithm: SRCHSL(INFO,LINK,START,ITEM,LOC) LIST is sorted list (Sorted in ascending order) in memory. This algorithm finds the location LOC of the node where ITEM first appears in LIST or sets LOC=NULL Step 1: Set PTR:= START Step 2:Repeat while PTR NULL If ITEM > INFO[PTR], then: Set PTR := LINK[PTR] Else If ITEM = INFO[PTR], then: Set LOC := PTR Return Else: Set LOC:= NULL Return [End of If structure]

[End of step 2 Loop] Step 3: Set LOC:= NULL Step 4: Return

START

INFO[PTR] LINK[PTR]

Insertion into a Linked List- Together with the linked list , a special list is maintained in memory which consist of unused memory cells. This list, which has its own pointer, is called the list of available space or the free storage list or the free pool. This list is called the Avail List. During insertion operation, new nodes are taken from this avail list which is maintained just like normal data linked list using its own pointer. Similar to START, Avail list also has its own start pointer named as AVAIL which stores the address of the first free node of avail list.

1
START 5

Kirk Dean Maxwell Adams Lane Green Samuels

2
3 4 5

6
11 12 3

6
7
AVAIL 10

0
4 1 0

8 9

10
11 Fields 12 Nelson

2
8 9

Insertion in a Linked List- Algorithms which insert nodes into linked lists come up in various situations. Three main cases to be discussed are: Inserting node at the beginning of the List Inserting node after the node with a given location Inserting node into a sorted list. For all algorithms, variable ITEM contains the new information to be added to the list. All cases follow some common steps: Checking to see if free space is available in AVAIL list. If AVAIL=NULL, algorithm will print overflow Removing first node from AVAIL list. Using variable NEW to keep track of location of new node, this step can be implemented by the pair of assignments (in this order) NEW:= AVAIL and AVAIL:=LINK[AVAIL] Copying new information into new node INFO[NEW]:=ITEM

START

AVAIL

ITEM NEW

INSERTION AT THE BEGINNING OF THE LINKED LIST

INSERTING AT THE BEGINNING OF THE LIST Algorithm: INSFIRST (INFO, LINK,START,AVAIL,ITEM) This algorithm inserts ITEM as the first node in the list Step 1: [OVERFLOW ?] If AVAIL=NULL, then Write: OVERFLOW Return Step 2: [Remove first node from AVAIL list ] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL]. Step 3: Set INFO[NEW]:=ITEM [Copies new data into new node] Step 4: Set LINK[NEW]:= START [New node now points to original first node] Step 5: Set START:=NEW [Changes START so it points to new node] Step 6: Return

INSERTING AFTER A GIVEN NODE Algorithm: INSLOC(INFO, LINK,START,AVAIL, LOC, ITEM) This algorithm inserts ITEM so that ITEM follows the node with location LOC or inserts ITEM as the first node when LOC =NULL Step 1: [OVERFLOW] If AVAIL=NULL, then: Write: OVERFLOW Return Step 2: [Remove first node from AVAIL list] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:= ITEM [Copies new data into new node] Step 4: If LOC=NULL, then: Set LINK[NEW]:=START and START:=NEW Else: Set LINK[NEW]:=LINK[LOC] and LINK[LOC]:= NEW [End of If structure] Step 5: Return

START

LOC

AVAIL

NEW

18

ITEM

INSERTING BEFORE A GIVEN NODE Algorithm: INSLOC(INFO, LINK,START,AVAIL, ITEM, ITEM1) This algorithm inserts ITEM so that ITEM PRECEDES the node with data item ITEM1 Step 1: [OVERFLOW] If AVAIL=NULL, then: Write: OVERFLOW Return Step 2: [Remove first node from AVAIL list] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:= ITEM [Copies new data into new node] Step 4: Set PTR:=START and SAVE:=NULL Step 5: Repeat while INFO[PTR] ITEM1 Set SAVE:=PTR and PTR:=LINK[PTR] [End of Loop] Step 6:Set LOC:=SAVE Step 7: If LOC=NULL, then: Set LINK[NEW]:=START and START:=NEW Else: Set LINK[NEW]:=LINK[LOC] and LINK[LOC]:= NEW [End of If structure] Step 8: Return

INSERTING BEFORE A GIVEN NODE WITH LOCATION GIVEN Algorithm: INSLOC(INFO, LINK,START,AVAIL, ITEM, LOC) This algorithm inserts ITEM so that ITEM PRECEDES the node with location LOC Step 1: [OVERFLOW] If AVAIL=NULL, then: Write: OVERFLOW Return Step 2: [Remove first node from AVAIL list] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:= ITEM [Copies new data into new node] Step 4: Set PTR:=START and SAVE:=NULL Step 5: Repeat while PTR LOC Set SAVE:=PTR and PTR:=LINK[PTR] [End of Loop] Step 6:Set LOC:=SAVE Step 7: If LOC=NULL, then: Set LINK[NEW]:=START and START:=NEW Else: Set LINK[NEW]:=LINK[LOC] and LINK[LOC]:= NEW [End of If structure] Step 8: Return

INSERTING BEFORE NTH NODE Algorithm: INSLOC(INFO, LINK,START,AVAIL, ITEM, N) This algorithm inserts ITEM so that ITEM PRECEDES the Nth node Step 1: [OVERFLOW] If AVAIL=NULL, then: Write: OVERFLOW Return Step 2: [Remove first node from AVAIL list] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:= ITEM [Copies new data into new node] Step 4: Set PTR:=START , SAVE:=NULL and K:=1 Step 5: Repeat while K N SAVE:=PTR and PTR:=LINK[PTR] Set K:=K+1 [End of Loop] Step 6:Set LOC:=SAVE Step 7: If LOC=NULL, then: Set LINK[NEW]:=START and START:=NEW Else: Set LINK[NEW]:=LINK[LOC] and LINK[LOC]:= NEW [End of If structure] Step 8: Return

For insertion in a sorted list, we use two alogorithms. One algorithm finds the location of the node after which the new node has to be inserted. This algorithm returns the location to main algorithm which finally do the insertion.

INSERTING INTO A SORTED LIST Algorithm: INSSRT(INFO,LINK,START,AVAIL,ITEM) Step 1: CALL FINDA(INFO,LINK, START,ITEM,LOC) Step 2: CALL INSLOC(INFO,LINK,START,AVAIL,LOC,ITEM) Step 3: Exit Algorithm: FINDA(INFO,LINK,START,ITEM,LOC) This algorithm finds the location LOC of the last node in a sorted list such that INFO[LOC ]<ITEM, or sets LOC=NULL Step 1: [List Empty ?] If START=NULL, then: Set LOC:=NULL Return [End of If structure] Step 2: [Special case] If ITEM<INFO[START], then: Set LOC:=NULL Return [End of If structure]

Step 3: Set SAVE:=START and PTR:=LINK[START] Step 4: Repeat while PTRNULL: If ITEM< INFO[PTR], then: Set LOC:=SAVE Return [End of If Structure] Set SAVE:=PTR and PTR:=LINK[PTR] [End of Step 4 Loop] Step 5: Set LOC:=SAVE Step 6: Return

Algorithm: INSLOC(INFO, LINK,START,AVAIL, LOC, ITEM) This algorithm inserts ITEM so that ITEM follows the node with location LOC or inserts ITEM as the first node when LOC =NULL Step 1: [OVERFLOW] If AVAIL=NULL, then: Write: OVERFLOW Return Step 2: [Remove first node from AVAIL list] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:= ITEM [Copies new data into new node] Step 4: If LOC=NULL, then: Set LINK[NEW]:=START and START:=NEW Else: Set LINK[NEW]:=LINK[LOC] and LINK[LOC]:= NEW [End of If structure] Step 5: Return

Deletion of a node from a linked list Case 1: Deletion of node following a given node Algorithm: DEL(INFO,LINK,START,AVAIL,LOC,LOCP) This algorithm deletes node N with location LOC. LOCP is location of node which precedes N or when N is first node, LOCP=NULL. Step 1: If LOC=NULL, then: Write: UNDERFLOW Exit Step 2: If LOCP=NULL, then: Set START:=LINK[START] [Deletes first node] Else: Set LINK[LOCP]:=LINK[LOC] [End of if structure] Step 3: [Return deleted node to AVAIL list] Set LINK[LOC]:=AVAIL and AVAIL:=LOC Step 4: Return

Deleting the node with a given item of information For deleting with given item of information, we first need to know the location of item to be deleted and also the location preceding the item to be deleted. For this, one algorithm will be called from the main algorithm to search for the two locations. Two variables, SAVE and PTR will be used to save the location of preceding and current node at every comparison respectively. Once , locations are found, deletion will be done in main algorithm.

Algorithm: DELETE(INFO, LINK, START, AVAIL, ITEM) Step 1: CALL FINDB(INFO, LINK,START, ITEM, LOC, LOCP) Step 2: If LOC=NULL, then: Write: Underflow Exit Step 3: If LOCP=NULL, then: Set START:= LINK[START] Else: Set LINK[LOCP]:=LINK[LOC] [End of if structure] Step 4: Set LINK[LOC]:=AVAIL and AVAIL:=LOC Step 5: Return

Algorithm: FINDB(INFO,LINK,START,ITEM,LOC,LOCP) This algorithm finds the location LOC of first node N which contains ITEM and location LOCP of node preceding N. if ITEM does not appear in the list, procedure sets LOC=NULL and if ITEM appears in first node, then it sets LOCP=NULL Step 1: If START=NULL, then: Set LOC:=NULL and LOCP:=NULL Return Step 2: If INFO[START]=ITEM, then: Set LOC:=START and LOCP:=NULL Return Step 3: Set SAVE:=START and PTR:=LINK[START] Step 4: Repeat while PTRNULL: Step 5: If INFO[PTR]=ITEM, then: Set LOC:=PTR and LOCP:=SAVE Return

Step 6: Set SAVE:=PTR and PTR:=LINK[PTR] [End of Loop] Step 7:[Item to be deleted not in list] Set LOC:=NULL Step 8: Return

Concatenating two linear linked lists


Algorithm: Concatenate(INFO,LINK,START1,START2) This algorithm concatenates two linked lists with start pointers START1 and START2 Step 1: Set PTR:=START1 Step 2: Repeat while LINK[PTR]NULL: Set PTR:=LINK[PTR] [End of Step 2 Loop] Step 3: Set LINK[PTR]:=START2 Step 4: Return

Circular Linked List- A circular linked list is a linked list in which last element or node of the list points to first node. For non-empty circular linked list, there are no NULL pointers. The memory declarations for representing the circular linked lists are the same as for linear linked lists. All operations performed on linear linked lists can be easily extended to circular linked lists with following exceptions: While inserting new node at the end of the list, its next pointer field is made to point to the first node. While testing for end of list, we compare the next pointer field with address of the first node Circular linked list is usually implemented using header linked list. Header linked list is a linked list which always contains a special node called the header node, at the beginning of the list. This header node usually contains vital information about the linked list such as number of nodes in lists, whether list is sorted or not etc. Circular header lists are frequently used instead of ordinary linked lists as many operations are much easier to state and implement using header lists

This comes from the following two properties of circular header linked lists: The null pointer is not used, and hence all pointers contain valid addresses Every (ordinary ) node has a predecessor, so the first node may not require a special case.

Algorithm: (Traversing a circular header linked list) This algorithm traverses a circular header linked list with START pointer storing the address of the header node. Step 1: Set PTR:=LINK[START] Step 2: Repeat while PTRSTART: Apply PROCESS to INFO[PTR] Set PTR:=LINK[PTR] [End of Loop] Step 3: Return

Searching a circular header linked list Algorithm: SRCHHL(INFO,LINK,START,ITEM,LOC) This algorithm searches a circular header linked list Step 1: Set PTR:=LINK[START] Step 2: Repeat while INFO[PTR]ITEM and PTRSTART: Set PTR:=LINK[PTR] [End of Loop] Step 3: If INFO[PTR]=ITEM, then: Set LOC:=PTR Else: Set LOC:=NULL [End of If structure] Step 4: Return

Deletion from a circular header linked list Algorithm: DELLOCHL(INFO,LINK,START,AVAIL,ITEM) This algorithm deletes an item from a circular header linked list. Step 1: CALL FINDBHL(INFO,LINK,START,ITEM,LOC,LOCP) Step 2: If LOC=NULL, then: Write: item not in the list Exit Step 3: Set LINK[LOCP]:=LINK[LOC] [Node deleted] Step 4: Set LINK[LOC]:=AVAIL and AVAIL:=LOC [Memory retuned to Avail list] Step 5: Return

Algorithm: FINDBHL(NFO,LINK,START,ITEM,LOC,LOCP) This algorithm finds the location of the node to be deleted and the location of the node preceding the node to be deleted Step 1: Set SAVE:=START and PTR:=LINK[START] Step 2: Repeat while INFO[PTR]ITEM and PTRSTART Set SAVE:=PTR and PTR:=LINK[PTR] [End of Loop] Step 3: If INFO[PTR]=ITEM, then: Set LOC:=PTR and LOCP:=SAVE Else: Set LOC:=NULL and LOCP:=SAVE [End of If Structure] Step 4: Return

Insertion in a circular header linked list Algorithm: INSRT(INFO,LINK,START,AVAIL,ITEM,LOC) This algorithm inserts item in a circular header linked list after the location LOC Step 1:If AVAIL=NULL, then Write: OVERFLOW Exit Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:=ITEM Step 4: Set LINK[NEW]:=LINK[LOC] Set LINK[LOC]:=NEW Step 5: Return

Insertion in a sorted circular header linked list Algorithm: INSSRT(INFO,LINK,START,AVAIL,ITEM) This algorithm inserts an element in a sorted circular header linked list Step 1: CALL FINDA(INFO,LINK,START,ITEM,LOC) Step 2: If AVAIL=NULL, then Write: OVERFLOW Return Step 3: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 4: Set INFO[NEW]:=ITEM Step 5: Set LINK[NEW]:=LINK[LOC] Set LINK[LOC]:=NEW Step 6: Return

Algorithm: FINDA(INFO,LINK,ITEM,LOC,START) This algorithm finds the location LOC after which to insert Step 1: Set PTR:=START Step 2: Set SAVE:=PTR and PTR:=LINK[PTR] Step 3: Repeat while PTRSTART If INFO[PTR]>ITEM, then Set LOC:=SAVE Return Set SAVE:=PTR and PTR:=LINK[PTR] [End of Loop] Step 4: Set LOC:=SAVE Step 5: Return

One of the most important application of Linked List is representation of a polynomial in memory. Although, polynomial can be represented using a linear linked list but common and preferred way of representing polynomial is using circular linked list with a header node. Polynomial Representation: Header linked list are frequently used for maintaining polynomials in memory. The header node plays an important part in this representation since it is needed to represent the zero polynomial. Specifically, the information part of node is divided into two fields representing respectively, the coefficient and the exponent of corresponding polynomial term and nodes are linked according to decreasing degree. List pointer variable POLY points to header node whose exponent field is assigned a negative number, in this case -1. The array representation of List will require three linear arrays as COEFF, EXP and LINK. For example: P(x)= 2x8-5x7-3x2+4 can be represented as:

POLY
HEADER NODE WITH EXP -1 AND COEFF 0

-1

-5

-3

CIRCULAR HEADER LINEKD LIST REPRESENTATION OF 2x8-5x7-3x2+4

Addition of polynomials using linear linked list representation for a polynomial

Algorithm:ADDPOLY( COEFF, POWER, LINK, POLY1, POLY2, SUMPOLY, AVAIL) This algorithm adds the two polynomials implemented using linear linked list and stores the sum in another linear linked list. POLY1 and POLY2 are the two variables that point to the starting nodes of the two polynomials. Step 1: Set SUMPOLY:=AVAIL and AVAIL:=LINK[AVAIL] Step 2: Repeat while POLY1 NULL and POLY2NULL: If POWER[POLY1]>POWER[POLY2],then: Set COEFF[SUMPOLY]:=COEFF[POLY1] Set POWER[SUMPOLY]:=POWER[POLY1] Set POLY1:=LINK[POLY1] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] Else If POWER[POLY2]>POWER[POLY1], then: Set COEFF[SUMPOLY]:=COEFF[POLY2] Set POWER[SUMPOLY]:=POWER[POLY2] Set POLY2:=LINK[POLY2] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY]

Else: Set COEFF[SUMPOLY]:=COEFF[POLY1]+COEFF[POLY2] Set POWER[SUMPOLY]:=POWER[POLY1] Set POLY1:=LINK[POLY1] Set POLY2=LINK[POLY2] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] [End of If structure] [End of Loop] Step3: If POLY1=NULL , then: Repeat while POLY2NULL Set COEFF[SUMPOLY]:=COEFF[POLY2] Set POWER[SUMPOLY]:=POWER[POLY2] Set POLY2:=LINK[POLY2] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] [End of Loop] [End of If Structure]

Step 4: If POLY2=NULL, then: Repeat while POLY1NULL Set COEFF[SUMPOLY]:=COEFF[POLY1] Set POWER[SUMPOLY]:=POWER[POLY1] Set POLY1:=LINK[POLY1] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] [End of Loop] [End of If Structure] Step 5: Set LINK[SUMPOLY]:=NULL Set SUMPOLY:=LINK[SUMPOLY] Step 6: Return

Addition of polynomials using circular header linked list representation for polynomials

Algorithm: ADDPOLY(

COEFF, POWER, LINK, POLY1,POLY2, SUMPOLY, AVAIL) This algorithm finds the sum of two polynomials implemented using header circular linked lists. POLY1 and POLY2 contain the addresses of header nodes of two polynomials and SUMPOLY is the circular header linked list storing the terms of sum of two polynomials

Step 1: Set HEADER:=AVAIL and AVAIL:=LINK[AVAIL] Step 2: Set SUMPOLY:=HEADER Set LINK[SUMPOLY ]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] Step 3: Set START1:=POLY1 and START2:=POLY2 Set POLY1:=LINK[START1] and POLY2:=LINK[START2] Step 4: Repeat while POLY1 START1 and POLY2START2: If POWER[POLY1]>POWER[POLY2],then: Set COEFF[SUMPOLY]:=COEFF[POLY1] Set POWER[SUMPOLY]:=POWER[POLY1] Set POLY1:=LINK[POLY1] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] Else If POWER[POLY2]>POWER[POLY1], then: Set COEFF[SUMPOLY]:=COEFF[POLY2] Set POWER[SUMPOLY]:=POWER[POLY2] Set POLY2:=LINK[POLY2] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY]

Else: Set COEFF[SUMPOLY]:=COEFF[POLY1]+COEFF[POLY2] Set POWER[SUMPOLY]:=POWER[POLY1] Set POLY1:=LINK[POLY1] Set POLY2=LINK[POLY2] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] [End of If structure] [End of Loop] Step5: If POLY1=START1 , then: Repeat while POLY2 START2 Set COEFF[SUMPOLY]:=COEFF[POLY2] Set POWER[SUMPOLY]:=POWER[POLY2] Set POLY2:=LINK[POLY2] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] [End of Loop] [End of If Structure]

Step 6: If POLY2=START2, then:


Repeat while POLY1START1 Set COEFF[SUMPOLY]:=COEFF[POLY1] Set POWER[SUMPOLY]:=POWER[POLY1] Set POLY1:=LINK[POLY1] Set LINK[SUMPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set SUMPOLY:=LINK[SUMPOLY] [End of Loop] [End of If Structure] Step 7: Set LINK[SUMPOLY]:=HEADER and SUMPOLY:=LINK[SUMPOLY] Step 8: Return

Multiplication of Polynomials using linear linked list representation for polynomials

Algorithm: MULPOLY( COEFF, POWER, LINK, POLY1, POLY2, PRODPOLY, AVAIL) This algorithm multiplies two polynomials implemented using linear linked list. POLY1 and POLY2 contain the addresses of starting nodes of two polynomials. The result of multiplication is stored in another linked list whose starting node is PRODPOLY Step 1: Set PRODPOLY:=AVAIL and AVAIL:=LINK[AVAIL] Set START:=PRODPOLY Step 2: Repeat while POLY1 NULL Step 3:Repeat while POLY2NULL: Set COEFF[PRODPOLY]:=COEFF[POLY1]*COEFF[POLY2] Set POWER[PRODPOLY]:=POWER[POLY1]+POWER[POLY2] Set POLY2:=LINK[POLY2] Set LINK[PRODPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set PRODPOLY:=LINK[PRODPOLY] [End of step 4 loop] Set POLY1:=LINK[POLY1] [End of step 3 loop] Step 4 Set LINK[PRODPOLY]:=NULL and PRODPOLY:=LINK[PRODPOLY] Step 5: Return

Algorithm: MULPOLY( COEFF, POWER, LINK, POLY1, POLY2, PRODPOLY, AVAIL) This algorithm finds the product of two polynomials implemented using header circular linked list. POLY1 and POLY2 contain addresses of header nodes of two polynomials. The result of multiplication is stored in another header circular linked list. Step 1: Set HEADER:=AVAIL and AVAIL:=LINK[AVAIL] Step 2: Set PRODPOLY :=HEADER Set LINK[PRODPOLY ]:=AVAIL and AVAIL:=LINK[AVAIL] Set PRODPOLY:=LINK[PRODPOLY] Step 3: Set START1:=POLY1 and START2:=POLY2 Set POLY1:=LINK[START1] and POLY2:=LINK[START2] Step 4: Repeat while POLY1 START1 Step 5: Repeat while POLY2 START2 Set COEFF[PRODPOLY]:=COEFF[POLY1]*COEFF[POLY2] Set POWER[PRODPOLY]:=POWER[POLY1]+POWER[POLY2] Set POLY2:=LINK[POLY2] Set LINK[PRODPOLY]:=AVAIL and AVAIL:=LINK[AVAIL] Set PRODPOLY:=LINK[PRODPOLY] [End of Step 5 Loop] Set POLY1:=LINK[POLY1] [End of Step 4 Loop] Step 6: Set LINK[PRODPOLY]:=HEADER and PRODPOLY:=LINK[PRODPOLY] Step 7: Return

Doubly Linked List: Two-way List


A two-way list is a linear linked list which can be traversed in two directions: in usual forward direction from beginning of the list to end and in backward direction from end of list to the beginning. Thus, given the location LOC of a node N in list, one has immediate access to both the next node and the preceding node in the list. Each node of a two-way list is divided into three parts: An information field INFO which contains data of N A pointer field FORW which contains the location of next node in the list A pointer field BACK which contains the location of preceding node. The list also requires two list pointer variables FIRST which points to first node in the list and LAST which points to the last node in the list. Thus null pointer will appear in FORW field of last node in list and also in BACK field of first node in list.

Two way lists are maintained in memory by means of linear arrays in same way as one way list except that two pointer arrays , FORW and BACK , are required instead of one list pointer variable. The list AVAIL will still be maintained as a one-way list.
FIRST LAST

BACK[PTR]

FORW[PTR]

Operations on a Two-way list


Traversal Algorithm: Traversal This algorithm traverses a two-way list. FORW and BACK are the two address parts of each node containing the address of next node and previous node respectively. INFO is the information part of each node. START contains the address of the first node Step 1: Set PTR:=START Step 2: Repeat while PTR NULL Apply PROCESS to INFO[PTR] Step 3: Set PTR:=FORW[PTR] [End of Step 2 Loop] Step 4: Exit

Algorithm: SEARCH(INFO,FORW,BACK,ITEM,START,LOC) This algorithm searches the location LOC of ITEM in a twoway list and sets LOC=NULL if ITEM is not found in the list Step 1: Set PTR:=START Step 2: Repeat while PTRNULL and INFO[PTR]ITEM Set PTR:=FORW[PTR] [End of Loop] Step 3: If INFO[PTR]=ITEM, then: Set LOC:=PTR Else: Set LOC:=NULL Step 4: Return

Algorithm: DELETE(INFO,FROW,BACK,START,AVAIL,LOC) This algorithm deletes a node from a two-way list Step 1: If LOC=START , then: START:=FORW[START] BACK[START]:=NULL Return Step 2: [Delete node] Set FORW[BACK[LOC]]:=FORW[LOC] Set BACK[FORW[LOC]]:=BACK[LOC] Step 3: [Returning node to AVAIL] Set FORW[LOC]:=AVAIL and AVAIL:=LOC Step 4: Return

Algorithm:INSRT (INFO,FORW,BACK,START,AVAIL,LOCA, LOCB,


ITEM)

This algorithm inserts an item in a doubly linked list. LOCA and LOCB location of adjacent nodes A and B Step 1: [OVERFLOW] If AVAIL=NULL, then: Write: OVERFLOW Return Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Set INFO[NEW]:=ITEM Step 3: Set FORW[LOCA]:=NEW and BACK[NEW]:=LOCA Set FORW[NEW]:=LOCB and BACK[LOCB]:=NEW Step 4: Return

BACK[PTR]

FORW[PTR]

NEW

TEST QUESTIONS

Algorithm to copy the contents of one linked list to another Algorithm: Copy (INFO,LINK,START,AVAIL) This algorithm copies the contents of one linked list to another. Step 1: If AVAIL=NULL,then Write: Overflow Return Step 2: Set PTR:=START Step 3: Set NEW:=AVAIL , START1:=NEW and AVAIL:=LINK[AVAIL] Step 4: Repeat while PTR NULL INFO[NEW]:=INFO[PTR] LINK[NEW]:=AVAIL and AVAIL:=LINK[AVAIL] NEW:=LINK[NEW] [End of Loop] Step 5: Set LINK[NEW]:=NULL Step 6: Return

Algorithm: Copy (INFO,LINK,START,START1) This algorithm copies the contents of one linked list to another. START AND START1 are the start pointers of two lists Step 1: Set PTR:=START and PTR1:=START1 Step 2: Repeat while PTR NULL and PTR1 NULL INFO[PTR1]:=INFO[PTR] PTR1:=LINK[PTR1] PTR:=LINK[PTR] [End of Loop] Step 3: If PTR=NULL and PTR1=NULL Return Step 4:[Case when the list to be copied is still left] If PTR1=NULL,then PTR1: =AVAIL and AVAIL:=LINK[AVAIL] Repeat while PTR NULL INFO[PTR1]:=INFO[PTR] LINK[PTR1]:=AVAIL and AVAIL:=LINK[AVAIL] PTR1:=LINK[PTR1] [End of Loop]

Step 5: [Case when list to be copied is finished, Truncate the extra nodes] If PTR1 NULL, then Set PTR1:=NULL [End of If structure] Step 6: Return

Algorithm to insert a node after the kth node in the circular linked list Algorithm: INSRT (INFO,LINK,START,AVAIL,K,ITEM) This algorithm inserts in a circular linked list after the kth node Step 1: If AVAIL:=NULL, then Write: OVERFLOW Return Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Set INFO[NEW]:=ITEM Step 3: Set PTR:=START and N:=1 Step 4: If N=K, then Set LINK[NEW]:=LINK[PTR] Set LINK[PTR]:=NEW Return Step 5: Set PTR:=LINK[PTR] and Step 6: Repeat while N K Set PTR:=LINK[PTR] and N:=N+1 [End of if structure] Step 7: Set LINK[NEW]:=LINK[PTR] Set LINK[PTR]:=NEW Step 8: Return

Algorithm to delete the kth node from a doubly linked list Algorithm: Del (INFO,FORW,BACK,FIRST,LAST,AVAIL,K) This algorithm deletes the kth node Step 1: Set N:=1 and PTR:=FIRST Step 2: Set SAVE:=PTR and PTR:=FORW[PTR] Step 3: Repeat while N K Set SAVE:=PTR and PTR:=FORW[PTR] Set N:=N+1 [End of If structure] Step 4: If N=K, then FORW[SAVE]:=FORW[PTR] BACK[FORW[PTR]]:=BACK[PTR] FORW[PTR]:=AVAIL and AVAIL:=PTR [End of If structure] Step 5: Return

GARBAGE COLLECTION

In computer science, garbage collection (GC) is a form of automatic memory management. The garbage collector, or just collector, attempts to reclaim garbage, or memory used by objects that will never be accessed or mutated again by the application. Garbage collection is often portrayed as the opposite of manual memory management, which requires the programmer to specify which objects to deallocate and return to the memory system. The basic principle of how a garbage collector works is: Determine what data objects in a program will not be accessed in the future Reclaim the resources used by those objects .

Reachability of an object
Informally, a reachable object can be defined as an object for which there exists some variable in the program environment that leads to it, either directly or through references from other reachable objects. More precisely, objects can be reachable in . only two ways: A distinguished set of objects are assumed to be reachablethese are known as the roots. Typically, these include all the objects referenced from anywhere in the call stack (that is, all local variables and parameters in the functions currently being invoked), and any global variables. Anything referenced from a reachable object is itself reachable; more formally, reachability is a transitive closure

The memory is traced for garbage collection using Tracing collectors OR SIMPLY COLLECTORS. Tracing collectors are called that way because they trace through the working set of memory. These garbage collectors perform collection in cycles. A cycle is started when the collector decides (or is notified) that it needs to reclaim storage, which in particular happens when the system is low on memory. The original method involves a naive mark-and-sweep in which the entire memory set is touched several times In this method, each object in memory has a flag (typically a single bit) reserved for garbage collection use only. This flag is always cleared (counter-intuitively), except during the collection cycle. The first stage of collection sweeps the entire 'root set', marking each accessible object as being 'in-use'. All objects transitively accessible from the root set are marked, as well. Finally, each object in memory is again examined; those with the in-use flag still cleared are not reachable by any program or data, and their memory is freed. (For objects which are marked in-use, the in-use flag is cleared again, preparing for the next cycle.).

Moving vs. non-moving garbage Collection Once the unreachable set has been determined, the garbage collector may simply release the unreachable objects and leave everything else as it is, or it may copy some or all of the reachable objects into a new area of memory, updating all references to those objects as needed. These are called "non-moving" and "moving" garbage collectors, respectively. At first, a moving garbage collection strategy may seem inefficient and costly compared to the non-moving approach, since much more work would appear to be required on each cycle. In fact, however, the moving garbage collection strategy leads to several performance advantages, both during the garbage collection cycle itself and during actual program execution

Memory Allocation: Garbage Collection The maintenance of linked list in memory assumes the possibility of inserting new nodes into the linked lists and hence requires some mechanism which provides unused memory space for new nodes.. Analogously, some mechanism is required whereby memory space of deleted nodes becomes available for future use. Together with linked list, a special list is maintained in memory which consists of unused memory cells. This list, which has its own pointer is called the list of available space or the free-storage list or the free pool. During insertions and deletions in a linked list, these unused memory cells will also be linked together to form a linked list using AVAIL as its list pointer variable.

Garbage Collection: The operating system of a computer may


periodically collect all deleted space onto the free-storage list. Any technique which does this collection is called garbage collection. Garbage collection is mainly used when a node is deleted from a list or an entire list is deleted from a program.

Garbage collection usually takes place in two steps: First the computer runs through the whole list tagging those cells which are currently in use, The computer then runs through the memory collecting all untagged spaces onto the free storage list. Garbage collection may take place when there is only some minimum amount of space or no space at all left in free storage list or when CPU is idle and has time to do the collection.

STACKS

Stack- A stack is a linear data structure in which items may be added or removed only at one end . Accordingly, stacks are also called lastin-first-out or LIFO lists. The end at which element is added or removed is called the top of the stack. Two basic operations associated with stacks are : Push- Term used to denote insertion of an element onto a stack. Pop- Term used to describe deletion of an element from a stack. The order in which elements are pushed onto a stack is reverse of the order in which elements are popped from a stack

Representation of stacks Stacks may be represented in memory in various ways usually by means of one-way list or a linear array In array representation of stack, stack is maintained by an array named STACK , a variable TOP which contains the location/index of top element of stack and a variable MAXSTK giving the maximum number of elements that can be held by the stack. The condition TOP=0 or TOP=NULL indicates that stack is empty. The operation of addition of an item on stack and operation of removing an item from a stack may be implemented respectively by sub algorithms, called PUSH and POP. Before executing operation PUSH on to a stack, one must first test whether there is room in the stack for the new item. If not, then we have the condition known as overflow. Analogously, in executing the POP operation, one must first test where there is an element in stack to be deleted. If not, then we have condition known as underflow.

ARRAY IMPLEMENTATION OF STACK

Algorithm: PUSH (STACK,TOP,MAXSTK,ITEM) This algorithm pushes an item onto the stack array. TOP stores the index of top element of the stack and MAXSTK stores the maximum size of the stack.

Step 1: [Stack already filled] If TOP=MAXSTK, then: Write: OVERFLOW Return Step 2: Set TOP:=TOP+1 Step 3: Set STACK[TOP]:=ITEM Step 4: Return

Algorithm: POP(STACK,TOP,ITEM) This procedure deletes the top element of STACK array and assign it to variable ITEM

Step 1: If TOP=0,then: Write: UNDERFLOW Return Step 2: Set ITEM:=STACK[TOP] Step 3: Set TOP:=TOP-1 Step 4: Return

A stack represented using a linked list is also known as linked stack. The array based representation of stack suffers from following limitations: Size of stack must be known in advance Representing stack as an array prohibits the growth of stack beyond finite number of elements. In a linked list implementation of stack, each memory cell will contain the data part of the current element of stack and pointer that stores the address of its bottom element and the memory cell containing the bottom most element will have a NULL pointer

Push operation on linked list representation of stack Algorithm: PUSH(INFO, LINK, TOP, ITEM, AVAIL) This algorithm pushes an element to the top of the stack Step 1: If AVAIL=NULL, then Write: OVERFLOW Return Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:=ITEM Step 4: If TOP=NULL, then Set LINK[NEW]:=NULL Set TOP:=NEW Return Else: Set LINK[NEW]:=TOP Set TOP:=NEW Step 5: Return

POP operation on linked list representation of stack Algorithm: POP(INFO, LINK, TOP, AVAIL) This algorithm deletes an element from the top of the stack Step 1: If TOP=NULL , then: Write: UNDERFLOLW Return Step 2: Set PTR:=TOP Set TOP:=LINK[TOP] Write: INFO[PTR] Step 3: Set LINK[PTR]:=AVAIL and AVAIL:=PTR Step 4: Return

Application of stack Evaluation of arithmetic expression. For most common arithmetic operations, operator symbol is placed between its operands. This is called infix notation of an expression. To use stack to evaluate an arithmetic expression, we have to convert the expression into its prefix or postfix notation. Polish notation , refers to the notation in which operator symbol is placed before its two operands. This is also called prefix notation of an arithmetic expression. The fundamental property of polish notation is that the order in which operations are to be performed is completely determined by positions of operators and operands in expression. Accordingly, one never needs parentheses when writing expression in polish notation. Reverse Polish notation refers to notation in which operator is placed after its two operands. This notation is frequently called postfix notation. Examples of three notations are:

INFIX NOTATION: A+B PREFIX OR POLISH NOTATION: +AB POSTFIX OR REVERSE POLISH NOATATION: AB+ Convert the following infix expressions to prefix and postfix forms A+(B*C) (A+B)/(C+D) Prefix: +A*BC Postfix: A BC*+ Prefix: / +AB+CD Postfix: AB+CD+/

The computer usually evaluates an arithmetic expression written in infix notation in two steps. Converts expression to postfix notation Evaluates the postfix notation.

Evaluation of postfix expression Suppose P is an arithmetic expression written in postfix notation. The following algorithm which uses a STACK to hold operands and evaluates P

Algorithm: This algorithm finds the VALUE of an arithmetic expression P written in Postfix notation. Step 1: Add a right parentheses ) at the end of P Step 2: Scan P from left to right and repeat step 3 and 4 for each element of P until ) is encountered Step 3: If an operand is encountered, put the operand on the stack Step 4: If an operator is encountered , then: (a) Remove the two top elements of stack, where A is top element and B is next-top-element. (b) Evaluate B A (c ) Place the result of (b) back on stack [End of if structure] [End of step 2 loop] Step 5: Set VALUE equal to the top element of stack Step 6: Exit

Transforming Infix Expression into Postfix Expression The algorithm uses a stack to temporarily hold operators and left parentheses. The postfix expression P will be constructed from left to right using operands from Q and operators which are removed from STACK. The algorithm begins by pushing a left parentheses onto stack and adding a right parentheses at the end of Q Algorithm: POSTFIX (Q, P) Suppose Q is an arithmetic expression written in infix notation. This algorithm finds the equivalent postfix expression P Step 1: Push ( on to the STACK and add ) to the end of Q Step 2: Scan Q from left to right and repeat step 3 to 6 for each element of Q until the STACK is empty Step 3: If an operand is encountered, add it to P Step 4: If left parentheses is encountered, add it to STACK

Step 5: If an operator is encountered, then: (a) Repeatedly pop from STACK and add to P each operator which has same precedence or higher precedence than (b) Add to STACK Step 6: If a right parentheses is encountered , then: (a) Repeatedly pop from STACK and add to P each operator until a left parentheses is encountered (b) Remove the left parentheses [End of Step 2 Loop] Step 7: Exit

Example: Convert Q=A+(B*C) / D) to its corresponding postfix form Solution: put ) at the end of Q and put ( on stack Starting from left: Operand A , put it on P Operator + move to stack as no operator there ( move on stack Operand B, put it on P Operator * , move to stack as no operator Operand C , move to P ) , pop from stack and put on P until ( is encountered. Pop ( also operator /, as precedence of / is higher than + on stack, no pop possible. Push / on stack Operand D , put it on P Right parentheses ), Pop all the elements and add the P until ( is encountered. Also remove ( from stack

* ( /

+
(

+
(

+
P= A B C* D / +

Transforming Infix Expression into Prefix Expression Algorithm: [Polish Notation] PREFIX (Q, P) Suppose Q is an arithmetic expression written in infix notation. This algorithm finds the equivalent prefix expression P Step 1: Reverse the input string Step 2: Examine the next element in the input Step 3: If it is operand, add it to output string Step 4: If it is closing parentheses, push it on stack Step 5: If it is operator, then: (i) if stack is empty, push operator on stack (ii) if top of stack is closing parentheses, push operator on the stack (iii) If it has same or higher priority than top of stack, push operator on stack Else pop the operator from the stack and add it to output string, repeat step 5

Step 6: If it is an opening parentheses, pop operators from stack and add them to output string until a closing parentheses is encountered, pop and discard the closing parentheses. Step 7: If there is more input go to step 2 Step 8: If there is no more input, unstack the remaining operators and add them to output string Step 9: Reverse the output string

Consider the following arithmetic expression P written in postfix notation P: 12, 7, 3 -, /, 2, 1, 5, +, *, + (a) Translate P, by inspection and hand, into its equivalent infix expression (b) Evaluate the infix expression Sol: (a) Scanning from left to right, translate each operator from postfix to infix notation P = 12, [7-3], /, 2, 1, 5, +, *, + = [12/[7-3]],2, [1+5],*,+ = 12/(7-3)+2*(1+5) (b) 12/(7-3)+2*(1+5) = [3],[2*6],+ = 3+12 = 15

Practical applications of stack


Stacks are used for implementing function calls in a program Used for implementing recursion. Used for conversion of infix expression to its postfix or prefix form Used for evaluation of postfix expression. Used in sorting of arrays (quicksort and mergesort technique)

QUEUE

Queue- A queue is a linear list of elements in which insertions can take place at one end called the rear of the queue, and deletion can take place only at other end, called the font of the queue. Queues are also called the FIFO lists (First In First Out) since first element in queue is the first element out of the queue. An important example of a queue in computer science occurs in time sharing systems in which programs with same priority form a queue while waiting to be executed. Queues may be represented in computer in various ways, usually by means of one-way lists or linear arrays

Representing a Queue Using an Array


A Queue is maintained by a linear array queue and two pointer variables: FRONT, containing the location of front element of the queue and REAR, containing the location of rear element of the queue. The condition FRONT=NULL indicates that queue is empty. Whenever an element is deleted from the queue, the value of FRONT is increased by 1. Similarly, whenever an element is added to queue, the value of REAR is increased by 1.

Queue as a circular queue


It can be seen that after N insertions in a Queue represented by an array of N elements, the rear element of Queue will occupy last part of array. This occurs even though the queue itself may not contain many elements. Now, if we want to insert an element ITEM into a queue, we have to move or rearrange the elements of entire queue to the beginning of the queue. This procedure may be very expensive. Another method to do so is to represent a queue as a circular queue i,e QUEUE[1] comes after QUEUE[N] in array. With this assumption, we insert ITEM into queue by assigning ITEM to QUEUE[1]. Thus instead of increasing REAR to N+1, we reset REAR=1 and then assign QUEUE[REAR]=ITEM Similarly, If FRONT=N and an element of QUEUE is deleted, we reset FRONT=1 instead of increasing FRONT to N+1

Algorithm for Inserting in a QUEUE Algorithm: QINSERT(QUEUE, N, FRONT, REAR,ITEM) This algorithm inserts an element in a linear queue Step 1:[Queue already filled] If REAR=N, then: Write: OVERFLOW Exit Step 2: If FRONT=NULL, then: [Queue initially empty] Set FRONT:=1 and REAR:=1 Else: Set REAR:=REAR+1 [End of If structure] Step 3: Set QUEUE[REAR]:=ITEM Step 4: Return

Algorithm: QDELETE(QUEUE,N,FRONT,REAR,ITEM) This algorithm deletes an element from a queue Step 1: If FRONT=NULL, then: Write: UNDERFLOW Exit Step 2: Set ITEM:=QUEUE[FRONT] Step 3: If FRONT=REAR, then: [Empty Queue] Set FRONT:=NULL and REAR:=NULL Else: Set FRONT:=FRONT+1 [End of If structure] Step 4: Return

Algorithm: QINSERT(QUEUE, N, FRONT, REAR,ITEM) This algorithm inserts an element in a circular queue Step 1:[Queue already filled] If FRONT=1 and REAR=N or FRONT=REAR+1, then: Write: OVERFLOW Exit Step 2: If FRONT=NULL, then: [Queue initially empty] Set FRONT:=1 and REAR:=1 Else If REAR=N, then: Set REAR:=1 Else: Set REAR:=REAR+1 [End of If structure] Step 3: Set QUEUE[REAR]:=ITEM Step 4: Return

Algorithm: QDELETE(QUEUE,N,FRONT,REAR,ITEM) This algorithm deletes an element from a circular queue Step 1: If FRONT=NULL, then: Write: UNDERFLOW Exit Step 2: Set ITEM:=QUEUE[FRONT] Step 3: If FRONT=REAR, then: [Empty Queue] Set FRONT:=NULL and REAR:=NULL Else If FRONT=N, then: Set FRONT:=1 Else: Set FRONT:=FRONT+1 [End of If structure] Step 4: Return

Consider the following queue of characters where QUEUE is a circular array which is allocated six memory cells FRONT=2, REAR=4 QUEUE: _ A C D _ _ Describe the queue as following operations take place: (a) F is added to queue (b) Two letters are deleted (c) K , L and M are added (d) Two letters are deleted (e) R is added to queue (f) Two letters are deleted (g) S is added to queue (h) Two letters are deleted (i) One letter is deleted (j) One letter is deleted

(a) (b) (c) (d) (e) (f) (g) (h) (i) (j)

Solution: FRONT=2, REAR=5 QUEUE: _ A C D F_ FRONT=4, REAR=5 QUEUE: _ _ _ D F _ REAR=2, FRONT=4 QUEUE: L M _ D F K FRONT=6, REAR=2 QUEUE: L M _ _ _ K FRONT=6, REAR=3 QUEUE: L M R_ _ K FRONT=2, REAR=3 QUEUE: _M R _ _ _ REAR=4, FRONT=2 QUEUE: _ M R S _ _ FRONT=4, REAR=4 QUEUE: _ _ _ S _ _ FRONT=REAR=0 [ As FRONT=REAR, queue is empty] Since FRONT=NULL, no deletion can take place. Underflow occurred

DEQUE(Double ended Queue)- A deque is a queue in which elements can be added or removed at either end but not in the middle. A deque is usually maintained by a circular array DEQUE with pointers LEFT and RIGHT, which point to two ends of deque. The elements extend from LEFT end to RIGHT end of deque. The term circular comes from the fact that DEQUE[1] comes after DEQUE [N].The condition LEFT=NULL will be used to indicate that a deque is empty. There are two variations of a deque Input-restricted deque- It is a deque which allows insertions at only one end of list but allows deletions at both ends of the list Output-restricted deque- It is a deque which allows deletions at only one end of list but allows insertions at both ends of list

Consider the following deque of characters where DEQUE is a circular array which is allocated six memory cells. LEFT=2, RIGHT=4 DEQUE: _ A,C,D, _ , _ Describe deque while the following operation take place (a) F is added to right of deque LFET=2, RIGHT=5 _A C D F _ _ (b) Two letters on right are deleted LEFT=2 RIGHT=3 _A C _ _ _ _ (c) K,L and M are added to the left of the deque LEFT=5 RIGHT=3 KAC_ML (d) One letter on left is deleted. LEFT=6 RIGHT=3 KAC__L (e) R is added to the left of deque. LEFT=5 RIGHT= 3 KAC_RL (f) S is added to right of deque LEFT=5 RIGHT= 4 KACSRL (g) T is added to the right of deque Since LEFT= RIGHT+1 , the array is full and hence T cannot be added to the deque

Linked representation of the Queue A linked queue is a queue implemented as a linked list with two pointer variables FRONT and REAR pointing to the nodes in the front and rear of the queue. The INFO field of list hold the elements of the queue and LINK field holds pointer to neighboring element of queue. In case of insertion in linked queue, a node borrowed from AVAIL list and carrying the item to be inserted is added as the last node of linked list representing the queue. Rear pointer is updated to point to last node just added to the list In case of deletion, first node of list pointed to by FRONT is deleted and FRONT pointer is updated to point to next node in the list. Unlike the array representation, linked queue functions as a linear queue and there is no need to view it as circular for efficient management of space.

Algorithm:LINKQINSRT(INFO,LINK,FRONT,REAR,AVAIL,ITEM) This algorithm inserts an item in linked list implementation of the queue Step 1: If AVAIL=NULL,then: Write: OVERFLOW Exit Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:=ITEM and LINK[NEW]:=NULL Step 4: If FRONT=NULL, then: Set FRONT=REAR=NEW Else: Set LINK[REAR]:=NEW and REAR:=NEW Step 5: Return

Algorithm: LINKQDEL(INFO,LINK,FRONT,AVAIL,ITEM) This algorithm deletes an element from the front of the queue Step 1: If FRONT=NULL,then: Write:UNDERFLOW Exit Step 2: Set TEMP:=FRONT Step 3: Set ITEM:=INFO[FRONT] Step 4: Set FRONT:=LINK[FRONT] Step 5: Set LINK[TEMP]:=AVAIL and AVAIL:=TEMP Step 6: Return

Priority Queue- A priority queue is a collection of elements such that each element has been assigned a priority and such that the order in which elements are deleted and processed comes from following rules: An element of higher priority is processed before any element of lower priority Two elements of same priority are processed according to the order in which they were added to queue An example of a priority queue is a time sharing system. Programs of higher priority are processed first and programs with same priority form a standard queue

One-way list representation of a priority queue


One way to maintain a priority queue in memory is by means of a one-way list

Each node in list will contain three items of information: an information field INFO, a priority number PRN and a link field LINK

A node X precedes a node Y in list


If X has higher priority than Y Or when both have same priority but X was added to list before Y

Algorithm:LKQINS(INFO,LINK,FRONT,PRN,AVAIL,ITEM, P) This algorithm inserts an item in linked list implementation of priority queue Step 1: If AVAIL=NULL,then: Write: OVERFLOW Exit Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: [Enter the data and priority of new node] Set INFO[NEW]:=ITEM and PRN[NEW]:=P Step 4: Set PTR:=FRONT Step 5: If PRN[PTR]>PRN[NEW], then LINK[NEW]:=FRONT FRONT:=NEW Return [End of If Structure] Step 5: Repeat while PTRNULL and PRN[PTR]<=PRN[NEW] Set SAVE:=PTR Set PTR:=LINK[PTR] [End of If Structure] Step 6: If PRN[PTR]>PRN[NEW] Set LINK[SAVE]:=NEW Set LINK[NEW]:=PTR Else: Set LINK[SAVE]:=NEW Set LINK[NEW]=NULL [End of If Structure]

Step 7: Return

Another way to maintain a priority queue in memory is to use a separate queue for each level of priority . Each such queue will appear in its own circular array and must have its own pair of pointers, FRONT and REAR. If each queue is allocated the same amount of space, a two dimensional array QUEUE can be used instead of the linear arrays for representing a priority queue. If K represents the row K of the queue, FRONT[K] and REAR[K] are the front and rear indexes of the Kth row. 1 2 3 4 5 6 1 AAA 2 BBB CCC XXX 3 4 FFF DDD EEE 5 GGG

Priority

Algorithm: QINSERT( QUEUE,N, FRONT, REAR,ITEM,K) This algorithm inserts an element in a priority queue in a row with priority K. N is the size of the Kth row. Step 1:[Queue already filled] If FRONT[K]=1 and REAR[K]=N or FRONT[K]=REAR[K]+1, then: Write: OVERFLOW Exit Step 2: If FRONT[K]=NULL, then: [Queue initially empty] Set FRONT[K]:=1 and REAR[K]:=1 Else If REAR[K]=N, then: Set REAR[K]:=1 Else: Set REAR[K]:=REAR[K]+1 [End of If structure] Step 3: Set QUEUE[K][REAR[K]]:=ITEM Step 4: Return

Algorithm: QDELETE(QUEUE,N,FRONT,REAR,ITEM, START, MAXP) This algorithm deletes an element from a priority queue. MAXP is the maximum priority in the array Step 1: Set K=1 [Priority number] Step 2: Repeat while K<=MAXP and FRONT[K]=NULL Set K=K+1 [End of Loop] Step 3: If K>MAXP , then: Write:UNDERFLOW Exit [End of If structure] Step 4: Set ITEM:=QUEUE[K][FRONT[K]] Step 5: If FRONT[K]=REAR[K], then: [Empty Queue] Set FRONT[K]:=NULL and REAR[K]:=NULL Else If FRONT[K]=N, then: Set FRONT[K]:=1 Else: Set FRONT[K]:=FRONT[K]+1 [End of If structure] Step 5: Return

TREE

A tree is a non-linear data structure mainly used to represent data containing hierarchical relationship between elements. In hierarchical data we have ancestor-descendent, superior-subordinate, whole-part, or similar relationship among data elements. A (general) tree T is defined as a finite nonempty set of elements such that There is a special node at the highest level of hierarchy called the root, and the remaining elements , if any, are partitioned into disjoint sets T1,T2,T3---Tn where each of these sets is a tree, called the sub tree of T. In other words, one may define a tree as a collection of nodes and each node is connected to another node through a branch. The nodes are connected in such a way that there are no loops in the tree and there is a distinguished node called the root of the tree.

Tree Terminology Parent node- If N is a node in T with left successor S1 and right successor S2, then N is called father or parent of S1 and S2. Similarly, S1 is called left child of N and S2 is called the right child of N. The child node is also called the descendant of a node N Siblings- The child nodes with same parent are called siblings Level of element- Each node in tree is assigned a level number. By definition, root of the tree is at level 0;its children, if any, are at level 1; their children, if any, are at level 2; and so on. Thus a node is assigned a level number one more than the level number of its parent Depth/Height of Tree- The height or depth of tree is maximum number of nodes in a branch . It is one more than the maximum level number of the tree. Degree of an element- The degree of a node in a tree is number of children it has. The degree of leaf node is zero. Degree of Tree- The degree of a tree is the maximum degree of its nodes. Edge- Line drawn from a node N of T to a successor is called an edge. Path- A sequence of edges is called an path Leaf- A terminal node of a tree is called leaf node Branch- Path ending in a leaf is called branch of the tree

The most common form of tree maintained in computer is binary tree. Binary Tree- A binary tree T is defined as a finite set of elements, called nodes, such that either: T is empty (called null tree or empty tree) or, T contains a distinguished node, R, called root of T and remaining nodes of T form an ordered pair of disjoint binary trees T1 and T2 Two trees T1 and T2 are called respectively left and right subtree of R (root node of T). If T1 is nonempty, then its root is called left successor of R. Similarly, If T2 is nonempty, then its root is called right successor of R Root Node A (Left Successor of A) B C (Right Successor of A) D E G H F J K L The nodes D,F,G,L,K are the terminal or leaf nodes

Bianry trees are used to represent algebraic expressions involving only binary operations, such as E= (a-b)/((c*d)+e) Each variable or constant in E appears as an internal node in T whose left and right subtree correspond to operands of the expression / a b * + e

Before constructing a tree for an algebraic expression, we have to see the precedence of the operators involved in the expression.

Difference between binary tree and a general tree A binary tree can be empty whereas a tree cannot be empty Each element in binary tree has at most two sub trees whereas each element in a tree can have any number of sub trees The sub trees of each element in a binary tree are ordered. That is we can distinguish between left and right sub trees. The sub trees in a tree are unordered.

Properties of Binary Trees Each node of a binary tree T can have at most two children. Thus at level r of t, there can be atmost 2r nodes. The number of nodes in a tree for given number of levels in a tree is 2n-1 Depth of a tree T with n nodes is given by Dn= log2n + 1 Dn log2n Complete Binary tree- A binary tree T is said to be complete if all its levels, except possibly the last, have maximum number of possible nodes, and if all the nodes at last level appear as far left as possible. Thus there is a unique complete tree T with exactly n nodes. Extended Binary Trees: 2-Trees- A binary tree is said to be a 2-tree or an extended binary tree if each node N has either 0 or 2 children. In such a case, nodes with 2 children are called internal nodes, and nodes with 0 child are called external nodes. The external and internal nodes are distinguished diagrammatically by using circles for internal nodes and squares for external nodes

Representing Binary Trees in memory Binary trees can be represented using linked list using a single array called the sequential representation of tree Sequential representation of Binary Trees- This representation uses only a single linear array Tree as follows: The root R of T is stored in TREE[1] If a node N occupies TREE[K], then its left child is stored in TREE[2*K] and its right child is stored in TREE[2*K+1] 45 22 11 15 25 30 88 77 90

45 22

77
11 30

0
90 0 15 25 0 0 0 88

NULL
NULL

NULL NULL NULL

It can be seen that a sequential representation of a binary tree requires numbering of nodes; starting with nodes on level 1, then on level 2 and so on. The nodes are numbered from left to right . It is an ideal case for representation of a complete binary tree and in this case no space is wasted. However for other binary trees, most of the space remains unutilized. As can be seen in the figure, we require 14 locations in array even though the tree has only 9 nodes. If null entries for successors of the terminal nodes are included, we would actually require 29 locations instead of 14.Thus sequential representation is usually inefficient unless binary tree is complete or nearly complete

Linked representation of Binary Tree In linked representation, Tree is maintained in memory by means of three parallel arrays, INFO, LEFT and RIGHT and a pointer variable ROOT. Each node N of T will correspond to a location K such that INFO[K] contains data at node N. LEFT[K] contains the location of left child of node N and RIGHT[K] contains the location of right child of node N. ROOT will contain location of root R of Tree. If any subtree is empty, corresponding pointer will contain null value. If the tree T itself is empty, then ROOT will contain null value
ROOT A B C

D H

F
I

G
J

Traversing Binary Trees There are three standard ways of traversing a binary tree T with root R. These are preorder, inorder and postorder traversals Preorder PROCESS the root R Traverse the left sub tree of R in preorder Traverse the right sub tree of R in preorder Inorder Traverse the left sub tree of R in inorder Process the root R Traverse the right sub tree of R in inorder Postorder Traverse the left sub tree of R in postorder Traverse the right sub tree of R in postorder Process the root R

The difference between the algorithms is the time at which the root R is processed. In pre algorithm, root R is processed before sub trees are traversed; in the in algorithm, root R is processed between traversals of sub trees and in post algorithm , the root is processed after the sub trees are traversed. A B D Preorder Traversal: Inorder Traversal: Postorder Traversal : E ABDECF DBEACF DEBFCA C F

All the traversal algorithms assume a binary tree T maintained in memory by linked representation TREE(INFO,LEFT,RIGHT,ROOT) All algorithms use a variable PTR(pointer) which will contain the location of the node N currently being scanned. LEFT[N] denotes the left child of node N and RIGHT[N] denotes the right child of N. All algorithms use an array STACK which will hold the addresses of nodes for further processing.

Algorithm: PREORD(INFO, LEFT, RIGHT, ROOT)


This algorithm traverses the tree in preorder Step 1: Set TOP:=1, STACK[1]:=NULL and PTR:= ROOT Step 2: Repeat Step 3 to 5 while PTRNULL Step 3: Apply PROCESS to INFO[PTR] Step 4: [Right Child ?] If RIGHT[PTR] NULL, then: Set TOP:=TOP + 1 Set STACK[TOP]:= RIGHT[PTR] [End of If structure] Step 5: [Left Child ?] If LEFT[PTR] NULL, then: Set PTR:=LEFT[PTR] Else: Set PTR:=STACK[TOP] Set TOP:=TOP-1 [End of If structure] [End of Step 2 Loop] Step 6: Return

Algorithm: INORD (INFO, LEFT,RIGHT, ROOT) Step 1: Set TOP:=1, STACK[1]:=NULL and PTR:=ROOT Step 2: Repeat while PTR NULL: (A) Set TOP:=TOP + 1 and STACK[TOP]:= PTR (B) Set PTR:=LEFT[PTR] [End of Loop] Step 3: Set PTR:=STACK[TOP] and TOP:=TOP -1 Step 4: Repeat Step 5 to 7 while PTR NULL Step 5: Apply PROCESS to INFO[PTR] Step 6: If RIGHT[PTR] NULL, then: (A) Set PTR := RIGHT[PTR] (B) GO TO step 2 [End of If structure] Step 7: Set PTR:=STACK[TOP] and TOP:=TOP -1 [End of Step 4 Loop] Step 8: Return

Algorithm : POSTORD( INFO, LEFT, RIGHT, ROOT)


Step 1: Set TOP:=1, STACK[1]:=NULL and PTR:=ROOT Step 2: Repeat Step 3 to 5 while PTR NULL Step 3: Set TOP:=TOP +1 and STACK[TOP]:=PTR Step 4: If RIGHT[PTR] NULL, then: Set TOP:=TOP +1 and STACK[TOP]:= - RIGHT[PTR] [End of If structure] Step 5: Set PTR:=LEFT[PTR] [End of Step 2 loop] Step 6: Set PTR:=STACK[TOP] and TOP:=TOP -1 Step 7: Repeat while PTR>0: (A) Apply PROCESS to INFO[PTR] (B) Set PTR:=STACK[TOP] and TOP:=TOP -1 [End of Loop] Step 8: If PTR<0, then: (a) Set PTR:=-PTR (b) Go to Step 2 [End of If structure] Step 9: Exit

Problem: Create a tree from the given traversals preorder: F A E K C D H G B inorder: E A C K F H D B G Solution: The tree is drawn from the root as follows: (a) The root of tree is obtained by choosing the first node of preorder. Thus F is the root of the proposed tree (b) The left child of the tree is obtained as follows: (a) Use the inorder traversal to find the nodes to the left and right of the root node selected from preorder. All nodes to the left of root node(in this case F) in inorder form the left subtree of the root(in this case E A C K ) (b) All nodes to the right of root node (in this case F ) in inorder form the right subtree of the root (H D B G) (c) Follow the above procedure again to find the subsequent roots and their subtrees on left and right.

F is the root Nodes on left subtree( left of F):E A C K (from inorder) Nodes on right subtree(right of F):H D B G(from inorder) The root of left subtree: From preorder: A E K C , Thus the root of left subtree is A D H G B , Thus the root of right subtree is D Creating left subtree first: From inorder: elements of left subtree of A are: E (root of left) elements of right subtree of A are: C K (root of right) Thus tree till now is: F A D

C As K is to the left of C in preorder

Creating the right subtree of F The root node is D From inorder, the nodes on the left of D are: H (left root of D) the nodes on the right of D are: B G (right root of D) Thus the tree is: F

A
E C K H

D
G B

F
A E C K H B D G

Threads: Inorder Threading Considering linked list representation of a binary tree, it can be seen that half of the entries in pointer fields LEFT and RIGHT will contain null entries. This space may be more efficiently used by replacing the null entries by special pointers called Threads which point to nodes higher in tree. Such trees are called Threaded trees. The threads in a threaded tree are usually indicated by dotted lines . In computer memory, threads may be represented by negative integers when ordinary pointers are denoted by positive integers. There are many ways to thread a binary tree T but each threading will correspond to a particular traversal of T. Trees can be threaded using one-way threading or two-way threading. Unless otherwise stated, threading will correspond to inorder traversal of T. Accordingly, in one-way threading, a thread will appear in right null field of a node and will point to the next node in inorder traversal of T In two-way threading of T, a thread will also appear in the LEFT field of a node and will point to the preceding node in inorder traversal of T

B D F E

A C G J H K

L
One-way inorder Threading Inorder traversal: D B F E A G C L J H K

B D F E

A C G J H K

L
Two-way inorder Threading

Binary Search Tree- If T is a binary tree, then T is called a binary search tree or
binary sorted tree if each node N of T has the following property: The Value of N is greater than every value in left sub tree of N The value at N is less than or equal to every value in right sub tree of N The inorder traversal of BST gives sorted numbers For example: The following numbers create a BST as: 3 5 9 1 2 6 8 10 3 1 5 2 6 8 9 10

Binary search tree is one of the most important data structures in computer science. This structure enables one to search for and find an element with an average running time f(n)=O(log2 n ) It also enables one to easily insert and delete elements. This structure contrasts with following structures: Sorted linear array- here one can find the element with a running time of O(log2 n ) but it is expensive to insert and delete Linked list- Here one can easily insert and delete but searching is expensive with running time of O(n)

Searching and Inserting in a BST Algorithm: This algorithm searches for ITEM in a tree and inserts it if not present in tree Step 1: Compare ITEM with root node N of Tree (i) If ITEM < N, proceed to left child of N (ii) If ITEM >= N, proceed to right child of N Step 2: Repeat step 1 until one of the following occurs: (i) If ITEM = N, then: Write: Search successful (ii) Empty sub tree found indicating search unsuccessful. Insert item in place of empty sub tree

Algorithm: INSBT(INFO, LEFT, RIGHT, AVAIL, ITEM, LOC)


This algorithm finds the location LOC of an ITEM in T or adds ITEM as a new node in T at location LOC Step 1: Call FIND(INFO, LEFT, RIGHT, ROOT, ITEM, LOC, PAR) Step 2: If LOC NULL, then Return Step 3: [Copy item into new node in AVAIL list] (a) If AVAIL=NULL, then: Write: OVERFLOW Return (b) Set NEW:=AVAIL, AVAIL:=LINK[AVAIL] and INFO[NEW]:=ITEM (c) Set LEFT[NEW]:=NULL and RIGHT[NEW]:=NULL Step 4:[Add ITEM to tree] If PAR=NULL, then: Set ROOT:=NEW Else If ITEM<INFO[PAR], then: Set LEFT[PAR]:=NEW Else: Set RIGHT[PAR]:=NEW [End of If structure] Step 5: Return

Algorithm: FIND(INFO,LEFT,RIGHT,ROOT,ITEM,LOC,PAR)
This algorithm finds the location LOC of ITEM in T and also the location PAR of the parent of ITEM. There are three special cases (a) LOC=NULL and PAR=NULL will indicate tree is empty (b) LOC NULL and PAR=NULL will indicate that ITEM is the root of T ( c) LOC=NULL and PAR NULL will indicate that ITEM is not in T and can be added to T as a child of node N with location PAR Step 1: If ROOT= NULL , then: Set LOC:=NULL and PAR:=NULL Return Step 2: If ITEM=INFO[ROOT], then: Set LOC:=ROOT and PAR:=NULL Write: Item is the root of the tree Return Step3: If ITEM < INFO[ROOT], then: Set PTR:=LEFT[ROOT] and SAVE:=ROOT Else: Set PTR:=RIGHT[ROOT] and SAVE:= ROOT [End of If structure] Step 4: Repeat while PTR NULL: If ITEM=INFO[PTR] ,then: Set LOC:=PTR and PAR:=SAVE Write: the location of the node in tree is, LOC Return If ITEM< INFO[PTR] , then: Set SAVE:=PTR and PTR:=LEFT[PTR] Else: Set SAVE:=PTR and PTR:=RIGHT[PTR] [End of If structure] [End of Step 4 Loop] Step 5: [Search unsuccessful] Set LOC:=NULL and PAR:=SAVE Step 6: Return

Deletion in a Binary Search Tree- Deletion in a BST uses a procedure FIND to find the location of node N which contains ITEM and also the location of parent node P(N). The way N is deleted from the tree depends primarily on the number of children of node N. There are three cases: Case 1: N has no children. Then N is deleted from T by simply replacing the location P(N) by null pointer Case 2: N has exactly one child. Then N is deleted from T by simply replacing the location of N by location of the only child of N Case 3: N has two children. Let S(N) denote the inorder successor of N. Then N is deleted from T by first deleting S(N) from T(by using Case 1 or Case 2) and then replacing node N in T by node S(N)

Case 1: When node to be deleted does not have two children Algorithm: DELA( INFO, LEFT,RIGHT,ROOT,LOC,PAR) This procedure deletes node N at location LOC where N does not have two children. PAR gives the location of parent node of N or else PAR=NULL indicating N is the root node. Pointer CHILD gives the location of only child of N Step 1: If LEFT[LOC]=NULL and RIGHT[LOC]=NULL, then: Set CHILD=NULL Else If LEFT[LOC]NULL, then: Set CHILD:=LEFT[LOC] Else Set CHILD:=RIGHT[LOC] Step 2: If PAR NULL, then: If LOC=LEFT[PAR] , then: Set LEFT[PAR]:=CHILD Else: Set RIGHT[PAR]:=CHILD Else: Set ROOT:=CHILD Step 3: Return

Case 2: When node to be deleted has two children


Algorithm: DELB( INFO, LEFT, RIGHT, ROOT, LOC, PAR, SUC, PARSUC) This procedure deletes node N at location LOC where N have two children. PAR gives the location of parent node of N or else PAR=NULL indicating N is the root node. Pointer SUC gives the location of in order successor of N and PARSUC gives the location of parent of in order successor Step 1: (a) Set PTR:=RIGHT[LOC] and SAVE:=LOC (b) Repeat while LEFT[PTR]NULL Set SAVE:=PTR and PTR:=LEFT[PTR] [End of Loop] (c ) Set SUC:=PTR and PARSUC:=SAVE Step 2: CALL DELA(INFO,LEFT,RIGHT, ROOT,SUC,PARSUC) Step 3: (a) If PAR NULL, then: If LOC = LEFT [PAR], then: Set LEFT[PAR]:=SUC Else: Set RIGHT[PAR]:=SUC [End of If structure] Else: Set ROOT:=SUC [End of If structure]

(b) Set LEFT[SUC]:=LEFT[LOC] and Set RIGHT[SUC]:=RIGHT[LOC] Step 4: Return

Heap Suppose H is a complete binary tree with n elements. Then H is called a heap or a maxheap if each node N of H has the property that value of N is greater than or equal to value at each of the children of N. 97 88 95

66
66 18 40 30 35 48

55
55 62

95
77 25

48
38

26 24

Analogously, a minheap is a heap such that value at N is less than or equal to the value of each of its children. Heap is more efficiently implemented through array rather than linked list. In a heap, the location of parent of a node PTR is given by PTR/2 Inserting an element in a Heap Suppose H is a heap with N elements, and suppose an ITEM of information is given. We insert ITEM into the heap H as follows: First adjoin the ITEM at the end of H so that H is still a complete tree but not necessarily a heap Then let the ITEM rise to its appropriate place in H so that H is finally a heap

Algorithm: INSHEAP( TREE, N, ITEM) A heap H with N elements is stored in the array TREE and an ITEM of information is given. This procedure inserts the ITEM as the new element of H. PTR gives the location of ITEM as it rises in the tree and PAR denotes the parent of ITEM Step 1: Set N:= N +1 and PTR:=N Step 2: Repeat Step 3 to 6 while PTR > 1 Set PAR:=PTR/2 If ITEM TREE[PAR], then: Set TREE[PTR]:=ITEM Return Set TREE[PTR]:=TREE[PAR] [End of If structure] Set PTR:=PAR [End of Loop] Step 3: Set TREE[1]:=ITEM Step 4: Return

Deleting the root node in a heap Suppose H is a heap with N elements and suppose we want to delete the root R of H. This is accomplished as follows: Assign the root R to some variable ITEM Replace the deleted node R by last node L of H so that H is still a complete tree but not necessarily a heap Let L sink to its appropriate place in H so that H is finally a heap

Algorithm: DELHEAP( TREE, N , ITEM ) A heap H with N elements is stored in the array TREE. This algorithm assigns the root TREE[1] of H to the variable ITEM and then reheaps the remaining elements. The variable LAST stores the value of the original last node of H. The pointers PTR, LEFT and RIGHT give the location of LAST and its left and right children as LAST sinks into the tree. Step 1: Set ITEM:=TREE[1] Step 2: Set LAST:=TREE[N] and N:=N-1 Step 3: Set PTR:=1, LEFT:=2 and RIGHT:=3 Step 4: Repeat step 5 to 7 while RIGHT N: Step 5: If LAST TREE[LEFT] and LAST TREE [RIGHT] , then: Set TREE[PTR]:=LAST Return

Step 6: If TREE[RIGHT] TREE[LEFT], then: Set TREE[PTR]:=TREE[LEFT] Set PTR:=LEFT Else: Set TREE[PTR]:=TREE[RIGHT] and PTR:=RIGHT [End of If structure] Set LEFT:= 2* PTR and RIGHT:=LEFT + 1 [End of Loop] Step 7: If LEFT=N and If LAST < TREE[LEFT], then: Set TREE[PTR]:=TREE[LEFT] and Set PTR:=LEFT Step 8: Set TREE[PTR]:=LAST Return

90 80 60 50 75 85 70

Application of Heap HeapSort- One of the important applications of heap is sorting of an array using heapsort method. Suppose an array A with N elements is to be sorted. The heapsort algorithm sorts the array in two phases:

Phase A: Build a heap H out of the elements of A


Phase B: Repeatedly delete the root element of H Since the root element of heap contains the largest element of the heap, phase B deletes the elements in decreasing order. Similarly, using heapsort in minheap sorts the elements in increasing order as then the root represents the smallest element of the heap.

Algorithm: HEAPSORT(A,N) An array A with N elements is given. This algorithm sorts the elements of the array Step 1: [Build a heap H] Repeat for J=1 to N-1: Call INSHEAP(A, J, A[J+1]) [End of Loop] Step 2: [Sort A repeatedly deleting the root of H] Repeat while N > 1: (a) Call DELHEAP( A, N, ITEM) (b) Set A[N + 1] := ITEM [Store the elements deleted from the heap] [End of loop] Step 3: Exit

Problem: Create a Heap out of the following data: jan feb mar apr may jun jul aug sept oct nov dec Solution: sep oct mar apr aug feb nov jan may dec jun jul

AVL TREE

The efficiency of many important operations on trees is related to the height of the tree for example searching, insertion and deletion in a BST are all O(height). In general, the relation between the height of the tree and the number of nodes of the tree is O (log2n) except in the case of right skewed or left skewed BST in which height is O(n). The right skewed or left skewed BST is one in which the elements in the tree are either on the left or right side of the root node. A A B B C C D D E E

Right-skewed

Left-skewed

For efficiency sake, we would like to guarantee that h remains O(log2n). One way to do this is to force our trees to be heightbalanced. Method to check whether a tree is height balanced or not is as follows: Start at the leaves and work towards the root of the tree. Check the height of the subtrees(left and right) of the node. A tree is said to be height balanced if the difference of heights of its left and right subtrees of each node is equal to 0, 1 or -1 Example: Check whether the shown tree is balanced or not

D Sol: Starting from the leaf nodes D and C, the height of left and right subtrees of C and D are each 0. Thus their difference is also 0 Check the height of subtrees of B Height of left subtree of B is 1 and height of right subtree of B is 0. Thus the difference of two is 1 Thus B is not perfectly balanced but the tree is still considered to be height balanced. Check the height of subtrees of A Height of left subtree of A is 2 while the height of its right subtree is 1. The difference of two heights still lies within 1. Thus for all nodes the tree is a balanced binary tree.

Check whether the shown tree is balanced or not A B C D E F

Ans No as node B is not balanced as difference of heights of left and right subtrees is 3-0 i.e more than 1.

Height-balanced Binary tree (AVL Tree) The disadvantage of a skewed binary search tree is that the worst case time complexity of a search is O(n). In order to overcome this disadvantage, it is necessray to maintain the binary search tree to be of balanced height. Two Russian mathematicians , G.M. Adel and E.M. Landis gave a technique to balance the height of a binary tree and the resulting tree is called AVL tree. Definition: An empty binary tree is an AVL tree. A non empty binary tree T is an AVL tree iff given TL and TR to be the left and right subtrees of T and h(TL) and h(TR) be the heights of subtrees TL and TR respectively, TL and TR are AVL trees and |h(TL)-h(TR)| 1. |h(TL)-h(TR)| is also called the balance factor (BF) and for an AVL tree the balance factor of a node can be either -1, 0 or 1 An AVL search tree is a binary search tree which is an AVL tree.

A node in a binary tree that does not contain the BF of 0, 1 or -1, it is said to be unbalanced. If one inserts a new node into a balanced binary tree at the leaf, then the possible changes in the status of the node are as follows: The node was either left or right heavy and has now become balanced. A node is said to be left heavy if number of nodes in its left subtree are one more than the number of nodes in its right subtree.. In other words, the difference in heights is 1. Similar is the case with right heavy node where number of nodes in right subtree are one more than the number of nodes in left subtree The node was balanced and has now become left or right heavy The node was heavy and the new node has been inserted in the heavy subtree, thus creating an unbalanced subtree. Such a node is called a critical node.

Rotations- Inserting an element in an AVL search tree in its first phase is similar to that of the one used in a binary search tree. However, if after insertion of the element, the balance factor of any node in a binary search tree is affected so as to render the binary search tree unbalanced, we resort to techniques called Rotations to restore the balance of the search tree. To perform rotations, it is necessary to identify the specific node A whose BF (balance factor) is neither 0,1 or -1 and which is nearest ancestor to the inserted node on the path from inserted node to the root. The rebalancing rotations are classified as LL, LR, RR and RL based on the position of the inserted node with reference to A LL rotation: Inserted node in the left subtree of the left subtree of A RR rotation: Inserted node in the right subtree of the right subtree of A LR rotation: Inserted node in the right subtree of the left subtree of A RL rotation: Inserted node in the left subtree of the right subtree of A

LL Rotation- This rotation is done when the element is inserted in the left subtree of the left subtree of A. To rebalance the tree, it is rotated so as to allow B to be the root with BL and A to be its left subtree and right child and BR and AR to be the left and right subtrees of A. The rotation results in a balanced tree.

RR Rotation-This rotation is applied if the new element is inserted right subtree of right subtree of A. The rebalancing rotation pushes B upto the root with A as its left child and BR as its right subtree and AL and BL as the left and right subtrees of A

LR and RL rotations- The balancing methodology of LR and RL rotations are similar in


nature but are mirror images of one another. Amongst the rotations, LL and RR rotations are called as single rotations and LR and RL are known as double rotations since LR is accomplished by RR followed by LL rotation and RL can be accomplished by LL followed by RR rotation. LR rotation is applied when the new element is inserted in right subtree of the left subtree of A. RL rotation is applied when the new element is inserted in the left subtree of right subtree of A

LR Rotation- this rotation is a combination of RR rotation followed by LL rotation. A A C B BL C AR RR B C CR AR LL BL B A CL CR AR

CL
x

CR

BL

CL
x

RL Rotation-This rotation occurs when the new node is inserted in left subtree of right subtree of A. Its a combination of LL followed by RR A C T1 C B T4
RL

A T1 T2 T3

B T4

T2
NEW

T3

NEW

RL Rotation- This rotation occurs when the new node is inserted in right subtree of left subtree of A. A A T1 C T2 NEW C A T1 T2 NEW T3 B T4 T3 B T4 LL T1 T2 NEW T3 C B T4

RR

Problem: Construct an AVL search tree by inserting the following elements in the order of their occurrence 64, 1, 14, 26, 13, 110, 98, 85 Sol:

Deletion in an AVL search Tree The deletion of element in AVL search tree leads to imbalance in the tree which is corrected using different rotations. The rotations are classified according to the place of the deleted node in the tree. On deletion of a node X from AVL tree, let A be the closest ancestor node on the path from X to the root node with balance factor of +2 or -2 .To restore the balance, the deletion is classified as L or R depending on whether the deletion occurred on the left or right sub tree of A. Depending on value of BF(B) where B is the root of left or right sub tree of A, the R or L rotation is further classified as R0, R1 and R-1 or L0, L1 and L-1. The L rotations are the mirror images of their corresponding R rotations.

R0 Rotation- This rotation is applied when the BF of B is 0 after deletion of the node

R1 Rotation- This rotation is applied when the BF of B is 1

R-1 Rotation- This rotation is applied when the BF of B is -1

L rotations are the mirror images of R rotations. Thus L0 will be applied when the node is deleted from the left subtree of A and the BF of B in the right subtree is 0 Similarly, L1and L-1 will be applied on deleting a node from left subtree of A and if the BF of root node of right subtree of A is either 1 or -1 respectively.

GRAPH

Graph-

A graph G consists of : A set V of elements called the nodes (or points or vertices) A set E of edges such that each edge e in E is identified with a unique (unordered) pair [u,v] of nodes in V, denoted by e=[u,v] The nodes u and v are called the end points of e or adjacent nodes or neighbors. The edge in a graph can be directed or undirected depending on whether the direction of the edge is specified or not. A graph in which each edge is directed is called a directed graph or digraph. A graph in which each edge is undirected is called undirected graph. A graph which contains both directed and undirected edges is called mixed graph. Let G=(V,E) be a graph and e E be a directed edge associated with ordered pair of vertices (v1,v2). Then the edge e is said to be initiating from v1 to v2. v1 is the starting and v2 is the termination of the edge e.

An edge in a graph that joins a vertex to itself is called a sling or a loop The degree of a node or vertex u is written deg(u) is the number of edges containing u. The degree of a loop is 2 In a directed graph for any vertex v the number of edges which have v as their initial vertex is called the out-degree of v. The number of edges which have v as their terminal vertex is called the in-degree of v. The sum of in-degree and out-degree of a vertex is called the degree of that vertex. If deg(u)=0, then u is called an isolated node and a graph containing only isolated node is called a null graph The maximum degree of a graph G, denoted by (G), is the maximum degree of its vertices, and the minimum degree of a graph, denoted by (G), is the minimum degree of its vertices. A sequence of edges of a digraph such that the terminal vertex of the edge sequences in the initial vertex of the next edge, if exist, is called a path E={(v1,v2),(v2,v3),(v3,v4)} A path P of length n from a vertex u to vertex v is defined as sequence of n+1 nodes P=(v0,v1,v2--------------vn) such that u=v0; vi-1 is adjacent to vi for i=1,2,3----------n; and v=vn. The path is said to be closed or a circular path if v0=vn.

The path is said to be simple if all nodes are distinct, with the exception that v0 may equal vn; that is P is simple if the nodes v0,v1,v2--------vn-1 are distinct and the nodes v1,v2------------vn are distinct. A cycle is closed simple path with length 2 or more. A cycle of length k is called a k-cycle. A graph G is said to be connected if and only if there is a simple path between any two nodes in G. In other words, a graph is connected if it does not contain any isolated vertices. A graph that is not connected can be divided into connected components (disjoint connected subgraphs). For example, this graph is made of three connected components A graph G is said to be complete if every node u in G is adjacent to every node v in G. A complete graph with n vertices (denoted Kn) is a graph with n vertices in which each vertex is connected to each of the others (with one edge between each pair of vertices). In other words, there is path from every vertex to every other vertex. Clearly such a graph is also a connected graph. A complete graph with n nodes will have n(n-1)/2 edges A connected graph without any cycles is called a tree graph or free graph or simply a tree.

Here are the five complete graphs:

A graph is said to be labeled if its edges are assigned data. G is said to be weighted if each edge e in G is assigned a non negative numerical value w (e) called the weight or length of e. In such a case, each path P in G is assigned a weight or length which is the sum of the weights of the edges along the path P. If no weight is specified, it is assumed that each edge has the weight w (e) =1 Multiple edges- Distinct edges e and e are called multiple edges if they connect the same endpoints, that is, if e=[u, v] and e=[u, v]. Such edges are also called parallel edges and a graph that contains these multiple or parallel edges is called a multigraph. Also a graph containing loops is also not a simple graph but a multigraph.

Directed Multi Graph

Weighted Graph

Representation of a graph-There are two main ways of representing a graph in memory. These are: Sequential Linked List Sequential Representation- The graphs can be represented as matrices in sequential representation. There are two most common matrices. These are: Adjacency Matrix Incidence Matrix The adjacency matrix is a sequence matrix with one row and one column devoted to each vertex. The values of the matrix are 0 or 1. A value of 1 for row i and column j implies that edge eij exists between vi and vj vertices. A value of 0 implies that there is no edge between the vertex vi and vj. Thus, for a graph with v1,v2,v3..vn vertices, the adjacency matrix A=[aij] of the graph G is the n x n matrix and can be defined as:

1 if vi is adjacent to vj (if there is an edge between vi and vj) aij = 0 if there is no edge between vi and vj

Such a matrix that contains entries of only 0 or 1 is called a bit matrix or Boolean matrix. The adjacency matrix of the graph G does depend on the ordering of the nodes in G that is different ordering of the nodes may result in a different adjacency matrix. However the matrices resulting from different orderings are closely related in that one can be obtained from another by simply interchanging rows and columns

Suppose G is an undirected graph. Then the adjacency matrix A of G will be a symmetric matrix i.e one in which aij=aji for every i and j. If G is a multigraph, then the adjacency matrix of G is the m x m matrix A=(aij) defined by setting aij equal to the number of edges from vi to vj. Consider an Adjacency matrix A representing a graph. Then A2, A3..Ak of Adjacency matrix A represent the matrices with path lengths 2,3 k respectively. In other words, if ak(i,j)= the ijth entry of matrix Ak then this entry represents the number of paths of length k from node vi to vj. .If now we represent a matrix Br as Br=A+A2+A3Ak then each entry of Br represent the number of paths of lengths r or less than r from node vi to vj

Example :Consider the graph G. Suppose the nodes are stored in memory in a linear array DATA as follows:

DATA: X,Y,Z,W
Then we assume that the ordering of the nodes in G as as v1=X, v2=Y,v3=Z, v4=W. The adjacency matrix A of G is 0001

A=

1011
1001 0010

Path Matrix- Let G be a simple directed graph with m nodes, v1,v2,v3..vm. The path matrix or reachability matrix of G is the m-square matrix P=(pij) defined as: 1 if there is a path from vi to vj Pij= 0 Otherwise Suppose there is a path from vi to vj. Then there must be a simple path from vi to vj when vi vj or there must be a cycle from vi to vj when vi = vj. Since G has only m nodes such a simple path must be of length m-1 or less or such a cycle must have length m or less.

Proposition: Let A be the adjacency matrix and let P=(pij) be the path matrix of a digraph G. Then Pij=1 if and only if there is a nonzero number in the Adjacency matrix in the ijth entry of the matrix. Bm = A+ A2+ A3 +. + Am

Linked representation of the graph- The sequential representation of the graph in memory i.e the representation of graph by adjacency matrix has a number of major drawbacks It may be difficult to insert and delete nodes in Graph. This is because the size of array may need to be changed and the nodes may need to reordered which can lead to many changes in the matrix. Also if the number of edges are very less, then, the matrix will be sparse and there will be wastage of memory. Accordingly, graph is usually represented in memory by a linked representation also called an adjacency structure. Specifically, the linked representation of graph contains two lists (or files), a node list NODE and an edge list EDGE, as follows: Node list- Each element in the list NODE will correspond to a node in G(Graph) and it will be a record of the form:

NODE

NEXT

ADJ

Here NODE will be the name or key value of the node, NEXT will be a pointer to the next node in the list NODE and ADJ will be a pointer to the first element in the adjacency list of the node, which is maintained in the list EDGE. The last rectangle indicates that there may be other information in the record such as indegree of the node, the outdegree of the node, status of the node during execution etc. The nodes themselves will be maintained as a linked list and hence will have a pointer variable START for the beginning of the list and a pointer variable AVAILN for the list of available space Edge List- Each element in the list EDGE will correspond to an edge of graph and will be a record of the form: DEST LINK

The field DEST will point to the location in the list NODE of the destination or terminal node of the edge. The field LINK will link together the edges with the same initial node, that is, the nodes in the same adjacency list. The third area indicates that there may be other information in the record corresponding to the edge, such as a field EDGE containing the labeled data of the edge when graph is a labeled graph, a field weight WEIGHT containing the weight of the edge when graph is a weighted graph and so on.

D
E

Node Adjacency List

A B C D E

B,C,D C
C,E C

Traversing a Graph- There are two standard ways of traversing a graph Breadth-first search Depth-First Search The breadth-first search will use a queue as an auxiliary structure to hold nodes for future processing, and analogously, the depth-first search will use a stack During the execution of algorithm, each node N of G (Graph) will be in one of the three states, called the status of N as follows: STATUS=1: (Ready state) The initial state of the node N STATUS=2: (Waiting state) The node N is one the queue or stack, waiting to be processed STATUS=3: (Processed state). The node N has been processed

The general idea behind a breadth-first search beginning at a starting node A as follows: Examine the starting node A Examine all neighbors of A Then examine all neighbors of neighbors of A and so on. Keep track of neighbors of node and guarantee that no node is processed more than once. This is accomplished by using a queue to hold nodes that are waiting to be processed and by using a field STATUS which tells the status of any node. The breadth-first search algorithm helps in finding the minimum path from source to destination node

Algorithm: This algorithm executes a breadth-first search on a graph G beginning at a starting node A. This algorithm can process only those nodes that are reachable from A. To examine all the nodes in graph G, the algorithm must be modified so that it begins again with another node that is still in the ready state Step 1: Initialize all nodes to the ready state (STATUS=1) Step 2: Put the starting node A in Queue and change its status to the waiting state (STATUS=2) Step 3: Repeat Steps 4 and 5 until Queue is empty: Step 4: Remove the front node N of Queue. Process N and change the status of N to the processed state (STATUS=3) Step 5: Add to the rear of Queue all the neighbors of N that are in the ready state ( STATUS=1) , and change their status to the waiting state (STATUS=2) [End of Step 3 Loop] Step 6: Exit

Algorithm: This algorithm executes a depth-first search on a graph G beginning at a starting node A. This algorithm can process only those nodes that are reachable from A. To examine all the nodes in graph G, the algorithm must be modified so that it begins again with another node that is still in the ready state Step 1: Initialize all nodes to the ready state (STATUS=1) Step 2: Push the starting node A onto STACK and change its status to the waiting state (STATUS=2) Step 3: Repeat steps 4 and 5 until STACK is empty Step 4: Pop the top node N of STACK. Process N and change its status to the processed state (STATUS=3) Step 5: Push onto the STACK all the neighbors of N that are still in the ready state (STATUS=1) and change their status to the waiting state (STATUS=2) [End of Step 3 Loop] Step 6: Exit

Difference between BFS and DFS The BFS uses a queue for its implementation whereas DFS uses a stack BFS is mostly used for finding the shortest distance between the two nodes in a graph whereas DFS is mostly used to find the nodes that are reachable from a particular node. BFS is called breadth first search as first it processes a node then its immediate neighbors and so on. In other words, FIFO queue puts all its newly generated nodes at the end of a queue which means that shallow nodes are expanded before the deeper nodes. BFS traverses a graph breadth- wise. DFS first traverses the graph till the last reachable node then back tracks the nodes for processing. In other words, DFS expands deepest unexpanded node first first traverses the depth of the graph. BFS ensures that all the nearest possibilities are explored first whereas DFS keeps going as far as it can then goes back to look at other options

Dijkstras Algorithm- This technique is used to determine the shortest path between two arbitrary vertices in a graph. Let weight w(vi ,vj) is associated with every edge (vi,vj) in a given graph. Furthermore, the weights are such that the total weight from vertex vi to vertex vk through vertex vj is w(vi,vj) + w(vj,vk). Using this technique, the weight from a vertex vs (starting of the path) to the vertex vt( the end of the path) in the graph G for a given path (vs,v1),(v1,v2),(v2,v3)..(vi,vt) is given by w(vs,v1) + w(v1,v2) +w(v2,v3)+.+w(vi,vt) Dijkstras method is a very popular and efficient one to find every path from starting to terminal vertices. If there is an edge between two vertices, then the weight of this edge is its length. If several edges exist, use the shortest length edge. If no edge actually exists, set the length to infinity. Edge(vi,vj) does not necessarily have same length as edge (vj,vi). This allows different routes between two vertices that take different paths depending on the direction of travel

Dijkstras technique is based on assigning labels to each vertex. The label is equal to the distance (weight) from the starting vertex to that vertex. The starting vertex has the label 0 A label can be in one of the two states-temporary or permanent. A permanent label that lies along the shortest path while a temporary label is one that has uncertainty whether the label is along the shortest path. Algorithm: Step 1: Assign a temporary label l(vi)= to all vertices except vs (the starting vertex) Step 2: [Mark vs as permanent by assigning 0 label to it] l(vs)=0 Step 3: [Assign value of vs to vk where vk is the last vertex to be made permanent] vk=vs Step 4: If l(vi) > l(vk) + w(vk,vi) [weight of the edge from vk to vi] l(vi)= l(vk) +w(vk,vi) Step 5: vk=vi Step 6: If vt has temporary label, repeat step 4 to step 5 otherwise the value of vt is permanent label and is equal to the shortest path vs to vt Step 7: Exit

Dijkstras Algorithm

An Example

2 4

0
1

2
1 4 3 3 5 2 3

2
6 2

Initialize Select the node with the minimum temporary distance label.

Update Step
2
2 4

0
1

2
1 4 3 3 5 2 3

2
6 2

Choose Minimum Temporary Label


2
2 4

0
1

2
1 4 3 3 5 2 3

2
6 2

Update Step
2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

4 3 The predecessor of node 3 is now node 2

Choose Minimum Temporary Label


2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

Update
2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

d(5) is not changed.

Choose Minimum Temporary Label


2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

Update
2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

d(4) is not changed

Choose Minimum Temporary Label


2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

Update
2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

d(6) is not updated

Choose Minimum Temporary Label


2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

There is nothing to update

End of Algorithm
2
2 4

6
4

0
1

2
1 4 3 3 5 2 3

2
6 2

All nodes are now permanent The predecessors form a tree The shortest path from node 1 to node 6 can be found by tracing back predecessors

Recursion- Recursion is a process by which a function calls itself repeatedly, until some specified condition has been satisfied. The process is used for repetitive computations in which each action is stated in terms of the previous result. Many iterative (or repetitive) problems can be written in this form. In order to solve a problem recursively, two conditions must be satisfied Problem must be written in a recursive form Problem statement must include a stopping condition.

Program to calculate the factorial of a number using recursion #include<stdio.h> #include<conio.h> void main() { int n,f; printf("enter the number"); scanf("%d",&n); f=fact(n); printf("factorial of the number is %d",f); getch(); }

fact(int n)
{ if(n==1) return n;
return n* fact(n-1); }

When a recursive program is executed, the recursive function calls are not executed immediately. Rather they are placed on a stack until the condition that terminates the recursion is encountered. The stack is a last-in first out data structure in which the successive data items are pushed down upon the preceding items. The data are later removed from the stack in reverse order. The function calls are then executed in reverse order as they are popped off the stack. Thus when evaluating the factorial recursively, the function calls will proceed in the following order

n!= n * (n-1)! (n-1)!=n-1 * (n-2)! (n-2)!=n-2 * (n-2)! 2!=2 * 1! The actual values returned will be in the reverse order 1!=1 2!= 2 * 1!= 2*1=2 3!= 3 * 2!= 3*2=6 4!= 4 * 3!=4*6=24 ----------------n!=n* (n-1)!= This reversal in the order of execution is a characteristic of all functions that are executed recursively.

The use of recursion is not necessarily the best way to approach a problem, even though the problem definition may be recursive in nature. A non recursive implementation may be more efficient in terms of memory utilization and execution speed. Thus use of recursion may involve a tradeoff between the simplicity and performance.

Program to print fibonacci series using recursion #include<stdio.h> #include<conio.h> void main() { int fibbo(); int i; clrscr(); for(i=1;i<50;i++) printf("%d",fibbo(i)); getch(); } int fibbo(int i) { if(i==1) { return(0); } else if(i==2) { return(1); } else { return(fibbo(i-2)+fibbo(i-1)); } } }

Program to find the x to the power of y using recursion void main() { int i, j, res; printf(enter the two numbers as x to the power y); scanf(%d%d,&i,&j); res=Power(i,j); printf( x to the power of y is %d,res); getch(); } int Power(int x, int y ) { if(y == 0) return 1; y--; return x* Power(x,y); }

Towers of Hanoi- Towers of Hanoi is a practical application of recursion. The problem is as follows: Suppose three pegs, labeled A, B and C are given and suppose on peg A there are placed a finite number n of disks with decreasing size. The object of the game is to move the disks from peg A to peg C using peg B as an auxillary. The rules of the game are: Only one disk may be moved one at a time. At no time can a larger disc be placed on a smaller disk
A B C

Solution for towers of hanoi problem for n=3 is done in seven moves as: Move top disk from peg A to peg C Move top disk from peg A to peg B Move top disk from peg C to peg B Move top disk from peg A to peg C Move top disk from peg B to peg A Move top disk from peg B to peg C Move top disk from peg A to peg C

initial

A->C

A->B

C->B

A->C

B -> A

B -> C

A -> C

Rather than finding a separate solution for each n, we use the technique of recursion to develop a general solution Move top n-1 disks from peg A to peg B Move top disk from peg A to peg C: A->C Move the top n-1 disks from peg B to peg C. Let us introduce a general notation TOWER(N, BEG, AUX, END) to denote a procedure which moves the top n disks from initial peg BEG to final peg END using the peg AUX as a auxillary. For n=1, TOWER(1, BEG, AUX, END) consist of single instruction BEG->END For n>1, solution may be reduced to the solution of following three subproblems:

TOWER(N-1, BEG, END, AUX) TOWER(1,BEG, AUX, END) BEG->END TOWER(N-1, AUX, BEG, END) Each of the three sub problems can be solved directly or is essentially the same as the original problem using fewer disks. Accordingly, this reduction process does yield a recursive solution to the towers of Hanoi problem. In general the recursive solution requires 2n 1 moves for n disks.

Algorithm: TOWER(N, BEG, AUX, END) This procedure produces a recursive solution to the Towers of Hanoi problem for N disks Step 1: If N=1, then: Write: BEG->END Return Step 2: [Move N -1 disks from peg BEG to peg AUX] Call TOWER(N-1, BEG, END, AUX) Write: BEG->END Step 3: [Move N-1 disks from peg AUX to peg END] Call TOWER(N-1, AUX, BEG, END) Step 4: Return

TOWER(1,A,C,B)---A->B TOWER(2,A,B,C) -----A->C TOWER(1,B,A,C) ----------B->C TOWER(3,A,C,B)--------------------------------------------------------------------A->B TOWER(1,C,B,A)---------------- C->A TOWER(2,C,A,B) --------------------------------C->B TOWER(1,A,C,B)----------------A->B

TOWER(4,A,B,C)------------------------------------------------------------------A->C TOWER(1,B,A,C) -------------------B->C TOWER(2,B,C,A) ----------------------------------B->A TOWER(1,C,B,A) --------------------C->A TOWER(3,B,A,C)---------------------------------------------------------B->C TOWER(1,A,C,B) ------------------- A-> B TOWER(2,A,B,C) ------------------------------------A->C TOWER(1,B,A,C) --------------------B->C

Operations on Graph
Suppose a graph G is maintained in memory by the linked list representation GRAPH(NODE,NEXT,ADJ,START,AVAILN,DEST,LINK,AVAIL E) The various operations possible on graph are insertion, deletion and searching a node in a graph.

Algorithm: INSNODE( NODE, NEXT, ADJ, START, AVAIL, N ) This algorithm inserts a node N in a graph G Step 1: If AVAIL:=NULL, then Write: OVERFLOW Return Step 2: [Remove node from AVAIL List] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 4: [Insert node in the NODE list] Set NODE[NEW]:=N, ADJ[NEW]:=NULL, NEXT[NEW]:=START and START:=NEW Step 5: Return

Algorithm: INSEDGE(NODE, NEXT, ADJ, START, DEST, LINK, AVAIL, A ,B) This algorithm inserts an edge in a graph (A,B) where A and B are the two nodes in the graph Step 1: CALL FIND(NODE,NEXT,START,A ,LOCA) Step 2: CALL FIND(NODE, NEXT, START, B, LOCB) Step 3: If AVAIL:=NULL, then Write: OVERFLOW Return Step 4: [Remove node from AVAIL List] Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 5: [Insert LOCB in the list of successor of A] Set DEST[NEW]:=LOCB, LINK[NEW]:=ADJ[LOCA] and ADJ[LOCA]:=NEW Step 6: Return

Algorithm: FIND (INFO,LINK,START,ITEM,LOC) Find the first node containing the ITEM else sets the LOC to NULL Step 1: Set PTR:=START Step 2: Repeat while PTR NULL If ITEM=INFO[PTR], then Set LOC:=PTR Return Else PTR=LINK[PTR] [End of Loop] Step 3: Set LOC:=NULL Step 4: Return

Algorithm: Algorithm to delete a node from a graph Step 1: Find the location LOC of the node N in G Step 2: Delete all edges ending at N; that is delete LOC from the list of successors of each node M in G Step 3: Delete all edges beginning at N. This is accomplished by finding the location BEG of the first successor and the location END of the last successor of N and then adding the successor list of N to the free AVAIL list Step 4: Delete N itself from the list NODE

Last year question paper selected questions

Program to implement Breadth First search in a graph We are using linked list implementation of the graph struct node { int info; struct node* next; struct edge* adj; }; struct edge struct queue { { struct node* dest; struct node* n1; struct node* link; struct queue* link; }; }; This is assumed that we have a graph in which we want to apply breadth first search to find the path from node1 to node2

void bfs ( struct node* node1, struct node* node2, struct node* start)
{

struct node* ptr,*item; struct edge* edge1; ptr=start; while(ptr!=NULL) { ptr->status=1; ptr=ptr->next; }

ptr=start; ptr->status=2; insert(ptr);


while(front!=NULL) { item=del(); item->status=3; printf(%d,item->info); edge1=item->link; while(edge1->link!=NULL) { if(edge1->status==1) { edge1->status=2; insert(edge1->dest); edge1=edge1->link; } } }

void insert(struct node* ptr) { if(front==NULL) { front=rear=ptr; return; } rear=rear->link; rear->n1=ptr; return; } struct node* del() { struct node* n2; n2=front->n1; if(front==NULL) { printf(underflow); return; } else if(front==rear) { front=rear=NULL; } else front=front->link; return n2; }

Program to count leaf and nonleaf nodes in a binary search tree


#include<stdio.h> #include<conio.h> struct tree { int num; struct tree* left; struct tree* right; }; struct tree* root=NULL; struct stack { struct tree* tr; struct stack* link; }; struct tree* pop(); struct stack* top=NULL; void pre(); void create(); void push(); void main() { char ch='y'; int choice,choice1; while(ch=='y') { printf("enter the type of operation 1. create 2. traversal"); scanf("%d",&choice); switch(choice) { case 1: create(); break; case 2: pre(); break; } printf("want another operation"); ch=getch(); } getch(); }

void create() { struct tree* new1,*ptr1,*save; if(root==NULL) { root=(struct tree*)malloc(sizeof(struct tree)); printf("enter the data"); scanf("%d",&root->num); root->left=NULL; root->right=NULL; return; } ptr1=root; new1=(struct tree*)malloc(sizeof(struct tree)); printf("enter the number to be added"); scanf("%d",&new1->num); new1->left=NULL; new1->right=NULL; while(ptr1!=NULL) { if(new1->num<ptr1->num) {save=ptr1; ptr1=ptr1->left; } else { save=ptr1; ptr1=ptr1->right; } } if(new1->num<save->num) { save->left=new1; } else { save->right=new1; } }

void pre() { int leafcount=0,nonleaf=0; struct tree* ptr,*temp; ptr=root; while(ptr!=NULL) { printf("%d",ptr->num); if(ptr->left==NULL&&ptr->right==NULL) { leafcount++; } else { nonleaf++; } if(ptr->right!=NULL) { push(ptr->right); } ptr=ptr->left; } while(top!=NULL) { temp=pop(); while(temp!=NULL) { printf("%d",temp->num); if(temp->left==NULL&&temp->right==NULL) { leafcount++; } else { nonleaf++; } if(temp->right!=NULL) push(temp->right); temp=temp->left; } }

printf("leaf nodes are %d and nonleaf nodes are %d",leafcount,nonleaf);


}

void push(struct tree* p)


{ struct stack* new1; new1=(struct stack*)malloc(sizeof(struct stack)); new1->tr=p; new1->link=top; top=new1; } struct tree* pop() { struct stack* s1; s1=top; top=top->link; return s1->tr; }

Program to search for an item in a binary search tree


#include<stdio.h> #include<conio.h> struct tree { int num; struct tree* left; struct tree* right; }; struct tree* root=NULL; struct stack { struct tree* tr; struct stack* link; }; struct tree* pop(); struct stack* top=NULL; void pre(); void search(); void create(); void push(); void main() { char ch='y'; int choice,choice1; while(ch=='y') { printf("enter the type of operation 1. create 2. search"); scanf("%d",&choice); switch(choice) { case 1: create(); break; case 2: search(); break; } printf("want another operation"); ch=getch(); } getch(); }

void search() { int item; struct tree* ptr,*save; printf("enter the item to be searched"); scanf("%d",&item); ptr=root; if(root==NULL) { printf("tree empty"); return; } if(ptr->num==item) { printf("item found"); return; } if(item<ptr->num) { save=ptr; ptr=ptr->left; } else { save=ptr; ptr=ptr->right; } while(ptr!=NULL) { if(item==ptr->num) { printf("element found"); return; } if(item<ptr->num) { ptr=ptr->left; } else{ptr=ptr->right;}} if(ptr==NULL) {

void create()
{ struct tree* new1,*ptr1,*save; if(root==NULL) { root=(struct tree*)malloc(sizeof(struct tree)); printf("enter the data"); scanf("%d",&root->num); root->left=NULL; root->right=NULL; return; } ptr1=root; new1=(struct tree*)malloc(sizeof(struct tree)); printf("enter the number to be added"); scanf("%d",&new1->num); new1->left=NULL; new1->right=NULL; while(ptr1!=NULL) { if(new1->num<ptr1->num) {save=ptr1; ptr1=ptr1->left; } else { save=ptr1; ptr1=ptr1->right; } } if(new1->num<save->num) { save->left=new1; } else { save->right=new1; } }

Write an algorithm to divide a linked list into three sublists based on a reminder value of data. Algorithm: Sublist( INFO ,LINK, ITEM, START) This algorithm divides the list into three sublists based on some reminder values of data. PTR stores the address of current node of base list. PTR1 stores the address of current node of first new list created and PTR2 stores the address of current node of second sublist created.START1 and START2 and START3 stores the address of the starting address of the three sublists Step 1: START1:=AVAIL and AVAIL:=LINK[AVAIL] START2:=AVAIL and AVAIL:=LINK[AVAIL] START3:=AVAIL and AVAIL:=LINK[AVAIL] Step 2: Set PTR:=START, PTR1:=START1 and PTR2:=START2 and PTR3:=START3 Step 3: Repeat while INFO[PTR]ITEM INFO[PTR]:=INFO[PTR1] LINK[PTR1]:=AVAIL and AVAIL:=LINK[AVAIL] PTR1:=LINK[PTR1] PTR:=LINK[PTR] [End of Loop]

Step 4: Repeat while INFO[PTR]ITEM INFO[PTR]:=INFO[PTR2] LINK[ PTR2]:=AVAIL and AVAIL:=LINK[AVAIL] PTR2:=LINK[PTR2] PTR:=LINK[PTR] [End of Loop] Step 5: Repeat while PTR NULL INFO[PTR]:=INFO[PTR3] LINK[PTR3]:=AVAIL and AVAIL:=LINK[AVAIL] PTR3:=LINK[PTR3] PTR:=LINK[PTR] [End of Loop] Step 6: Exit

Problem: Let there be a doubly linked list with three elements P , Q and R. Write an algorithm to insert S between P and Q Write an algorithm to delete the head element P from list Sol: (A) Algorithm: INSERT(INFO,BACK,FORW) This algorithm inserts an element S between elements P and Q. Step 1: IF AVAIL:=NULL, then: Write:OVERFLOW [End of If structure] Step 2: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 3: Set INFO[NEW]:=S Step 4: Set PTR:=START Step 5: Repeat while INFO[PTR]P PTR:=FORW[PTR] [End of Loop] Step 6: Set TEMP:=FORW[PTR] Set FORW[PTR]:=NEW Set BACK[NEW]:=PTR And FORW[NEW]:=TEMP Set BACK[TEMP]:= NEW Step 7: Exit

(b) Algorithm: Del (INFO, BACK,FORW,START) This algorithm deletes the head node of a doubly linked list Step 1:Set PTR:=START Step 2: Set START:= FORW[PTR] Step 3: Set BACK[FORW[PTR]]:=NULL Step 4: [Returning memory to avail list] Set FORW[PTR]:=AVAIL and AVAIL:=PTR Step 5: Exit

Problem: Suppose the names of few students of a class are as below: Ram, Sham, Mohan, Sohan, Vimal, Komal It is assumed that the names are represented as single linked list (a) Write a program or algorithm to insert the name Raman between sham and mohan (b) Write a routine to replace the name vimal with guman Sol: (a) Algorithm: INSRT(INFO,LINK,START) This algorithm inserts an item Raman between Sham and Mohan Step 1: Set PTR:=START Step 2: If AVAIL=NULL Write: OVERFLOW Exit Step 3: Set NEW:=AVAIL and AVAIL:=LINK[AVAIL] Step 4: Set INFO[NEW]:=Raman Step 5: Repeat while INFO[PTR]Sham PTR:=LINK[PTR] [End of Loop] Step 6: Set LINK[NEW]:=LINK[PTR] Set LINK[PTR]:=NEW Step 7:Exit

(b) Algorithm:Replace(INFO,LINK,START) This algorithm replaces the name vimal in the linked list with name Guman. START pointer stores the address of starting address of the linked list Step 1: Set PTR:=START Step 2: Repeat while INFO[PTR]Vimal PTR:=LINK[PTR] [End of Loop] Step 3: Set INFO[PTR]:=Guman Step 4: Exit

Problem: Calculate the depth of a tree with 2000 nodes Solution: The formulae for calculating the depth of a tree with n number of nodes is Depth= Log2N + 1 Here N is 2000 Find out the 2s power between which the value 2000 lies 210 =1024 and 211 =2048 Thus 2000 lies between 210 and 211 Putting in formulae Depth= Log2 (210 to 211) + 1 Selecting the lower limit = Log2 (210) + 1 =10Log22 +1 =10+1=11

Static and dynamic data structures-A static data structure in computational complexity theory is a data structure created for an input data set which is not supposed to change within the scope of the problem. When a single element is to be added or deleted, the update of a static data structure incurs significant costs, often comparable with the construction of the data structure from scratch. In real applications, dynamic data structures are used, which allow for efficient updates when data elements are inserted or deleted. Static data structures such as arrays allow - fast access to elements - expensive to insert/remove elements - have fixed, maximum size Dynamic data structures such as linked lists allow - fast insertion/deletion of element - but slower access to elements - have flexible size

Applications of Binary Tree/Binary Search Tree For making decision trees For expressing mathematical expressions Faster searching as search is reduced to half at every step. For sorting of an array using heap sort method For representation of Organization charts For representation of File systems For representation of Programming environments

Arithmetic Expression Tree Binary tree associated with an arithmetic expression internal nodes: operators external nodes: operands Example: arithmetic expression tree for the expression (2 (a 1) + (3 b))

Trailer node- A trailer node is like a header node in a linked list except that it stores the address of the last node in a linked list. The significance of this node is in a doubly linked list that can be traversed in both the direction. In doubly linked list, we can take the advantage of trailer node in searching a node in a sorted doubly linked list. The search will be more efficient.

For convenience, a doubly linked list has a header node and a trailer node. They are also called sentinel nodes, indicating both the ends of a list.
header trailer

Baltimore

Rome

Seattle

Difference from singly linked lists: - each node contains two links. - two extra nodes: header and trailer, which contain no elements.

You might also like