8 views

Uploaded by rozeny2k

- heaptext
- Flex Tronic s
- Balanced Trees
- Nine Issues in Translation
- PP MANUAL Original
- Load Balancing in Cloud Computing Through Virtual Machine Placement
- Arbori Binari de cautare
- Android Widget Framework
- MongoDB Berlin Schema Design
- data structures
- Data Structures
- 04-dynamic-programming.pdf
- Assignments on DS
- Binary Trees
- BC412 ABAP Dialog Programming Using Enjoy SAP Controls(1)
- GateGenie cs
- ov4-en.pdf
- Comparison 2 Linux Sch
- Linear Graph EqFormulation
- Binomial Heaps pdf

You are on page 1of 63

Assistant Professor Computer Science Department American International University Bangladesh Email: shamimakhter@aiub.edu

Common problem in computer science Involves storing and maintaining large data set, and then searching the data for particular values data storage and retrieval are key to many industry applications search algorithms are necessary to storing and retrieving data efficiently

For instance, a program that checks the spelling of words, searches for them in a dictionary, which is just an ordered list of words. Problems of this kind are called searching problems.

There are many searching algorithms. The natural searching method is linear search (or sequential search, or exhaustive search)

very simple but takes a long time to apply with large lists

much faster than linear search

Like a binary search, an interpolation search repeatedly subdivides the list to locate an item

Special case of brute-force search This is a very simple algorithm It uses a loop to sequentially step through an array, starting with the first element. It compares each element with the value being searched for and stops when that value is found or the end of the array is reached.

Sub LinearSearch(x:int, a[]: Int, loc: Int) i:=1 While (i<=n) And (x<>a[i]) i:=i+1 End While If i<=n Then loc = i Else loc = 0 End Sub

Array numlist contains

Searching for the the value 11, linear search examines 17, 23, 5, and 11 -> Found Searching for the the value 7, linear search examines 17, 23, 5, 11, 2, 29, and 3 -> Not Found

The advantage is its simplicity.

It is easy to understand Easy to implement Does not require the array to be in order

If there are 20,000 items in the array and what you are looking for is in the 19,999th element, you need to search through the entire list.

Whenever the number of entries doubles, so does the running time, roughly. If a machine does 1 million comparisons per second, it takes about 30 minutes for 4 billion comparisons.

Use a Sentinel to Improve the Performance

Sub LinearSearch2(x:int, a[]: Int, loc: Int) a[n+1] = x: n = n + 1: i = 1 While (x<>a[i]) i = i+1 End While If i<=n Then loc = i Else loc = 0 End Sub

Apply Linear Search to Sorted Lists

Sub LinearSearch3(x:int, a[]: Int, loc: Int) i=1 While (x > a[i]) i = i+1 End While If a[i] = x Then loc = i Else loc = 0 End Sub

Can We Search More Efficiently?

Yes, provided the list is in some kind of order, for example alphabetical order with respect to the names. If this is the case, we use a divide and conquer strategy to find an item quickly. This strategy is what one would use in a number guessing game, for example.

Im Thinking of A Number between 1 and 1000. Guess it!

Is it 500? Nope, too low. Is it 750? Nope, too high. Is it 625? etc

Apply This Strategy to Searching The resulting algorithm is called the Binary Search algorithm. We check the middle key in our list.

If it is beyond what we are looking for (too high), we look only at the 1st half of the list. If its not far enough in (too low), we look at the 2nd half.

Then iterate!

1.

middle element elements on one side of the middle element elements on the other side of the middle element

2.

If the middle element is the correct value, done. Otherwise, go to step 1, using only the half of the array that may contain the correct value.

3.

Continue steps 1 and 2 until either the value is found or there are no more elements to examine.

Binary Search Example Array numlist2 contains

2 3 5 11 17 23 29

Searching for the value 11, binary search examines 11 and stops. Found. Searching for the value 7, binary search examines 11,3,5,and stops. Not Found.

Algorithm for Binary search

Sub BinarySearch(x:int, a[]: int, loc: Int) i =1: j =n while i<j begin m =(i + j) \ 2 if x > a[m] then i=m+1 else j=m end if x=a[i] then loc=i else loc=0 End Sub

The worst case number of comparisons grows by only 1 comparison every time list size is doubled. Only 32 comparisons would be needed on a list of 4 billion using Binary Search.

Sequential Search would need 4 billion comparisons and would take 30 minutes!

Benefit

Much more efficient than linear search. For array of N elements, performs at most log2N comparisons.

Disadvantage

Requires that array elements be sorted.

Binary search is a great improvement over linear search

eliminates large portion of the list without actually examine all

Values are fairly evenly distributed, interpolation can be used to eliminate more values at each step.

Interpolation is the process of using knowledge to guess the position of an unknown value Indexes of known values in the list to guess what index the target value should have. Interpolation search selects the dividing point by interpolation using the following code: m = l + (x a[l])*(r-l)/(a[r]-a[l])

Compare x to a[m]

If x = a[m]: Found. If x<a[m]: set r = m-1 If x > a[m]: set l = m + 1

If searching is still not finish, continue searching with new l and r. Stop searching when Found or x<a[l] or x>a[r].

Example: Find the key x = 32 in the list

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 4 7 9 9 12 13 17 19 21 24 32 36 44 45 54 55 63 66 70

10 a[10]=21<32=x -> l=11 2: l=11, r=20 -> m=11+(30-24)*(20-11)/(7024) = 12 a[12]=32=x -> Found at m = 12

Example: Find the key x = 30 in the list

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 1 4 7 9 9 12 13 17 19 21 24 32 36 44 45 54 55 63 66 70

1: l=1, r=20 -> m=1+(30-1)*(20-1)/(70-1) = 9 a[9]=19<30=x -> l=10 2: l=10, r=20 -> m=10+(30-21)*(20-10)/(7021) = 12 a[12]=32>30=x -> r = 11 3: l=10, r=11 -> m=10+(30-24)*(11-10)/(2421) = 12 m=12>11=r: Not Found

Private Sub Interpolation(a[]: Int, x: Int, n: Int, Found: Boolean) l = 1: r = n Do While (r > l) m = l + ((x a[l]) / (a[r] a[l])) * (r - l) Verify and Decide What to do next Loop End Sub

Verify and Decide what to do next If (a[m] = x) Or (m < l) Or (m > r) Then Found = iif(a[m] = x, True, False) Exit Do ElseIf (a[m] < x) Then l=m+1 ElseIf (a[m] > x) Then r=m1 End If

Binary search is very fast (O(logn)), but interpolation search is much faster (O(loglogn)). For n = 2^32 (four billion items)

Binary search took 32 steps of verification Interpolation search took only 5 steps of verification.

Interpolation search performance time is nearly constant for a large range of n. Interpolation is still more useful if the data had been stored on a hard disk or other relatively slow device.

Its a binary tree ! For each node in a BST

left subtree is smaller than it; and right subtree is greater than it.

Search Operation

Operation Insert

Worst Case

Performance

Depend on the shape of the tree Best Case:

Perfectly balanced tree, log N nodes from root to leave

Worst Case:

N nodes in a search path

Average Case:

1.39 log N comparisons for N keys

Balanced Tree

Tree structures support various basic dynamic set operations in time proportional to the height

of the tree e.g.: Search, Predecessor, Successor, Minimum, Maximum, Insert, and Delete

Ideally, a tree will be balanced and the height will be log n where n is the number of nodes in the tree To ensure that the height of the tree is as small as possible and therefore provide the best running time

Balanced BST

BST Worst case O(N)

Need to be balanced

Approach:

rebalance the BST explicitly Recursive and linear time However, insertion cost quadratic

Frequently rebalancing

Every insert and search will be logarithmic

Nodes store 1, 2, or 3 keys and have 2, 3, or 4 children, respectively All leaves have the same depth

Introduction of nodes with more than 1 key, and more than 2 children 2-Node:

same as a binary node

3 keys, 4 links

Why not minimize height by maximizing children in a d-tree? Let each node have d children so that we get O(logd N) search time! Right?

However, searching out the correct child on each level requires O(log N1/2) by binary search 2 log N1/2 = O(log N) which is not as good as we had hoped for! 2-3-4-trees will guarantee O(log N) height using only 2, 3, or 4 children per node

Insert the new key at the lowest internal node reached in the search 2-node becomes 3-node

We cant insert another key!

In our way down the tree, whenever we reach a 4-node, we break it up into two 2-nodes, and move the middle element up into the parent node

Now we can perform the insertion using one of the previous two cases Since, we follow this method from the root down to the leaf, it is called top down insertion

As we travel down the tree, if we encounter any 4-node we will break it up into 2-nodes. This guarantees that we will never have the problem of inserting the middle element of a former 4-node into its parent 4-node.

Time complexity:

A search visits O(log N) nodes An insertion requires O(log N) node splits Each node split takes constant time Operations Search and Insert each take time O(log N)

What do we know about 2-3-4 Trees?

Balanced O(log N) search time Different node structures

Welcome to the world of Red-Black Trees!!!

Search in BST Insert in 2-3-4 search tree

Red-Black Tree

A red-black tree is a binary search tree with the following properties:

edges are colored red or black no two consecutive red edges on any root-leaf path same number of black edges on any root-leaf path (= black height of the tree) edges connecting leaves are black

Red-Black Tree

How 2-3-4 trees relate to red-black trees

1.

2.

3. 4.

Perform a standard search to find the leaf where the key should be added Replace the leaf with an internal node with the new key Color the incoming edge of the new node red Add two new leaves, and color their incoming edges black

If the parent had an incoming red edge, we now have two consecutive red edges!

We must re-organize tree to remove that violation. What must be done depends on the sibling of the parent.

Restructuring

Case 2: Incoming edge of p is red, and its sibling is black

Double Rotation

What if the new node is between its parent and grandparent in the inorder sequence? We must perform a double rotation (which is no more difficult than a single one)

And this would be called a right-left double rotation

Bottom-Up Rebalancing

Case 3: Incoming edge of p is red and its sibling is also red

We call this a promotion Note how the black depth remains unchanged for all of the descendants of g This process will continue upward beyond g if necessary: rename g as n and repeat.

Summary of Insertion

If two red edges are present, we do either

a restructuring (with a simple or double rotation) and stop, or a promotion and continue

A restructuring takes constant time and is performed at most once. It reorganizes an offbalanced section of the tree. Promotions may continue up the tree and are executed O(log N) times. The time complexity of an insertion is O(logN).

- heaptextUploaded byapi-449920999
- Flex Tronic sUploaded byManish Dahiya
- Balanced TreesUploaded bySerge
- Nine Issues in TranslationUploaded byAlbert R. Semaan
- PP MANUAL OriginalUploaded byVinitha Sudeesh
- Load Balancing in Cloud Computing Through Virtual Machine PlacementUploaded byIRJET Journal
- Arbori Binari de cautareUploaded byCristianValentin
- Android Widget FrameworkUploaded byKiril Stanoev
- MongoDB Berlin Schema DesignUploaded byAlvin John Richards
- data structuresUploaded byMohammed Jeelan
- Data StructuresUploaded byanon_514479957
- 04-dynamic-programming.pdfUploaded byNilay Pochhi
- Assignments on DSUploaded byayush chaturvedi
- Binary TreesUploaded byHilal Imtiyaz
- BC412 ABAP Dialog Programming Using Enjoy SAP Controls(1)Uploaded byjarmsj
- GateGenie csUploaded byKumar Gaurav
- ov4-en.pdfUploaded byBenjamin Morrison
- Comparison 2 Linux SchUploaded byJames Heath
- Linear Graph EqFormulationUploaded byGhulam Imraan Hussain
- Binomial Heaps pdfUploaded byBoopathi Thee
- c Terror - en algunos caso de verde verdeUploaded byPedro Osorio G.
- 07+08slideUploaded byAbu Mohammad Mohammad
- MELJUN CORTES PACUCOA 2016-17-1stSem CCS104 Syllabus Teacher ITUploaded byMELJUN CORTES, MBA,MPA
- sqlite-formatextensionUploaded bycrazydman
- Project_B_The Iterative Closest Point AlgorithmUploaded byTnQuoc Bao
- Intermediate Code Generation 2Uploaded byAditya Harbola
- TableQueryDesign_HierarchicalDataUploaded byShaik Riyaz Peer
- murthy_dtUploaded byabdul hanan
- editedUploaded byAbhishek Dev
- Huff ManUploaded byvivekoberoi

- CEH v8 Labs Module 00Uploaded byariessht
- KA02 Knowledge Assessment Notes DraftUploaded byrozeny2k
- 06898215Uploaded byrozeny2k
- CPSC503 Project 1 Fall 2015Uploaded byrozeny2k
- CPSC503 Project 1 Fall 2015Uploaded byrozeny2k
- HearstnotesUploaded byrozeny2k
- Readme.txtUploaded byrozeny2k
- 6. Configuration Standards for All System Components Policy 0Uploaded byrozeny2k
- workUploaded byrozeny2k
- 7097FA_2017 SQSM-2Uploaded byrozeny2k
- ReferenceDoc (1)Uploaded byrozeny2k
- The Serval MeshUploaded byrozeny2k
- WP Alsbridge Network CarriersUploaded byrozeny2k
- Case Study 1 March 31 2015Uploaded byrozeny2k
- The ACS Core Body of Knowledge for ICT Professionals CBOKUploaded byrozeny2k
- IT AssignmentsUploaded byrozeny2k
- CCNA QuestionUploaded byrozeny2k
- Wireless Site Survey Best PracticesUploaded byjdelacruz010480
- Campus network designUploaded byrozeny2k
- Presentation HandoutsUploaded byrozeny2k
- eLearning for engineering studentsUploaded byrozeny2k
- cyber crime analysisUploaded byrozeny2k
- Wireless ReportUploaded byrozeny2k
- Local Area Networking TechnlologiesUploaded byrozeny2k
- 1587132109Uploaded byStanislaus Ivantius Limampauw
- icici bank.docxUploaded byrozeny2k
- Disaster Recovery Plan PolicyUploaded byrozeny2k
- ccna-questions.docUploaded byrozeny2k
- Sample IP Route COmmandUploaded byrozeny2k

- Auto Pilot OperationsUploaded bygregm
- copacean2.pdfUploaded byvassypop
- 41 Fair empl.prac.cas. 780, 40 Empl. Prac. Dec. P 36,366 William Lattimore v. Oman Construction, Bill White, 795 F.2d 930, 11th Cir. (1986)Uploaded byScribd Government Docs
- study the comparison of ELSS mutual fund marketed at Karvy private WealthUploaded byBharat Narula
- AutoCAD Associate Sample QuestionsUploaded byKelly Gan
- sulphate test method.pdfUploaded bySatish Kumar
- Musculoskeletal SystemUploaded byromeo jr mostoles
- Growing Shiitake Mushrooms in an Agroforestry PracticeUploaded by1dsmark27088
- Metrode Welding Consumables for Cryogenic ApplicationsUploaded byClaudia Mms
- Click Wrap License AgreementUploaded bybadalbarreda
- PMAC_Whitepaper.pdfUploaded byIFeLisTigris
- Harden UpUploaded bymara9121
- May 8, 2015 Strathmore TimesUploaded byStrathmore Times
- 730-aaUploaded byRini Mathew
- Construction Manager or Facilities DirectorUploaded byapi-121669887
- Principle of ManagementUploaded byMalik Khizer
- Mathematical Analysis of Electronic Commerce Architecture Using Queuing Theory-Dr Riktesh SrivastavaUploaded byDr Riktesh Srivastava
- MCTE3210_Lab#4_FL16.pdfUploaded bySalim Alsenani
- 06a Setup Reduction Data Forms 4 PgsUploaded bywulfgang66
- Unit-1Uploaded bySanjeev Chaudhary
- TM 9-1527 ( Ordnance Maintenance Gunner's Quadrants M1 and .pdfUploaded byferdockm
- 2012.06 DNV Model-based methods for energy efficiency improvements onboard shipsUploaded bygnd100
- TN_EDC_SAMBHAL 52.pdfUploaded byPankaj Palawat
- NBX_V50_InstallGuideUploaded byJhona Ed Mendez C
- Indoor Environmental QualityUploaded byPaula Fontelara
- Aspiring MindsUploaded byAhmedsadat
- "EmotionShips"Uploaded bySamiker Janala
- MagnetometerUploaded byCesar Alberto Vazquez
- Adams Bash Forth ProofUploaded bypiticmic
- cometd_grailsUploaded byPablo Iaria