You are on page 1of 6

Stanford University CS161: Algorithms Luca Trevisan

Handout 1 April 1, 2013

Lecture 1
In which we review prerequisites and start talking about divide-and-conquer algorithms.

Prerequisites

This class assumes CS103, CS106 and CS109 as prerequisites.

Proofs and Big-Oh notation


From CS103 we assume mathematical maturity, that is the ability to follow nontrivial mathematical proofs, the ability to easily absorb new denitions and concepts, and the ability to develop and to clearly, concisely, and rigorously expose your own proofs. For example, you should be able to prove by induction that a tree with n vertices has n 1 edges. We also assume familiarity with the notation O(), (), o(), (), whose denitions we now recall. If we write f (n) = O(g (n)), where n is a positive integer that typically denotes the size of the input to an algorithm, then we mean that f (n) is at most a xed constant times g (n) for every suciently large n, that is, f (n) = O(g (n)) c, n0 .n n0 . f (n) c g (n) . The following is a helpful fact: if limn f (n)/g (n) exists and it is nite, then f (n) = O(g (n)). The converse, unfortunately, is not necessarily true. For example, suppose that f (n) = 1 if n is odd, and f (n) = 35n2 if n is even. (Maybe f (n) is the running time of an algorithm that pairs the n input items in some way, and if the number of inputs is odd then the algorithm immediately quits with an error message.) Then f (n) = O(n2 ), but limn f (n)/n2 does not exist. It is also true, although maybe less helpful, that f (n) = O(g (n)) if and only if lim supn f (n)/g (n) is nite. If we write f (n) = (g (n)) then we mean that, for suciently large n, f (n) is at least a xed constant times g (n), that is, f (n) = (g (n)) c > 0, n0 .n n0 . f (n) c g (n) . 1

Note that f (n) = O(g (n)) if and only if g (n) = (f (n)). If both f (n) = O(g (n)) and f (n) = (g (n)), then we write f (n) = (g (n)). For example 35n2 n = (12n2 + n) The meaning of f (n) = o(g (n)) is that, for suciently large n, f (n) is smaller than g (n) times an arbitrarily small constant. f (n) = o(g (n)) .n0 .n n0 . f (n) g (n) . Note that f (n) = o(g (n)) is exactly the same as limn f (n)/g (n) = 0. Finally, f (n) = (g (n)) means that g (n) = o(f (n)).

Coding
From CS106 we assume basic coding skills. It will be very useful to you (but not required) to code and to experiment with the algorithms and the data structures that we will describe in the course. The midterm will be a programming assignment.

Counting and probability


From CS109 we assume the basics of counting and of discrete probability. For example, how many functions of the type f :AB are there, if A and B are nite sets? For example, how many functions of type f : {0, 1}n {0, 1}? How many functions of type f : {1, . . . , n} {1, . . . , n}? Let us say that an input x for a function f () is a xed point if f (x) = x. If we sample uniformly a random a function f : {1, . . . , n} {1, . . . , n}, what is, exactly, the probability that f () has no xed point? What does this probability tend to as n goes to innity? Answers: |B ||A| , 22 , nn , 1
n

1 n 1 , e. n

Divide-and-conquer

Divide-and-conquer is a general technique to design algorithms: the idea (which works for many, but denitely not all, problems) is to divide the problem into pieces, to solve each piece recursively, and then to combine the solutions to the pieces. 2

Deciding what will be a piece, what problem we want to solve on each piece, and how the solutions will be combined, will depend on the problem. Sometimes, considerable ingenuity goes into coming up with the right way to instantiate this method.

2.1

Binary search

Given a sorted vector A = A[0], A[1], . . . , A[n 1] and a target value x, we want to nd an index i such that A[i] = x, if such an index exists. We will call the following procedure with arguments (A, x, 0, n 1). function binary-search(A, x, L, R) if L > R then return Fail else i (L + R)/2 if A[i] == x then return i else if A[i] > x then return binary-search(A, x, L, i 1) else return binary-search(A, x, i + 1, R) end if end if end function The procedure looks for an index i such that A[i] = x and such that L i R. If L > R then the interval L i R is empty and the search fails. Otherwise we try i = (L + R)/2. If we nd x we are done; otherwise either A[i] > x, in which case we do not need to look at the sub-vector A[i], . . . , A[R], and we can recurse on the subvector A[L], . . . , A[i 1], or A[i] < x, in which case we recurse on A[i + 1], . . . , A[R]. If we denote by T (n) the worst-case running time of binary search on inputs of length n, then we have

T (1) = O(1) T (n) = O(1) + T (n/2) That is, there is a constant c such that

T (1) c T (n) c + T (n/2) Now dene a function T as

T (1) = 1 T (n) = 1 + T (n/2) Then it is easy to prove by induction that we have T (n) c T (n), and so if we prove a big-Oh bound on T , then the same bound will apply to T . To unfold the recursive denition of T we see that

T (n) = 1 + T (n/2) = 2 + T (n/4) . . . = k + T (n/2k ) so if we pick k = log2 n, we have T (n) = log2 n + 1, and so T (n) = O(log n).1

2.2

Mergesort

Given a vector A = A[0], . . . , A[n 1], we want to sort it. We will proceed in the following way: split A into two pieces: A[0], . . . , A[n/2 1] and A[n/2], . . . , A[n 1], each of length n/2, sort each piece recursively, and then combine the two pieces. The procedure mergesort takes in input a vector A and its length n and it returns a vector that is the sort of A. If A has length 1 or is empty, then there is nothing to do. Otherwise we will split A into two vectors B and C , each containing half of the elements of A, we will sort each of them recursively, and then we will invoke the procedure merge to combine them together. The procedure that does all the work is merge. It takes in input two sorted vectors B and C , of length b and c, respectively, and it merges them into a sorted vector M . In general, we will not give fully rigorous proofs of correctness of algorithms (the sample proof given in HW1 is representative of the proof we will do in class and of
To be precise, this analysis only covers the case in which n is a power of two, and n/2 should be n/2 in the denition of T (n). We will see how to deal with these kind of issues later.
1

function mergesort(A, n) if n == 1 or n == 0 then return A else b n/2 cnb B A[0 : b 1] C A[b : n 1] return merge( mergesort (B, b), mergesort (C, c),b, c) end if end function

function merge(B, C, b, c) M empty vector of length b + c i0 j0 p0 B [ b] C [c] while i < b or j < c do if B [i] < C [j ] then M [p] B [i] pp+1 ii+1 else M [p] C [j ] pp+1 j j+1 end if end while end function

the proofs that we will require in homeworks), but here we will illustrate how such a proof looks like. The algorithm merge maintains three pointers, p in M , i in B and j in C . We will prove by induction that the following properties are true at every execution of the while loop: 1. M [0 : p 1] is sorted, and it contains precisely the elements of B [0 : i 1] and C [0 : j 1]; 2. All the elements of M [0 : p 1] are smaller than or equal to all the elements of B [i : b 1] and C [j : c 1]. To prove that the above properties are always satised, we prove that they are true at the beginning, and then we prove that if they are true before an execution of the while loop, then they remain true after another execution of the loop. Then it follows by induction on the number of times the while loop is executed that the properties are true at all steps and, in particular, at the end of the while loop. At the beginning, M [0 : p 1], B [0 : i 1] and C [0 : j 1] are all empty and so the conditions are trivially true. If the conditions are true and we run one step of the while loop, then we write on M [p] the smallest of B [i] and C [j ], that is, the smallest element of the union of B [i : b 1] and C [j : c 1], which means that the second property remains satised. Because of the second property, we also maintain the condition that M is sorted, because the element that we insert in position p will be greater than or equal to all the elements in M [0 : p 1]. Finally, it is clear that the second part of the rst condition is also maintained. At the end of the algorithm, we have i = b and j = c, so the rst property tells us that M contains all the elements of A and B , and that M is sorted, as desired. It should also be clear that the running of merge is O(b + c), because every invocation of the while loop takes O(1) time and it increases either i or j , which means that, at every step, the time elapsed so far is O(i + j ) and, at the end, it is O(b + c).

You might also like