Professional Documents
Culture Documents
Lecture 1
In which we review prerequisites and start talking about divide-and-conquer algorithms.
Prerequisites
Note that f (n) = O(g (n)) if and only if g (n) = (f (n)). If both f (n) = O(g (n)) and f (n) = (g (n)), then we write f (n) = (g (n)). For example 35n2 n = (12n2 + n) The meaning of f (n) = o(g (n)) is that, for suciently large n, f (n) is smaller than g (n) times an arbitrarily small constant. f (n) = o(g (n)) .n0 .n n0 . f (n) g (n) . Note that f (n) = o(g (n)) is exactly the same as limn f (n)/g (n) = 0. Finally, f (n) = (g (n)) means that g (n) = o(f (n)).
Coding
From CS106 we assume basic coding skills. It will be very useful to you (but not required) to code and to experiment with the algorithms and the data structures that we will describe in the course. The midterm will be a programming assignment.
1 n 1 , e. n
Divide-and-conquer
Divide-and-conquer is a general technique to design algorithms: the idea (which works for many, but denitely not all, problems) is to divide the problem into pieces, to solve each piece recursively, and then to combine the solutions to the pieces. 2
Deciding what will be a piece, what problem we want to solve on each piece, and how the solutions will be combined, will depend on the problem. Sometimes, considerable ingenuity goes into coming up with the right way to instantiate this method.
2.1
Binary search
Given a sorted vector A = A[0], A[1], . . . , A[n 1] and a target value x, we want to nd an index i such that A[i] = x, if such an index exists. We will call the following procedure with arguments (A, x, 0, n 1). function binary-search(A, x, L, R) if L > R then return Fail else i (L + R)/2 if A[i] == x then return i else if A[i] > x then return binary-search(A, x, L, i 1) else return binary-search(A, x, i + 1, R) end if end if end function The procedure looks for an index i such that A[i] = x and such that L i R. If L > R then the interval L i R is empty and the search fails. Otherwise we try i = (L + R)/2. If we nd x we are done; otherwise either A[i] > x, in which case we do not need to look at the sub-vector A[i], . . . , A[R], and we can recurse on the subvector A[L], . . . , A[i 1], or A[i] < x, in which case we recurse on A[i + 1], . . . , A[R]. If we denote by T (n) the worst-case running time of binary search on inputs of length n, then we have
T (1) = O(1) T (n) = O(1) + T (n/2) That is, there is a constant c such that
T (1) = 1 T (n) = 1 + T (n/2) Then it is easy to prove by induction that we have T (n) c T (n), and so if we prove a big-Oh bound on T , then the same bound will apply to T . To unfold the recursive denition of T we see that
T (n) = 1 + T (n/2) = 2 + T (n/4) . . . = k + T (n/2k ) so if we pick k = log2 n, we have T (n) = log2 n + 1, and so T (n) = O(log n).1
2.2
Mergesort
Given a vector A = A[0], . . . , A[n 1], we want to sort it. We will proceed in the following way: split A into two pieces: A[0], . . . , A[n/2 1] and A[n/2], . . . , A[n 1], each of length n/2, sort each piece recursively, and then combine the two pieces. The procedure mergesort takes in input a vector A and its length n and it returns a vector that is the sort of A. If A has length 1 or is empty, then there is nothing to do. Otherwise we will split A into two vectors B and C , each containing half of the elements of A, we will sort each of them recursively, and then we will invoke the procedure merge to combine them together. The procedure that does all the work is merge. It takes in input two sorted vectors B and C , of length b and c, respectively, and it merges them into a sorted vector M . In general, we will not give fully rigorous proofs of correctness of algorithms (the sample proof given in HW1 is representative of the proof we will do in class and of
To be precise, this analysis only covers the case in which n is a power of two, and n/2 should be n/2 in the denition of T (n). We will see how to deal with these kind of issues later.
1
function mergesort(A, n) if n == 1 or n == 0 then return A else b n/2 cnb B A[0 : b 1] C A[b : n 1] return merge( mergesort (B, b), mergesort (C, c),b, c) end if end function
function merge(B, C, b, c) M empty vector of length b + c i0 j0 p0 B [ b] C [c] while i < b or j < c do if B [i] < C [j ] then M [p] B [i] pp+1 ii+1 else M [p] C [j ] pp+1 j j+1 end if end while end function
the proofs that we will require in homeworks), but here we will illustrate how such a proof looks like. The algorithm merge maintains three pointers, p in M , i in B and j in C . We will prove by induction that the following properties are true at every execution of the while loop: 1. M [0 : p 1] is sorted, and it contains precisely the elements of B [0 : i 1] and C [0 : j 1]; 2. All the elements of M [0 : p 1] are smaller than or equal to all the elements of B [i : b 1] and C [j : c 1]. To prove that the above properties are always satised, we prove that they are true at the beginning, and then we prove that if they are true before an execution of the while loop, then they remain true after another execution of the loop. Then it follows by induction on the number of times the while loop is executed that the properties are true at all steps and, in particular, at the end of the while loop. At the beginning, M [0 : p 1], B [0 : i 1] and C [0 : j 1] are all empty and so the conditions are trivially true. If the conditions are true and we run one step of the while loop, then we write on M [p] the smallest of B [i] and C [j ], that is, the smallest element of the union of B [i : b 1] and C [j : c 1], which means that the second property remains satised. Because of the second property, we also maintain the condition that M is sorted, because the element that we insert in position p will be greater than or equal to all the elements in M [0 : p 1]. Finally, it is clear that the second part of the rst condition is also maintained. At the end of the algorithm, we have i = b and j = c, so the rst property tells us that M contains all the elements of A and B , and that M is sorted, as desired. It should also be clear that the running of merge is O(b + c), because every invocation of the while loop takes O(1) time and it increases either i or j , which means that, at every step, the time elapsed so far is O(i + j ) and, at the end, it is O(b + c).