Professional Documents
Culture Documents
Course Information:
http://cs.calvin.edu/curriculum/cs/212/
Grades, turning in programs on moodle
Why study DS & Algorithms?
Some problems are difficult to solve and
good solutions are known
Some “solutions” don’t always work
Some simple algorithms don’t scale well
Data structures and algorithms make good
tools for addressing new problems
Interviews
Fun! Beauty! Joy!
Place in the curriculum
The study of interesting areas of computer
science: upper-level electives
108, 112: basic tools (like algebra to
mathematics)
212: More advanced tools. Introduction to
the science of computing. (A little more
mathematical than other core CS courses, but not as
rigorous as a real math course.)
Example: Search and Replace
Goal: replace one string with another
Which version is better?
#!/usr/bin/perl #!/usr/bin/perl
my $a = shift; my $a = shift;
my $b = shift; my $b = shift;
my $input = ""; my $input = "";
10000
4 250
1000
8 560
100
16 810
10
32 1750 1
64 6510 1 10 100 1000
Problem size (1000s)
128 25680
256 102050
swap2 runtime analysis
size swap2
4 70 swap2 runtimes
8 50
7000
16 70 6000
32 80 5000
Runtime (ms)
64 100 4000
128 120 3000
120000
100000
80000
swap1
60000
swap2
40000
20000
0
0 2000 4000 6000 8000 10000 12000 14000 16000 18000
Runtime vs. Size, lg-lg plot
18
y = 1.9577x + 0.957
16
14
y = 0.9175x - 0.2822
12
Lg (runtime)
10
0
0 2 4 6 8 10 12 14 16
Lg(problem size)
for i 0 to n 1 do
for j 0 to n 1 do
if A[i] A[j] then
A[j] A[i]
Theoretical Analysis
Uses a high-level description of the algorithm
instead of an implementation
Characterizes running time as a function of the input
size, n.
Takes into account all possible inputs, often analyzing
the worst case
Allows us to evaluate the speed of an algorithm
independent of the hardware/software environment
Pseudocode (§1.1)
High-level description Example: find max
of an algorithm element of an array
More structured than Algorithm arrayMax(A, n)
English prose
Input array A of n integers
Less detailed than a Output maximum element of A
program
Preferred notation for currentMax A[0]
describing algorithms for i 1 to n 1 do
Hides program design if A[i] currentMax then
issues currentMax A[i]
return currentMax
Pseudocode Details
Control flow Method call
if … then … [else …] var.method (arg [, arg…])
while … do … Return value
repeat … until … return expression
for … do … Expressions
Indentation replaces braces Assignment
(like in Java)
Method declaration Equality testing
Algorithm method (arg [, arg…]) (like in Java)
Input … n2 Superscripts and other
Output … mathematical
formatting allowed
Questions…
Can a program be asymptotically faster on
one type of CPU vs another?
1E+14
102n + 105 is a linear 1E+12
function 1E+10
105n2 + 108n is a 1E+8
quadratic function 1E+6
1E+4
1E+2
1E+0
1E+0 1E+2 1E+4 1E+6 1E+8 1E+10
n
Asymptotic (big-O) Notation (§1.2)
10,000
Given functions f(n) and g(n), 3n
we say that f(n) is O(g(n)) if
2n+10
there are positive constants 1,000
c and n0 such that n
f(n) cg(n) for n n0 100
Example: 2n + 10 is O(n)
2n + 10 cn
(c 2) n 10 10
n 10/(c 2)
Pick c 3 and n0 10
1
1 10 100 1,000
n
Example
1,000,000
n^2
Example: the function 100n
n2 is not O(n) 100,000
10n
n2 cn 10,000 n
nc
The above inequality 1,000
cannot be satisfied
since c must be a 100
constant
10
1
1 10 100 1,000
n
More Big-O Examples
7n-2
7n-2 is O(n)
need c > 0 and n0 1 such that 7n-2 c•n for n n0
this is true for c = 7 and n0 = 1
3n3 + 20n2 + 5
3n3 + 20n2 + 5 is O(n3)
need c > 0 and n0 1 such that 3n3 + 20n2 + 5 c•n3 for n n0
this is true for c = 4 and n0 = 21
3 log n + log log n
3 log n + log log n is O(log n)
need c > 0 and n0 1 such that 3 log n + log log n c•log n for n n0
this is true for c = 4 and n0 = 2
Asymptotic analysis of functions
Asymptotic analysis is equivalent to
ignoring multiplicative constants
ignoring lower-order terms
“for large enough inputs”
Big-O and growth rate
Big-O gives an upper bound on the growth rate of
a function
Think of it as <= [asymptotically speaking]
Big-O Rules
If is f(n) a polynomial of degree d, then f(n) is O(nd),
i.e.,
1. Drop lower-order terms
2. Drop constant factors
Use the smallest possible class of functions, if possible
Say “2n is O(n)” instead of “2n is O(n2)”
(The former is a stronger statement)
Use the simplest expression of the class
Say “3n + 5 is O(n)” instead of “3n + 5 is O(3n)”
Asymptotic Algorithm Analysis
Asymptotic analysis: determine the runtime in big-O
notation
To perform the asymptotic analysis
Find the worst-case number of primitive operations
executed as a function of the input size
Express this function with big-O notation
Example:
We determine that algorithm arrayMax executes at most
7n 1 primitive operations
We say that algorithm arrayMax “runs in O(n) time”
Since constant factors and lower-order terms are
eventually dropped anyway, we can disregard them
when counting primitive operations
Computing Prefix Averages
Runtime analysis example: 35
X
Two algorithms for prefix 30 A
averages 25
The i-th prefix average of
20
an array X is average of the
15
first (i + 1) elements of X:
10
A[i] (X[0] + X[1] + … + X[i])/(i+1)
5
Computing the array A of
0
prefix averages of another
1 2 3 4 5 6 7
array X has applications to
financial analysis
Prefix Averages (Quadratic)
The following algorithm computes prefix averages in quadratic
time by applying the definition
Algorithm prefixAverages1(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A new array of n integers n
for i 0 to n 1 do n
s X[0] n
for j 1 to i do 1 + 2 + …+ (n 1)
s s + X[j] 1 + 2 + …+ (n 1)
A[i] s / (i + 1) n
return A 1
Prefix Averages (Linear)
The following algorithm computes prefix averages in linear time
by keeping a running sum
Algorithm prefixAverages2(X, n)
Input array X of n integers
Output array A of prefix averages of X #operations
A new array of n integers n
s0 1
for i 0 to n 1 do n
s s + X[i] n
A[i] s / (i + 1) n
return A 1
Algorithm prefixAverages2 runs in O(n) time
Arithmetic Progression
The running time of 7
prefixAverages1 is 6
O(1 + 2 + …+ n)
5
The sum of the first n
integers is n(n + 1) / 2 4
There is a simple visual 3
proof of this fact
2
Thus, algorithm
prefixAverages1 runs in 1
O(n2) time 0
1 2 3 4 5 6
What’s the runtime?
int n;
2n3+n2+n+2?
cin >> n;
for (int i=0; i<n; i++)
for (int j=0; j<n; j++) O(n3) runtime
for (int k=0; k<n; k++)
cout << “Hello world!\n”;
1 + 2 + 3 + … + n = n(n+1)/2 = Q(n2)
What’s the runtime?
template <class Item>
void insert(Item a[], int l, int r)
{ int i;
for (i=r; i>l; i--) compexch(a[i-1],a[i]);
for (i=l+2; i<=r; i++)
{ int j=i; Item v=a[i];
while (v<a[j-1])
{ a[j] = a[j-1]; j--; }
a[j] = v;
}
}
Math you need to Review
Summations (Sec. 1.3.1)
properties of logarithms:
Logarithms and Exponents (Sec. 1.3.2)
logb(xy) = logbx + logby
logb (x/y) = logbx - logby
Logb xa = a logb x
logba = logxa/logxb
properties of exponentials:
a(b+c) = aba c
abc = (ab)c
ab /ac = a(b-c)
Proof techniques (Sec. 1.3.3) b = a logab
Basic probability (Sec. 1.3.4) bc = a c*logab
Relatives of Big-Oh
big-Omega
f(n) is (g(n)) if there is a constant c > 0
Big-Oh
f(n) is O(g(n)) if f(n) is asymptotically less than or equal to g(n)
big-Omega
f(n) is (g(n)) if f(n) is asymptotically greater than or equal to g(n)
big-Theta
f(n) is Q(g(n)) if f(n) is asymptotically equal to g(n)
little-oh
f(n) is o(g(n)) if f(n) is asymptotically strictly less than g(n)
little-omega
f(n) is (g(n)) if is asymptotically strictly greater than g(n)
Example Uses of the
Relatives of Big-Oh
5n2 is (n2)
f(n) is (g(n)) if there is a constant c > 0 and an integer
constant n0 1 such that f(n) c•g(n) for n n0
let c = 5 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if there is a constant c > 0 and an integer
constant n0 1 such that f(n) c•g(n) for n n0
let c = 1 and n0 = 1
5n2 is (n)
f(n) is (g(n)) if, for any constant c > 0, there is an integer
constant n0 0 such that f(n) c•g(n) for n n0
need 5n02 c•n0 given c, the n0 that satisfies this is n0
c/5 0
Asymptotic Analysis: Review
What does it mean to say that an algorithm
has runtime O(n log n)?
n: Problem size
Big-O: upper bound over all inputs of size n
“Ignore constant factor” (why?)
“as n grows large”
1.0001n = O(n943)?
No – all exponentials dominate any polynomial
lg n = Q(ln n)?
Yes – different bases are just a constant factor difference
Extensible Array:
Usually works like a normal array
When it fills, copy the entire contents into a new
array of twice the size.
Runtime?
Analysis of Extensible Arrays
Worst-case runtime for a push() operation?
Is that the most useful way of describing the result?