Algorithm Analysis
Runtime analysis as a process: comprehending programs, modeling the number of steps, and formulating an answer.
1
Kevin Lin, with thanks to many others.
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
2
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
Asymptotic Analysis
What happens for very large N, as N→∞.
Simulating billions of particles.
Social network with billions of users.
Logging billions of transactions.
Encoding billions of bytes of video data.
Linear-time algorithms scale better than quadratic-time algorithms (parabolas).
3
Algorithms (Robert Sedgewick, Kevin Wayne/Princeton)
Orders of Growth
4
Algorithm Design (Jon Kleinberg, Éva Tardos/Pearson Education)
Asymptotic Analysis vs. Case Analysis
For a very large array with billions of elements (asymptotic analysis), how is it possible for dup1 to execute only 2 less-than (<) operations?
5
Operation | dup1: Quadratic/Parabolic | dup2: Linear |
i = 0 | 1 | 1 |
less-than (<) | 2 to (N2 + 3N + 2) / 2 | 1 to N |
increment (+= 1) | 0 to (N2 + N) / 2 | 0 to N - 1 |
equality (==) | 1 to (N2 - N) / 2 | 1 to N - 1 |
array accesses | 2 to N2 - N | 2 to 2N - 2 |
Q
Asymptotic Analysis vs. Case Analysis
First two items are the duplicates. That makes sense for large inputs.
Case analysis coexists with asymptotic analysis.
In the case of 0 items in the array, there’s only one less than. That seems fast.
6
A
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
7
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
Simplified Modeling Process
Rather than building the entire table of operation counts, we can instead:
Let’s redo our analysis of dup1 with this Simplified Modeling Process.
This time, we’ll show all our work.
8
Worst Case Order of Growth: Exact Count of == Operations
“The worst case order of growth of the runtime for dup1 is N2.”
9
int N = A.length; // N == 6
for (int i = 0; i < N; i += 1)
for (int j = i + 1; j < N; j += 1)
if (A[i] == A[j])
return true;
return false;
| == | == | == | == | == |
| | == | == | == | == |
| | | == | == | == |
| | | | == | == |
| | | | | == |
| | | | | |
0
1
2
3
4
5
0 1 2 3 4 5
i
j
Worst Case Order of Growth: Geometric Argument
“The worst case order of growth of the runtime for dup1 is N2.”
10
int N = A.length; // N == 6
for (int i = 0; i < N; i += 1)
for (int j = i + 1; j < N; j += 1)
if (A[i] == A[j])
return true;
return false;
| == | == | == | == | == |
| | == | == | == | == |
| | | == | == | == |
| | | | == | == |
| | | | | == |
| | | | | |
0
1
2
3
4
5
0 1 2 3 4 5
i
j
Area of right triangle of side length N - 1.
Order of growth of area is N2.
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
11
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
“The worst case order of growth of the runtime for dup1 is N2.”
Order of Growth Exercise
What is the order of growth of each function?
(Informally, what is the shape of each function for very large N?)
12
Q
Function | Order of Growth |
N3 + 3N4 | N4 |
(1 / N) + N3 | N3 |
(1 / N) + 5 | 1 |
NeN + N | NeN |
40 sin(N) + 4N2 | N2 |
Function | Order of Growth |
N3 + 3N4 | |
(1 / N) + N3 | |
(1 / N) + 5 | |
NeN + N | |
40 sin(N) + 4N2 | |
Big-Theta Notation
Suppose we have a function R(N) with order of growth f(N). In Big-Theta, we write this as:
13
Function | Big-Theta |
N3 + 3N4 | Θ(N4) |
(1 / N) + N3 | Θ(N3) |
(1 / N) + 5 | Θ(1) |
NeN + N | Θ(NeN) |
40 sin(N) + 4N2 | Θ(N2) |
Big-Theta Definition
means there exist positive constants k1 and k2 such that
for all values of N greater than some N0.
14
“Very large N”
Big-Theta Challenge
Find a simple f(N) and corresponding k1 and k2.
means there exist positive constants k1 and k2 such that
for all values of N greater than some N0.
15
Q
Big-O Notation
Whereas Big-Theta can informally be thought of as something like “equals”, Big-O can be thought of as “less than or equal”.
All of the following are true.
16
Big-O Definition
means there exists a positive constant k2 such that
for all values of N greater than some N0.
17
“Very large N”
Big-Omega Definition
means there exists a positive constant k1 such that
for all values of N greater than some N0.
18
“Very large N”
Overall Asymptotic Runtime Bound for dup1
Give an overall asymptotic runtime bound for R as a combination of Θ, O, and/or Ω notation. Take into account both the best and the worst case runtimes (Rbest and Rworst). N is the length of array.
Then, give a few other valid runtime bounds for Rbest , Rworst , and R using asymptotic notation.
19
Q
Tight Asymptotic Runtime Bounds for dup1
Best case: Ω(1) and O(1), therefore Θ(1).
Worst case: Ω(N2) and O(N2), therefore Θ(N2).
Overall: Ω(1) and O(N2).
Because the Ω and O bounds do not agree, Θ bound does not exist.
20
A
Valid but Less-Descriptive Runtime Bounds for dup1
Best case: Ω(1) and O(1), therefore Θ(1).
Worst case: Ω(N2) and O(N2), therefore Θ(N2).
Overall: Ω(1) and O(N2).
Because the Ω and O bounds do not agree, Θ bound does not exist.
21
A
Mystery
Give a tight asymptotic runtime bound for mystery as a function of N, the length of the array, in the best case, worst case, and overall.
22
Q
boolean mystery(int[] a, int target) {
int N = a.length;
for (int i = 0; i < N; i += 1)
if (a[i] == target)
return true;
return false;
}
Mystery: Big-Ω, Big-O, and Big-Θ
Best case: First element in array is the target.
Worst case: Target is the last item in array.
Overall:
23
A