Algorithm Analysis I
Runtime analysis as a process: comprehending programs, modeling the number of steps, and formulating an answer.
1
Kevin Lin, with thanks to many others.
Feedback from the Reading Quiz
Counting operations. Let’s look at an example where N = 6 instead of N = 10000.
Confusing question. What part of dup1 and dup2 required us to specify a range for the number of times an operation was executed?
Parabolas vs. lines.
2
Demo
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
3
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
Asymptotic Analysis
What happens for very large N, as N→∞.
Simulating billions of particles.
Social network with billions of users.
Logging billions of transactions.
Encoding billions of bytes of video data.
Linear-time algorithms scale better than quadratic-time algorithms (parabolas).
4
Orders of Growth
5
Algorithm Design (Jon Kleinberg, Éva Tardos/Pearson Education)
Asymptotic Analysis and Case Analysis
For a very large array with billions of elements (i.e. asymptotic analysis), is it possible for dup1 to execute only 2 less-than (<) operations?
6
Operation | dup1: Quadratic/Parabolic | dup2: Linear |
i = 0 | 1 | 1 |
less-than (<) | 2 to (N2 + 3N + 2) / 2 | 1 to N |
increment (+= 1) | 0 to (N2 + N) / 2 | 0 to N - 1 |
equality (==) | 1 to (N2 - N) / 2 | 1 to N - 1 |
array accesses | 2 to N2 - N | 2 to 2N - 2 |
Q
For a very large array, is it possible for dup1 to execute only 2 less-than (<) operations?
7
Duplicate Finding
Our goal is to somehow characterize the runtimes of the functions below.
8
Operation | dup1: Quadratic/Parabolic | dup2: Linear |
i = 0 | 1 | 1 |
less-than (<) | 2 to (N2 + 3N + 2) / 2 | 1 to N |
increment (+= 1) | 0 to (N2 + N) / 2 | 0 to N - 1 |
equality (==) | 1 to (N2 - N) / 2 | 1 to N - 1 |
array accesses | 2 to N2 - N | 2 to 2N - 2 |
Worst Case Order of Growth
9
Simplification 1: Consider Only the Worst Case
Provides a runtime guarantee for any input of size N.
We often only care about worst case, but there are many exceptions.
10
Operation | Worst Case: dup1 |
i = 0 | 1 |
j = i + 1 | 1 to 10000 |
less-than (<) | 2 to (N2 + 3N + 2) / 2 |
increment (+= 1) | 0 to (N2 + N) / 2 |
equality (==) | 1 to (N2 - N) / 2 |
array accesses | 2 to N2 - N |
public static boolean dup1(int[] A) {
for (int i = 0; i < A.length; i += 1) {
for (int j = i + 1; j < A.length; j += 1) {
if (A[i] == A[j]) {
return true;
}
}
}
return false;
}
Identifying Orders of Growth
Consider the algorithm step counts below.
What do you expect will be the order of growth of the runtime for the algorithm?
11
Q
Operation | Count |
less-than (<) | 100N2 + 3N |
greater-than (>) | 2N3 + 1 |
and (&&) | 5,000 |
What do you expect will be the order of growth?
12
Simplification 2: Restrict Attention to One Operation
Pick some representative operation to act as a proxy for the overall runtime.
Good choices:
Poor choices: assignment of i = 0 or j = i + 1.
We call our choice the cost model.
13
Operation | Worst Case: dup1 |
i = 0 | 1 |
j = i + 1 | 10000 |
less-than (<) | (N2 + 3N + 2) / 2 |
increment (+= 1) | (N2 + N) / 2 |
equality (==) | (N2 - N) / 2 |
array accesses | N2 - N |
Simplification 3: Eliminate Lower-Order Terms
Ignore lower-order terms.
14
Operation | Worst Case: dup1 |
increment (+= 1) | (N2 + N) / 2 |
public static boolean dup1(int[] A) {
for (int i = 0; i < A.length; i += 1) {
for (int j = i + 1; j < A.length; j += 1) {
if (A[i] == A[j]) {
return true;
}
}
}
return false;
}
Simplification 4: Eliminate Multiplicative Constants
Ignore multiplicative constants.
We already threw away the meaningful constant when we chose a single proxy operation.
15
Operation | Worst Case: dup1 |
increment (+= 1) | N2 / 2 |
public static boolean dup1(int[] A) {
for (int i = 0; i < A.length; i += 1) {
for (int j = i + 1; j < A.length; j += 1) {
if (A[i] == A[j]) {
return true;
}
}
}
return false;
}
Simplification Summary
“The worst case order of growth of the runtime for dup1 is N2.”
16
Operation | dup1 |
i = 0 | 1 |
j = i + 1 | 1 to 10000 |
less-than (<) | 2 to (N2 + 3N + 2) / 2 |
increment (+= 1) | 0 to (N2 + N) / 2 |
equality (==) | 1 to (N2 - N) / 2 |
array accesses | 2 to N2 - N |
Order of growth
Operation | Worst Case Growth |
increment (+= 1) | N2 |
Your Turn: Worst Case Order of Growth for dup2
“The worst case order of growth of the runtime for dup2 is …”
17
Operation | dup2 |
i = 0 | 1 |
less-than (<) | 1 to N |
increment (+= 1) | 0 to N - 1 |
equality (==) | 1 to N - 1 |
array accesses | 2 to 2N - 2 |
Order of growth
Operation | Worst Case Growth |
array accesses | N |
Operation | Worst Case Growth |
| |
Q
Simplified Modeling Process
Rather than building the entire table, we can instead:
Let’s redo our analysis of dup1 with this new process.
This time, we’ll show all our work.
18
Worst Case Order of Growth: Exact Count of == Operations
“The worst case order of growth of the runtime for dup1 is N2.”
19
int N = A.length; // N == 6
for (int i = 0; i < N; i += 1)
for (int j = i + 1; j < N; j += 1)
if (A[i] == A[j])
return true;
return false;
| == | == | == | == | == |
| | == | == | == | == |
| | | == | == | == |
| | | | == | == |
| | | | | == |
| | | | | |
0
1
2
3
4
5
0 1 2 3 4 5
i
j
Worst Case Order of Growth: Geometric Argument
“The worst case order of growth of the runtime for dup1 is N2.”
20
int N = A.length; // N == 6
for (int i = 0; i < N; i += 1)
for (int j = i + 1; j < N; j += 1)
if (A[i] == A[j])
return true;
return false;
| == | == | == | == | == |
| | == | == | == | == |
| | | == | == | == |
| | | | == | == |
| | | | | == |
| | | | | |
0
1
2
3
4
5
0 1 2 3 4 5
i
j
Area of right triangle of side length N - 1.
Order of growth of area is N2.
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
“The worst case order of growth of the runtime for dup1 is N2.”
21
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
“The worst case order of growth of the runtime for dup1 is N2.”
22
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
Order of Growth Exercise
Informally, what is the shape of each function for very large N?
In other words, what is the order of growth of each function?
23
Q
Function | Order of Growth |
N3 + 3N4 | N4 |
(1 / N) + N3 | N3 |
(1 / N) + 5 | 1 |
NeN + N | NeN |
40 sin(N) + 4N2 | N2 |
Function | Order of Growth |
N3 + 3N4 | |
(1 / N) + N3 | |
(1 / N) + 5 | |
NeN + N | |
40 sin(N) + 4N2 | |
Big-Theta Notation
Suppose we have a function R(N) with order of growth f(N). In Big-Theta, we write this as:
24
Function | Big-Theta |
N3 + 3N4 | Θ(N4) |
(1 / N) + N3 | Θ(N3) |
(1 / N) + 5 | Θ(1) |
NeN + N | Θ(NeN) |
40 sin(N) + 4N2 | Θ(N2) |
Big-Theta Definition
means there exist positive constants k1 and k2 such that
for all values of N greater than some N0.
25
“Very large N”
Big-Theta Challenge
Find a simple f(N) and corresponding k1 and k2.
means there exist positive constants k1 and k2 such that
for all values of N greater than some N0.
26
Q
Demo
Big-O Notation
Whereas Big-Theta can informally be thought of as something like “equals”, Big-O can be thought of as “less than or equal”.
All of the following are true.
27
Big-O Definition
means there exists a positive constant k2 such that
for all values of N greater than some N0.
28
“Very large N”
Big-Omega Definition
means there exists a positive constant k1 such that
for all values of N greater than some N0.
29
“Very large N”
Runtime Analysis Process
Comprehending. Understanding the implementation details of a program.
Modeling. Counting the number of steps in terms of N, the size of the input.
Case Analysis. How certain conditions affect the program execution.
Asymptotic Analysis. Describing what happens for very large N, as N→∞.
Formalizing. Summarizing the final result in precise English or math notation.
“The overall order of growth of the runtime for dup1 is …”
30
boolean dup1(int[] A)
Consider every pair
Array contains a duplicate at front
Array contains no duplicate items
Constant time
Quadratic time
Θ(1)
Θ(N2)
Overall Asymptotic Runtime Bound
Worst case
Best case
Asymptotic Analysis and Case Analysis
For a very large array with billions of elements (i.e. asymptotic analysis), is it possible for dup1 to execute only 2 less-than (<) operations?
31
Operation | dup1: Quadratic/Parabolic | dup2: Linear |
i = 0 | 1 | 1 |
less-than (<) | 2 to (N2 + 3N + 2) / 2 | 1 to N |
increment (+= 1) | 0 to (N2 + N) / 2 | 0 to N - 1 |
equality (==) | 1 to (N2 - N) / 2 | 1 to N - 1 |
array accesses | 2 to N2 - N | 2 to 2N - 2 |
Overall Asymptotic Runtime Bound for dup1
Give an overall asymptotic runtime bound for R as a combination of Θ, O, and/or Ω notation. Take into account both the best and the worst case runtimes (Rbest and Rworst).
32
Q
Demo