Announcements
Sorry about the lack of pre-recorded lectures.
Seems unreasonable for students to spend 20 hours of time and earn almost no credit, so Project 2A/2B will have a point recovery policy. Details TBA.
Announcements: yellkey.com/national
Sorry about the lack of pre-recorded lectures.
Project 2A/2B will have a point recovery policy, but details TBA.
If you had two choices, which would you prefer? Assume your final score is the same either way:
Announcements
Phase 3 of the course starts today: Algorithms and Software Engineering.
Lectures in this phase:
Same optional textbook for M/W: “Algorithms” by Wayne/Sedgewick.
Optional textbook for Friday lectures: “A Philosophy of Software Design” by John Ousterhout.
Announcements
Phase 3 of the course starts today: Algorithms and Software Engineering.
Assignments in this phase:
CS61B
Lecture 29: Sorting I
Our Major Focus for Several Lectures: Sorting
For many of our remaining lectures, we’ll discuss the sorting problem.
This is a useful task in its own right. Examples:
Also provide interesting case studies for how to approach basic computational problems.
Sorting - Definitions (from Knuth’s TAOCP)
An ordering relation < for keys a, b, and c has the following properties:
An ordering relation with the properties above is also known as a “total order”.
A sort is a permutation (re-arrangement) of a sequence of elements that puts the keys into non-decreasing order relative to a given ordering relation.
Example: String Length
Example of an ordering relation: The length of strings.
Two valid sorts for [“cows”, “get”, “going”, “the”] for the ordering relation above:
Under this relation, “the” is considered = to “get”, since len(“the”) = len(“get”).
= under the relation, not the Java idea of .equals
Java Note
Ordering relations are typically given in the form of compareTo or compare methods.
�Note that with respect to the order defined by the method above “the” = “get”.
import java.util.Comparator;
public class LengthComparator implements Comparator<String> {
public int compare(String x, String b) {
return x.length() - b.length();
}
}
Sorting: An Alternate Viewpoint
An inversion is a pair of elements that are out of order with respect to <.
Another way to state the goal of sorting:
0 1 1 2 3 4 8 6 9 5 7
8-6 8-5 8-7 6-5 9-5 9-7
(6 inversions out of 55 max)
Gabriel Cramer
Yoda
Performance Definitions
Characterizations of the runtime efficiency are sometimes called the time complexity of an algorithm. Example:
Characterizations of the “extra” memory usage of an algorithm is sometimes called the space complexity of an algorithm.
Selection Sort and Heapsort
Selection Sort
We’ve seen this already.
Sort Properties:
Seems inefficient: We look through entire remaining array every time to find the minimum.
Naive Heapsort: Leveraging a Max-Oriented Heap
Idea: Instead of rescanning entire array looking for minimum, maintain a heap so that getting the minimum is fast!
For reasons that will become clear soon, we’ll use a max-oriented heap.
Naive heapsorting N items:
Naive Heapsort Demo: https://goo.gl/EZWwSJ
A min heap would work as well, but wouldn’t be able to take advantage of the fancy trick in a few slides.
Naive Heapsort Runtime: http://yellkey.com/couple
Naive heapsorting N items:
What is the TOTAL runtime of naive heapsort?
Heapsort Runtime Analysis
Use the magic of the heap to sort our data.
Overall runtime is O(N log N) + Θ(N) + O(N log N) = O(N log N)
Memory usage is Θ(N) to build the additional copy of all of our data.
In-place Heapsort
Alternate approach, treat input array as a heap!
In-place heap sort: Demo
32
15
2
17
19
26
41
17
17
41
19
32
17
15
26
2
17
17
heapification
For this algorithm we don’t leave spot 0 blank.
In-place Heapsort Runtime: http://yellkey.com/response
Use the magic of the heap to sort our data.
Give the time complexity of in-place heapsort in big O notation.
In-place Heapsort Runtime
Use the magic of the heap to sort our data.
Give the time complexity of in-place heapsort in big O notation.
Bottom-up heapification is N sink operations, each taking no more than O(log N) time, so overall runtime for heapification is O(N log N).
In-place Heapsort: http://yellkey.com/tell
What is the memory complexity of Heapsort?
In-place Heapsort: http://yellkey.com/tell
What is the memory complexity of Heapsort?
Incorrect answer given by student during lecture: Θ(N): Creating N spots for a min heap.
In-place Heapsort
What is the memory complexity of Heapsort?
The only extra memory we need is a constant number instance variables, e.g. size.
Sorts So Far
| Best Case Runtime | Worst Case Runtime | Space | Demo | Notes |
Θ(N2) | Θ(N2) | Θ(1) | | ||
(in place) | Θ(N)* | Θ(N log N) | Θ(1) | Bad cache (61C) performance. |
*: An array of all duplicates yields linear runtime for heapsort.
Mergesort
Mergesort
We’ve seen this one before as well.
Mergesort: Demo
Time complexity, analysis from previous lecture: Θ(N log N runtime)
Also possible to do in-place merge sort, but algorithm is very complicated, and runtime performance suffers by a significant constant factor.
Sorts So Far
| Best Case Runtime | Worst Case Runtime | Space | Demo | Notes |
Θ(N2) | Θ(N2) | Θ(1) | | ||
(in place) | Θ(N)* | Θ(N log N) | Θ(1) | Bad cache (61C) performance. | |
Θ(N log N) | Θ(N log N) | Θ(N) | Faster than heap sort. |
*: An array of all duplicates yields linear runtime for heapsort.
Insertion Sort
Insertion Sort
General strategy:
Naive approach, build entirely new output: Demo (Link)
32
15
2
17
19
26
41
17
17
Input:
Output:
Insertion Sort
General strategy:
Naive approach, build entirely new output: Demo (Link)
32
15
2
17
19
26
41
17
17
Input:
Output:
32
15
2
17
19
26
41
17
17
Insertion Sort
General strategy:
For naive approach, if output sequence contains k items, worst cost to insert a single item is k.
More efficient method:
In-place Insertion Sort
Two more examples.
7 swaps:
36 swaps:
P O T A T O
P O T A T O (0 swaps)
O P T A T O (1 swap )
O P T A T O (0 swaps)
A O P T T O (3 swaps)
A O P T T O (0 swaps)
A O O P T T (3 swaps)
S O R T E X A M P L E
S O R T E X A M P L E (0 swaps)
O S R T E X A M P L E (1 swap )
O R S T E X A M P L E (1 swap )
O R S T E X A M P L E (0 swaps)
E O R S T X A M P L E (4 swaps)
E O R S T X A M P L E (0 swaps)
A E O R S T X M P L E (6 swaps)
A E M O R S T X P L E (5 swaps)
A E M O P R S T X L E (4 swaps)
A E L M O P R S T X E (7 swaps)
A E E L M O P R S T X (8 swaps)
Purple: Element that we’re moving left (with swaps).
Black: Elements that got swapped with purple.
Grey: Not considered this iteration.
Insertion Sort Runtime: http://yellkey.com/arrive
What is the runtime of insertion sort?
36 swaps:
Insertion Sort Runtime
What is the runtime of insertion sort?
You may recall Ω is not “best case”.
So technnnniically you could also say � Ω(1)
36 swaps:
Picking the Best Sort: http://yellkey.com/machine
Suppose we do the following:
Which sorting algorithm would be the fastest choice for X?
Observation: Insertion Sort on Almost Sorted Arrays
For arrays that are almost sorted, insertion sort does very little work.
Picking the Best Sort (Poll Everywhere)
Suppose we do the following:
Which sorting algorithm would be the fastest choice for X? Worst case run-times:
Insertion Sort Sweet Spots
On arrays with a small number of inversions, insertion sort is extremely fast.
Less obvious: For small arrays (N < 15 or so), insertion sort is fastest.
Sorts So Far
| Best Case Runtime | Worst Case Runtime | Space | Demo | Notes |
Θ(N2) | Θ(N2) | Θ(1) | | ||
(in place) | Θ(N)* | Θ(N log N) | Θ(1) | Bad cache (61C) performance. | |
Θ(N log N) | Θ(N log N) | Θ(N) | Fastest of these. | ||
Insertion Sort (in place) | Θ(N) | Θ(N2) | Θ(1) | Best for small N or almost sorted. |
Shell’s Sort (Extra)
(Not on Exam)
Optimizing Insertion Sort: Shell’s Sort
Big idea: Fix multiple inversions at once.
Optimizing Insertion Sort: Shell’s Sort
7-sorting:
3-sorting:
3 swaps
5 swaps
1-sorting:
6 swaps
Total swaps: 14 (vs. 31)
S O R T E X A M P L E
M O R T E X A S P L E (1 swap, > 1 inversion)
M O R T E X A S P L E (0 swaps)
M O L T E X A S P R E (1 swap, > 1 inversion)
M O L E E X A S P R T (1 swap, > 1 inversion)
M O L E E X A S P R T
E O L M E X A S P R T (1 swap, >1 inversions)
E E L M O X A S P R T (1 swap, >1 inversions)
E E L M O X A S P R T (0 swaps, 0 inversions)
A E L E O X M S P R T (2 swaps, >1 inversions)
A E L E O X M S P R T (0 swaps, 0 inversions)
A E L E O P M S X R T (1 swaps, >1 inversions)
A E L E O P M S X R T (0 swaps, 0 inversions)
A E L E O P M S X R T (0 swaps, 0 inversions)
Example from Algorithms 4th Edition
A E L E O P M S X R T
A E L E O P M S X R T
A E L E O P M S X R T
A E L E O P M S X R T
A E E L O P M S X R T
A E E L O P M S X R T
A E E L O P M S X R T
A E E L M O P S X R T
A E E L M O P S X R T
A E E L M O P S X R T
A E E L M O P R S X T
A E E L M O P R S T X
h=1 is just insertion sort.
Shell’s Sort: Generalization and Performance
h=1 is just normal insertion sort.
We used 7, 3, 1. Can generalize to 2k - 1 from some k down to 1.
Sorts So Far
| Best Case Runtime | Worst Case Runtime | Space | Demo | Notes |
Θ(N2) | Θ(N2) | Θ(1) | | ||
Heapsort (in place) | Θ(N)* | Θ(N log N) | Θ(1) | Bad cache (61C) performance. | |
Θ(N log N) | Θ(N log N) | Θ(N) | Fastest of these. | ||
Insertion Sort (in-place) | Θ(N) | Θ(N2) | Θ(1) | Best for small N or almost sorted. | |
Θ(N) | Ω(N log N), O(?) | Θ(1) | N/A | Rich theory! |
An earlier version of this slide assumed that Heapsort was always given an array with no duplicate elements. If we omit this assumption, heapsort’s best case is Θ(N).
Citations
Lego man: http://thetechnicgear.com/wp-content/uploads/2014/02/sorting-lego.jpg
Keynote Animations of Algorithms courtesy of Kevin Wayne