1 of 39

Announcements

Project 3 autograder has been updated to do randomized testing.

In general, it is not reasonable to hardcode answers to projects in any class, ever (e.g. a program which consists of if statements).

More people are failing the 3x test than we expected, will investigate.

Possible culprit: You might be tested against a solution that uses Euclidean rather than Great Circle distance, which is slower.

Extra deadline for part 1 is today.

2 of 39

CS61B

Lecture 34: Sorting III

Quicksort Flavors vs. Mergesort
Selection (Quick Select)
Stability, Adaptiveness, and Optimization
Shuffling

3 of 39

Sorting Summary (so far)

Listed by mechanism:

Selection sort: Find the smallest item and put it at the front.
Insertion sort: Figure out where to insert the current item.
Merge sort: Merge two sorted halves into one sorted whole.
Partition (quick) sort: Partition items around a pivot.

Listed by memory and runtime:

	Memory	Time	Notes
Heapsort	Θ(1)	Θ(N log N)	Bad caching (61C)
Insertion	Θ(1)	Θ(N²)	Θ(N) if almost sorted
Mergesort	Θ(N)	Θ(N log N)
Random Quicksort	Θ(log N) expected	Θ(N log N) expected	Fastest sort

4 of 39

Quicksort Flavors

We said Quicksort is the fastest, but this is only true if we make the right decisions about:

Pivot selection.
Partition algorithm.
How we deal with avoiding the worst case (can be covered by the above choices).

Let’s speed test Mergesort vs. Quicksort from last time, which had:

Pivot selection: Always use leftmost.
Partition algorithm: Make an array copy then do three scans for red, white, and blue items (white scan trivially finishes in one compare).
Shuffle before starting (to avoid worst case).

We’ll call this QuicksortL3S

5 of 39

Quicksort vs. Mergesort

Quicksort didn’t do so well!

	Pivot Selection Strategy	Partition Algorithm	Worst Case Avoidance Strategy	Time to sort 1000 arrays of 10000 ints
Mergesort	N/A	N/A	N/A	2.1 seconds
Quicksort L3S	Leftmost	3-scan	Shuffle	4.4 seconds

Note: These are unoptimized versions of mergesort and quicksort, i.e. no switching to insertion sort for small arrays.

6 of 39

Tony Hoare’s In-place Partitioning Scheme

Tony originally proposed a scheme where two pointers walk towards each other.

Left pointer loves small items.
Right pointer loves large items.
Big idea: Walk towards each other, swapping anything they don’t like.

End result is that things on left are “small” and things on the right are “large”.

Full details here: Demo

Using this partitioning scheme yields a very fast Quicksort.

Though faster schemes have been found since.
Overall runtime still depends crucially on pivot selection strategy!

7 of 39

Quicksort vs. Mergesort

Using Tony Hoare’s two pointer scheme, Quicksort is better than mergesort!

More recent pivot/partitioning schemes do somewhat better.

Best known Quicksort uses a two-pivot scheme.
Interesting note, this version of Quicksort was introduced to the world by a previously unknown guy, in a Java developers forum (Link).

	Pivot Selection Strategy	Partition Algorithm	Worst Case Avoidance Strategy	Time to sort 1000 arrays of 10000 ints
Mergesort	N/A	N/A	N/A	2.1 seconds
Quicksort L3S	Leftmost	3-scan	Shuffle	4.4 seconds
Quicksort LTHS	Leftmost	Tony Hoare	Shuffle	1.7 seconds

Note: These are unoptimized versions of mergesort and quicksort, i.e. no switching to insertion sort for small arrays.

8 of 39

What If We Don’t Want Randomness?

Our approach so far: Use randomness to avoid worst case behavior, but some people don’t like having randomness in their sorting routine.

Another approach: Use the median (or an approximation) as our pivot.

Four philosophies:

1. Randomness: Pick a random pivot or shuffle before sorting.

2. Smarter pivot selection: Calculate or approximate the median.

3. Introspection: Switch to a safer sort if recursion goes to deep.

4. Try to cheat: If the array is already sorted, don’t sort (this doesn’t work).

This is what we’ve been using.

9 of 39

Philosophy 2a: Smarter Pivot Selection (linear time pivot pick)

The best possible pivot is the median.

Splits problem into two problems of size N/2.

Obvious approach: Just calculate the actual median and use that as pivot.

But how?

Goal: Come up with an algorithm for finding the median of an array. Bonus points if your algorithm takes linear time.

10 of 39

Philosophy 2a: Smarter Pivot Selection (linear time pivot pick)

The best possible pivot is the median.

Splits problem into two problems of size N/2.

Obvious approach: Just calculate the actual median and use that as pivot.

But how?

Goal: Come up with an algorithm for finding the median of an array. Bonus points if your algorithm takes linear time.

Your answer:

11 of 39

Median Identification

Is it possible to find the median in Θ(N) time?

Yes! Use ‘BFPRT’ (called PICK in original paper).
Algorithm developed in 1972 by a team including my former TA, Bob Tarjan (well before I was born).
In practice, rarely used.

Historical note: The authors of this paper include FOUR Turing Award winners (and Pratt is no slouch!)

Let’s see how Exact Median Quicksort performs.

12 of 39

Quicksort vs. Mergesort

Quicksort using PICK to find the exact median (Quicksort PickTH) is terrible!

Cost to compute medians is too high.
Have to live with worst case Θ(N²) if we want good practical performance.

	Pivot Selection Strategy	Partition Algorithm	Worst Case Avoidance Strategy	Time to sort 1000 arrays of 10000 ints	Worst Case
Mergesort	N/A	N/A	N/A	2.1 seconds	Θ(N log N)
Quicksort L3S	Leftmost	3-scan	Shuffle	4.4 seconds	Θ(N²)
Quicksort LTHS	Leftmost	Tony Hoare	Shuffle	1.7 seconds	Θ(N²)
Quicksort PickTH	Exact Median	Tony Hoare	Exact Median	10.0 seconds	Θ(N log N)

Note: These are unoptimized versions of mergesort and quicksort, i.e. no switching to insertion sort for small arrays.

13 of 39

Quick Select

14 of 39

The Selection Problem

Computing the exact median would be great for picking an item to partition around. Gives us a “safe quick sort”.

Unfortunately, it turns out that exact median computation is too slow.

However, it turns out that partitioning can be used to find the exact median.

The resulting algorithm is the best known median identification algorithm.

15 of 39

Quick Select

Goal, find the median:

Partition, pivot lands at 2.

Not the median. Why?
So what next?

Now pivot lands at 6.

Not the median.

Partition right subproblem, median can’t be to the left!

Pivot lands at 4. Are we done?

Yep, 9/2 = 4.

550

330

817

913

330

550

817

913

550

330

817

913

550

330

817

913

330

16 of 39

Worst case performance?

What is the worst case performance for Quick Select? Give an array that causes this worst case (assuming we always pick leftmost item as pivot).

17 of 39

Worst case performance?

What is the worst case performance for Quick Select? Give an array that causes this worst case (assuming we always pick leftmost item as pivot).

Worst asymptotic performance Θ(N²) occurs if array is in sorted order.

[1 2 3 4 5 6 7 8 9 10 … N]

…

[1 2 3 4 5 … N/2 … N]

18 of 39

Expected Performance

On average, Quick Select will take Θ(N) time.

Intuitive picture (not a proof!):

~N compares

~N/2 compares

~N/4 compares

On average, pivot ends up about halfway

N + N/2 + N/4 + … + 1 = Θ(N)

19 of 39

Quicksort With Quickselect?

Quicksort with PICK to find exact median was terrible.

What if we used Quickselect to find the exact median?

Resulting algorithm is still quite slow. Also: a little strange to do a bunch of partitions to identify the optimal item to partition around.

	Pivot Selection Strategy	Partition Algorithm	Worst Case Avoidance Strategy	Time to sort 1000 arrays of 10000 ints	Worst Case
Mergesort	N/A	N/A	N/A	2.1 seconds	Θ(N log N)
Quicksort L3S	Leftmost	3-scan	Shuffle	4.4 seconds	Θ(N²)
Quicksort LTHS	Leftmost	Tony Hoare	Shuffle	1.7 seconds	Θ(N²)
Quicksort PickTH	Exact Median	Tony Hoare	Exact Median	10.0 seconds	Θ(N log N)

20 of 39

Stability, Adaptiveness, Optimization

21 of 39

Sorting Summary (so far)

Listed by mechanism:

Selection sort: Find the smallest item and put it at the front.
Insertion sort: Figure out where to insert the current item.
Merge sort: Merge two sorted halves into one sorted whole.
Partition (quick) sort: Partition items around a pivot.

Listed by memory and runtime:

	Memory	# Compares	Notes
Heapsort	Θ(1)	Θ(N log N) worst	Bad caching (61C)
Insertion	Θ(1)	Θ(N²) worst	Θ(N) if almost sorted
Mergesort	Θ(N)	Θ(N log N) worst
Random Quicksort	Θ(log N) (call stack)	Θ(N log N) expected	Fastest sort

22 of 39

Other Desirable Sorting Properties: Stability

A sort is said to be stable if order of equivalent items is preserved.

Bas	3
Fikriyya	4
Jana	3
Jouni	3
Lara	1
Nikolaj	4
Rosella	3
Sigurd	2

sort(studentRecords, BY_NAME);

Lara	1
Sigurd	2
Bas	3
Jana	3
Jouni	3
Rosella	3
Fikriyya	4
Nikolaj	4

sort(studentRecords, BY_SECTION);

Equivalent items don’t ‘cross over’ when being stably sorted.

23 of 39

Other Desirable Sorting Properties: Stability

A sort is said to be stable if order of equivalent items is preserved.

Bas	3
Fikriyya	4
Jana	3
Jouni	3
Lara	1
Nikolaj	4
Rosella	3
Sigurd	2

sort(studentRecords, BY_NAME);

Lara	1
Sigurd	2
Jouni	3
Rosella	3
Bas	3
Jana	3
Fikriyya	4
Nikolaj	4

sort(studentRecords, BY_SECTION);

Sorting instability can be really annoying! Wanted students listed alphabetically by section.

24 of 39

Sorting Stability

Is insertion sort stable?

Is Quicksort stable?

Consider -------->

S O R T E X A M P L E

S O R T E X A M P L E (0 swaps)

O S R T E X A M P L E (1 swap )

O R S T E X A M P L E (1 swap )

O R S T E X A M P L E (0 swaps)

E O R S T X A M P L E (4 swaps)

E O R S T X A M P L E (0 swaps)

A E O R S T X M P L E (6 swaps)

A E M O R S T X P L E (5 swaps)

A E M O P R S T X L E (4 swaps)

A E L M O P R S T X E (7 swaps)

A E E L M O P R S T X (8 swaps)

25 of 39

Sorting Stability

Is insertion sort stable?

Yes.
Equivalent items never move past their equivalent brethren.

Is Quicksort stable?

Depends on your partitioning strategy.

S O R T E X A M P L E

S O R T E X A M P L E (0 swaps)

O S R T E X A M P L E (1 swap )

O R S T E X A M P L E (1 swap )

O R S T E X A M P L E (0 swaps)

E O R S T X A M P L E (4 swaps)

E O R S T X A M P L E (0 swaps)

A E O R S T X M P L E (6 swaps)

A E M O R S T X P L E (5 swaps)

A E M O P R S T X L E (4 swaps)

A E L M O P R S T X E (7 swaps)

A E E L M O P R S T X (8 swaps)

Hoare partitioning.

Three array partitioning.

26 of 39

Stability

	Memory	# Compares	Notes	Stable?
Heapsort	Θ(1)	Θ(N log N)	Bad caching (61C)	No
Insertion	Θ(1)	Θ(N²)	Θ(N) if almost sorted	Yes
Mergesort	Θ(N)	Θ(N log N)		Yes
Quicksort LTHS	Θ(log N)	Θ(N log N) expected	Fastest sort	No

You can create a stable Quicksort. However, unstable partitioning schemes (like Hoare partitioning) tend to be faster. All reasonable partitioning schemes yield Θ(N log N) expected runtime, but with different constants.

This is due to the cost of tracking recursive calls by the computer, and is also an “expected” amount. The difference between log N and constant memory is trivial.

27 of 39

Optimizing Sorts

Additional tricks we can play:

Switch to insertion sort:

When a subproblem reaches size 15 or lower, use insertion sort.

Make sort adaptive: Exploit existing order in array (Insertion Sort, SmoothSort, TimSort (the sort in Python and Java)).
Exploit restrictions on set of keys. If number of keys is some constant, e.g. [3, 4, 1, 2, 4, 3, …, 2, 2, 2, 1, 4, 3, 2, 3], can sort faster (see 3-way quicksort -- if you’re curious, see: http://goo.gl/3sYnv3).
For Quicksort: Make the algorithm introspective, switching to a different sorting method if recursion goes too deep. Only a problem for deterministic flavors of Quicksort.

28 of 39

Arrays.sort

In Java, Arrays.sort(someArray) uses:

Mergesort (specifically the TimSort variant) if someArray consists of Objects.
Quicksort if someArray consists of primitives.

Why? See A level problems.

30 of 39

Shuffling

How do we shuffle an array?

31 of 39

Shuffling

How do we shuffle an array?

Easiest way:

Generate N random numbers, and attach one to each array item.
Sort the items by the attached random number.

0.8741

0.5512

0.211

0.366

32 of 39

Shuffling

How do we shuffle an array?

Easiest way:

Generate N random numbers, and attach one to each array item.
Sort the items by the attached random number.

How do we get random numbers? See Fa14 lecture 33.

0.8741

0.5512

0.211

0.366

33 of 39

How do we NOT shuffle?

In response to anti-trust investigation, Microsoft agreed to provide a random browser selection website.

Appeared last 50% of the time.

34 of 39

The (Bad) Randomization Approach

Implementation: Make compareTo() return a random answer.

Appeared last 50% of the time.

For more, see: http://www.robweir.com/blog/2010/03/new-microsoft-shuffle.html

35 of 39

The (Bad) Randomization Approach

Implementation: Make compareTo() return a random answer.

Doesn’t play well with insertion sort.
We use insertion sort for N < ~15.

Simple programming error… or was it?

http://psych.fullerton.edu/mbirnbaum/papers/Nihm_BPS_1984.pdf

Appeared last 50% of the time.

Nisbett and Wilson (1977), two well-known psychologists, presented subjects with packs of nylon stockings in a row and asked them to choose the best one. Subjects had a strong tendency to prefer stockings toward the right side of the array (even though, unbeknownst to the subjects, all the stockings were actually identical). However, when the experimenters asked the subjects if the position of the choices affected their decision, almost everyone denied it and ridiculed the idea that something as trivial as position would affect their choice. The research showed that people do not have conscious access to their underlying cognitive processes that gave rise to biases such as, in this case, preference for choosing options at the end of the array. This theory has been followed up by many other supporting studies, and the paper itself has been cited more than 6500 times.

If someone at Microsoft who knew about this research attempted to implement this flawed sorting algorithm that biases Internet Explorer at the end so that (1) consumers will not even be consciously aware that they chose Internet Explorer because of its presentation position instead of its quality, and (2) if uncovered, it could be passed off as an innocuous, accidental programming bug.

36 of 39

Sounds of Sorting (Fun)

37 of 39

Sounds of Sorting Algorithms (of 125 items)

Starts with selection sort: https://www.youtube.com/watch?v=kPRA0W1kECg

Insertion sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=0m9s

Quicksort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=0m38s

Mergesort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=1m05s

Heapsort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=1m28s

LSD sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=1m54s [coming next Wednesday]

MSD sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=2m10s [coming next Wednesday]

Shell’s sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=3m37s [bonus from last time]

Questions to ponder (later… after class):

How many items for selection sort?
Why does insertion sort take longer / more compares than selection sort?
At what time stamp does the first partition complete for Quicksort?
Could the size of the input to mergesort be a power of 2?
What do the colors mean for heapsort?
How many characters are in the alphabet used for the LSD sort problem?
How many digits are in the keys used for the LSD sort problem?

38 of 39

Citations

Title image: http://www.constructionphotography.com/ImageThumbs/A168-02831/3/A168-02831_plastic_bottles_sorted_by_colour_compressed_into_bales_and_ready_for_recycling.jpg

39 of 39

Sorting, Puppies, Cats, and Dogs

A solution to the sorting problem also provides a solution to puppy, cat, dog.

Thus: Sorting must be at least as hard as puppy, cat, dog.
Because [difficulty of sorting] ≥ [difficulty of puppy, cat, dog], any lower bound on difficulty of puppy, cat, dog must ALSO apply to sorting.

Physics analogy: Climbing a hill with your legs is one way to solve the problem of getting up a hill.

Thus: Using “climbing a hill with your legs” must be at least as hard as “getting up a hill”.
Because CAHWYL ≥ GUAH, any lower bound on energy to GUAH must also apply to CAHWYL.
Example bound: Takes m*g*h energy to climb hill, so using legs to climb the hill takes at least m*g*h energy.

1 of 39

2 of 39

3 of 39

4 of 39

5 of 39

6 of 39

7 of 39

8 of 39

9 of 39

10 of 39

11 of 39

12 of 39

13 of 39

14 of 39

15 of 39

16 of 39

17 of 39

18 of 39

19 of 39

20 of 39

21 of 39

22 of 39

23 of 39

24 of 39

25 of 39

26 of 39

27 of 39

28 of 39

29 of 39

30 of 39

31 of 39

32 of 39

33 of 39

34 of 39

35 of 39

36 of 39

37 of 39

38 of 39

39 of 39