1 of 66

Instructor Lab 10

CS 61BL Summer 2023

2 of 66

Instructor Lab Check

3 of 66

4 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

5 of 66

Sorting So Far

Core ideas:

  • Selection sort: Find the smallest item and put it at the front.
    • Heapsort variant: Use MaxPQ to find max element and put at the back.
  • Merge sort: Merge two sorted halves into one sorted whole.
  • Insertion sort: Figure out where to insert the current item.

Quicksort:

  • Much stranger core idea: Partitioning.
  • Invented by Sir Tony Hoare in 1960, at the time a novice programmer.

6 of 66

Context for Quicksort’s Invention (Source)

1960: Tony Hoare was working on a crude automated translation program for Russian and English. �

  • Binary search for each word.
    • Find “the” in log D time.
    • Find “cat” in log D time...
  • Total time: N log D

...

...

beautiful

красивая

...

...

cat

кошка

...

...

“The cat wore a beautiful hat.”

Dictionary of D english words

N words

“Кошка носил красивая шапка.”

How would you do this?

7 of 66

Context for Quicksort’s Invention (Source)

Limitation at the time:

  • Dictionary stored on long piece of tape, sentence is an array in RAM.
    • Binary search of tape is not log time (requires physical movement!).
  • Better: Sort the sentence and scan dictionary tape once. Takes N log N + D time.
    • But Tony had to figure out how to sort an array (without Google!)...�

1960: Tony Hoare was working on a crude automated translation program for Russian and English.

Algorithm: N binary searches of D length dictionary.

  • Total runtime: N log D
  • ASSUMES log time binary search!

...

...

beautiful

красивая

...

...

cat

кошка

...

...

8 of 66

Core Idea of Tony’s Sort: Partitioning http://www.yellkey.com/popular

To partition an array a[] on element x=a[i] is to rearrange a[] so that:

  • All entries to the left of x are <= x.
  • All entries to the right of x are >= x.
  • x moves to position j (may be the same as i)

Which partitions are valid?

5

550

10

4

10

9

330

i

4

5

9

10

10

330

550

5

9

10

4

10

550

330

5

4

9

10

10

550

330

5

9

10

4

10

550

330

A.

C.

B.

D.

j

j

j

j

Called the ‘pivot’.

9 of 66

The Core Idea of Tony’s Sort: Partitioning

To partition an array a[] on element x=a[i] is to rearrange a[] so that:

  • x moves to position j (may be the same as i)
  • All entries to the left of x are <= x.
  • All entries to the right of x are >= x.

Which partitions are valid?

5

550

10

4

10

9

330

i

4

5

9

10

10

330

550

5

9

10

4

10

550

330

5

4

9

10

10

550

330

5

9

10

4

10

550

330

A.

C.

B.

D.

j

j

j

j

Called the ‘pivot’.

10 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

11 of 66

Partition Sort, a.k.a. Quicksort

Observations:

  • 5 is “in its place.” Exactly where it’d be if the array were sorted.
  • Can sort two halves separately, e.g. through recursive use of partitioning.

5

3

2

1

8

4

6

7

3

2

1

4

7

8

6

5

3

2

1

4

5

7

8

6

5

2

1

3

4

5

6

7

8

5

Q: How would we use this operation for sorting?

12 of 66

Partition Sort, a.k.a. Quicksort

Quicksorting N items: (Demo)

  • Partition on leftmost item.
  • Quicksort left half.
  • Quicksort right half.

32

15

2

17

19

26

41

17

17

partition(32)

15

2

17

19

26

17

17

32

41

<= 32

>= 32

in its place

13 of 66

Quicksort

Quicksort was the name chosen by Tony Hoare for partition sort.

  • For most common situations, it is empirically the fastest sort.
    • Tony was lucky that the name was correct.

How fast is Quicksort? Need to count number and difficulty of partition operations.

Theoretical analysis:

  • Partitioning costs Θ(K) time, where Θ(K) is the number of elements being partitioned (as we saw in our earlier “interview question”).
  • The interesting twist: Overall runtime will depend crucially on where pivot ends up.

14 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

15 of 66

Best Case: Pivot Always Lands in the Middle

Only size 1 problems remain, so we’re done.

16 of 66

Best Case Runtime?

What is the best case runtime?

Only size 1 problems remain, so we’re done.

17 of 66

Best Case Runtime?

Only size 1 problems remain, so we’re done.

Total work at each level:

N

N/2 + N/2 = N

N/4 * 4 = N

Overall runtime:

Θ(NH) where H = Θ(log N)

so: Θ(N log N)

18 of 66

Worst Case: Pivot Always Lands at Beginning of Array http://www.yellkey.com/popular

Give an example of an array that would follow the pattern to the right.

What is the runtime Θ(∙)?

19 of 66

Worst Case: Pivot Always Lands at Beginning of Array

Give an example of an array that would follow the pattern to the right.

  • 1 2 3 4 5 6

What is the runtime Θ(∙)?

  • N2

20 of 66

Quicksort Performance

Theoretical analysis:

  • Best case: Θ(N log N)
  • Worst case: Θ(N2)

Compare this to Mergesort.

  • Best case: Θ(N log N)
  • Worst case: Θ(N log N)

Recall that Θ(N log N) vs. Θ(N2) is a really big deal. So how can Quicksort be the fastest sort empirically? Because on average it is Θ(N log N).

  • Rigorous proof requires probability theory + calculus, but intuition + empirical analysis will hopefully convince you.

21 of 66

Argument #1: 10% Case

Suppose pivot always ends up at least 10% from either edge (not to scale).

Work at each level: O(N)

  • Runtime is O(NH).
    • H is approximately log 10/9 N = O(log N)
  • Overall: O(N log N).

N

N/10

9N/10

N/100

9N/100

9N/100

81N/100

Punchline: Even if you are unlucky enough to have a pivot that never lands anywhere near the middle, but at least always 10% from the edge, runtime is still O(N log N).

22 of 66

Argument #2: Quicksort is BST Sort

5

3

2

1

8

4

6

7

3

2

1

4

7

8

6

2

1

4

6

8

5

3

5

7

5

3

7

2

4

6

8

1

Key idea: compareTo calls are same for BST insert and Quicksort.

  • Every number gets compared to 5 in both.
  • 3 gets compared to only 1, 2, 4, and 5 in both.

Reminder: Random insertion into a BST takes O(N log N) time.

23 of 66

Empirical Quicksort Runtimes

For N items:

  • Mean number of compares to complete Quicksort: ~2N ln N
  • Standard deviation:

Empirical histogram for quicksort compare counts (10,000 trials with N = 1000)

Lots of arrays take 12,000ish compares to sort with Quicksort.

A very small number take 15,000ish compares to sort with Quicksort.

Chance of taking 1,000,000ish compares is effectively zero.

24 of 66

Quicksort Performance

Theoretical analysis:

  • Best case: Θ(N log N)
  • Worst case: Θ(N2)
  • Randomly chosen array case: Θ(N log N) expected

Compare this to Mergesort.

  • Best case: Θ(N log N)
  • Worst case: Θ(N log N)

Why is it faster than mergesort?

  • Requires empirical analysis. No obvious reason why.

With extremely high probability!!

For our pivot/partitioning strategies: Sorted or close to sorted.

25 of 66

Sorting Summary (so far)

Listed by mechanism:

  • Selection sort: Find the smallest item and put it at the front.
  • Insertion sort: Figure out where to insert the current item.
  • Merge sort: Merge two sorted halves into one sorted whole.
  • Partition (quick) sort: Partition items around a pivot.

Listed by memory and runtime:

Memory

Time

Notes

Heapsort

Θ(1)

Θ(N log N)

Bad caching (61C)

Insertion

Θ(1)

Θ(N2)

Θ(N) if almost sorted

Mergesort

Θ(N)

Θ(N log N)

Quicksort

Θ(log N) (call stack)

Θ(N log N) expected

Fastest sort

26 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

27 of 66

Quicksort Performance

The performance of Quicksort (both order of growth and constant factors) depend critically on:

  • How you select your pivot.
  • How you partition around that pivot.
  • Other optimizations you might add to speed things up.�

Bad choices can be very bad indeed, resulting in Θ(N2) runtimes.

28 of 66

Avoiding the Worst Case

If pivot always lands somewhere “good”, Quicksort is Θ(N log N). However, the very rare Θ(N2) cases do happen in practice, e.g.

  • Bad ordering: Array already in sorted order (or almost sorted order).
  • Bad elements: Array with all duplicates.

What can we do to avoid worst case behavior?

Recall, our version of Quicksort has the following properties:

  • Leftmost item is always chosen as the pivot.
  • Our partitioning algorithm preserves the relative order of <= and >= items.

6

8

3

1

2

7

4

3

1

2

4

6

8

7

29 of 66

Avoiding the Worst Case

If pivot always lands somewhere “good”, Quicksort is Θ(N log N). However, the very rare Θ(N2) cases do happen in practice, e.g.

  • Bad ordering: Array already in sorted order (or almost sorted order).
  • Bad elements: Array with all duplicates.

What can we do to avoid worst case behavior?

  • Shuffle before starting.
  • Go through entire array, find the median, use that as the pivot.
  • Scan through one tie and check if it’s already sorted, don’t sort.

30 of 66

More Quicksort Origins

Amusingly, Quicksort was the wrong tool for the job. Two issues:

  • Language that Tony was using didn’t support recursion (so he couldn’t easily implement Quicksort).
  • Sentences are usually shorter than 15 words.

31 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

32 of 66

Avoiding the Worst Case: My Answers

What can we do to avoid running into the worst case for QuickSort?

Four philosophies:

1. Randomness: Pick a random pivot or shuffle before sorting.

2. Smarter pivot selection: Calculate or approximate the median.

3. Introspection: Switch to a safer sort if recursion goes to deep.

4. Preprocess the array: Could analyze array to see if Quicksort will be slow. No obvious way to do this, though (can’t just check if array is sorted, almost sorted arrays are almost slow).

33 of 66

Philosophy 1: Randomness

Pathological arrays:

  • Bad ordering: Array already in sorted order.
  • Bad elements: Array with all duplicates.

Dealing with bad ordering:

  • Strategy #1: Pick pivots randomly.
  • Strategy #2: Shuffle before you sort.

The second strategy requires care in partitioning code to avoid Θ(N2) behavior on arrays of duplicates.

  • Common bug, even in a well known 2010s textbook.

34 of 66

Philosophy 2a: Smarter Pivot Selection (constant time pivot pick)

Randomness is necessary for best Quicksort performance! For any pivot selection procedure that is:

  • Deterministic
  • Constant Time

Dangerous input

7

2

3

4

6

1

8

5

The resulting Quicksort has a family of dangerous inputs that an adversary could easily generate.

  • See McIlroy’s “A Killer Adversary for Quicksort

35 of 66

Philosophy 2b: Smarter Pivot Selection (linear time pivot pick)

Could calculate the actual median in linear time.

  • “Exact median Quicksort” is safe: Worst case Θ(N log N), but it is slower than Mergesort.

Raises interesting question though: How do you compute the median of an array? Will talk about how to do this later today.

36 of 66

Philosophy 3: Introspection

Can also simply watch your recursion depth.

  • If it exceeds some critical value (say 10 ln N), switch to mergesort.

Perfectly reasonable approach, though not super common in practice.

37 of 66

Sorting Summary (so far)

Listed by mechanism:

  • Selection sort: Find the smallest item and put it at the front.
  • Insertion sort: Figure out where to insert the current item.
  • Merge sort: Merge two sorted halves into one sorted whole.
  • Partition (quick) sort: Partition items around a pivot.

Listed by memory and runtime:

Memory

Time

Notes

Heapsort

Θ(1)

Θ(N log N)

Bad caching (61C)

Insertion

Θ(1)

Θ(N2)

Θ(N) if almost sorted

Mergesort

Θ(N)

Θ(N log N)

Random Quicksort

Θ(log N) expected

Θ(N log N) expected

Fastest sort

38 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

39 of 66

Quicksort Flavors

We said Quicksort is the fastest, but this is only true if we make the right decisions about:

  • Pivot selection.
  • Partition algorithm.
  • How we deal with avoiding the worst case (can be covered by the above choices).

Suppose we run a speed test of Mergesort vs. Quicksort from last time, which had:

  • Pivot selection: Always use leftmost.
  • Partition algorithm: Make an array copy then do three scans for smaller, equal, and bigger items (equal scan trivially finishes in one compare).
  • Shuffle before starting (to avoid worst case).

We’ll call this Quicksort L3S

40 of 66

Quicksort vs. Mergesort

Quicksort didn’t do so well!

Pivot Selection Strategy

Partition

Algorithm

Worst Case Avoidance Strategy

Time to sort 1000 arrays of 10000 ints

Mergesort

N/A

N/A

N/A

1.3 seconds

Quicksort L3S

Leftmost

3-scan

Shuffle

3.2 seconds

Note: These are unoptimized versions of mergesort and quicksort, i.e. no switching to insertion sort for small arrays.

41 of 66

Tony Hoare’s In-place Partitioning Scheme

Tony originally proposed a scheme where two pointers walk towards each other.

  • Left pointer loves small items.
  • Right pointer loves large items.
  • Big idea: Walk towards each other, swapping anything they don’t like.
    • End result is that things on left are “small” and things on the right are “large”.

Full details here (note this is not the exact scheme Tony used): Demo

Using this partitioning scheme yields a very fast Quicksort.

  • Though faster schemes have been found since.
  • Overall runtime still depends crucially on pivot selection strategy!

42 of 66

Quicksort vs. Mergesort

Using Tony Hoare’s two pointer scheme, Quicksort is better than mergesort!

  • More recent pivot/partitioning schemes do somewhat better.
    • Best known Quicksort uses a two-pivot scheme.
    • Interesting note, this version of Quicksort was introduced to the world by a previously unknown guy, in a Java developers forum (Link).

Pivot Selection Strategy

Partition

Algorithm

Worst Case Avoidance Strategy

Time to sort 1000 arrays of 10000 ints

Mergesort

N/A

N/A

N/A

1.3 seconds

Quicksort L3S

Leftmost

3-scan

Shuffle

3.2 seconds

Quicksort LTHS

Leftmost

Tony Hoare

Shuffle

0.9 seconds

Note: These are unoptimized versions of mergesort and quicksort, i.e. no switching to insertion sort for small arrays.

43 of 66

What If We Don’t Want Randomness?

Our approach so far: Use randomness to avoid worst case behavior, but some people don’t like having randomness in their sorting routine.

Another approach: Use the median (or an approximation) as our pivot.

Four philosophies:

1. Randomness: Pick a random pivot or shuffle before sorting.

2. Smarter pivot selection: Calculate or approximate the median.

3. Introspection: Switch to a safer sort if recursion goes to deep.

4. Try to cheat: If the array is already sorted, don’t sort (this doesn’t work).

This is what we’ve been using.

44 of 66

Philosophy 2a: Smarter Pivot Selection (linear time pivot pick)

The best possible pivot is the median.

  • Splits problem into two problems of size N/2.

Obvious approach: Just calculate the actual median and use that as pivot.

  • But how?

Goal: Come up with an algorithm for finding the median of an array. Bonus points if your algorithm takes linear time.

Your answer:

45 of 66

Philosophy 2a: Smarter Pivot Selection (linear time pivot pick)

Goal: Come up with an algorithm for finding the median of an array. Bonus points if your algorithm takes linear time.

  • Sort the array, return the middle item.

46 of 66

Median Identification

Is it possible to find the median in Θ(N) time?

  • Yes! Use ‘BFPRT’ (called PICK in original paper).
  • Algorithm developed in 1972
  • In practice, rarely used.

Historical note: The authors of this paper include FOUR Turing Award winners (and Vaughan Pratt)

Let’s see how Exact Median Quicksort performs.

47 of 66

Quicksort vs. Mergesort

Quicksort using PICK to find the exact median (Quicksort PickTH) is terrible!

  • Cost to compute medians is too high.
  • Have to live with worst case Θ(N2) if we want good practical performance.

Pivot Selection Strategy

Partition

Algorithm

Worst Case Avoidance Strategy

Time to sort 1000 arrays of 10000 ints

Worst Case

Mergesort

N/A

N/A

N/A

1.3 seconds

Θ(N log N)

Quicksort L3S

Leftmost

3-scan

Shuffle

3.2 seconds

Θ(N2)

Quicksort LTHS

Leftmost

Tony Hoare

Shuffle

0.9 seconds

Θ(N2)

Quicksort PickTH

Exact Median

Tony Hoare

Exact Median

4.9 seconds

Θ(N log N)

Note: These are unoptimized versions of mergesort and quicksort, i.e. no switching to insertion sort for small arrays.

48 of 66

Quicksort

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

49 of 66

The Selection Problem

Computing the exact median would be great for picking an item to partition around. Gives us a “safe quick sort”.

  • Unfortunately, it turns out that exact median computation is too slow.

However, it turns out that partitioning can be used to find the exact median.

  • The resulting algorithm is the best known median identification algorithm.

50 of 66

Quick Select

Goal, find the median:

Partition, pivot lands at 2.

  • Not the median. Why?
  • So what next?

Now pivot lands at 6.

  • Not the median.

Partition right subproblem, median can’t be to the left!

Pivot lands at 4. Are we done?

  • Yep, 9/2 = 4.

550

14

10

330

5

4

817

913

14

10

330

550

5

4

817

913

6

5

9

550

14

10

330

817

913

9

550

14

6

10

5

330

817

913

14

10

330

5

4

4

4

5

10

14

330

5

4

4

4

5

51 of 66

Worst case performance?

What is the worst case performance for Quick Select? Give an array that causes this worst case (assuming we always pick leftmost item as pivot).

52 of 66

Worst case performance?

What is the worst case performance for Quick Select? Give an array that causes this worst case (assuming we always pick leftmost item as pivot).

Worst asymptotic performance Θ(N2) occurs if array is in sorted order.

[1 2 3 4 5 6 7 8 9 10 … N]

[1 2 3 4 5 6 7 8 9 10 … N]

[1 2 3 4 5 6 7 8 9 10 … N]

[1 2 3 4 5 … N/2 … N]

53 of 66

Expected Performance

On average, Quick Select will take Θ(N) time.

  • Intuitive picture (not a proof!):

~N compares

~N/2 compares

~N/4 compares

On average, pivot ends up about halfway

N + N/2 + N/4 + … + 1 = Θ(N)

54 of 66

Quicksort With Quickselect?

Quicksort with PICK to find exact median was terrible.

What if we used Quickselect to find the exact median?

  • Resulting algorithm is still quite slow. Also: a little strange to do a bunch of partitions to identify the optimal item to partition around.

Pivot Selection Strategy

Partition

Algorithm

Worst Case Avoidance Strategy

Time to sort 1000 arrays of 10000 ints

Worst Case

Mergesort

N/A

N/A

N/A

1.3 seconds

Θ(N log N)

Quicksort L3S

Leftmost

3-scan

Shuffle

3.2 seconds

Θ(N2)

Quicksort LTHS

Leftmost

Tony Hoare

Shuffle

0.9 seconds

Θ(N2)

Quicksort PickTH

Exact Median

Tony Hoare

Exact Median

4.9 seconds

Θ(N log N)

55 of 66

Stability

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

56 of 66

Sorting Summary (so far)

Listed by mechanism:

  • Selection sort: Find the smallest item and put it at the front.
  • Insertion sort: Figure out where to insert the current item.
  • Merge sort: Merge two sorted halves into one sorted whole.
  • Partition (quick) sort: Partition items around a pivot.

Listed by memory and runtime:

Memory

# Compares

Notes

Heapsort

Θ(1)

Θ(N log N) worst

Bad caching (61C)

Insertion

Θ(1)

Θ(N2) worst

Θ(N) if almost sorted

Mergesort

Θ(N)

Θ(N log N) worst

Random Quicksort

Θ(log N) (call stack)

Θ(N log N) expected

Fastest sort

57 of 66

Other Desirable Sorting Properties: Stability

A sort is said to be stable if order of equivalent items is preserved.

Bas

3

Fikriyya

4

Jana

3

Jouni

3

Lara

1

Nikolaj

4

Rosella

3

Sigurd

2

sort(studentRecords, BY_NAME);

Lara

1

Sigurd

2

Bas

3

Jana

3

Jouni

3

Rosella

3

Fikriyya

4

Nikolaj

4

sort(studentRecords, BY_SECTION);

Equivalent items don’t ‘cross over’ when being stably sorted.

58 of 66

Other Desirable Sorting Properties: Stability

A sort is said to be stable if order of equivalent items is preserved.

Bas

3

Fikriyya

4

Jana

3

Jouni

3

Lara

1

Nikolaj

4

Rosella

3

Sigurd

2

sort(studentRecords, BY_NAME);

Lara

1

Sigurd

2

Jouni

3

Rosella

3

Bas

3

Jana

3

Fikriyya

4

Nikolaj

4

sort(studentRecords, BY_SECTION);

Sorting instability can be really annoying! Wanted students listed alphabetically by section.

59 of 66

Sorting Stability http://www.yellkey.com/popular

Is insertion sort stable?

Is Quicksort stable?

  • Consider -------->

S O R T E X A M P L E

S O R T E X A M P L E (0 swaps)

O S R T E X A M P L E (1 swap )

O R S T E X A M P L E (1 swap )

O R S T E X A M P L E (0 swaps)

E O R S T X A M P L E (4 swaps)

E O R S T X A M P L E (0 swaps)

A E O R S T X M P L E (6 swaps)

A E M O R S T X P L E (5 swaps)

A E M O P R S T X L E (4 swaps)

A E L M O P R S T X E (7 swaps)

A E E L M O P R S T X (8 swaps)

6

8

3

1

2

7

4

60 of 66

Sorting Stability

Is insertion sort stable?

  • Yes.
  • Equivalent items never move past their equivalent brethren.

Is Quicksort stable?

  • Depends on your partitioning strategy.

S O R T E X A M P L E

S O R T E X A M P L E (0 swaps)

O S R T E X A M P L E (1 swap )

O R S T E X A M P L E (1 swap )

O R S T E X A M P L E (0 swaps)

E O R S T X A M P L E (4 swaps)

E O R S T X A M P L E (0 swaps)

A E O R S T X M P L E (6 swaps)

A E M O R S T X P L E (5 swaps)

A E M O P R S T X L E (4 swaps)

A E L M O P R S T X E (7 swaps)

A E E L M O P R S T X (8 swaps)

6

7

3

1

2

8

3

2

3

3

1

6

8

7

3

1

2

3

6

7

8

Hoare partitioning.

Three array partitioning.

61 of 66

Stability

Memory

# Compares

Notes

Stable?

Heapsort

Θ(1)

Θ(N log N)

Bad caching (61C)

No

Insertion

Θ(1)

Θ(N2)

Θ(N) if almost sorted

Yes

Mergesort

Θ(N)

Θ(N log N)

Yes

Quicksort LTHS

Θ(log N)

Θ(N log N) expected

Fastest sort

No

You can create a stable Quicksort (i.e. the version from the previous lecture). However, unstable partitioning schemes (like Hoare partitioning) tend to be faster. All reasonable partitioning schemes yield Θ(N log N) expected runtime, but with different constants.

This is due to the cost of tracking recursive calls by the computer, and is also an “expected” amount. The difference between log N and constant memory is trivial.

62 of 66

Arrays.sort

In Java, Arrays.sort(someArray) uses:

  • Mergesort (specifically the TimSort variant) if someArray consists of Objects.
  • Quicksort if someArray consists of primitives.

Why? See A level problems.

63 of 66

Optimizing Sorts

Additional tricks we can play:

  • Switch to insertion sort:
    • When a subproblem reaches size 15 or lower, use insertion sort.
  • Make sort adaptive: Exploit existing order in array (Insertion Sort, SmoothSort, TimSort (the sort in Python and Java)).
  • Exploit restrictions on set of keys. If number of keys is some constant, e.g. [3, 4, 1, 2, 4, 3, …, 2, 2, 2, 1, 4, 3, 2, 3], can sort faster (see 3-way quicksort -- if you’re curious, see: http://goo.gl/3sYnv3).
  • For Quicksort: Make the algorithm introspective, switching to a different sorting method if recursion goes too deep. Only a problem for deterministic flavors of Quicksort.

64 of 66

Sounds of Sorting

Instructor Lab 10, CS61BL, SU23

Quicksort

  • Quicksort Backstory, Partitioning
  • Quicksort

Quicksort Performance

  • Quicksort Runtime Analysis
  • Avoiding Quicksort Worst Case

Quicksort Variants

  • Philosophies for Avoiding Worst Case Behavior
  • Quicksort Variant Experiments

Quick Select (median finding)

Sorting Stability

Sounds of Sorting

65 of 66

Sounds of Sorting Algorithms (of 125 items)

Starts with selection sort: https://www.youtube.com/watch?v=kPRA0W1kECg

Insertion sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=0m9s

Quicksort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=0m38s

Mergesort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=1m05s

Heapsort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=1m28s

LSD sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=1m54s [coming tomorrow!]

MSD sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=2m10s [coming tomorrow!]

Shell’s sort: https://www.youtube.com/watch?v=kPRA0W1kECg&t=3m37s [bonus from last time]

Questions to ponder (later… after class):

  • How many items for selection sort?
  • Why does insertion sort take longer / more compares than selection sort?
  • At what time stamp does the first partition complete for Quicksort?
  • Could the size of the input to mergesort be a power of 2?
  • What do the colors mean for heapsort?
  • How many characters are in the alphabet used for the LSD sort problem?
  • How many digits are in the keys used for the LSD sort problem?

66 of 66

Citations

Quickman from Mega Man 2