ABCDEFGHIJKLMNOPQRSTUVWXY
1
Sorting 11m strings on EC2 c4.2xlarge
2
Radix sort (ByBytes)Quicksort
3
Max procsWall time (sec)CPU time (sec)CPU use %% of ideal useCPU waste %Wall time (sec)CPU time (sec)CPU use %% of ideal useCPU waste %
4
110.310.4100.97%100.97%16.516.64100.85%100.85%
5
26.810.65156.62%78.31%2.40%8.817.15194.89%97.44%3.06%
6
44.712.13258.09%64.52%16.63%5.718.61326.49%81.62%11.84%
7
83.312.09366.36%45.80%16.25%3.518.4525.71%65.71%10.58%
8
162.715.3566.67%35.42%47.12%3.224.77774.06%48.38%48.86%
9
"CPU waste %" is the total-CPU-time delta from the 1-proc sort; higher is bad. The 16 procs % may be high from using Intel's hardware threads which aren't as helpful as real cores. Unsure.
10
The data sorted is http://dumps.wikimedia.org/enwiki/20141208/enwiki-20141208-all-titles-in-ns0.gz, shuffled with sort -R. All sorts are of [][]bytes, with the file mmap'd.
11
These timings are with a slightly old revision, but should match what you'd get today--only the API has changed, not the algorithms.
12
Parallel radix sort v. parallel quicksortParallel radix sort v. serial quicksortParallel vs. serial radix sort
13
Max procsWall deltaCPU deltaMax procsWall deltaCPU deltaMax procsWall deltaCPU waste %
14
1-37.58%-37.50%1-37.58%-37.50%10.00%0.00%
15
2-22.73%-37.90%2-58.79%-36.00%2-33.98%2.40%
16
4-17.54%-34.82%4-71.52%-27.10%4-54.37%16.63%
17
8-5.71%-34.29%8-80.00%-27.34%8-67.96%16.25%
18
16-15.63%-38.23%16-83.64%-8.05%16-73.79%47.12%
19
Qsort is better at keeping all cores busy, but not by quite enough to catch up to radix sort in wall time.The radix win outweighed the parallelization costs, though total CPU time got close at 16 procs.
20
21
22
benchcmp of radixsort and stdlib sort on EC2 t2.medium
23
With -cpu=1With -cpu=2
24
benchmarkold ns/opnew ns/opdeltabenchmarkold ns/opnew ns/opdelta
25
BenchmarkSortString1K
339428275374-18.87%
BenchmarkSortString1K-2
374884298033-20.50%
26
BenchmarkSortInt1K
14755952741-64.26%
BenchmarkSortInt1K-2
18600790019-51.60%
27
BenchmarkSortInt64K
143851823228304-77.56%
BenchmarkSortInt64K-2
143676182535033-82.36%
28
BenchmarkSort1e2
6882172587+5.47%
BenchmarkSort1e2-2
6888072560+5.34%
29
BenchmarkSort1e4
151548986455438-57.40%
BenchmarkSort1e4-2
151750156700472-55.85%
30
BenchmarkSort1e6
2395248933701996155-70.69%
BenchmarkSort1e6-2
2372841030465819031-80.37%
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100