ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
Full rounds8
2
Partial rounds24
3
4
AddsMuls
5
"Easy matmul" costX -> X * diag + sum(X)3116
6
"Full matmul" costX -> MX (assuming M is repeat of 4*4)8080
7
Ext*ext mul costMod X^4 - 3. No karatsuba turns out to be faster1816
8
9
Base addsBase mulsExt addsBase*ext mulsExt*ext mulsTotal base addsTotal base muls
10
RLC for linear sumcheck1414147056
11
Weights generationSmall-size (this also gets used for full layers)0.250.255.54
12
Linear sumcheck (first layer)Half-size due to Gruen's trick1114
13
Linear sumcheck (later layers)N/4+N/8+N/16+... = half-size112216
14
Cubic sumcheck (first layer)Half-size due to Gruen's trick232211
15
Cubic sumcheck (later layers)N/4+N/8+N/16+... = half-size259880
16
0
17
Full layer (execution)80808080
18
Partial layer (execution)31163116
19
Total execution1384102413841024
20
21
Full layer (proving)324832.253280.251605.51460
22
Partial layer (proving)17317.25176.25198.5171
23
Total proving6644566726647921760815784
24
25
Does not take into account memory bandwidth or differences in ability to parallelize different operations.
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100