ABCDEFGHIJKLMNOPQRSTUVWXYZAAAB
1
Background Data
2
FLOP to train xTrimo 6.00E+23Epoch
3
Number of A100s for xtrimo in 2023768Chen et al
4
5
Cost to train xTrimo Epoch method 2Method is from Epoch
6
Cost of A100 at release ($)$15,000For each hardware model, I used whatever unit price I could find that was reported closest to the release date of the hardware, as long as it was reported by a seemingly credible source.
7
Hardware replacement time (years)2Epoch
8
Cost per year to run $7,500Compute price trends.ipynb
9
Cost per second to run$0.0002Calc
10
Peak FP performance (FLOP/s)3.1E+14See A100 datasheet: https://www.nvidia.com/content/dam/en-zz/Solutions/Data-Center/a100/pdf/nvidia-a100-datasheet-nvidia-us-2188504-web.pdf
11
Hardware price performance (FLOP/s per $)1.3E+18Calc
12
Utilization rate 35%Epoch
13
Realised training compute per dollar (FLOP per $)4.6E+17Calc
14
FLOP to train xTrimo-100b6.0E+23Epoch
15
Hardware cost ($)$1,306,722Calc
16
17
Increase in data OOMs7Epoch database4,883.72
18
Increase in compute OOMs5
19
How much data increase for a given increase in compute1.4
20
21
Cost to train xTrimo level performance over time from Moore's law
22
FLOP/s per $ doubles every x years2.46Epoch
23
OOMs per year0.122Epoch
24
Increase in hardware price performance per year (FLOP/s per $)1.324341535
25
2023$1,306,722
26
2024$986,696
27
2025$745,046
28
2026$562,579
29
2027$424,799
30
2028$320,762
31
2029$242,205
32
2030$182,887
33
34
Projections of compute to train xTrimo-type model
35
Compute increase factor per year90.31546487683.785578521
36
As a proportion 0.9542425094
37
FLOP to train BDTsYear
realised training compute per dollar
Hardware cost
38
20236.0E+234.6E+17$1,306,722Compute in year n=2023
39
20245.4E+246.1E+17$8,880,2621
40
20254.9E+258.1E+17$60,348,7512
41
20264.4E+261.1E+18$410,119,8573Seems implausible
42
20273.9E+271.4E+18$2,787,104,8454
43
20283.5E+281.9E+18$18,940,690,8545
44
20293.2E+292.5E+18$128,717,716,0536
45
20302.9E+303.3E+18$874,743,722,5937
46
20312.6E+314.3E+18$5,944,609,675,1598
47
20322.3E+325.8E+18$40,398,557,059,9549
48
20332.1E+337.6E+18$274,541,728,003,82610
49
50
Cost to train nucleotide transformer
51
FLOP to train Nucleotide transformer1.2E+22
52
Hardware cost ($)$26,134
53
54
Projections of compute to train Nucleotide transformer type modelYear
realised training compute per dollar
Hardware cost
55
20231.2E+224.6E+17$26,134
56
20241.1E+236.1E+17$177,605
57
20259.7E+238.1E+17$1,206,975
58
20268.7E+241.1E+18$8,202,397
59
20277.9E+251.4E+18$55,742,097
60
20287.1E+261.9E+18$378,813,817
61
20296.4E+272.5E+18$2,574,354,321
62
20305.7E+283.3E+18$17,494,874,452
63
20315.2E+294.3E+18$118,892,193,503
64
20324.6E+305.8E+18$807,971,141,199
65
20334.2E+317.6E+18$5,490,834,560,077
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100