ABCDEFGHIJKLMNOPQRSTUVWXYZAA
1
Actual figures key data
2
ModelGPT-2GPT-3GPT-4
3
Release dateFeb-2019June-2020March-2023
4
Spent on training run compute/m$0.04240
5
FLOP on training run4E+213E+232E+25
6
Algo efficiency (normalised to GPT-4 in 2023)0.020.071
7
Effective FLOP on training (norm to GPT-4)6E+192E+222E+25
8
Increase in effective FLOP vs. previous model325948
9
Increase in effective FLOP from GPT 2 to 4307,733.45
10
11
Projections based on trend lines from 2023 (i.e. assuming no major disruptions)
12
Year2020202120222023202420252026202720282029203020312032Comments
13
Dollars spent on largest training run compute/m$
2.566.41640110303832228862911730047576130834359794
Projected at 2.75/year from 2023
14
FLOP on largest training run2E+231E+244E+242E+258E+253E+261E+275E+272E+288E+283E+291E+305E+30
Projected at 4x/year from 2023
15
Algorithmic efficiency - normalised to 2023 GPT-4
0.040.10.313927812437292187656119683
Projected at 3x/year
16
Effective FLOP on largest run - normalised to 2023 GPT-4
8E+211E+231E+242E+252E+263E+273E+284E+295E+306E+317E+329E+331E+35
17
Increase vs GPT-44.1E-045.5E-037.4E-021.0E+001.2E+011.4E+021.7E+032.1E+042.5E+053.0E+063.6E+074.3E+085.2E+09
18
Equivalent to a model of sizeGPT-4GPT-5GPT-6GPT-7
19
20
21
Efficiency of training chip FLOP per dollar9E+162E+173E+175E+177E+171E+182E+182E+183E+185E+187E+181E+191E+19
22
Increase in FLOP per dollar1.81.81.81.51.51.51.51.51.51.51.51.5
23
24
25
Effective training FLOPTraining FLOPTraining cost/bnParameters/ trillionsTokens of data/trillions
26
GPT-42E+252E+25$0.1210
27
GPT-51E+281E+27$21784
28
GPT-66E+301E+29$20140700
29
GPT-73E+337E+30$20011715857
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100