ABCDEFGHIJKLMNOPQRSTUVWXYZAAABACADAEAFAGAHAIAJAKALAMANAOAPAQARASATAUAVAWAXAYAZBABBBCBD
1
Benchmarking models
2
3
4
MODELRoBERTa (`roberta-base`)AverageAverage w/ comparisonHigh-level overview
5
BATCH SIZE1248
6
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024All valuesAverage with OOM values removed
7
PT CPU0.0460.0380.0620.1020.231N/A0.0430.0610.0910.1840.393N/A0.0430.0880.1690.3310.696N/A0.0380.1660.2980.6151.368N/A0.253150.25315PyTorch CPU Average inference time (s)1.3390.748
8
PT CPU + TorchScript0.0370.0330.0590.0960.251N/A0.0420.0580.0890.1810.404N/A0.0390.090.1670.3430.691N/A0.0320.1590.3240.6911.61N/A0.26980.2698PyTorch CPU + TorchScript Average inference time (s)0.7680.625
9
PT GPU0.0150.0160.0160.0160.016N/A0.0150.0170.0160.0160.023N/A0.0150.0160.0160.020.037N/A0.0150.0160.0210.0340.064N/A0.0210.021PyTorch GPU Average inference time (s)0.0460.046
10
PT GPU + TorchScript0.0090.010.010.0090.014N/A0.0090.010.0090.0120.022N/A0.010.0090.0110.020.043N/A0.010.0110.0190.0390.08N/A0.01830.0183PyTorch GPU + TorchScript Average inference time (s)0.0360.036
11
TF CPU0.0270.0730.0980.1460.249N/A0.0310.0930.1310.2100.458N/A0.0550.1380.1970.3521.048N/A0.0720.1880.3120.642.22N/A0.3370.337TensorFlow CPU Average inference time (s)1.3590.823
12
TF GPU0.0080.0060.0060.0080.016N/A0.0060.0060.0070.0140.027N/A0.0060.0070.0140.0240.049N/A0.0070.0130.0230.0450.094N/A0.0190.019TensorFlow GPU Average inference time (s)0.0740.043
13
TF GPU + XLA0.00430.00430.00500.00730.0132N/A0.00370.00490.00710.01180.0219N/A0.00350.00690.01110.01920.0394N/A0.00410.01130.01840.03500.074N/A0.01530.0153TensorFlow GPU + XLA Average inference time (s)0.0490.035
14
15
MODELGPT-2 (`gpt2`)AverageAverage w/ comparison
16
BATCH SIZE1248
17
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
18
PT CPU0.0390.0390.0670.1080.2470.5930.0390.0630.0990.210.4181.1040.0340.0940.1860.3820.7852.0330.0380.1790.3260.6871.5784.1130.5608750.4064347826
19
PT CPU + TorchScript0.0460.0580.120.1310.2850.7980.0460.0760.1190.2770.4921.3210.040.110.2060.5021.1383.5780.1080.2510.4420.9622.6874.8880.7783750.5996956522
20
PT GPU0.0160.0160.0160.0160.0160.0270.0160.0160.0160.0160.0220.0470.0160.0160.0160.020.0380.0850.0160.0160.020.0350.068N/A0.025478260870.02547826087
21
PT GPU + TorchScript0.0090.0090.010.0090.0130.0270.0090.010.0090.0110.0220.0540.010.0090.0110.020.0430.1040.0090.0110.0190.0390.085N/A0.0240.024
22
TF CPU0.02900.0800.1040.1560.2650.700.0360.1020.1390.2240.471.380.0630.1320.2020.371.042.760.0770.1920.330.682.315.50.72130.5144
23
TF GPU0.0080.0060.0060.0080.0170.0370.0060.0060.0080.0150.0300.0690.0060.0080.0140.0270.0550.1310.0070.0140.0250.0490.1040.2510.0380.028
24
TF GPU + XLA0.0030.0040.0050.0070.0130.0290.0030.0050.0070.0120.0230.0540.0030.0070.0110.0210.0420.1010.0040.0110.0200.0370.0790.1950.0290.022
25
26
MODELBERT (`bert-base-cased`)AverageAverage w/ comparison
27
BATCH SIZE1248
28
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
29
PT CPU0.0440.0370.060.0940.194N/A0.0410.0580.0880.1760.36N/A0.0410.0860.1720.3290.68N/A0.0370.1670.2990.6351.37N/A0.24840.2484
30
PT CPU + TorchScript0.0520.0370.0640.0890.202N/A0.040.0580.0840.230.352N/A0.040.080.1670.3220.637N/A0.0320.1610.2830.6211.406N/A0.247850.24785
31
PT GPU0.0150.0160.0160.0150.015N/A0.0150.0150.0150.0150.021N/A0.0150.0150.0150.020.037N/A0.0160.0160.0190.0330.063N/A0.020350.02035
32
PT GPU + TorchScript0.0080.0090.0090.0090.014N/A0.0090.0090.0110.0120.022N/A0.0090.0090.0110.020.04N/A0.0090.0110.020.0380.081N/A0.0180.018
33
TF CPU0.0260.0730.0950.1480.239N/A0.0310.0890.1240.2040.411N/A0.0530.1200.1860.3480.942N/A0.0690.1780.3060.6142.132N/A0.3190.319
34
TF GPU0.0080.0070.0070.0080.016N/A0.0070.0070.0070.0150.027N/A0.0070.0070.0140.0250.050N/A0.0070.0130.0230.0450.095N/A0.0200.020
35
TF GPU + XLA0.00360.00400.00510.00740.0132N/A0.00370.00500.00740.01180.0219N/A0.00370.00700.01100.01940.0395N/A0.00380.01080.01850.03480.0739N/A0.01530.0153
36
37
MODELXLNet (`xlnet-base-cased`)AverageAverage w/ comparison
38
BATCH SIZE1248
39
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
40
PT CPU0.0470.0680.1050.2420.8244.0160.0430.1010.1940.4321.6938.8830.0350.1630.3430.7923.5964.360.0450.2920.6041.6732.66110.4041.7341.357043478
41
PT CPU + TorchScript0.0420.0640.1050.2230.8124.1430.0410.0920.1940.4251.6739.5120.0310.1530.330.7933.7226.0720.0430.2840.5891.6563.66510.2851.8728751.507130435
42
PT GPU0.0220.0240.0230.0240.0260.0780.0220.0230.0220.0220.0480.1990.0230.0240.0240.0350.1120.4620.0230.0240.0320.0740.268N/A0.0710.071
43
PT GPU + TorchScript0.0160.0170.0160.0160.0230.0710.0160.0180.0180.0180.0440.1870.0160.0180.0180.0310.1040.4390.0160.0190.0270.0670.252N/A0.0640.064
44
TF CPU0.0370.0930.1250.2000.3601.2300.0470.1410.1690.3060.8472.6240.0720.1590.2550.552.586.40.0880.2420.431.324.211.51.4180.979
45
TF GPU0.0090.0090.0090.0120.0290.0740.0090.0090.0100.0210.0420.1090.0090.0100.0180.0350.0820.2240.0090.0170.0320.0650.1620.5190.0640.044
46
TF GPU + XLA0.00490.00590.00700.00960.01690.03480.00530.00730.00940.01500.02780.06500.00540.01080.01490.02530.05180.12250.00680.01440.02410.04460.09650.23740.03600.0272
47
48
MODELXLM (`xlm-mlm-en-2048`)AverageAverage w/ comparison
49
BATCH SIZE1248
50
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
51
PT CPU0.1030.1570.2920.4860.954N/A0.0980.2950.4860.9371.791N/A0.0960.4650.861.7423.345N/A0.1560.861.6783.1816.791N/A1.238650.9464210526
52
PT CPU + TorchScript0.0940.1560.2820.4790.931N/A0.0970.2790.4660.8931.836N/A0.0890.4610.9081.7763.756N/A0.1510.8751.873.2147.909N/A1.32610.9796315789
53
PT GPU0.0170.0170.0170.0250.042N/A0.0180.0170.0240.0420.079N/A0.0170.0240.0410.0760.15N/A0.0170.0410.0770.148N/AN/A0.046789473680.04678947368
54
PT GPU + TorchScript0.0110.0110.0150.0270.053N/A0.0110.0150.0270.0530.107N/A0.010.0270.0520.1040.214N/A0.0110.0520.1030.289N/AN/A0.062736842110.06273684211
55
TF CPU0.1020.2450.3460.5961.052N/A0.1370.3310.551.032.09N/A0.2070.550.952.004.2N/A0.2380.921.683.98.2N/A1.4611.108
56
TF GPU0.0070.0110.0160.0300.060N/A0.0070.0180.0310.0580.113N/A0.0070.0300.0570.1100.220N/A0.0100.0560.1080.2130.436N/A0.0800.061
57
TF GPU + XLA0.00760.01160.01690.02890.055N/A0.00870.01800.02890.0530.102N/A0.00910.02910.0520.0980.199N/A0.01190.0510.0960.1920.38N/A0.07260.0563
58
59
MODELTransformer-XL (`transfo-xl-wt103`)AverageAverage w/ comparison
60
BATCH SIZE1248
61
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
62
PT CPU0.4450.5610.7221.081.994.290.7991.051.392.224.1310.01.391.902.664.519.4723.02.6393.7175.4610.621.052.16.9651252.274388889
63
PT CPU + TorchScriptN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/A
64
PT GPU0.040.0420.0490.070.1170.2450.0550.0620.0840.1290.243N/A0.0830.1020.1440.227N/AN/A0.1680.2280.32N/AN/AN/A0.13377777780.1337777778
65
PT GPU + TorchScriptN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/AN/A
66
TF CPU0.430.620.751.042.474.90.711.041.342.614.89.71.241.923.185.19.319.32.544.76.810.618.8416.442.56
67
TF GPU0.0320.0380.0460.0630.1070.2270.0490.0600.0780.1250.2250.4880.0780.1040.1470.2370.4481.0370.1420.1960.2900.5201.1482.8920.3660.125
68
TF GPU + XLA0.0290.0320.0380.0490.0740.1370.0440.0510.0670.0830.1340.2530.0750.0880.1190.1540.2520.4970.1390.1670.2300.3010.5000.9750.1870.095
69
70
MODELGPT (`openai-gpt`)AverageAverage w/ comparison
71
BATCH SIZE1248
72
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
73
PT CPU0.0380.0370.060.0980.212N/A0.0370.0580.0930.1880.381N/A0.0310.0910.1770.3420.744N/A0.0370.1740.3210.6671.447N/A0.261650.26165
74
PT CPU + TorchScript0.0380.0340.0570.0930.218N/A0.0340.0540.0910.1930.396N/A0.0290.0850.1790.3430.774N/A0.0330.1710.3390.6910.774N/A0.23130.2313
75
PT GPU0.01600.01620.01590.01590.0169N/A0.01850.01790.01800.01840.0253N/A0.01820.01830.01850.02290.0489N/A0.01820.01780.02150.04360.0936N/A0.02500.0250
76
PT GPU + TorchScript0.0090.0090.0090.0090.012N/A0.0090.0090.0090.0110.022N/A0.0090.0090.0120.020.043N/A0.0090.0110.01910.03410.083N/A0.017860.01786
77
TF CPU0.0290.0820.1090.1680.279N/A0.0360.1080.1570.2420.528N/A0.0650.1440.2310.3751.152N/A0.0850.2140.3450.7012.405N/A0.3730.373
78
TF GPU0.0080.0050.0050.0080.017N/A0.0050.0050.0080.0150.029N/A0.0050.0080.0140.0260.054N/A0.0050.0140.0250.0480.103N/A0.0200.020
79
TF GPU + XLA0.0030.0030.0050.0070.013N/A0.0030.0050.0070.0110.023N/A0.0030.0070.0110.0200.041N/A0.0040.0110.0190.0360.078N/A0.0150.015
80
81
MODELDistilBERT (`distilbert-base-uncased`)AverageAverage w/ comparison
82
BATCH SIZE1248
83
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
84
PT CPU0.020.020.0320.0530.117N/A0.0210.0330.0490.0990.216N/A0.0220.0450.090.1770.395N/A0.0190.0850.160.3350.786N/A0.13870.1387
85
PT CPU + TorchScript0.0220.0170.0350.0540.117N/A0.0210.0390.050.0980.209N/A0.020.0450.0890.1870.408N/A0.0170.0820.1660.3570.896N/A0.146450.14645
86
PT GPU0.0080.0080.0080.0080.008N/A0.0080.0080.0080.0080.011N/A0.0080.0080.0080.010.018N/A0.0080.0080.010.0170.031N/A0.010450.01045
87
PT GPU + TorchScript0.0060.0050.0050.0050.007N/A0.0050.0050.0050.0060.011N/A0.0050.0050.0060.010.021N/A0.0050.0060.010.020.04N/A0.00940.0094
88
TF CPU0.01450.03810.05000.07550.1279N/A0.01620.04730.06740.11110.2344N/A0.02820.0630.0990.1830.51N/A0.03550.0940.1540.3171.08N/A0.16760.1676
89
TF GPU0.00450.00290.00300.00400.0079N/A0.00280.00280.00380.00730.0130N/A0.00290.00370.00700.01200.0241N/A0.00280.00690.01150.02190.045N/A0.00950.0095
90
TF GPU + XLA0.00200.00220.00260.00370.0065N/A0.00200.00250.00350.00590.0108N/A0.00200.00340.00560.00990.0199N/A0.00220.00560.00940.01770.037N/A0.00770.0077
91
92
MODELDistilGPT-2 (`distilgpt2`)AverageAverage w/ comparison
93
BATCH SIZE1248
94
SEQUENCE LENGTH8641282565121024864128256512102486412825651210248641282565121024
95
PT CPU0.0250.020.0310.0520.1150.2560.0190.030.0480.1020.2030.510.0160.0470.0920.1780.3860.9010.0190.0880.1610.3470.7251.750.2551250.255125
96
PT CPU + TorchScript0.020.0230.0340.0540.1160.2710.0180.0290.0490.1020.2310.6280.0150.0430.0870.1780.4031.460.0180.0890.1630.3781.122.510.3350.335
97
PT GPU0.0070.0080.0080.0080.0080.0160.0080.0080.0080.0080.0130.0310.0070.0080.0080.0110.0250.0600.0080.0080.0110.0220.0470.1170.0190.019
98
PT GPU + TorchScript0.0060.0050.0050.0050.0060.0140.0050.0050.0050.0060.0110.0270.0050.0050.0060.010.0220.0520.0060.0060.0090.0190.0420.1020.0160.016
99
TF CPU0.01510.0410.0540.0820.1360.3290.01820.0520.0710.1260.2370.690.03190.0700.1080.1880.541.430.0390.0990.1700.351.202.850.37210.3721
100
TF GPU0.0040.0030.0030.0040.0080.0190.0030.0030.0040.0080.0150.0350.0030.0040.0070.0140.0280.0660.0040.0070.0130.0250.0520.1260.0190.019