ABCDEFGHIJKLMNOPQRSTUVWXYZ
1
Dataset (Test)
Whisper Model WER
Talon Model WER
NVIDIA Model WER
This data shows the Word Error Rate of each model per dataset. Lower is better.
2
tinytiny.enbasebase.ensmallsmall.enmediummed.enlarge
large-v2
c-300Md-1Blargexlarge
(Check out the other tabs at the bottom)
3
common voice v1029.2726.7621.6419.7514.4614.1811.1711.6610.229.6313.619.447.335.78
4
librispeech clean6.955.685.094.703.613.122.872.812.802.722.592.451.641.49
5
librispeech other15.7813.4211.6010.497.677.195.955.885.465.266.335.463.522.82
6
mls17.2215.1713.0212.019.259.048.247.927.437.859.378.045.595.14
7
gigaspeech14.8113.1212.5011.9911.1710.8410.5410.5210.6011.4116.5315.3012.5111.98
8
tedlium9.958.247.687.197.406.656.686.236.978.476.425.316.766.28
9
accent2-all18.1917.2513.9413.059.099.367.567.827.396.1212.4910.4511.508.54
10
podcast1 (n)8.357.056.647.205.465.465.044.975.287.104.803.926.125.93
11
english1 (r)11.279.639.208.886.946.476.186.266.096.878.227.567.367.16
12
podcast2 (t)25.0522.6822.2021.0119.8819.1319.6818.2918.6221.6513.9213.1221.5120.84
13
tts156.1151.1047.7045.4828.5531.9124.1126.5122.4425.344.854.8127.7833.82
14
tts2102.97100.9885.3290.1167.3861.2659.3155.0854.5254.3225.8424.7578.4174.63
15
words151.9745.0235.1630.7424.0621.6918.8518.0717.8216.708.057.5530.6131.85
16
words2170.34167.19139.57140.30115.52107.1296.3797.6591.9788.1121.5522.26131.94136.06
17
words361.3754.4148.2845.8535.3933.3328.7627.5125.5326.2316.1416.1632.3728.92
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100