ABCDEFGHIJKLMNOPQRST
1
EncodingCountPercentNotes
2
Total100276107100This data is taken from a crawl of a 100M pages randomly sampled from Google's index of active/popular webpages on November 11.15.2013. Encodings were determined using a custom encoding detection library which uses a variety of factors including the HTTP headers, <meta> tag as well as the textual content of the page. In the case of Visual Hebrew <space> preceeding a <final hebrew letter> is also taken as an indication of that encoding. The fact that Visual Hebrew was not detected even once in 100M pages suggests that it's not used really at all in the modern web.
3
ASCII17540811.749actually Latin1
4
ASCII-7-bit53556205.341subset of UTF-8
5
Arabic3750
6
BIG52770100.276
7
BIG5-CP95012060.001
8
BIG5_HKSCS14940.001
9
CP12502046940.204
10
CP125120685822.063MS Cyrillic
11
CP125268968316.878MS Latin
12
CP1253171850.017
13
CP12544522980.451
14
CP1255608710.061
15
CP12566131190.611Arabic
16
CP1257241370.024
17
CP85234020.003
18
CP86655290.006
19
CP8741706110.17
20
CP932473740.047
21
CSN_369103790
22
EUC780
23
EUC-CN1940
24
EUC-JP5968640.595Japanese
25
GB32999973.291Chinese
26
GB1803063280.006
27
GBK32660163.257Chinese
28
Greek207680.021
29
HZ-GB-2312140
30
ISO-2022-KR360
31
ISO-8859-111349310.135
32
ISO-8859-1317110.002
33
ISO-8859-153468780.346
34
ISO-8859-510090.001
35
ISO-8859-8-I10680.001Logical Hebrew
36
ISO-8859-800Visual Hebrew
37
ISO_2022_CN70
38
JIS41250.004
39
KOI8R174730.017
40
KOI8U24160.002
41
KSC8369530.835Korean
42
Latin25634440.562Eastern European
43
Latin319200.002
44
Latin437780.004
45
Latin51184490.118
46
Latin676630.008
47
MACINTOSH103670.01
48
SJS13787451.375Japanese
49
UTF-16BE1290
50
UTF-16LE49170.005
51
UTF-32BE140
52
UTF-32LE30
53
UTF73780
54
UTF86986589769.674wow!
55
X-BINARYENC11220.001
56
invalid_encoding18279171.823
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100