HSTS Preload List
State of the Union 2016
contact: last updated: canonical link (for Googlers): crbug: | lgarron@ 2016-02-02 |
Contents
Ownership Verification for Manual Requests
Preload List (Code Generation) Script
HSTS allows a site to request that it always be contacted over HTTPS.
Chrome maintains an "HSTS Preload List” of built-in HSTS domains, which is growing rapidly. A routine “weekly” update on January 26, 2016 added over 1100 new domains.
In addition, the “HSTS” preload list also includes preloaded HPKP entries, but there are much fewer of these.
See chromium.org/hsts for more info.
This is an assorted collection of concerns that are in my mind as of January 2016.
We have an automated submission at hstspreload.appspot.com, but we still have to run a manual script to accept/reject domains. (Googler-only link to script: go/hsts-preload-process)
This increases latency. Sometimes, site operators are frustrated that a non-obvious check in the “manual” script delayed their preload request by a few weeks.
TODO
|
For Marking HTTP As Non-Secure, we still need a better understanding of HTTPS adoption rates and challenges. Sites using HSTS are level 4 (of 5) on Ivan Ristic's TLS Maturity Model, so we can learn from them about late stages of HTTPS adoption.
TODO
|
Sites are asking to be removed on a regular basis now.
Fortunately, some sites are doing this immediately after being preloaded (so that they’re easy to remove), but some of them ask for this months later.
The most common issue appears to be forgetting to check for HTTPS support on all subdomains hosted by third parties.
TODO
|
Some sites cannot preload HSTS with includeSubDomains because of a handful of subdomains that do not support valid certificates.
They can “whitelist” individual subdomains by setting/preloading HSTS, but right now neither the spec nor any implementation can support a “blacklist” carve-out. Without carve-outs, a single internal corp.example.com subdomain can prevent preloading example.com for end users because it breaks users in the corporate in the network.
Some large companies have run into this issue; as a concrete example, dropbox.com took 2.5 years to work through this (and that’s a “success story”!).
However, supporting carve-outs would break the strong, straightforward guarantee that would come with HSTS on an entire eTLD+1. If companies get used to the option of a permanent carve-out, this may harm the overall protection of users in the long term.
TODO
|
There are definitely a bunch of stale entries on the preload list. Unlike Mozilla, we do not perform any scans to ensure freshness. We also require a preload token for automatic hstspreload submissions, but we don’t ask for that token to stick around.
In practice, I honor removal requests if the preload token has been removed, but this is not a documented policy.
Note that any automated process to determine stale sites should be done very carefully, and ideally announced ahead of time to reach relevant site operators. Some examples:
As of February 1, 2016, I had only ever received one email with mild concern about stale entries. However, on February 2, 2016 I received an email about a concrete case where someone recently bought a domain that was preloaded by the previous owner in 2012. It’s unclear if this may become a growing problem.
TODO
|
There are various reasons a site may not preload through the form:
In this case, I:
In addition, sites sometimes ask to be removed, but the site doesn’t answer the phone over port 443 anymore or the certificate is now invalid. In this case, I ask the owner to restore valid HTTPS or add a TXT record to the DNS entry for the domain, but this is totally ad-hoc.
I think the status quo is okay, but it doesn’t scale, so I want to keep an eye on it.
TODO
|
The preload list is compressed using a Huffman tree, but the length of the compressed data is already over 64KB and growing. This format cannot be compressed much further[2], and it is not optimized for small binary diffs.
If the list grows to be, say, 10× the current size, the list will contribute a non-trivial amount to the Chrome binary size and update bandwidth.
We help keep the size down by requiring eTLD+1 domains for automated submissions, and discouraging manual submissions from adding too many individual subdomains. But we can’t ship a partial list – the whole point is that we’ve agreed to preload the sites on it.
TODO
|
There is a script called transport_security_state_static_generate.go that sanity checks the preload list and generates the C++ header that contains the preload list data (it also handles other things that are stored with the preload list, including HPKP and expect-ct). In particular, it generates a compact Huffman tree encoded in the header as static data that contains all the domains in the list.
Right now, the script lives in a separate repository and is written in Go. This makes it non-obvious for anyone but a handful of Chrome security folks to test a bona fide preload list change.
If the script is integrated into the build process[q][r][s][t], it would also be easy to:
As of 2016-03-07, HSTS preload updates to the source code result in a new 500KB blob every time, to the point that I can’t upload CRs using the normal way anymore.
TODO
|
Sites may preload expect-ct in the near future.
TODO
|
People who could be involved:
TODO
|
I suspect that the advice about HSTS out there is messy. Some of it will be outdated, while some of it will encourage using an HSTS header without making the tradeoffs clear.
Sites are still making many mistakes.
TODO
|
A lot of of removals are due to subdomains hosted by third parties that either do not support HTTPS or charge so much for it that sites would rather be removed from the preload list than pay.
This is a “living” section of this document; please suggest additions with any concrete information that would be useful.
Here is a list of concrete cases:
Googlers can also view an internal list here.
[1] If this is implemented for enterprise policy using a preference, the settings is also also accessible from Chrome settings extension API.
[2] I can compress it only about 5% using zip or brotli.
[3] Python is common for build scripts, but per agl@ this is “too large a script for Python and also things like SPKI extraction would be very problematic.”
[4] agl@: “Rewriting in C++ would be a pain. I'd like for someone not to waste their time but that would be plausible if BoringSSL was used for the certificate handling bits.“
[5] Note: we’d still run the script manually.
[6] Using ≈20 minutes as a slightly conservative estimate for the ≈20,000 steps.
[a]Do we have a list of the limitations in the Go libraries? (Are the libraries getting fixed?) Do we tell the user that their site might not be accessible to some clients?
[b]No, I don't.
1) I think it's usually a problem on their end – their site probably assumes it's talking to OpenSSL with certain configs.
2) The only reason we reject sites is if they will break on any platform *in Chrome*. So, no, we don't currently tell them (but the automated submission process will fail if I can't connect to them from Go).
[c]This means that
1) Users who install Chrome freshly are not protected
2) Malicious operators can more subtlely block Chrome updates.
Much like the public suffix list, things which are critical to security are generally left in the Chrome binary.
[d]CRLSets are also security-critical.
In any case, I think we'd still ship with a subset of hardcoded sites in the binary if we do this. I'll defer the decision until this becomes a problem.
[e]Is it feasible to reuse any of the Safe Browsing code, tweaked to fail closed? The most-used domains could be shipped with the Chrome binary, while other domains that hit an initial hash match might need to download additional data. Would add a "phone home" and a possible point of failure for the less-used sites, though.
In terms of storage, Go's public suffix list uses a table that takes advantage of label-wise overlaps of strings. Might be worth evaluating whether that could produce a smaller stored size: https://github.com/golang/net/blob/master/publicsuffix/gen.go#L590
[f]Reusing the Safe Browsing code should be doable, but I don't know if we can get away with failing closed unless the probability is low enough to be a rounding error.
Fortunately, the majority of entries have the same preload settings, so we can reduce the problem to an efficient membership querying structure for the bulk entries.
*Un*fortunately, we need at least a few bytes for each site if we store anything at all locally. We currently average about 8 bytes per entry (including metadata), and I don't think it'll be possible to get down to, say, 2 bytes per entry.
So, we can hold on to the dream of keeping some local information for every entry shipped with the Chrome binary a little longer, but it's unclear how long.
Do you know how many bytes per entry the Go code uses?
[g]It depends on how much overlap there is in the data set. I'm also not sure how much space the index structure would take up. I think the best way would be to build a table based on the HSTS data set, build a program with the resulting code, and introspect the size of the relevant data structures.
[h]Just tried, and it failed with "text offset 47098 is too large, or nodeBitsTextOffset is too small". Not sure how far I want to debug this. :-/
[i]Fair enough. That's saying that the size of the string storage would be 47k, which is reasonably close to the 64k compressed you mentioned that the existing system takes. With overhead it doesn't sound like it would be that much of a win.
[j]I did manage to get the script to finish by commenting out some checks. It comes out to 248KB gzipped.
[k]Is that the source form or the compiled form? The generated code has lots of comments and other noise.
[l]This seems like it'd be problematic. If your site is one that's hit by a false positive, you're stuck.
[m]Definitely. I'm talking Miller-Rabin kinds of false positives – say, 2^-64 for a given domain.
We could also scan for false positives against all the domains Google knows at generation time, and include a tiny blacklist in the generated data. That might let use get away with 2^-32, if it matters.
[n]Ah, okay.
[o]_Marked as resolved_
[p]_Re-opened_
[q]I did such a change in https://codereview.chromium.org/658723002/ but it wasn't interesting at the time. Now that is 18 months old and the format has been rewritten so it can't be used as is but maybe parts of it can be used.
The tricky part was extracting certificate information since standard python doesn't have that functionality.
[r]Oh, awesome! How slow/fast do you think that patch would be if we adapted it again?
[s]I didn't remember but I found this message I sent to agl at the time:
"Hmm, checking the performance of the python version it seems to take several hundred milliseconds. It looks like the pyasn1 library is not very efficient, needing several ms to parse a certificate. Bad python programmer! Ah, twice as fast with a trivial change in the BitStringDecoder."
But this was before the huffman step and with a smaller database. Somewhere between a few hundred ms and a few seconds I would guess but I don't know.
[t]Hmm, so it seems that parsing is slow. But we have few certs to parse.
What if we started by porting the JSON parsing and Huffman tree to Python/C++ and turned it into a build step? Pins change very little, s we would benefit from much easier JSON updates on almost all commits.
[u]I'd say that the average buildstep needs time in the order of seconds. It may become 0.05s wall time only by running many at the same time and that is not a fair comparison.
[v]True, but this task is CPU-bound. To compete with the average build step, we need decently low CPU time (i.e. multiple seconds of 1 CPU is definitely too much).
[w]Almost all build steps are CPU bound if you have enough RAM. The only problem with a multi-second generation time would be if nothing else could be done at the same time because it has to be run before everything else of after everything else and I don't think that would be a case.
The average build step seems to be around 1.35 cpu seconds in one calculation I did now and people have no qualms adding new compilation units or other things adding much more to the compilation time than 1.35 cpu seconds. Doing the right thing is worth much more than a cpu second IMHO.
(there are files in blink that I think needs 30 seconds to compile)
[x]HSTS is definitely something we'll be talking about as we do evangelism as a part of the MOARTLS outreach. It's pretty far along the curve though, since the site needs to get everything else done before HSTS is something they can consider.
[y]For the average site, it's pretty far along the curve. But it's becoming pretty mainstream, so that HPKP is now at the end of the curve.
In particular, a lot of new sites opt for HSTS. I guess that these don't need as much help from evangelism, but the preload list growth indicates that there are a lot.
[z]+rmh@google.com Have you done this sort of outreach before?
[aa]I am the CT Product Manager. I also ran the MS root program amongst other things which has some similarities at least in spirit.
[ab]https://https.cio.gov/hsts/ seems to be growing as a reference for non-gov sites. Happy to make sure it has correct and up-to-date information. Issues and PRs welcome here: https://github.com/gsa/https