TL;DR
Our paper investigates BoPETs, client-server applications that aim to hide sensitive user data from untrusted servers by encrypting this data but allowing the server to perform certain operations on it. As an example of a BoPET, we analyze a system called Mylar, designed by Popa et al. for building Web applications on top of encrypted data.
We demonstrate that Mylar is vulnerable to passive and active attacks by adversarial servers. In general, our paper shows why the problem of securing client-server applications against active attacks is very hard and remains unsolved.
What is the purpose of this guide?
We want to set the record straight, as there are now several different versions of the Mylar paper, webpage, and associated security claims.
After Popa et al. obtained a confidential draft of our paper, they changed the Mylar webpage several times (without explaining the changes), erased their own previous statements, added claims that never appeared in the original Mylar paper, and replaced links to that paper with a heavily revised tech report. They also posted a technically incorrect “response,” even though our paper was not public at the time.
These attempts to retroactively rewrite a published paper and erase evidence of previously made claims impede scientific progress and confuse the community’s understanding of the field. They also put users of systems like Mylar at risk because the designers make contradictory statements about its security guarantees in different places at different times.
What is Mylar?
Mylar is a system for building Web applications on top of encrypted data. It is described in the NSDI 2014 paper by Popa et al. Popa et al. claimed that Mylar is secure against arbitrary attacks by a malicious server fully controlled by the adversary. Here is what Popa et al. say in the Mylar paper:
"Mylar allows users to share keys and encrypted data securely in the presence of an active adversary" (Abstract)
"Both the application and the database servers can be fully controlled by an adversary: the adversary may obtain all data from the server, cause the server to send arbitrary responses to web browsers, etc. This model subsumes a wide range of real-world security problems, from bugs in server software to insider attacks. Mylar also allows some user machines to be controlled by the adversary, and to collude with the server. This may be either because the adversary is a user of the application, or because the adversary broke into a user's machine. We call this adversary active, in contrast to a passive adversary that eavesdrops on all information at the server, but does not make any changes, so that the server responds to all client requests as if it were not compromised" (Section 3.4)
"Mylar is a novel web application framework that enables developers to protect confidential data in the face of arbitrary server compromises” (Conclusion)
The Mylar webpage links to a different version of the paper. Also, it asserts that Mylar protects data confidentiality only against a passive attacker. How come?
An archived copy of the original Mylar webpage can be found here. In July of 2016, while our paper was undergoing peer review, all claims of security against active attacks were erased from this page and retroactively replaced with much weaker statements, which directly contradict the claims by Popa et al. in their NSDI 2014 paper that Mylar protects data and query confidentiality against active attacks
In September of 2016, after we sent our draft to Popa et al., the Mylar webpage was modified yet again. The security claims were retroactively changed for the second time (again, without explanation), and the link to the NSDI 2014 paper on the front page was replaced with a link to a revised tech report. The actual NSDI 2014 paper is available here.
Is Mylar secure?
Mylar does not protect confidentiality of users’ data and queries against either passive or active attacks. In our paper, we show three types of attacks against Mylar and Mylar-based Web applications.
The first attack exploits static metadata that Mylar leaves unencrypted on the server. It can be performed even by a completely passive attacker who obtains a static, one-time “snapshot” of the server.
The second attack exploits access patterns of Mylar-based applications. It can be performed by a persistent passive attacker who observes server operations for a period of time.
The third attack exploits the fundamental design flaw at the heart of Mylar. Mylar trusts the server to convert users’ query tokens for searching over encrypted data. We show how an actively malicious server can use this to perform brute-force dictionary attacks against any user query. In the case of Mylar-based kChat application, this attack recovers all user queries and 70% of the keywords in all chats stored on the server.
Are the first two attacks out of scope for Mylar?
The metadata attack is in scope and can be easily performed by any attacker considered as part of Mylar’s threat model (see Section 3.4 of the NSDI 2014 paper).
The access-patterns attack can be viewed as out of scope because the Mylar paper says the following: “Mylar does not hide data access patterns, or communication and timing patterns in an application”. Of course, just because Popa et al. decided not to hide this information, does not mean that a real-world attacker would not exploit it. In general, this is a hard open problem that affects many systems, not only Mylar. Our analysis helps understand how much is leaked by access patterns in realistic applications (a lot!).
Is your active attack the same as described in section 5.4 of the Mylar paper?
No. The attack described in the Mylar paper involves an adversary who “creates a document containing all the words in a dictionary, and gives the user access to that document” (see Section 5.4 of the Mylar paper). This is a generic attack against any searchable encryption scheme: the adversary learns all queries that match keywords in the adversary’s documents. It is not specific to Mylar and has been known since before Mylar was invented.
Our attack is much stronger and exploits the fundamental design flaws of Mylar. The dictionary is kept offline. Instead, the adversary creates a legitimate or even empty document on the server. If the user has access to that document, all of that user’s queries are compromised. In contrast to the generic attack, our attack recovers even the queries that do not match any word in any document stored on the server. This threat is never mentioned in the Mylar paper.
Even if Mylar were implemented on top of ORAM or some other storage that hides all data-access patterns from the Mylar server and thus foils the generic attack mentioned in the Mylar paper, Mylar would still remain completely vulnerable to our dictionary attack.
The Mylar webpage claims that allow_search prevents this attack. Is this true?
No.
In our paper, we explain two ways to attack user queries. The first involves a malicious server adding the user to a document (this is the attack that allow_search was intended to prevent). In the released Mylar codebase, allow_search is not functional, containing one line of code which updates an otherwise unused variable and calls an obsolete method without any of the needed parameters. Even if allow_search were actually implemented, its semantics are highly counterintuitive. When the user is told that someone shared a document with him and asked “Do you want to be able to search this document?,'' the user must understand that answering “Yes'' can compromise all of his queries over any document --- not just the document he is being asked about. The contents of these documents will be compromised, too, including private documents that are not shared with anyone. This risk is never mentioned in the Mylar paper.
The second avenue for attacking user queries is a collusion attack: if Alice voluntarily shares a document with Bob, and Bob colludes with the server, the server can perform brute-force attacks on all of Alice’s queries over all of her documents, whether shared with Bob or not. allow_search, even if implemented and deployed properly, does nothing to prevent the collusion attack.
Wasn’t Mylar formally proved secure?
Popa and Zeldovich presented a proof of security for the multi-key searchable encryption scheme underlying Mylar in their 2013 tech report. Unfortunately, their security definitions do not imply anything about the security of either their encryption scheme, or Mylar as a system. As we show in our paper, it is easy to construct a counterexample which is obviously insecure yet satisfies the definitions of Popa and Zeldovich.
Our counterexample is adapted from the 2006 paper by Curtmola et al. Popa and Zeldovich cite this paper but do not acknowledge that the counterexample applies to their own definitions, too.
The Mylar webpage asserts that Mylar is secure against active attacks if used in combination with a system called Verena. Is this true?
This assertion was added retroactively, in July of 2016. It never appeared in the original Mylar paper, the Mylar webpage, or the Verena paper. In fact, the paper describing Verena claims that Verena relies on Mylar for confidentiality, so the security claims are circular.
We did not perform a detailed analysis, but Verena appears to focus on protecting integrity of the data returned by the server to the clients. None of our attacks violate this integrity, thus Verena does not appear to do anything to prevent our attacks.
Last update: September 21, 2016