GlobaLeaks Application Security

Design and Details

Release 2.0 as of June 2013 (updated August 2016)

Goal

Introduction

Key concepts

Tip

Recipient interactions

Whistleblower interactions

Tip elements

Authentication

Authentication Matrix

Authentication Methods

Password

Receipt

Technical Implementation

Receipt

Password

Bruteforce protection

Password Security

Password Strength

Password Change on First Login

Password Lockout

Password Recovery (Not yet implemented)

Password Storage

Entropy sources

Web Application Security

Session Management

XSRF Prevention

Input Validation (Server)

(File) Content-Type Validation

Form Autocomplete OFF (Client)

Input Validation (Client)

CORS Security

Enhanced HTTP Security Headers

Privacy related HTTP headers

Anchor tags and external URLs

Crawlers Policy

DoS resiliency approach

Delivery task

Notification task

Server side data protection

File encryption

Secure file delete

Secure db delete

Exception logging and redaction

Related Project Documentation

Goal

This document describes design choices and details all of the Application Security features implemented in the GlobaLeaks software.

Please note that this document does not include details already part of the GlobaLeaks Threat Model.

Introduction

The GlobaLeaks software comforms to industry standard best practices for application security by following OWASP Security Guidelines.

Specifically, the software follows the OWASP guidelines for REST-driven interfaces and its extensions.

GlobaLeaks is made up of two main software components GLBackend and GLClient. GLBackend is a python backend that runs on a physical server.

The GLBackend exposes a REST API which the GLClient interacts with.

GLClient is a client side web application that interacts with GLBackend only through XHR.

This means that the GLClient may be served from a domain that is different from that on which GLBackend is running – it could even be installed as a browser plugin.

For more details on the security considerations of this approach see the “CORS Security” section.

Key concepts

This section explains the key security concepts when a Whistleblower submits a Tip to a number of Recipients through GlobaLeaks.

Tip

A Tip created by a Whistleblower is stored in the GLBackend’s database.

The Tip holds references to the files that have been uploaded, any comments or private messages, and metadata about the tip like the time of creation, which recipients were given access, and the number of accesses.

After a configurable amount of time a Tip will expire and be deleted by the GLBackend.

After deletion neither the Recipients or the Whistleblower will be able to access the Tip.

The number of times the Tip has been viewed by Recipients can be shown to both the other Recipients and the Whistleblower.

If this feature is enabled, it lets the Whistleblower understand the progress of his submission.

After a configurable amount of time of not accessing the tip the Whistleblowers access to the tip will expire.

Recipient interactions

Acess to a Tip is granted to every Recipient through a globally unique randomly generated access token.

The Node Administrator may also define a limit on the number of times Recipients can view the Tip.

When a Recipient views a Tip, he sees the Whisteblowers responses to the questionaire, some of the Tip’s metadata, and forms to download files associated with the Tip.

If a Reciever has configured a PGP encryption key then the files attached to the Tip will be encrypted with their public key.

Whistleblower interactions

Every time a Whistleblower performs a Submission on a GlobaLeaks node a Tip is created and the Whisteblower is given a randomly generated 16 digit Receipt (e.g: 5454 0115 3135 3982)

This Receipt is an authentication secret token that lets a Whistleblower view the Tip.

Futher details on the security properties of the Receipt are discussed in Authentication Methods->Receipt.

Tip elements

The Whistleblower questionnaire answers – text and other fields filled in a Form by the WB.

The Metadata – includes time of Tip creation, date of last access, number of views, number of files etc.

The Files – constrained to a max size by the Node Administrator.

The Comments – text written by either the WB or by the recipients, which is visible to all the Recipients and to the WB.

The Recipients – a list of node users allowed to access the Tip.

The Messages – one to one messages between the WB and a Recipient.

Authentication

This section describes how authentication is handled by the GLBackend.

Due to the fact that the GLBackend may be deployed as a Tor Hidden Service, the authentication method to the GLBackend has different modes depending on the type of the user.

Authentication Matrix

The table below summarizes the ways different Users – described in the General Architecture Specification – authenticate themselves to the GLBackend:

User type

Authentication method

Node Administrator

username and password

Node Administrator

API token

Recipient

username and password

Whistleblower

receipt

Authentication Methods

Supported authentication methods are described as follows.

Password

Node Administrators and Recipient MUST use password based authentication.

 

By accessing the GlobaLeaks login interface, Node Administrators and Receivers need to insert their respective username and password. If the password submitted is valid, the system grants access to the functionality available to that user.

Receipt

A Whistleblower MAY access his or her Tip by using a Receipt, which is a randomly generated 16 digit number created by the GLBackend when the Tip is first submitted.

This receipt is not stored by the GLBackend, instead the output of a one way function (of the form: F(Receipt) → X) is stored.

Everytime a Whistleblower with a valid Receipt wishes to view their Tip, the GLClient can compute the output of that function and pass it to the GLBackend.

Since the GLBackend only ever sees the ouput of the function and not the Receipt itself, the receipt acts as an authentication token for the Whistleblower.

Technical Implementation

Receipt

The Receipt is generated by the GLBackend immediately after a Whistleblower’s submission and is used by the Whistleblower to authenticate as the author of the Tip.

If the secrecy of the Receipt is compromised, it is possible to access the Tip with the privileges of a Whistleblower.

The security properties of a Receipt can be customized by the Node Admin by specifying a regexp that the random Receipt should match.

This will be implemented by using https://bitbucket.org/leapfrogdevelopment/rstr, but replacing the standard python random function with Crypto.Random.random.

For example by specifying:

[a-zA-Z0-9]{2}-[a-zA-Z0-9]{2}-[a-zA-Z0-9]{3}

The generated receipts will look like so:

aD-h8-Hn0,

h2-j8-bj1

By default the Receipt generated by the system for a Whistleblower is a 16 digit number

Note: The reason for choosing a 16 digit number as the default is that it resembles a standard phone number, making it easier for the whistleblower to conceal the receipt of their submission and give them plausible deniability on what is the significance of such digits.

The receipt regular expression is a context-based variable, so a node can has different format of receipt in use at the same time.

Authentication of receipt is performed with the login/password mechanism, with a fixed username of “wb”.

If the login is valid, the authentication API return a cookie with valid Session ID, used as described in the section: “Session Management”.

Password

The username and password fields are sent as payload of an HTTP POST request.

Here is an example:

Request:

POST http://example.onion/authentication

{“username”: “alice”, “password”: “mypassword”}

Response:

{“session_id”: RANDOM_SESSION_ID}

If authentication is successful a session identifier for user Alice is generated and is handled as described in the section: “Session Management”.

Security considerations

The confidentiality of the transmitted password is given by either Tor Hidden Services or SSL.

It should be noted however, that if the GlobaLeaks node is being accessed through Tor2web

clients passwords are being disclosed to the Tor2web node.This would allow a rogue Tor2web node to impersonate the user.

The Privacy Badge described in the Threat Model advises the user of this kind of risks.

Receiver Password

Access to a Recevier Tip is granted by inserting a password. When a Receiver accesses he’s own Tip, received by the notification service, the username field will be already filled and the Receiver will only have to enter his own password.

Bruteforce protection

Brute force protection is applied to the Whistleblower’s Receipt and to Receivers’s and Admin’s Password.

If a threshold of more than 10 globally failed login attempts within 60 seconds is detected, the following additional security measures are applied to make the brute force attack less efficient:

The goal is to make brute force attack inefficient for an attacker by slowing them down to a point that makes the attack not useful within a reasonable amount of time.

Password Security

The following password security measures are put in place in order to enforce the security of the system.

Password Strength

For password strength we follow: https://xkcd.com/936/

We implement using https://github.com/lowe/zxcvbn as meter for evaluating the strength of the users password. A password is considered acceptable if the estimated time to crack is > 10^8 (zxcvbn strength level 3); the password is marked as strong if the estimated time to crack is > 10^10 (zxcvbn strength level 3; max level).

The estimation of the time to crack is based on presence of common words, patterns and estimated entropy of the password.

Password Change on First Login

This feature forces the receiver to change its own password on first login. The feature makes possible for the administrator to prepare receivers accounts on the globaleaks platform by means of a default password or an administrator chosen password while avoiding unsecure uses of the platform.

Password Lockout

When an user (Receiver of Admin) fail in authenticating, the answer is delayed proportionally to the amount of failures.

Password Recovery (Not yet implemented)

If a Password Lockout event occurs, the Receiver is notified via its configured notification system and is invited to reset its o password by visiting a URL.

If the Receiver has more than one notification method configured then the URL for resetting the password is sent to one address, whereas a randomly generated token is for accessing their Tip is sent to the other. This first message will contain details as to where they will be able to retrieve the token.

The user will then visit the provided URL and if also a token was sent, such token will be prompted to be input in the system.

Password Storage

We store the Receivers password hashed with a random 128 bit salt, unique for each user. The salt is obtained hashing with scrypt including also the receiver’s username.

Node Administrative password follow a different rule, because has not an username usable for this operation, we collect 128 bit of entropy the first time a node is initialized, and this random salt is used to hash Admin password.

Same behavior is managed for the receipt. Receipt are randomly generated string, but are keep hashed using 128 random bits of salt (differs from Admin salt, but also this is collected at node initialization)

The 128 bit of entropy using Crypto.Rand python module (that interface the app with the safe kernel random)

We use the py-scrypt python bindings to scrypt (https://bitbucket.org/mhallin/py-scrypt).

Entropy sources

The main source of entropy for the platform is /dev/urandom.

In order to increase the entropy available into the system, the platform integrate the haveged algorithm:https://github.com/globaleaks/GlobaLeaks/issues/720.

The solution integrated for this is the haveged daemon: http://www.issihosts.com/haveged/

Web Application Security

This section describes the Web Application Security functionalities implemented by following the OWASP REST Security Cheat Sheet.

Session Management

We follow OWASP Session Management Cheat Sheet security guidelines.

Once a user is authenticated a random Session ID is generated for them.

A Session will expire accordingly to a default user idling timeout:

- Whistleblower 10 hours

- Receiver 10 minutes

- Admin 3 minutes

The user may also explicitly terminate the session by logging out or just by closing a Web Browser Window.

The Session ID is generated randomly by the server and is 256bits long (hex encoded) and is generated via Crypto.Random.random (hex(random.getrandbits(256))).

The Session ID is sent as an HTTP header (X-Session).

The user has to enter a password to access resources requiring certain privileges.

Example of session management with redis + tornado (almost identical to cyclone)

https://gist.github.com/1735032

XSRF Prevention

Every request must be protected by a XSRF token.

The XSRF mitigation strategy we implement is the “Double Submit Cookies”; A Cookie called “XSRF-TOKEN” is generated. Every request to be protected from XSRF will set the custom HTTP Header “X-XSRFToken” to be such value.

XSRF protection should whenever a request is made by an Authenticated User.

We will be using the cyclone implementation of XSRF protection as seen here: https://github.com/fiorix/cyclone/blob/master/cyclone/web.py#L904

Paired with: AngularJS XSRF implementation of the client side part of it: https://github.com/angular/angular.js/blob/master/src/ng/http.js#L519

The two will work happily together as soon as this patch is merged upstream: https://github.com/fiorix/cyclone/pull/90 

Input Validation (Server)

In general we adopt a whitelist based input validation approach. The user supplied input is checked against a regular expression that it must conform to. If it does not match a generic error is thrown.

The message exchange format used for communication between client and server is JSON.

The body of such requests is compared against a specification that it must conform to. Such messages are specified as a series of key, value pairs, where the value is a regular expression that the key in the client supplied input must be conformant to.

(File) Content-Type Validation

The anonymous Whistleblower submitting file for journalist, would sound as an attack vector itself. Journalist, or Receiver using the generic name, can be a unskilled user and can be unconscious of security threats.

For this reason, Node Administrators need to avoid easily exploiting of those person.

To this aim Globaleaks offers Node Administrators the possibility to enable a FileProcess layer and to configure some validation checks.

A file is provided to Receivers only if it pass all the configured security checks.

Every Whistleblower’s submitted file is passed through a filter to detect potential risks associated with the opening of the files by the Receiver.

If the uploaded file is considered to be potentially harmful to Receivers is displayed a Disclaimer; Receivers must accept the Disclaimer before being able to download the file.

The hash of the uploaded files is also always displayed to the Receiver allowing them to check through third party virus scanners if the file is a well known malware .

As an example some extensions that are considered potentially harmful could be: .exe, .com, .pdf. We will come with a built in list of harmful extensions and allow give the node admin the ability to extend the extension blacklist.

Input Validation (Client)

The means through which GLClient communicates to the backend is via JSON. All server side input is considered untrusted and all message payloads that are rendered in the users browser is passed automatically sanitized by the Angular.js Web Application framework.

Every string that is to be rendered in the users browser will pass through the $sanitize filter,

for more details on this see: http://docs.angularjs.org/api/ngSanitize.$sanitize and https://github.com/angular/angular.js/blob/master/src/ngSanitize/sanitize.js.

CORS Security

Since we are interested in having the possibility to delivery the GlobaLeaks web application (GLClient) from a source that is different from that which is running GLBackend we will be using very lax CORS headers on all resources.

In particular we will allow requests from any Origin:

Access-Control-Allow-Origin: *

The impact of maliciously crafted cross origin requests, though, is mitigated by using XSRF tokens in all authenticated resources.

Without this in place we would not be able to serve GLClient from newswebsite.org and have the requests be made to theglobaleaksnode.com.

Moreover allowing CORS on all requests to theglobaleaksnode.com will allow us to generate cover traffic toward the GLBackend instance by users that are not whistleblowers, providing some degree of plausible deniability (i.e. having performed a DNS lookup to theglobaleaksnode.com in itself is not indication of the fact that you are a whistleblower because every user of newswebsite.org also perform such lookups).

Security related HTTP headers

When accessing GLBackend via https we set the Strict Transport Security headers to be:

Strict-Transport-Security: max-age=8640000; includeSubDomains

Web-browsers usually attach referrers in their http headers as they browse links. When a user follows a link out off of the domain, the referrer is stripped.


Referrer-Policy: no-referrer

When setting up Content-Type for the specific output, we avoid the automatic mime detection logic of the browser by setting up the following header:

X-Content-Type-Options: nosniff

In addition in order to explicitly instruct browsers to enable XSS protections the GLBackend inject the following header:

X-XSS-Protection: 1; mode=block

Crawlers Policy

In order to instruct crawlers to not index or cache node data, the GLBackend injects the following HTTP header:

X-Robots-Tag: noindex


Web Browser Privacy

The Tor browser strives to remove as much identifiable information from requests as possible. It is still not perfect. For normal web browsers the situation is much more grave. The goals here are two fold, reduce the amount of application data and metadata stored on the a client’s machine, and reduce the amount of information about the client shared from client to server.

Privacy related HTTP headers

The Globaleaks server by default sends the following headers to instruct client’s browsers to not store resources in their cache. For browsers that comply with the header and in Tor browser’s case this prevents resources served by the server from reaching the disk via the client’s caching mechanism.

Cache-control: no-cache, no-store, must-revalidate

Pragma: no-cache

Expires: -1

It is worth noting that the User-agent header is examined by the application and the UAs associated with known web downloading tools like curl and wget are blocked.

Additionally, if an unhandled exception is thrown by the client application loaded in the users browser, when the failure is reported to the backend the UA will be recorded by the backend for failure analysis.

Anchor tags and external URLs

In order to guarantee user privacy has been given to the various ways a user may leave the application passing to a different external website leaking information about his operations. Keep the number of clickable external anchor tags to a minimum. The content generated by whistleblowers, recipients, and translators is strictly escaped to prevent the insertion of malicious links.

For links that point outward to external hosts the following safeguards are in place:

<meta name="referrer" content="never">

<a href="https://external_url" rel="noreferrer">external_url</a>

No use of local store of cookies

To prevent the potential abuse of origin violations in the transfer of cookies in HTTP headers, cookies are never set by the GLBackend of GLClient.

Form Autocomplete OFF (Client)

Form implemented by the platform make use of the HTML5 form attribute in order to instruct the browser to do not keep caching of the user data in order to predict and autocomplete forms on subsequent submissions.

The implementation involve setting autocomplete=”false” on the relevant forms or attributes.

https://www.w3.org/TR/html5/forms.html#autofilling-form-controls:-the-autocomplete-attribute

Standard font usage

The GLClient intentionally does not add any new fonts to the clients and instead relies on the ubiquitous Helvetica.

font: "Helvetica Neue", Helvetica, Arial, sans-serif

The loaded system fonts is one of the most effective indicators for fingerprinting an individual browser. See appendix A Table 2 of the EFF’s paper.

https://panopticlick.eff.org/static/browser-uniqueness.pdf

DoS resiliency approach

To avoid applicative and database denial of service, GlobaLeaks try at best to follow this pattern:

This approach avoids that CPU intensive operation can be executed on a timings and amount choosed by the anonymous interaction (because can be a DoS of resource exhaution)

We’ve three tasks, runs periodically: delivery, cleaning, notification

These task run on the database object based on the ‘internal status’ of that object. In example, a submission performed by the whistleblower use three status:

Delivery task

delivery task, read the recently submission marked as ‘finalized’ and create the support for the receiver, encrypting files, creating their access, and switching the submission in ‘first’ status key

Notification task

The notification tasks runs over the comments, files, Tip and message looking for event not yet notified to the Receiver. It extract the information on the event and put in queue an email.

Server side data protection

File encryption

A receiver can submit a personal PGP key, this key is validated prior successful import of GPG software (interaction wrapped by python-gnupg package)

Every file uploaded is saved in the disk using a simmetric random AES key. this key is stored in a ramdisk.

This guarantee that non persistent key are available in the disk, and then, stole or seize the GLbackend server do not permit the extration of the encrypted data.

This encryption is used only when a submission is in the status ‘submission’ and ‘finalize’.

If a receiver has a PGP configured, in the delivery task the files are (on the fly) decrypted and encrypted with the Receiver public key. This process to no used filesystem, then never a plaintext file is stored in the system.

If one of the receiver has not a PGP key configured, the file is stored in plaintext for this one (or for the more).

A context can be configured to enforce the presence of receivers with a valid PGP key configure only. This ensure that the plaintext file is never stored in the node lifetime.

Secure file delete

Every file deleted by the application if overwritten before releasing the file space on the disk.

The overwrite routine is performed by a periodic scheduler and acts as following:

Secure db delete

The platform enable the sqlite capability for secure_delete (1678) that automatically make the database overwrite the data upon each delete query.

Exception logging and redaction

In order to quickly diagnose potential problems in the software when exceptions in clients are generated, they are automatically reported to the backend. The backend server temporarily caches these exceptions and sends them to the server administrator via email.

In order to prevent inadvertent information leaks the logs are run through filters that redact email addresses and uuids (
1799).

Related Project Documentation

The documentation below is inclusive part of the project.

 di