Running a Login System with an Account Chooser
http://goo.gl/WnwzA to open the document for commenting
http://goo.gl/R8pMP to open the document for viewing
This is a guide for anyone responsible for managing the user account system of a web site. Each chapter will help you create a more sophisticated user account system. You can jump straight to Chapter 1 if you are ready to start modifying your website. The Introduction will give you background on the problems with most user account systems, and the history of how we got there.
This document is constantly evolving. You can follow my Google Plus profile to get notices about changes, or about related issues in Internet Identity research. If you have have questions/suggestions, feel free to contact me through my profile, or open the document in commenting mode and use the Insert-Comments feature to add questions/suggestions directly to the document..
Table of Contents
Most of the people who will read this guide are computer scientists, and generally computer scientists like to think that account systems should be simple tools. Unfortunately these systems are used by non-technical users who will not constrain themselves to our rules whether because they don’t understand them, or are hackers trying to break them. Add in legal, privacy, and security issues, and you unavoidably end up with a complex set of requirements that are very hard to fulfill, and in many cases the requirements are conflicting. Hopefully this guide will help you balance and meet those requirements.
The most common computer science requirement of a User Account system is to provide a unique numeric ID for an account. In a “simple” computer science world, there would be one global user account system, similar to DNS, where every person was assigned a single unique numerical ID at birth, and each person also had a perfect way to prove who they were. Every website could then use those user IDs to store information associated with the person.
Obviously that does not exist, and for decades every user account system issued its own IDs (and sometimes usernames) to users. Such systems were “simple” to write, but painful for users.
Then during the early 90s, a “hack” was found that created the foundation of most user account systems on the web. That “hack” was the idea of logging into a website with your email address, and proving you were the owner of that email address by having the site send you an SMTP message with a hyperlink back to the site which contained a long code. For the few of us who considered ourselves identity geeks at the time, this did not seem like an approach that would last a long time. Even SMTP seemed like a fad when we already had high end systems like Lotus Notes relying on things like PKI, digital certifications, signing, encryption, etc.
However twenty years later, that “hack” is still the most powerful technique that we rely on to build user account systems. From a purist perspective, emails have some downsides as identifiers. Much of this guide will discuss those downsides, and how to handle them. However nothing else, even phone numbers or social network IDs, has come even close to being as powerful an identifier.
Two of the downsides are that users change email address over time, and that the same email address is sometimes assigned to different people at different time periods. Because of those, and other complications, almost every website still maintains its own “local ID” system just as user accounts did before the 90s. The one key addition is that those local IDs are then mapped to and from a user’s email address.
In our “simple” computer science world, every person would be assigned a “single” ID. In reality, this quickly breaks down because users want to separate their personal and work life. In addition, some people compartmentalize their life further. One compartment might be the way most people know them, but another compartment might be for having affairs, or criminal activity, or dissident behavior. A celebrity might have a compartment for their “regular self,” similar to the Superman character’s “Clark Kent” compartment.
Email addresses turn out to be an amazingly good way for users to create a virtual identify that maps to each compartment in their life. In a large % of cases, users try to avoid linking these different identities. One common technique is to use different webmail providers for different email address, because they are so visually different that it reduces the chances that the user might accidentally perform an action in the wrong account.
So most websites don’t map user accounts to humans, they map them to email addresses, and only the actual human person knows all of their different compartments, along with the email address used to identify each of those compartments in the virtual world.
There are of course some websites that try to get closer to mapping to an actual person, such as government websites for taxes and social services. This will be discussed more in the Identity Verification chapter. There are other websites that try to avoid even knowing a user’s email address to avoid the potential for two websites to correlate a user’s actions across them. That will be discuss in the Social Login and Other IDPs chapter.
So humans have a map of the emails they use, and websites map an email to a local ID. Website’s user account systems also have a critical role of authenticating the owner of that email address. Note we did not say authenticate the human, but rather that owner of the email address. The difference is important, as well as powerful, but it also adds complexity.
The simplest way to authenticate the owner of the email address is to use the “hack” of sending them a URL with a code every time they want to login. However when that “hack” first became popular, email services had significant downtime, so websites did not want to be reliant on them. So instead we relied on a scheme that had been used for user account systems that issued their own user IDs instead of relying on email address, and that scheme was passwords.
Combining the “hack” with passwords seemed great. The best part was that if the user forgot their password, the website could just use the “hack” again to verify the owner of that email address and let them pick a new password.
Everyone loves to hate passwords. Users hate them, and generally give up and use the same small set of passwords across many websites. Security people hate them because there are so many ways to break them. That hatred has steadily increased the last 10 years. Starting around 2008 though, things got much worse.
The short version is that the security of a user’s accounts on the Internet has became equivalent to the security of the least secure website where the user types their password. Or to put it another way, the security of the Internet as a whole is now equivalent to the security level of websites with the worst security. And there are plenty of websites with little to no security.
If you are reading this guide, you are responsible for one of those websites. Unless you work for a firm with hundreds of dedicated security personnel, there generally is no reason for your site to require that users are authenticated with passwords. It both reduces the security of you own website to the level of websites with worse security, but it also reduces the security of other websites who have tried to build stronger security.
If there is one thing you should take away from this guide, it is the need to eliminate passwords from your user account system as much as possible. If you are ready to do that, you can skip to Chapter one, but you should probably keep reading to understand why the situation has become so bad.
By that time, SPAM had become a large criminal business generating huge profits. They mostly sent SPAM by creating fake email accounts, and using them to send messages. It was around that time that criminals realized that they could make more money by taking over a real user’s email account, and use it to send SPAM. It was much harder for the abuse systems used by large Internet providers to detect accounts in this state, and to get rid of the SPAM.
The main cost to the criminals was the cost of hijacking a user’s email account. That had mostly been done through techniques like phishing, malware, and dictionary attacks that were targeted at the user’s main email provider. What the hackers then realized is they could apply those same techniques against any website that let users login with an email address. Since most people reused passwords across sites, the hackers just needed to collect a list of the passwords associated with an email address on other websites, and then try those passwords to login to the user’s main email account.
Most websites were easily compromised using techniques such as dictionary attacks. In many cases hackers could also partially break into a website and gain control of the web page that showed its login form. Then whenever users typed their email address and password on the form, it was logged by the hackers. In a lot of cases the hackers broke completely into the website and stole their entire user account system, including the list of email addresses and passwords. Even if the passwords were encrypted, there are special techniques that let the hackers reverse engineer those password lists.
The sad response of much of the computer science and security community was to put the burden of solving this problem back on the user. User’s were told to use different passwords on each site, or at least their “most important” sites. Then vendors started selling software to manage these different passwords, but invariably they failed when the users ended up on a computer (or frequently a mobile phone) that did not have the software and thus they could not login to websites.
A small community of Identity geeks had seen this problem coming, and had been working on alternatives since around 2000, but they were generally “solutions designed by computer scientists that could only be used by other computer scientists.”
There was one sign of hope during that time. Some websites started to launch “service providers” that exposed APIs to the private information that the website stored with the user’s account. Service providers were created for photos, blogs, files, ecommerce, address books, etc.
Other websites, or desktop apps, could ask a user for the email address and password that they used to log into these website, and then use that to access those APIs. However those service providers disliked the idea of those other websites/apps having that much access to those accounts, so they started to develop new protocols to enable the user to grant permission for their data at one site to be accessed by another website/app. Over time those protocols evolved into the OAuth standard.
At a general level, OAuth allowed one website to access information from a service provider without knowing a user’s password. If a website saw value in accessing that service provider, it was required to use this technique. So for the first time, there was a strong business reason for websites to rely on a protocol instead of passwords.
As more and more websites experimented with this approach, it soon became obvious that one type of service provider was an “identity provider” who provided information to help a user log into other websites. The rest of this guide explains how you can evolve the user account systems of your website to leverage identity providers, and service providers.
Notice this chapter is NOT about identity providers. Before we get there, it is necessary to get in the mindset of your website’s users. While users may hate passwords, they have been using them for a very long time. Getting users to migrate to a new approach needs to be done with a lot of thought, or your website’s help desk (and soon your bosses) will hate you even more then they hate passwords!
Feel free to skip to the next section if you just want “the answer.” Otherwise, let us try to get in the mindset of the end users.
There are a few requirements we need to accept.
In a “simple” computer science world, every device would be used by at most one ID. However the reality is that many devices are used to access multiple IDs, such as a person’s personal accounts and work accounts. Or the person might have multiple jobs or additional personal ”compartments.” In some cases, the device will even be shared with other people. Computer scientists frequently try to force back the “one ID” simplification by using other tricks like having different operating system level accounts, or browser accounts. However they fall into the category of “solutions designed by computer scientists that could only be used by other computer scientists.” On mobile phones in particular they completely fail.
It took a long time, but after a lot of user experience experimentation (most of which failed)
Thoughts on combining Google & Yahoo OpenID UX research
An early draft proposal for a Personal Discovery Service to bootstrap IDP discovery without a browser extension
Eventually the identity community found one key technique. While people are lazy, they are willing to invest in a longer task one-time to make their lives easier in the future. That task can involve customizing the login experience for them on each device, for each account. Once they have done that, then on a customized device they can then “choose” the account they want to use to login to a website. That is the idea behind an “account chooser.”
For the latest information, see accountchooser.net
If you have never used an account chooser, try watching one of the videos of the experience.
Today when user clicks the Sign In button on your site, they are shown a login box. What if instead they saw something like this
In the common case, the user would simply choose the account they wanted to use on your site by clicking the picture. By leveraging other techniques discussed in this guide, the user would then immediately be logged into your site. It doesn’t get much simpler than that for the end user. However let us ignore those techniques for a moment.
In the most basic implementation, once the user chooses an account, they are shown a form asking for the password of the account they selected. In some cases they won’t even need to type that because their browser automatically fills it in, so they just hit return and are logged in.
The first step in modernizing your user account system will be to deploy an account chooser. It does not require large changes to a website, but it provides a huge amount of future flexibility to add further enhancements to your website.
There are two modes for an account chooser, Central, and Local.
This service is not yet formally launched, but you can learn more about it by monitoring the account chooser working group.
The domain name accountchooser.com is the location of the central account chooser. When a user visits that domain in their browser, they would see this list of accounts. They can also delete and modify entries. The account list is not stored on a server, but instead simply uses HTML5 to store the list in the browser.
When a user clicks the SignIn button on a website, it simply redirects the user to this list of accounts. The user chooses the account they want to use on that site, and email address of the account is sent back to the site (the name and photo URL are also provided so the site can use those if it wants). It is then left up to the site to decide how to authenticate whether the person at the device is the owner of that account. In the most common case, that will be done by asking the user for their password.
However the website may detect that there is no local user ID for that email address. In that case the website would send the user through a traditional account registration form, including picking a password, and possibly verifying the owner of the email address. The site might also save the name and photoURL that were passed in from the accountchooser.com for that email address.
If the browser has not been used with the account chooser before, there might not be any entries in the accountchooser.com account list. In that case the user would be automatically redirected back to the website. The website could then show its traditional login page. However after the user logs in, the site can redirect the user back to the accountchooser.com site, and pass in the email (as well as name and photo URL if known). The accountchooser.com site will then show a page like the one below <UI is not yet finalized> so the user can confirm they want to add it to the list of accounts.
If the person uses multiple accounts on the device, or multiple people use the device, then the process of adding an account will need to be repeated. That is why there is a button at the bottom of the list of accounts to add another.
Most people only have a small number of accounts, so they will quickly get all of them added to the device. From that point on, on any website they visit (even if they have never been to it before), they will encounter exactly the same user experience (even on a mobile browser). They simply click Sign In, and choose their account. They don’t even need to remember if they have registered before with the website. If they have not, then the site will be able to detect that and start the registration flow. So websites that use an account chooser don’t even need a separate “sign up” button.
Hopefully you are now ready to try adding an account chooser to your site. Fortunately the work required is just 4 small steps using a standard accountchooser.js file. Those steps and the details can be found here on the accountchooser.net website.
The initial version of the Account Chooser did not use a central domain. Instead, each website kept its own list of accounts. That requires the user to repeat the “add account” process at each website for each of the accounts they want to use.
Some websites might still choose to use that approach. For example, a website where people login with usernames instead of email addresses. One special type of website is a free webmail provider. They have to handle the situation of a user who is going to the site to create a new email address. That is obviously an activity that the average person will perform very infrequently, so the central account chooser does not support it. However the webmail provider can still run its own local account chooser, but add an additional option for a user to not just “sign in to another EXISTING account” but also to create a new email address.
A more advanced technique is for a website to merge the two models. In that model the website still uses the central Account Chooser to enable the user to login. However the site also keeps a local list of the accounts that have logged into that site on that browser. That list can be used for some advanced features. For example, the site might replace its sign out link with a drop down menu that lets the user either (1) sign out and choose another account from the central account chooser, or (2) immediately choose another account from that local list. Unfortunately for security/privacy reasons the list from option #2 cannot come from the central account chooser, so the local list is needed instead.
#2 is referred to as Account Switching. If your site is used on a single computer by multiple accounts, such as a husband and wife, then this can be a very useful feature. Instead of the two of them having to log each other out to switch accounts, they simply choose the other account account from a drop down list. However to enable this to work well, your site will need to have the ability to store information about multiple sessions in your session cookie (or cookies). One simple technique is that when the account is changed, you copy the current session cookie to a different cookie name, such as one whose name is based on the local user ID of the account. Then look for such a cookie for the local user ID of the other selected account and move its contents to your standard session cookie. However if you do this, then anytime the user deletes an entry for the account chooser on your site, or chooses signout, then you should delete the appropriate cookie(s).
If your website is an email provider (especially a free webmail provider) then there is some additional complexity to integrate with account choosers. On most websites there are only 2 ways a user might login; (1) register an account on the site with their EXISTING email address or (2) sign into the account they previously registered on the site with their EXISTING email address. Note that both options involve an EXISTING email address. On the login page of an email provider there is a 3rd option, i.e. create a NEW email address. The email provider can implement a local account chooser, but the page that lists the accounts needs to make it very obvious that the user can either (1) select an account from the list, (2) add an existing account to the list, or (3) create a new email address. If the user chooses option 2 or 3, then the email provider can choose to send the user to the Central Account Chooser to confirm that the account should be listed there as well.
The model above still involves the use of passwords. In Chapter 4 we will talk about how the Account Chooser provides a very smooth upgrade path to the use of identity providers.
If your site does use passwords, then you also need to consider what protections you provide against security threats such as “dictionary attacks” and “account harvesting.” One of the big advantages of becoming a relying party as described in other chapters is that your site will not need to worry about these things for users who have an identity provider.
“Account harvesting” is a technique where a hacker tries to determine what email addresses are registered at your site by seeing whether submitting an email brings them to a login form or signup form. Most websites leak this information in multiple ways, including their password recovery flows. Even when it is leaked, the information the hacker gets is not guaranteed to be accurate because anyone can register an email address on a site that uses passwords (maliciously or by just mistyping their email address), they just won’t be able to verify it unless they own that email.
If you are going to accept passwords from some or all of your users, then you should evaluate the use of CAPTCHAs as a first list of defense against those attacks. For “dictionary attacks” you could refer to the best practices published by the OpenID Foundation. For “account harvesting” you can also add CAPTCHAs to any “password reset” flows.
There is one user experience for combining an Account Chooser and CAPTCHAs that works reasonably well with a lower engineering cost then traditional schemes. That approach involves always asking the user to answer a CAPTCHA if they are trying to signin with an account that is not yet in your site’s local account chooser list. If they answer the CAPTCHA correctly, then you can let them register (if there is no account for the email) or attempt to type a password if the account exists (but limit them to only a few tries before giving an error and sending them back to the start of the account chooser flow.
Are you itching to get rid of passwords? If you have already implemented an Account Chooser, and you don’t have any mobile or desktop apps (and don’t plan to offer any) then you can jump ahead to Chapter 4 or 5. Chapter 4 offers some “baby steps” to eliminating passwords, and Chapter 5 will let you take the big step of letting users login without a password.
However if you have mobile or desktop apps, or devices, then you have some homework to do first. Those apps are probably configured today to ask the user for their email address and password. That will not work for websites with complex login systems that use techniques like identity providers because you won’t know how to validate the user’s password.
This Mobile apps for complex login systems article describes a technique you can use instead for apps/devices that can launch a web browser. For apps/devices that cannot launch a browser, the common approach is to keep authenticating them with a password/PIN managed by your system, but try to restrict what access that password/PIN has.
Auto-detecting OAuth approval from a desktop app
Videos of that desktop prototype with different federated login and strong authentication mechanisms
If you have deployed an Account Chooser on your site, and you have updated your mobile/desktop apps, then you are now prepared to start eliminating passwords on your website.
If your site does not ask the user for their password, then you need to get some other website needs to do that job. Those other websites are traditionally called “identity providers.” Those sites frequently have very large security teams, and they use sophisticated schemes to protect accounts. They generally do not rely on passwords alone to authenticate users. They may even have been audited against certification checklists to evaluate their security.
You will need to choose the identity providers that your website will support. If you want to try to authenticate an account using an identity provider, you will redirect the user to the identity provider using a protocol they support such as OAuth. When you perform such a flow, your website is referred to as a “relying party”
We strongly suggest you do NOT try to write your own code to support these identity provider protocols. Most identity providers offer libraries to help, and there are also vendors of software products and web services that help a website become a relying party. If you plan to support more then one identity provider, then if at all possible you should use a vendor’s product. The website openid.net is a good place to start to find those vendors.
The identity provider will make sure the user is logged in to the provider’s site, and then if necessary will ask the user to consent to sharing their identity back to your site. Below is an example of a page where the owner of account email@example.com (shown in the top right) is consenting for their email address to be shared with the website UserID TV. The identity provider can avoid showing this page if your site only asks for verification of the information it already got from the account chooser, i.e. the user’s email address.
That consent page is a key part of the role played by the identity provider. In the Secure mashups part of the introduction chapter, we discuss the idea of a service provider. They also use a similar consent mechanism.
In fact, the Identity Provider actually has a mix of four components:
In a large identity provider, those components are frequently run by different teams. In the later chapter on Relationship Managers we will discuss how those components can actually be split across different companies.
Chapter 4 will let you take the big step of letting users login without a password. However there are a number of techniques that can be used which are less disruptive to your websites. Even if you skip now to the techniques in Chapter 4, you should come back later and implement the techniques in Chapter 3.
In Chapter 1 we talked about the hack that makes Internet Identity possible. It relies on an email provider’s ability to verify an email address it hosts. When an Email provider is also an Identity provider, it creates one very powerful capability. The Email/Identity provider can assert an email address that it hosts, and a relying party can trust that assumption just like it trusts the email provider to deliver an SMTP message with an authentication code to the right mailbox.
So if your website wants to verify the owner of an email address, and that email domain runs an identity provider, then your site can redirect the user to that Email/Identity provider. That technique can replace the use of SMTP messages with authentication codes.
Even if your website continues to use passwords to log in registered users, it can still replace the mechanism of verifying the owner of an email. There are actually a number of related techniques that can also be used to greatly increase signup rates, and return users rates. This OpenID without a Login Box articles describes a number of those techniques.
In this chapter we will finally describe how to enable users to login to your website without needing passwords. We suggest breaking this change down into two steps:
In the next chapter we will discuss step two, but let us start with step one. Even if you eventually plan to only support non Email/Identity providers, you should use the techniques in this chapter to get a few accounts working with an identity provider.
If you have deployed an Account Chooser on your site, then when your users want to login to your site with an account, they will choose that account in the account chooser. Your site will get the user’s desired email address, and will then need to authenticate the owner of that email address. So far your website has done that by asking the user for their password, but that will now need to change.
What you should now do is pick at least one email provider that runs an identity provider which you are willing to trust. Examples include Hotmail, Yahoo mail, AOL mail, Gmail, etc. You can even setup a single Google apps test domain that you use for testing since all Google Apps domains have an identity provider operated by Google. For some % of emails in that domain, instead of asking the user for their password, your site can redirect them to that identity provider.
Some identity providers will let your site pass the email address that you are trying to authenticate. The identity providers will be able to detect that the user selected that email address in the accountchooser.com, and thus will khow they have given their consent to share their email address with your website. Assuming the user is logged into that account on the identity provider, the user will be invisibly redirected back to your site using a security protocol like OAuth that will enable your site to confirm that the identity provider verified that email address. Your site can then log the user in.
Even if the identity provider is not able to invisibly return the user to your site, it can prompt the user to give their consent to verify their email address (and login to the identity provider if necessary).
When a user is redirected back to your site from an identity provider, you should use the required protocol such as OAuth to confirm there was no security problem at the protocol level itself. There are two edge cases you need to handle:
Once you have this code in place, any further logins by that account can now be performed without a password. So your site now has accounts that no longer need a password!
At the beginning of this chapter, we suggested that the first step was to support combined Email/Identity providers. That simplifies the initial logic in your website, because if the email provider for domain xyz.com asserts an email address in the same domain, then your servers can trust it. However if that email provider asserted an email address in a different domain, such as firstname.lastname@example.org or email@example.com then your servers should not trust it.
However when your site extracts the email address asserted by the identity provider, you need to make sure that the domain name of the email is one you trust the identity provider to assert. For example you might decide to start by just using AOL, and only trusting them to assert aol.com addresses. If they assert an email address in any other domain, you need to show an error to the user. In the next chapter we talk about ways to handle this scenario without showing an error.
In addition, your website should check that the email address that the identity provider asserted was the same as the email address the user selected in the account chooser, and if not show an error to the user. Fortunately the protocol standards for identity providers are evolving quickly enough that this problem should not happen much in the future.
If you are happy with your initial identity provider support, you can now do one of two things. If you want to provide broad support to your users for combined email/identity providers, then you can follow the steps in this section. However if you have chosen a set of identity providers you want to support, and all of them have the potential to assert email addresses they don’t host (or not to even assert an email address), then skip to the next chapter.
Once you have support for one combined email/identity provider, it is pretty easy to add support for more. However there are some other techniques you can use to take better advantage of these identity providers:
In step 2 you changed what happens when a user clicks “Sign in to another account” and caused the user to simply be asking for the email address they want to add. A more advanced option is to show a page like this one:
If the user sees a button for their identity provider, they can simply click it instead of having to type their email address from scratch. Theoretically though, it will be very unlikely for a user to see this page on your website. As long as they have used their email address to sign into any other website that uses the central account chooser, then there would be no need for them to click the button that would take them to this screen.
However, you can still create this type of page if you would like. If you are using a vendor’s product, it probably has built in configurations for adding buttons for identity providers. The software may even be advanced enough to dynamically decide what buttons to show based on mapping the user’s IP address to a rough geographic location, and looking at the default language settings in their browser. For example, if the computer was in Russia, and the language was Russian, then it might be a good idea to show a button for the popular mail.ru service if they have an identity provider your site trusts.
One question an RP must consider is how to manage sessions AFTER a user has first logged into the site via an identity provider.
When a user logs into your site, you have the choice of setting a session cookie or persistent cookie to identify their account. A persistent cookie is kept on the user's computer even if they restart their web browser, while a session cookie is removed when they restart their browser. So the simplest approach for the RP is to always set a single cookie of one of the two types. A permanent cookie is more common.
However there is a more advanced approach. Nearly all good IDPs support invisible logins on repeat visits to the same site. That means they will assert the user's identity back to the RP if the user is still logged into the IDP. So a more advanced approach is for the RP is to always use a session cookie to track login state for an IDP user, but use a permanent cookie to save information on the account. If the session cookie is missing (such as after the user restarts the browser), then the website can redirect the user to their IDP and generally they will be invisibly logged in if the user is still logged into their IDP.
For either persistent or session cookies, it is possible to stamp them with the time they were issued, and then decide to force the user to re-authenticate after a certain time period. For users who have an IDP, the “re-authentication” process generally does not involve asking for a password, but instead involves redirecting the user to their IDP to make sure they are still logged in there. The IDP will ask them for their password if they are not yet logged in.
One big advantage of relying on IDPs to perform this “re-authentication” is that many of them have very sophisticated systems to decide when to invalidate sessions because of things such as lack of activity or activity that appears fraudulent. If they invalidate that compromised session, and your site redirects to the IDP, the user will not be able to get into your site without authenticating again to the IDP.
So that raises the question of how frequently to check that the user still has an active session at the IDP. There are four main methods where all sites perform #1, and then a mix of the others.
After reading this list, you may wonder if there is a way for your RP website to immediately detect when the user’s IDP session is invalidated. That is a challenging technical issue covered next.
In a perfect world, there would be a simple technical method for an RP to detect that the user’s session at the IDP was no longer valid. Unfortunately the design of browsers makes that extremely difficult. The only near-perfect form of single sign out is for a user to purge all the cookies in their browser manually, or by using the incognito mode of some browsers.
This chapter covers the second step of supporting identity providers to enable your users to login without a password. In the previous section we started with the simpler step of supporting combined email/identity providers. However there are many websites that are popular with consumers and they run an identity provider. Many of these sites host email for some, but not all of their users. Many of these popular identity providers are social networks, and the term “Social Login” is used to refer to the process of enabling a user to sign into a website with one of them.
Good: You might decide that your website’s needs are met by only supporting combined email/identity providers. But what about your user accounts that are in email domains that do not have an identity provider? Those accounts will continue to need a password. However you could give those users (as well as any of your users) the ability to login to an account by linking it to one of these popular identity providers. That will improve the security of your site, and your users, as well as increase registration and login rates. The next chapters will discuss many other valuable techniques you can use with these popular identity providers.
Bad: These popular identity providers sometimes assert email addresses they do not host. Generally speaking it is not a good idea for a website to trust those assertions. For example, a user might create an account at one of these websites, but they might have used a really poor password if they think the security of the account is not that important, for example if they only use it to read information posted by other people. To deal with this limitation, if a relying party gets an assertion from an identity provider for an email like firstname.lastname@example.org, and there is already an account for that address, then the relying party will try to confirm the person who control the accounts is the same by asking them to type the password of that legacy account. Once that is done, the identity provider is linked to the account, and the user can use that identity provider to login to the relying party in the future.
Ugly: There are unfortunately a ton of edge cases that relying party sites need to handle to support these popular identity providers. One way to deal with them is to let users call your help desk. Unfortunately that can get very expensive, and can even exceed the value of increased registration and login rates. Some commercial products are starting to appear on the market that handle many of these edge cases. Examples include:
This RP account linking article has more details about how to handle account linking, as well as discussions of the hard edge cases.
If you are willing to be an early adopter, then you can implement code (or buy a commercial product) to handle “the bad.” For “the ugly” partners you can rely on a mix of your help desk and custom code that you evolve over time based on the most expensive help desks requests. If you are not willing to be an early adopter, then you could at least start with the first step of supporting combined email/identity providers, and then keep an eye out for new vendor products that handle “the bad” and “the ugly” of popular identity providers.
If you have deployed an Account Chooser on your site, then when your users want to login to your site with an account, they will choose that account in the account chooser. Your site will get the user’s desired email address, and will then need to authenticate the owner of that email address. So far your website has done that by asking the user for their password (except possibly for some users supported by the techniques in the previous chapter).
Once you add support for one or more popular identity providers, you can see if there is an account for that email address that is already linked to an identity provider, and if so redirect the user there. If your site only supports one identity provider, then you can redirect all sign in attempts to that identity provider, and then move to the next chapter in this guide.
However the logic is more complex if you support more than one identity provider and the chosen account is not linked to a single identity provider. To help with that, a future version of the central accountchooser.com will keep not only the email/name/photoURL associated with an account, but also the popular identity providers associated with the account. That meta-data about the chosen account will be provided to the relying party website to help the user login. That list will never be perfect though, so even once it exists, the website will still need some additional logic as described in the rest of this chapter.
That version will also allow the storage of entries that do NOT have an email address. See the last section of this chapter on IDPs that do not assert Email address.
A good starting point for that logic is to modify the page a user sees when they click “Sign in to another account.” Also note that they will get redirected to this same destination if the list of accounts in accountchooser.com is empty. In either situation, the website should show a page like this one (which you might have already created in the previous chapter):
If the user sees a button for their identity provider, they can simply click it instead of having to manually type their email address. Theoretically though, it will be very unlikely for a user to see this page on your website. As long as they have used their email address to sign into any other website that uses the central account chooser, then there would be no need for them to click the button that would take them to this screen. However, you should still create this page. If you are using a vendor’s product, it probably has built in configurations for adding buttons for identity providers. The software may even be advanced enough to dynamically decide what buttons to show based on mapping the user’s IP address to a rough geographic location, and looking at the default language settings in their browser. For example, if the computer was in Brazil, and the language was Portugese, then it might be a good idea to show a button for the popular orkut.com social network if you support that popular identity provider.
If the user is sent to an identity provider, and comes back with a valid assertion, your site will need to decide how to handle it. Those assertions will sometimes contain an email address, but will always contain a string that represents an ID for the user’s identity provider (IDP) account. Whenever a user links an account on your relying party (RP) site to an account at an IDP, you should store that IDP user ID. You need to be able to map both your RP user ID to that IDP user ID, and visa versa.
If the IDP asserts an IDP ID that is already mapped to an RP user ID, then usually you can just log them right into that account. However as noted in “the ugly” section above, you have to decide what to do if the IDP asserts an email address, and it does not match the email address associated with your RP user ID.
If the IDP asserts an IDP ID that is not yet mapped to an RP user ID, then you need to decide whether to try to link it to an existing account, or register a new account. If the IDP asserts an email address, then that is a very helpful because you can use it to see if an account already exists with that email. If it does, you can show some type of account linking wizard to confirm the same person owns both accounts (see “the good” and “the ugly” section above). If the email does not match, or an email is not provided, then you can show an account registration wizard. The simplest option for such a wizard is to always create a new account and just log the user in. However in most cases the better user experience is to ask the user if they already have an account at the site, or whether they want to create a new one.
Once you have the previous two steps working, you are now prepared to handle some common edge cases. What will happen a lot is that a user will select an email entry in the central account chooser. They can then be in one of four states:
1) There is a single identity provider already linked to an account with that email
The website can redirect the user to that identity provider
2) There is no account associated with that email
In that case the website can show an account registration wizard. However the user experience is generally better if the site promotes the option for the person to use a popular identity provider. For example, it can show a variant of the page with the IDP buttons, and ask them to either click one, or choose to register a password based account. Some sites will even choose to require all new accounts to use an identity provider.
3) There is an account with that email, but it is not linked to an identity provider
The simplest option is to just ask the user to login with a password. Some sites will promote the option for the person to use a popular identity provider once they are logged in. For example, it can show a variant of the page with the IDP buttons, and ask them to either click one, or choose to continue to login with a password. Some sites will even choose to require all existing accounts to switch to using an identity provider.
4) There are more then 1 IDPs linked to the account
Many IDPs support the ability to “poll” them invisibly to see if a particular user is logged in, assuming the user has previously linked their IDP account to this RP. If the user’s account is linked to many IDPs, that may get slow, but it will work fine in many cases. If the RP is not able to use that technique, then it can show a variant of the page with the IDP buttons, and ask them to choose their preferred IDP. The RP might just list the IDPs already associated with the account, or also give the user the option to start the linking wizard for another IDP. Once the user is able to login to this RP account, the RP should generally store some data in a cookie or HTML5 storage to remember to try that IDP again if the user tries to login to that same RP account on this computer.
Some popular identity providers do not assert an email address in their federation protocols. If you don’t plan to support any of them, you can skip to the next chapter.
These identity providers are tricky because there may be situations where the account chooser needs to show an entry that does not have an email address. If a user visits accountchooser.com directly, they can see a list of all their accounts and add/edit/delete them. In the future when they edit, or add an entry, they will have the option of leaving the email address field blank. They will also have the option of providing the domain name of one IDP associated with the account, and they will also have to provide their user ID at that IDP. Every entry will need to have either an email address, an IDP domain name, or both. The display of these entries in the account chooser list will display all of that meta-data so the user is clear about the information they are sharing with a website. While email address will be shown in the standard “email@example.com” format, IDP entries will be shown as “IDPdomain.com: UserID.”
When a website redirects a user to accountchooser.com, by default only the entries with an email will be shown. However the website will be able to pass in parameters to specify the IDP domains it supports, and any accounts for those domains will also be included. If a user chooses an account, any metadata associated with it will be passed back to the website. The website can then use that metadata to help log the user in. If the website redirects the user to a popular IDP, it can pass the UserID parameter to help specify the target account.
A website can help a user add an entry to this list by redirecting them to accountchooser.com with the metadata and a request to add the account. The user will then see a variant of this page <this is not the final UI>:
If that metadata matches an existing entry in all fields except name/photoURL, then the existing entry will be updated with the new name/photoURL. If the metadata does not match, a new entry will be created.
Even if the IDP does not provide the website with the user’s email address in the federation protocol assertion, the website still might get the email address another way, such as simply asking the user for it. So the user might end up with two entries for the same IDP account, but whenever they sign into a site, they can choose whether to select the account that provides their email address or the one that does not.
If a user has an account at an RP that is linked to multiple IDP accounts, then when they click SignIn on that site and see the Account Chooser, they may see different entries for those different IDP accounts, even though they all connect to the same account at the RP. However by selecting one, they are helping the RP know which IDP to use to try to authenticate the user.
The UserID parameter in the previous section is assumed to be the same one that the identity provider asserts as part of the protocol it uses to help someone login to a relying party. However some IDPs enable a user to login to multiple websites and assert a different UserID to each of them. That technique does not work well with the central account chooser. If an RP wants to support IDPs in that mode, it should rely on a Local Account Chooser instead of the central account chooser.
In the What is an Identity Provider? chapter we explained that some identity providers also have two other components:
One of the main reasons a website might add support for identity providers is to access those other services and attributes. For example, a person might visit your site and choose an entry from the Account Chooser that did not match an existing account at your site. If you redirect them to an identity provider that also provides additional attributes/services, then you can request access to that information as well. The main downside of this approach is that the user may be overwhelmed/confused/concerned by the page the identity provider shows to get their consent to release that information.
Another option is to wait until a user is logged into your site, and then promote the option of connecting the local account to a service/attribute provider. That gives you more of a chance to explain the value. Some websites will even decide not to modify their login systems at all, and only promote the option of connecting accounts after a user is logged in.
There is a large variety in the types of services/attributes that are available, so this guide will not try to cover them. However if you are integrating with multiple identity providers, many vendor products have built in support for popular services/attributes. You can visit openid.net to help find those vendors.
One popular type of service provider are those that give access to a user’s address book. Some websites use those address books to provide simple type-ahead features for when a user wants to share some information on the site with someone whose name and email is already in their address book. Other websites use that information to build a social graph of how the users on their site are connected.
There are a few things that websites sometimes want to do that greatly increase the complexity of their local social graph:
Adding that support is quite complex. There is limited support for these features in vendor products. However websites like digg.com serve as a good model for how a website might provide these features to end users.
In the previous chapters we discussed the need for websites to choose the list of identity providers (and potentially attribute & service providers) that they want to trust/support. As the number of good quality providers increases, it will become harder for websites to make that decision.
Some groups of providers have started to be certified based on the features and operation of their services. That enables a website to decide to trust any website that meets a certain certification. The Open Identity Exchange (OIX) is one group that tries to help create the frameworks for such certification to help websites make these trust decisions. For example, they already have a page that lists some providers that have been audited against particular certifications.
After reading all the chapters of this guide so far, you might have started to think about becoming a service provider, attribute provider, and/or identity provider yourself. If not, simply skip to the next chapter.
If your website provides consumer email services, you should definitely do so. If your website is associated with a business, then you should strongly consider setting up an identity provider for your own employees. Even if your website does not fall into those two categories, you might be able to create high value for your users, and your site/business, by providing APIs to 3rd party developers. In many cases you can charge access to those APIs.
This guide will not provide all the information on how to expose APIs. You might want to start by simply learning about the OAuth2 standard for APIs, and as well as looking at the API platforms of other companies. The openid.net website also has information about vendor products that can help you with the security details of exposing your APIs.
However the next few chapters discuss some techniques that may help you greatly increase the usage of your APIs.
In Chapter 3 we noted that the Identity Provider actually has a mix of four components that can actually be split across different companies.
For example, a popular identity provider might let users opt-in to being authenticated by a different service that specializes in high security authentication schemes (discussed in Chapter 14).
A less obvious model is for Company A to expose an API, but allow Company B to handle the user consent aspect. The API can either be to a service (Service Provider) or to user attributes (Attribute Provider).
Why would Company A want to do this? The most common reason is an improved user experience that leads to increased used of the API from Company A. A common example is that a user logs into a website that wants access to both APIs offered by Company B and APIs offered by Company A. Normally this would require two different user consent flows. But instead the user can provide all the necessary consent in one flow. If that website is already a relying party to Company B, then Company A can leverage that relationship to simply have one more item added to the consent page a user sees when first logging into the website. The website might even want access to APIs from more then 2 companies, and thus the user experience is further improved by using a single consent page.
While that is the most common reason, there second main reason is that operating a Relationship Manager is a complicated task, and so Company A may be happy to offload it to other companies that specialize in being a Relationship Manager. That in turns allows Relationship Manager who provided more advanced user controls, such as:
The harder challenge is why would a company build a generic Relationship Manager that supports other attribute/service providers, and how would the company attract end-users and service providers to use it. However that are some examples:
The consent technique used by protocols such as OAuth is incredibly powerful, and helps overcome a lot of legal and privacy issues. However it requires users to make a choice, and that is frequently not an easy thing to do. Each company involved in the flow has an incentive to help the user make the choice, whether it is the services provider, relationship manager, etc.
However one potential model is to add another company into the equation whose job is solely focused on providing advice to the user. As an example, take a company like Consumer Reports that helps users make trust decisions about products. What if every time a user saw a consent page, they could refer to a Consumer Reports type guide to help them decide whether to give their consent?
Different users might trust different vendors to help provide them with this type of guidance. Some users might want a much more conservative guide, and others would prefer guidance which places more weight on value instead of potential privacy concerns. Other people might want a guide that takes into account their particular society’s norms. Some users may be willing to pay for premium guides, others might only want free advice.
However given the large number of websites, it would be very expensive for these relationship guides to analyze all of them. That creates an economic incentive for those websites to pay these guides to evaluate them. As the number of guides increases, that model may have trouble scaling, but websites might choose to be formally certified against certain trust frameworks, and those guides could use public information on those certifications to help users make trust decisions. There is always the potential for the money to create bad incentives that lead to relationship guides giving bad advice to users just to collect more money.
But the key point to take away is that as an industry, we are not stuck with a one-size-fits-all consent mechanism. The use of identity providers and relationship managers creates companies with specialized skills who over time can help users make better trust decisions, and/or help connect them with other people/organizations that can assist them in making those decisions.
The Internet Identity community has built up strong momentum the last few years getting consumer oriented websites to replace passwords by using popular identity providers.
There is now a growing demand from these websites, and end users, for those identity providers to provide the websites with more trustworthy information about the user, such as
However the popular consumer identity providers are generally not considered trustworthy sources of these attributes. Other companies like postal services, banks, mobile operators, cable operators, Paypal, etc. are better “attribute providers.” Many of those companies tried to run their own identity providers, but that has had limited success because of usability problem discussed in Chapter 11.
Some of those companies are now partnering with popular identity providers using the Relationship Manager model from Chapter 11 where these companies act as Attribute providers instead of always running a full Identity provider. To learn more, refer to the OpenAXN working group.
In the introduction of this guide the section on The Weakest link talked about the fact that there is no such thing as a strong password if it is reused on a website with low security. Then in Chapter 3 we discussed What is an Identity Provider? and described how that role might include multiple pieces, ones of which is an authentication provider. There is a large industry of firms focused on “strong authentication” and “multi-factor authentication,” and many offer “authentication provider” services to enterprises.
One of the reasons for identity providers to support stronger authentication is what is referred to as the “keys to the kingdom problem.” As more and more websites become relying parties, it means that if a hacker compromises a user’s identity provider account, then the hacker can also break into the user’s accounts on those relying parties. This problem already exists to some extent even without identity providers. Most websites let a user reset the password on their account by sending a reset message to the user’s email address. So if a hacker takes over a user’s email address, they can reset their passwords on those other websites. However the goal of federated login is to improve security, not be stuck with what we have.
Unfortunately there has been very little adoption of strong authentication in the consumer space. The most common consumer “strong authentication” technique is a variant of The hack that makes Internet Identity possible where a one-time-use code is sent to the user’s mobile phone that they then need to enter in addition to their password. Most popular consumer identity providers have added that type of approach including Google, Yahoo, Microsoft, Facebook, Paypal, and AOL. If a website becomes a relying party (RP) to one of these identity providers, then the RP “inherits” that stronger authentication.
Hackers can still using phishing style techniques to trick a user into providing both their password and one of these codes. So the industry is constantly looking at ways to avoid those type of social engineering attacks. The most common approach involves a technology called certificates or “certs” for short. A cert is either software or hardware that can authenticate itself using hidden keys that the user cannot even accidentally provide to a hacker.
For example, many smartphones rely on a small SIM card that can be placed in a phone, and the phone will then automatically be linked to the user’s phone number. The user is not asked for a password to enable the phone, instead the SIM card has a “cert” it uses to authenticate to the mobile carrier. However there has to have been a process ahead of time where the SIM card was connected to the user’s mobile account. That process involves humans, and thus is susceptible to social engineering attacks. However they are generally assumed to be harder or more expensive then phishing a user for a password or even a password plus one-time-use code.
So imagine every time you bought a computer, you also plugged something like a SIM card into it, and it immediately had access to your account information without you needing to type a password. That would help protect your account, but ONLY IF you were willing to require that all the machines you tried to use to access your account had such a card. Without that limitation, the hacker could simply use a machine without a card.
Unfortunately there is no global standard for how to plug such a card into computers. A huge number of people and companies have invested in this approach without finding a technique that could be adopted by average consumers. And even if such a technique was found, how would a user log into the web browser on their phone? Would it need both a slot for a SIM card and for this other card?
An alternative approach many industry players have been researching is not quite as secure as always connecting a card with a cert to each phone/computer. Instead, the idea is that a user has at least one device with a cert, and they use it to bootstrap other devices. For example, a user might have either a phone or laptop with a cert. If they bought a new computer at home and needed to login, they would have to perform some action on the phone or laptop to authorize the new device. There are many potential approaches with different levels of security ranging from high security modes like connecting the devices via USB/bluetooth/etc. down to lower security modes by using the phone/laptop to scan an OCR code displayed by the new device.
All these schemes tend to involve a lot of complexity, and that tends to lead to significant cost for the company performing the strong authentication. In the enterprise space the employer generally incurs those costs. But in the consumer space it is less clear who would pay. Users are generally not willing to pay, and IDPs are limited in how much they might pay.
If an RP website relies completely on the IDP to handle authentication, then it does not need to know how the user was authenticated. However some websites would like to know, or are legally required to know, how the user was authenticated in a particular browser session. That creates the potential for those RPs to pay the user’s authentication provider to get confirmation of how the user was authenticated. The authentication provider might be the user’s IDP, or might be a specialist authentication provider.
Some of those potential authentication providers are now partnering with popular identity providers using the Relationship Manager model from Chapter 11 where these companies act as Authentication providers instead of always running a full Identity provider. To learn more, refer to the OpenAXN working group.
In the previous chapter we discussed using stronger authentication than passwords to authenticate user accounts. However in some cases the thing being authenticated is not a user, but code running on another server that is making an API call. The term “robot” is sometimes used to refer to instances of such code making API calls. We discussed some APIs in Chapter 7: Consuming APIs and in Chapter 10: Exposing APIs. In most cases those types of API calls between systems on the Internet are authenticated with just a password. And like user passwords, they can lead, be phished, be guessed, etc. So what is the “strong authentication” equivalent for robots?
The most common goal is to use a software or hardware cert just like the ones described in Chapter 14. However setting up and managing these certs is a very complex job, and it is very common for them to be deployed in a way that does not provide any stronger security then a password.
One solution is for a company that operates “robots” to have one central identity provider for all their company’s robots. The “robots” authenticate to that central identity provider, and the corporate identity provider then handles authenticating the robot to APIs running at other companies. There are evolving standards in this area that do not use exactly the same approach as identity providers for end-users, but they use the same underlying OAuth2 standard.
Another variant is for Cloud PaaS and IaaS providers to offer a central identity provider that can be used by all the “robot” apps they host on behalf of other companies.
While the combination of robots and certs improves security, it also provides one other big potential advantage. Generally before a robot can call an API, it needs to register the password that it will use to authenticate itself to that API endpoint. However when a cert is used, there are techniques that allow that registration to happen automatically.
Imagine for a moment a situation where there are a large number of websites that want to be relying parties to a large number of identity providers. The protocols used for federated login actually require both a method for the identity provider to authenticate the user, AND a method for the identity provider to authenticate the robot at the RP website that uses the protocol. If a password is used for the identity provider to authenticate each RP website, then it requires a unique password to be manually registered for every combination of website and identity provider. That approach would be much too painful for potential RP websites, and would likely lead to them only supporting a few identity providers.
However by using certs, it is possible to avoid that manual registration requirement, and thus make it feasible for websites to become RPs to a large number of identity providers. Today the number of good identity providers is small enough this is not a big problem. However in the future the industry expects to use this cert technique to grow the ecosystem of RPs and identity providers.
Hopefully this guide helped you better understand how to manage the user account system of a web site. This document is constantly evolving as best-practices improve, and as new challenges/opportunities arise. You can follow my Google Plus profile to get notices about changes, or about related issues in Internet Identity research.
Hopefully you will suggest ways to improve this guide. If you have have questions/suggestions, feel free to contact me through my profile, or open the document in commenting mode and use the Insert-Comments feature to add questions/suggestions directly to the document.