The art of authentication

Moving on in the present series regarding TOSP’s stance on security matters, I would like to stop discussing file access control and process sandboxing for a bit and discuss a somewhat higher-level, but equally important concept: authentication.

For those new to the term, here’s a short reminder: when you log into an operating system, much like on any other computer service, some process occurs where the operating system checks that you are the user who you claim to be. The OS does this by having you prove that you possess a piece of information which only you should know about, such as a password. This process is what is called user authentication.

In this post, I will discuss the various options which the modern world offers for authentication, their respective advantages and drawbacks, and the role which an operating system could play not only for the authentication of its own users, but also for authentication of users to third-party services.

A tour of authentication methods

When discussing authentication, it is worth stressing the point that from the point of view of computers, human beings are really similar to one another. A program can send output to a user, and receive input from said user, but since user input comes in a standardized form, there is no intrinsic way for computer programs to tell users apart.

Building an authentication system involves changing this by having users send a piece of input that only them can produce. Since digital information is trivial to copy, this means that said input must either be considered a secret, or be mathematically derived from some secret information.

The simplest form of authentication available today, passwords, involves having users directly remember a secret piece of information, in the form of a bunch of text called a password or passphrase, and input it on request. A modern variation, with similar strengths and weaknesses, involves having users input some visual information instead, using a mouse or a touchscreen, as in Android’s “pattern lock”. Such direct memorization and subsequent input of secret information by users is the only form of authentication that can be implemented without using any extra computer hardware.

Since users cannot remember very complex information, it is sometimes desirable to offload the storage of authentication secrets to a hardware data storage device. The simplest device that one can envision is your average USB pen drive, which could very well be leveraged for password storage, building the computer equivalent of a door key. Of course, unlike actual door keys, the secret information stored inside of a USB pen drive is very easy to duplicate, which is why extra protection measures should probably be applied instead of using this technique in isolation. I will elaborate on this in a bit.

Moving one step further away from passwords, an alternate approach is to not rely on OS- or user-generated random data for authentication, but instead use some naturally random phenomenon, such as the growth of human fingerprints, the pattern followed by retina blood vessels in one’s eye, or some behavioral characteristic of users such as their typing patterns and mouse motions. The concept of measuring this data and using the resulting information for authentication is called biometrics.

Finally, it may be argued that having users directly transmit secret authentication data to the computer is a security risk. So a last class of authentication methods uses specialized peripherals with some limited information processing abilities, called authentication tokens, to avoid this. These peripherals, ranging from “smart cards” to “secure ID generator”, use fancy mathematics such as public-key cryptography to prove to the OS that they know a piece of secret information without directly transmitting said information. The proper term for this is a zero-knowledge proof of knowledge.

Meet the vulnerabilities

Now, let’s make a quick psychology experiment. Take an average child, tell him that something is secret, and observe his reaction. Most children will react by either trying to make you tell the secret, or trying to find out about it on their own, and some people will just remain that way as adult. Unfortunately, in the realm of authentication, a nosy someone getting a hold of your authentication secret is called identity theft for good reasons. It is very bad news, because it means that the person can act as you and do things that normally only you could, such as withdrawing all the money from your bank account. So it is to be avoided as much as possible.

A first way that authentication secrets can fall in the wrong hands, is if they are simple enough that an attacker can guess them. Unfortunately, when asked to choose their own passwords, humans tend to use extremely predictable data such as regular sequences of numbers, keyboard rows, or a part of their birth date. And even when people expend a genuine effort at producing convincingly random data, they tend to make it somewhat too short to resist modern password cracking tools. For this reason, a responsible authentication system should either propose a memorizable but secure password to the user, or kindly direct him to a secure passphrase generation method such as Diceware.

Another unfortunately simple and common way to weaken authentication security is to share authentication secrets between multiple services. If one uses a given password for their e-mail account and their online banking account, for example, they implicitly give their bank access to their e-mail account and, more worryingly, their e-mail provider access to their bank account. Even assuming perfectly honest service providers, doing so still means that a breach of security somewhere can collapse into a more global security problem for someone. So protecting multiple services with a single authentication secret is dangerous, and to be avoided as well.

It’s actually worth pausing for a moment and considering the implications of this fact. The problem of authentication secret reuse also occurs when multiple authentication secrets are stored on the same hardware device, and when multiple passwords are managed by the same password management software. It also applies when “single sign on” services, such as OpenID or Facebook Connect, are used. In today’s world, where computer users need much more authentication secrets than they can realistically remember, it is actually very hard to avoid creating a single point of failure in one’s authentication methods. I have yet to find a way to fully achieve this goal myself.

The problem of shared authentication secrets also applies, in a particularly disturbing manner, to biometrics. There is only so much biometric data that a computer can measure, and since the most reliable forms of biometric rely on specialized, relatively expensive input devices, it is unlikely that a given user will have access to more than one form of it. So it is provably impossible to achieve secure authentication to more than a couple of services using biometrics in isolation. This means that biometrics cannot directly replace passwords, without being combined to other authentication techniques.

Unfortunately, this is not just a problem with biometrics. Users cannot remember many secure passwords either. All mitigation techniques against this problem involve protecting multiple authentication secrets with a single password, which as discussed above creates a major security vulnerability in the form of a single point of failure in the authentication system.

Another bunch of vulnerabilities appear in authentication methods where secret data is transmitted to and processed by software, as is the case for passwords, external storage, and biometric data. Unless it is running on dedicated hardware with simplistic internals and a limited, well-defined interface, software cannot be made perfectly robust against exploits. This means that is should be possible to change any authentication secret that ends up hosted by software running on general-purpose hardware at some point. This is not possible with biometrics, in what is perhaps the greatest failing of the concept.

It is not just software that is vulnerable to privacy breaches, either. Hardware can sometimes leak all the cleartext information that passes through it. A textbook example of this for passwords is keyloggers, which memorize all the input typed on a keyboard using a small device that is secretely hidden either in the back of the computer, or within the keyboard itself. These are now available for a couple of bucks. If dedicated authentication hardware became mainstream, it is likely that specialized hardware exploits sniffing their own bus traffic would become mainstream as well. For this reason, any form of authentication hardware which sends secret data over cleartext hardware buses should be frowned upon by security-conscious persons.

It’s not just software and hardware that can be compromised when secrets are transmitted around. Yet another vulnerability is phishing, in which human users are lured into inputting authentication secrets into a form that looks like the real system being authenticated to, but is actually under the control of an attacker. Phishing is a growing problem online, but can actually be applied in any situation where log-in windows are not special and a malicious program can display a perfect clone of them. The best defense found so far is to have log-in forms filled not by humans, but by software which can better discriminate the fake from the real. However, this involves again entrusting one single piece of software with many authentication credentials, which as mentioned before is a security issue in and of itself.

Finally, even though dedicated authentication tokens can be made highly resilient to many of the aforementioned vulnerabilities, they still are vulnerable to being stolen (causing identity theft if they are the only form of authentication), broken or lost (causing loss of service availability if they are not replaceable). Again, door key analogies turn out to hold very well here. In such events, whether authentication tokens are easy to duplicate or not turns out to be mostly irrelevant. Once an authentication token leaves the pocket of his rightful owner, it can be misused for identity theft, and so it should be possible to immediately disable it without any loss of service availability.

An authentication proposal for TOSP

As discussed above, all authentication methods which are available today are flawed in some way, although some are more flawed than others. Methods which involve transmitting a secret directly to the entity being authenticated to are all vulnerable to hardware and software compromises of some sort, and whenever another device is added to waive this restriction, there is always a risk of said device being lost or stolen. Moreover, the problem of authentication secret reuse is serious, and to date has yet to find a good resolution.

I don’t think this problem will be solved anytime in the near future, because it would involve somehow modifying human beings so that they can all prove that they know multiple secrets, one per digital service they could authenticate to, without actually passing the secret around, and using inexpensive hardware that is difficult to lose or steal. Perhaps one day, we will do that using some sort of biological engineering or NFC smart card implant. But for now, it appears that public opinion would not be favorable to such authentication techniques becoming mainstream.

Consequently, security best practices have instead moved in the direction of suggesting that people combine multiple methods of authentication to access a system. The core idea is that if only one of the authentication methods is compromised, it does not yet represent an identity theft, and the faulty authentication secret can be safely replaced, foiling the evil plans of the attacker whose stolen secret is now useless. Of course, this assumes that authentication secret compromises can be detected, which may be a bit optimistic. But so far, I believe it’s the best that we have got.

At the risk of hurting the sensitivity of people who have spent years researching biometrics, it seems to me that the underlying concept is somewhat flawed. After all, practical biometrics are about always using the same authentication secret, that you cannot change and display publicly to the world around you in cleartext form, to authenticate to wide range of services by just passing it around through vulnerable software. So far, biometrics research has failed to answer the most salient question of what will happen if it begins to be used massively, attackers get interested in it, and this results in a massive breach of biometric data from vulnerable software.

Some people from the area of biometrics will counter that biometric data cannot be copied. I believe that this idea results from a fundamental misunderstanding of how digital information and physical measurements work. While the exact physical data that is at the root of biometrics may be indeed hard to reproduce, all biometric systems have to measure a simplified form of it in order to operate. And the authentication secret that is used in biometrics is this simplified data, not the underlying physical signal. If this data, which must itself be easy to copy for the biometric system to operate, is compromised, then it makes no difference from the case where the biometric data itself is compromised, because the biometric authentication system cannot tell the difference.

To the best of my understanding, this flaw lies in the fundamental concept of biometrics, and thus cannot be fixed. But I will gladly challenge the biometrics community to prove me wrong. In the meantime, however, I believe that the use of biometrics cannot be recommended for general use, outside of very controlled contexts.

For local authentication that can work when network connectivity is low or nonexistent, which operating systems require, this leaves us with secret data that is stored on an external computer peripheral (tokens) and in the user’s mind (passwords). This combination of “something you know” and “something you have” is also, unsurprisingly, one of the most developed forms of two-factor authentication available today, and it is what I will focus on in the remainder of this post.

Ideally, as mentioned above, the physical authentication token should not transmit its secret to the computer directly, nor send it to the operating system in encrypted form, because then this secret would become vulnerable to the same software compromises as the password, and this would reduce the effectiveness of multiple authentication factors. Instead, zero-knowledge protocols should be used. Thankfully, devices allowing for this are finally becoming mainstream, as the FIDO U2F protocol, which is one way to do this, is becoming standardized and supported by large companies such as Google or Microsoft.

A practical point to keep in mind is that hardware authentication tokens, even in the absence of compromise, will ultimately break and can be lost. Much like door keys, one should thus always have two of them, the later of which is kept in a place considered safe from attackers. While these tokens are somewhat expensive today, making such duplication cumbersome, one can only hope that in the future, as the technology is used more widely, prices will drop quite a bit. Today, one can already buy FIDO U2F tokens for around $18, making them comparable to high-end door keys in price.

Finally, for situations where specialized hardware authentication tokens are not available or not trusted by the user, a weaker form of hardware authentication that may be used is to store secondary authentication credentials on a regular storage drive, in encrypted form, possibly using an encryption passphrase that is directly derived from the authentication password. This solution obviously has a couple more vulnerabilities, such as again software compromises, but it has the advantage that it can be made to rely solely on off-the-shelf storage hardware and open-source software.

Authentication scalability issues

As mentioned above, building a good authentication system for one single service is one thing, but building one that scales up to the dozens of services that computer users interact with these days is another matter entirely. As of today, computer users cannot be expected to remember a lot of strong passwords, or carry around lots of hardware authentication tokens. So even if it creates a single point of failure, which is bad for security, the use of “single sign on” services that protect multiple authentication secrets with a single one appears necessary today.

My question, then, will be whether it would be worthwhile to integrate such a functionality into personal computer operating systems.

I think that it is the case, because the alternative is for users to use third-party software that runs on top of the operating system. So in any situation, users must rely on their operating system to protect their secret credential data. In this case, it appears needlessly risky to have users also rely on yet another third-party piece of software as well, especially for something as critical as protecting their digital identity.

This does mean, however, that this part of the operating system must be designed and implemented with even more care than usual, in a way that matches the best that third-party software can do in terms of information security. For example, software interfaces must be as immune as feasible to the confused deputy problem, implementation should be written in secure programming languages that are immune to buffer overflow vulnerabilities (to avoid the kind of credential leakage that happened with Heartbleed), and design and code reviews must be conducted by as many independent actors as possible.

Because hardware compromises cannot be fully protected against, the possibility that a breach of the single sign on system can occur would also have to be accounted for. The design of such a single-sign on system should not only include thread modeling, but also some level of prior planning and user education allowing for responsible action to be taken if and when breaches will occur.

In order for users to actually want it, such a system should also have state-of-the-art usability, including being transparently integrated in web browsers and any other user software that requires or performs authentication.

In short, it can be done, and in my view it should be done… But it would also be some serious piece of work ;)

Conclusion

Like many computer-based services, operating systems need to prove that the human operating them is indeed the legitimate user that he claims to be. It has to happen, at the very least, in two scenarii: when a computer is turned on by an unknown user, and when it is left powered on but in a “locked screen” state by the user as some computation or network operation is being performed.

Since all that OSs know about users is the output they send to them and the input that they reply with, checking their identity requires the existence of some secret piece of information that only the legit user has access to, and whose possession the operating system can query whenever it needs to. This process is called authentication.

Many methods of authentication have been used over the ages, with various advantages and drawbacks. Passwords are an old-time classic and require minimal setup effort, but their large amount of vulnerabilities makes them somewhat unsafe to use in isolation. Biometrics have an indeniable “cool” and practical feel to them, but so far their security does not seem to live up to their promises. Dedicated hardware authentication tokens can be used to reach higher levels of security, but like door keys, they are vulnerable to breakage, loss, and stealing, requiring duplication. All these methods have difficulties scaling to the many services that computer users log into these days.

In the end, all things considered, it appears to me that the best option for OS authentication is to use a combination of a master password, a pair of hardware authentication tokens (one of which is kept in a safe place), and the provision of some single sign-on service allowing the OS to log the user into extra services should he so desire. Of course, said single sign-on service should match the state of the art in security software in order to be useful. The added value is that it does not require users to trust two pieces of software in order to manage their digital identity, the OS and a third-party password manager, when only one suffices.

3 thoughts on “The art of authentication

  1. Tom Novelli May 5, 2015 / 7:59 pm

    I think the problem is that most people (including in IT and standards bodies) don’t care. They’re too busy wanking around with HTTPS to come up with a simple solution for authentication.

    If browsers and websites supported SSH keys, we could use them instead of password managers. One key protected by a weak password for stupid websites, another key (stronger pw) for email etc, and maybe one more for banking (along with a second factor). The second factor could be as simple as a second key on a more secure device (like a super-secure non-“smart” phone).

    Biometrics could be ok as shortcuts for low/medium security access, e.g. screen unlocking.

    I don’t like hardware tokens, except maybe for high security purposes, but loss/failure tends to be just as big a risk as hacking. Also, when you can’t trust your hardware or software, adding another hardware+software device isn’t reassuring.

  2. Hadrien May 5, 2015 / 9:32 pm

    I agree that more public key cryptography – based authentication would be nice, both for security and usability. Especially considering the bunch of SSL implementation breaches that have occured recently, putting the safety of directly transmitting authentication secrets, even via HTTPS, into question.

    Also, I feel VERY uneasy with respect to the public key infrastructure used by HTTPS. Actually, I have a rant about it planned in one of my future posts. There are just so many CAs which can sign SSL certificates for the model of third-party signature to remains realistically secure. And as the breaches at Comodo and Diginotar highlighted, the black hats are very interested in getting their own MITM certificates (when Lenovo’s not giving some to them for free). Simultaneously, getting SSL certificates signed by CAs for money also has much of a protection racket feel to it. In the end I would much prefer something more decentralized that doesn’t feature godlike authorities, like PGP’s web of trust model.

    Regarding the use of biometrics for low-security logins, I guess I could agree with that, although I’m not entirely sold on the idea that they represent a worthwhile improvement over the currently established mechanisms in this area, such as 4-digits PINs. I guess my worries are twofold. First, I think that one should not neglect lock screen security, because we never know what hides behind a lock screen. The lock-screen can very well be the only protection of something like a root shell or an opened online banking application. Second, I’m worried about the false sense of security that biometrics can give: movies present them like some sort of super-high-tech, super-reliable authentication, and it’s that view that users are likely to keep, whereas the truth is a lot more nuanced than that… It’s really just a sort of long password that users won’t be able to change when it will be compromised.

    As for hardware tokens, I actually kind of like them myself, as I guess the tone of this post can somewhat suggest, because…

    • With redundancy and some prior planning, loss/failure is not that big of a problem, see door keys
    • Most of them are designed to be resilient to physical access and cold boot attacks, unlike general-purpose computer hardware where physical access is usually a security game over
    • As the device is single-purpose, the hardware interface can be extremely narrow, limiting the available avenues for exploits
    • As a hardware device, the token only interacts directly with the OS drivers, unlike software which can usually be directly contacted by untrusted user-mode processes
    • The device has no direct network connection, so even if an exploit succeeds, actually getting the secrets out of it is harder
    • Devices which only do one thing are usually better engineered than devices which do many things

    Then again, I also agree that the solution is still not ideal, because hardware costs money, having to plug it in is needlessly cumbersome, and since these tokens are usually closed-source, one cannot void the possibility of a backdoor such as a faulty key generator.

  3. Tom Novelli May 5, 2015 / 11:12 pm

    Yep! I’m all for well-engineered, single-purpose, verifiable, open-source hardware. Meanwhile, without that, it’s pretty silly to use computers for anything where security really matters.

    You don’t need to rant about the CA infrastructure now – even the HTTPS-everywhere weenies aren’t defending it anymore. They’re saying “opportunistic encryption” will be the answer. Only problem is it won’t be supported for 5-10 years and HTTPS still isn’t end-to-end encryption. :D

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s