Imagine that one day, as you come to the office, you would find a mysterious muscular man at the entrance, performing something that looks like an ID check. Mumbling about this world’s pandemic tendency towards security psychosis, you would search through your bag for your ID card or driver’s license. But as you would show it to him, he would say that he does not recognize it as valid.
Instead, he would direct you towards the services of Don Corleone, inc., selling *certified* IDs (basically a cheap copy of your id with a shiny stamp on it) for a hefty price, and say that he won’t let you enter without one of these.
Obviously, this scenario would be illegal in most countries, and you would be within your right to call the police. The state is the only legitimate authority trusted with the power to publish ID documents, and when it is democratically elected and subjected to the scrutinity of millions of citizens, one can expect that it won’t abuse its ID editing powers for fun and profit.
Yet for some reason, this is exactly the kind of protection racket that we deal with daily as we connect to the world wide web using HTTPS and have to interact with the current-generation Public Key Infrastructure (PKI), which is based on the concept of Certification Authorities (CA). And as I will elaborate, it gets worse. But first, let’s discuss why we are putting up with that.
The need for public-key infrastructures
The existence of certification authorities originates from the need for computer users to assess the authenticity of software resources, such as websites or software binaries, which are accessed over a network. Indeed, when these resources are not authenticated, nothing prevents an attacker with access to your network connection from performing a man-in-the-middle (MitM) attack and impersonating the service which you try to connect to, with potentially disastrous results.
In order to keep this post clear, I will focus on the authentication of pieces of data in the remainder of this post, which is called message authentication. Since all computer network communications are based on passing around pieces of data, this restriction comes with no loss of generality, but will help keeping the wording clearer.
Message authentication, like any other form of authentication, may only be achieved through the use of some secret information. Indeed, there is nothing that looks more like a set of ones and zeroes than another set of ones and zeroes, unless it has some kind of special flavor to it which only one computer in the world can produce.
Also, resource authentication has this extra caveat, with respect to user authentication, that the person sending the message (i.e. the service provider or software provider) cannot trust its customers to behave nicely. This precludes the use of any form of authentication based on shared secrets, such as passwords, and mandates the use of protocols based on public key cryptography instead. Otherwise, users with knowledge of the authentication secret could very easily impersonate the service and carry out MitM attacks on other users, which is obviously not a desirable outcome.
Many so-called digital signature schemes have been devised to authenticate messages using public-key cryptography. All of them follow the same basic behavior :
- Extract a manageable fingerprint of the message by hashing it
- Perform various tweaks on the resulting hash, such as adding random salt to it in an HMAC-like fashion, in order to prevent various obscure attacks
- Encrypt the result using a well-chosen asymmetric cipher.
Provided that the underlying hash function and asymmetric cipher are cryptographically secure, and that the hash tweaking step is performed correctly, this is a provably secure way to authenticate a digital message for anyone who possesses the public key associated to the private key used to encrypt the message. Indeed, it is then not possible for an attacker to modify the message without significantly changing its hash, nor is it possible to subsequently generate a valid signature from the new hash as the private key used for this purpose is unknown.
Unfortunately, in the general case, the message recipient does not know the sender’s public key in advance. And therein lies a potential for man-in-the-middle attacks. If the public key is sent directly as cleartext, nothing prevents an attacker with good enough access to the network from intercepting it, sending his own public key to the user instead, and then intercepting all subsequent communications on their way and modifying them like men in the middle do.
Technological solutions to this problem are called public key infrastructures (PKIs), and are all based on the observation that if the public key of one trusted person makes its way safely to the user, then that person is subsequently able to digitally sign a message containing the public keys of someone else, alongside a statement that this key does belong to the right person. This process is called certification.
Where the various forms of public key infrastructure differ is in areas regarding which persons are trusted to issue certificates, for which purposes, which personally identifying information is used, and how the resulting message is transmitted.
The certification authority concept
Now, to understand how the dominant Certification Authority (CA) model came in power, let us go back in time somewhere in the 80-90s. Back where personal computers were still a niche business and the World Wide Web as we know it was still in the process of being invented.
At the time, there were only a small number of entities which were both capable of producing digital signatures and trusted by users. These included government authorities, OS developers, computer OEMs, and some specialized companies in the banking, insurance, and legal business.
In this situation, it seemed reasonable to go at the PKI problem in the following way:
- Every company which is blessed with general user trust calls itself a Certification Authority and get its public key embedded in all popular operating systems and web browsers
- OSs and web browsers are then configured to consider anything that a CA says as absolute truth
- Newly built CAs begin to sell key signing services to other companies and individuals who aren’t as lucky, making their keys as trusted as the CA itself by issuing certificates attesting their authenticity
This business model proved quite successful in practice, and managed to hold as the personal computer and Web service market grew beyond any expectation. Soon, other companies smelled the scent of easy money in the air and joined the CA business, to the point where today, a typical web browser will trust about a hundred CAs just for website authentication.
Over time, the power of CAs remained boundless. No one ever succeeded at introducing a policy limitating what a CA can certify to a reasonable extent, nor at holding CAs accountable for most of their actions. And as the growing threat of man-in-the-middle attacks was perpetually invoked as a universal strawman against people opposing CA’s practices, we recently reached the point where it gets nearly impossible to maintain a commercial website or distribute software without buying into the CA maffia’s protection racket. After all, no one wants their users to be scared away from their product by security warnings.
When the model breaks down
As of today, it seems that the certification authority model is gradually reaching its breaking point, as it proves increasingly unable to hide its two most basic flaws. The first one originates from the fact that there are two many CAs, and the second from the fact that there are too few CAs, highlighting the fact that there is no right number of CA in the current design and thus that the concept of a universal certification authority is itself fundamentally broken.
First, why are there too many CAs? Well, as mentioned above, we have some hundreds of companies around, selling certification services to various computer-related businesses. They need to certify millions of services, for the benefit of billions of users. At this volume of requests, quality corners will be cut, errors are bound to occur, and the more companies there are, the higher the chance that one of them will ultimately fail to secure its universally trusted, godlike private key appropriately.
Two examples of such a CA security breach have already be disclosed publicly. The first one occured in March 2011, when certification authority Comodo was found to issue rogue certificates allowing one to carry out man-in-the middle attacks on Google, GMail, Yahoo, Skype, Mozilla, and Windows Live. The certificates were found to originate from the compromise of an employee’s account, before the company’s PR department went on to distract the general public with allegations about the attack originating from Iran. As no key leak occured and the Iran PR stunt worked, the company was able to revoke the certificates and silently resume its operations, without any significant action taken.
DigiNotar weren’t so lucky. When it was their turn, in July 2011, to issue fake Google and Yahoo certificates against their will, they were unable to convince their angry clients, who notably included the Dutch government, that their certification infrastructure was safe and that their signing key wasn’t compromised. This led all web browsers and operating systems which previously trusted that certification authority to reject it in a software update, effectively putting the company out of business.
With these incidents in mind, one should indeed be critical of more CAs entering what is an already very crowded business, as it creates more points of failure in an already fragile infrastructure. Yet simultaneously, we can also observe evidence that there are not enough CAs around, considering that companies in a CA position will gladly laugh at the laws of market economics by merrily abusing their privileged position for fun and profit, as no one is really able to displace them from their position to propose a better alternative.
Consider, for example, recent headlines regarding computer OEM Lenovo. Like all PC OEMs, it derives a significant fraction of its income by installing various software on the machine it sells in exchange for a financial compensation. One of them, Superfish, made the headlines this year because of the way it enrolled a custom CA into the computer so as to perform a man-in-the-middle attack on users’ HTTPS traffic for targeted advertising purposes. Due to the key being inadequately secured, it was promptly discovered by crackers, who will now routinely use it to perform MitM attacks on any Lenovo user who doesn’t remove the software.
Consider, also, the practices of Apple regarding software targeting its operating systems iOS and OSX. Apple are the de facto certification authority for signed software on their platform, and made such signing mandatory on iOS and hard to get around on OSX. Since they have an effective monopoly on signing, they were able to set absolutely egregious conditions in exchange for the privilege of getting one’s software signed for them platform, ranging from the mandatory purchase of a Mac to the payment of a $100 monthly bill and a 30% share on all iOS software sales.
And finally, for an example that hits even closer to home, consider locked bootloaders on emerging mobile platforms, and their Secure Boot cousin in the UEFI standard. In these designs, computer users are effectively prevented from installing the software they want on general-purpose hardware they own, lest they know about some optional, hardware-specific, and ill-documented bypass procedures. Their computer is effectively taken over by a rogue CA which they have no control on. The fix proposed by Microsoft? Have their competitors in the OS business rely on a digital certificate signed by them in order to keep their software working without hassle on the average user’s machine.
Some would say that in the face of these events, we need to get tighter control on certification authorities through things like public security audits and regulations that restrict their absurd amount of power. Myself, I would rather question whether the CA model really suits today’s personal computer market at all.
Outgrowing certification authorities
Indeed, the reliance on a small maffia of godlike, occasionally abusive and largely illegitimate certification authorities is not an absolute prerequisite for the deployment of a public key infrastructure. It was only necessary back when there only were a few personal computer users who didn’t know each other.
The tightly interconnected network of personal computers that we have today offers more decentralized alternatives to CAs, with less problematic politics. One of them actually isn’t new, though it remained in obscurity for most of its existence, and that is the idea of a web of trust (WoT), introduced by email encryption software PGP in the early 90s.
The theoretical basis of web of trusts is as follows: as a computer user, you know other computer users, who will themselves know other computer users, and all things considered, you can find a link between yourself and any other person in the world by following only half a dozen direct links. This is known as the “small world problem” in social sciences.
The web of trust concept suggests that people encode that graph of trust relationships in a digital form, by proceeding as follows. First, everyone gets themselves a public/private key pair suitable for digital signature purposes. Next, they use their signing key to certify that the public key of every other person they know does belong to them. One way to do this, for example, is for them to write a signed message which features both that public key, and a human-readable identifier of the person the key is certified to belong to, such as an e-mail address in PGP. Over time, this will result in people across the world building a highly redundant (and thus liar-tolerant), decentralized certification database.
This collective knowledge may subsequently be shared from one user of the web of trust to another, either by having a public key owner provide the digital signatures of others alongside his key to prove his identity (which has the drawback of being somewhat wasteful in terms of bandwidth), or by publishing it on publicly available databases called “key servers”.
WoT implementation considerations
Since it is computationally unfeasible to forge digital signatures, the key server does not need to be trusted very deeply (it only needs to guarantee availability, not integrity), which means that it can be located on any public web hosting service with decent monthly bandwidth, or even on a peer-to-peer network. However, a problem with public key servers is that they centralize in one place a huge database of relationships between many people online, which can be a privacy concern. Since these relationships are at the heart of the web of trust model, there is no clear fix for this.
Also, the centralized nature of key servers makes them vulnerable to attacks such as distributed denial of service, though this can be fixed by using peer-to-peer data stores, in civilized countries where ISPs won’t block them.
Finally, accessing keyservers requires a functional network connection, which was fine for PGP’s e-mail intents, but may not be for more general use. A partial fix would be for computers to keep a local copy of the key server’s database, which is regularly kept in sync with the distant one as part of system updates, or more generally when network connections are available. But depending on the database’s size, this may not always be practical.
Another issue is that the line between trusted and untrusted is much more fuzzy between real-world persons than between the idealized actors in a cryptographer’s mind. To account for this, real-world implementations of web of trusts like GnuPG’s will let users express variable amounts of trust about the public key that they are signing, and transmit that information as part of the certification. Depending on that “trust level”, they will then require a variable amount of certificates in order to fully trust a key.
Most of these issues are either somewhat minor or already addressed today. However, one unsolved problem with all decentralized systems like web of trusts is that they require many users to work well, which is at odd with their current relative obscurity. So in the early days, a centralized fallback is desirable.
This could take the form of the OS developers taking the role of an interim certification authority until a critical mass has been reached by the web of trust. The justification for this would be that users have to trust their OS in pretty much any task that they perform on a computer where it is installed anyway.
The identifier problem
Finally, let us note that the choice of identifer used to authentify people is also very important. Ideally, it should be as close as possible to the problem domain that requires authentication, like PGP’s email-based authentication, or TLS’ domain-based website authentication. But sometimes, one may wish that multiple signing keys used by a single person be traceable back to a unique “master” public/private key pair, which represents that person’s digital identity. In this case, there should be a way to associate that master key-pair with the identity of the person owning it.
For example, a developer may wish to provide users with a way to trust all of their digitally signed software at once, while still keeping all the benefits of using one unique signing key per software.
Since digital certificates are long-lived and distributed worldwide, using people’s full names as identifiers turns out to be insufficient. The reason for that is that names can change, be duplicated (for example, according to the French phonebook, there are 32 people called “Michel Petit” in Paris), and may not be readable by a part of the target audience. I for one cannot read Mandarin Chinese names, and I’m pretty sure that a vast part of the world have the same thing to say about my French name.
X.509’s take on this was to use extra physical location information such as addresses, company name and phone numbers. But for non-corporate use, these identifiers are to be strictly avoided, first because they are *guaranteed* to change, and second because disclosing them is a significant privacy risk. If you need to be convinced of the latter, just look up on the Web what happens when 4chan get their hands on the address of someone they have an axe to grind with.
A more promising track is perhaps to use people’s handles on online communication networks, such as the URL of their website, one or more of their e-mail addresses, or their nickname on chat services such as Skype and Whatsapp. The rationale for this is that on online services, people can have as many accounts as they want, close (most of) them whenever they need, and block unwanted communications with reasonable ease. Online identifiers thus provide a much better protection against the threat of online bullies than information about someone’s physical location, or some other identifier that is hard to change such as a phone number.
This could be put together by proposing a flexible identity certification format, in which public key owners could specify as many fields as they feel comfortable with, and people signing the certificate could single out the fields which they personally know about. For example, a colleague could certify the authenticity of someone’s professional e-mail address, but not that of a Skype handle which he has never used to communicate with that person.
There is value in having a way to trust, with reasonable confidence, that a computer message or service does come from the person it claims to without directly knowing that person yourself. Applications include pretty much any digital activity which requires trust, from online shopping to software installation and updating.
Modern cryptography allows for this when the person sending the message is known, using digital signature schemes based on public-key cryptography. But when the persons exchanging a message do not know each other, one needs to ensure that on the first contact, the public key of the message’s sender is sent safely to its recipient. This is the purpose of public key infrastructures, which work by having some previously known people attest the authenticity of the public key of other people in a process called certification.
Today, most PKIs are based on the existence of a number of universally trusted third parties called certification authorities. There are hundreds of them around, specialized in various realms, and this is an uncomfortable number. It is too much to offer good protection against security breaches, and too little to prevent CAs from gaining excessive political power and trying to abuse it.
An alternative is offered by the web of trust model, as originally introduced in email encryption software PGP, which digitally encodes existing trust relationships between people to build a decentralized certification database.
Practical issues to consider when building a web of trust include the protection of people’s privacy, the relative reliance of that model on network connectivity and a working key server infrastructure, and the choice of identifiers used to authenticate people. Nevertheless, in the face of the growing number of CA-related security incidents, I think it is worthwhile to help the growth of this decentralized form of public-key infrastructure, while providing a centralized fallback in the early days until a critical user mass is reached.
And now, on these thoughts, I am going to upgrade the self-signed certificate of my home server to SHA-256 signatures. And perhaps generate a couple of man-in-the-middle certificates in the way, too, just for the fun of playing pranks on people who accessed my website once.