false-security – Alice, Eve and Bob

Sometimes I’m looking around for a subject to write about, and realise that there’s one which I assume that I’ve covered, but, on searching, discover that I haven’t. Such a one is “measured boot” and “trusted boot” – sometimes, misleadingly, referred to as “secure boot”. There are specific procedures which use these terms with capital letters – e.g. Secure Boot – which I’m going to try to avoid discussing in this post. I’m more interested in the generic processes, and a major potential downfall, than in trying to go into the ins and outs of specifics. What follows is a (heavily edited) excerpt from my forthcoming book on Trust in Computing and the Cloud for Wiley.

In order to understand what measured boot and trusted boot aim to achieve, let’s have a look at the Linux virtualisation stack: the components you run if you want to be using virtual machines (VMs) on a Linux machine. This description is arguably over-simplified, but we’re not interested here in the specifics (as I noted above), but in what we’re trying to achieve. We’ll concentrate on the bottom four layers (at a rather simple level of abstraction): CPU/management engine; BIOS/EFI; Firmware; and Hypervisor, but we’ll also consider a layer just above the CPU/management engine, where we interpose a TPM (a Trusted Platform Module) and some instructions for how to perform one of our two processes. Once the system starts to boot, the TPM is triggered, and then starts its work (alternative roots of trust such as HSMs might also be used, but we will use TPMs, the most common example in this context, as our example).

In both cases, the basic flow starts with the TPM performing a measurement of the BIOS/EFI layer. This measurement involves checking the binary instructions to be carried out by this layer, and then creating a cryptographic hash of the binary image. The hash that’s produced is then stored in one of several “PCR slots” in the TPM. These can be thought of as pieces of memory which can be read later on, either by the TPM for its purposes, or by entities external to the TPM, but which cannot be changed once they have been written. This provides assurances that once a value is written to a PCR by the TPM, it can be considered constant for the lifetime of the system until power-off or reboot.

After measuring the BIOS/EFI layer, the next layer (Firmware) is measured. In this case, the resulting hash is combined with the previous hash (which was stored in the PCR slot) and then itself stored in a PCR slot. The process continues until all of the layers involved in the process have been measured, and the results of the hashes stored. There are (sometimes quite complex) processes to set up the original TPM values (I’ve missed out some of the more low-level steps in the process for simplicity) and to allow (hopefully authorised) changes to the layers for upgrading or security patching, for example. What this process “measured boot” allows is for entities to query the TPM after the process has completed, and check whether the values in the PCR slots correspond to the expected values, pre-calculated with “known good” versions of the various layers – that is, pre-checked versions whose provenance and integrity have already been established. Various protocols exist to allow parties external to the system to check the values (e.g. via a network connection) that the TPM attests to being correct: the process of receiving and checking such values from an external system is known as “remote attestation”.

This process – measured boot – allows us to find out whether the underpinnings of our system – the lowest layers – are what we think they are, but what if they’re not? Measured boot (unsurprisingly, given the name) only measures, but doesn’t perform any other actions. The alternative, “trusted boot” goes a step further. When a trusted boot process is performed, the process not only measures each value, but also performs a check against a known (and expected!) good value at the same time. If the check fails, then the process will halt, and the booting of the system will fail. This may sound like a rather extreme approach to take to a system, but sometimes it is absolutely the right one. Where the system under consideration may have been compromised – which is one likely inference that you can make from the failure of a trusted boot process – then it is better that it not be available at all than to be running based on flawed expectations.

This is all very well if I’m the owner of the system which is being measured, have checked all of the various components being measured (and the measurements), and so can be happy that what’s being booted it what I want[1]. But what if I’m actually using a system on the cloud, for instance, or any system owned and managed by someone elese? In that case, I’m trusting the cloud provider (or owner/manager) with two things:

do all the measuring correctly, and report correct results to me;
actually to have built something which I should be trusting in the first place!

This is the problem with the nomenclature “trusted boot”, and, even worse, “secure boot”. Both imply that an absolute, objective property of a system has been established – it is “trusted” or “secure” – when this is clearly not the case. Obviously, it would be unfair to expect the designers of such processes to name them after the failure states – “untrusted boot” or “insecure boot” – but unless I can be very certain that I trust the owner of the system to do step 2 entirely correctly (and in my best interests, as user of the system, rather than theirs, and owner) then we can make no stronger assertions. There is an enormous temptation to take a system which has gone through a trusted boot process and to label it a “trusted system”, where the very best assertion we can make is that the particular layers measured in the measured and/or trusted boot process have been asserted to be those which the process expected to be present. Such a process says nothing at all about the fitness of the layers to provide assurances of behaviour, nor about the correctness (or fitness to provide assurances of behaviour) of any subsequent layers on top of those.

It’s important to note that designers of TPMs are quite clear what is being asserted, and that assertions about trust should be made carefully and sparingly. Unluckily, however, the complexities of systems, the general low level of understanding of trust, and the complexities of context and transitive trust make it very easy for designers and implementors of systems to do the wrong thing, and to assume that any system which has successfully performed a trusted boot process can be considered “trusted”. It is also extremely important to remember that TPMs, as hardware roots of trust, offer us one of the best mechanisms for we have for establishing a chain of trust in systems that we may be designing or implementing, and I plan to write an article about them soon.

1 – although this turns out to be much harder to do that you might expect!

I should probably avoid this one, because a) everyone will be writing about it; and b) it makes me really, really cross; but I just can’t*. I’m also going to restate the standard disclaimer that the opinions expressed here are mine, and may not represent those of my employer, Red Hat, Inc. (although I hope that they do).

Amber Rudd, UK Home Secretary, has embraced what I’m going to call the Backdoor Fallacy. This is basically a security-by-obscurity belief that it’s necessary for encryption providers to provide the police and security services with a “hidden” method by which they can read all encrypted communications**. The Home Secretary’s espousal of this popular position is a predictable reaction to the terrorist attack in London last week, but it won’t help. I literally don’t know a single person with modicum of technical understanding who thinks this is a good idea. Or remotely practicable. Obviously, therefore, I’m not the only person who’s going to writing about this, but I thought it would be an interesting exercise to collect some of the reasons that this is monumentally bad idea in one short article, so let’s examine this fallacy from a few angles.

It always fails – because a backdoor isn’t just a backdoor for authorised users: it’s a backdoor for anyone who can find it. And keeping these sort of things hidden is difficult, because:
- academic researchers look for them
- criminals look for them
- “unfriendly state actors” (governments we don’t like at the moment) look for them
- previously friendly state actors (governments we used to like, but we don’t like so much anymore) look for them
- police and security services mess up and leak them by accident
- insiders within police and security services decide to leak them
- source code gets leaked, giving clues to how they’re implemented -for those apps which aren’t Open Source in the first place
- the people writing them don’t always get it right, and you end up with more holes than you expected***
- techniques that seem safe now often seem laughably insecure in a few years’ time.

There is just no safe way to protect these backdoors.

You can’t identify all the providers – today it’s Whatsapp. And Facebook, and Twitter, and Instagram, and Tumblr, and … But if I’d asked you for a list a year, or five years ago, what would that list have looked like? And can you tell me what should be on the list for next week, or next year? No, you can’t. And I suspect you (as a learned reader of this blog) are a lot more clued up than the UK Home Office.

You can’t convince all the providers – and that’s assuming that all of the providers are interested or can be convinced to care adequately to sign up anyway.

You can’t hit all channels – even if you could identify all the providers, what about online gaming? And email. And ssh. I mean, really.

The obviousness issue – presumably, in order to make this work, governments need to publish a list of approved applications. I suspect, just suspect that the sort of bad people who want to get around this will choose to use different apps, or different channels to the approved ones … but so will people who aren’t “bad people”, but just have legitimate reasons for encrypting their communications.

The business problem – there are legitimate uses for encryption. Many, many of them. And they far outnumber the illegitimate uses. So, if you’re a government, you have two options:
1. you can convince all legitimate business, including banks, foreign corporations and human rights organisations and everyone who communicates with them to use your compromised, “backdoor-enhanced”***** encryption scheme. Good luck with this: it’s not going to work.
2. you can institute a simple, fast, unabuseable red-tape free process by which you hand out exemptions to “legitimate” businesses who you can trust to use non-compromised, backdoor-unenhanced encryption schemes.******

I’m guessing that we don’t expect either of these to fly.

The “nothing to hide” sub-fallacy – “But if you have nothing to hide, then you have nothing to fear” argument. Well, I may have nothing to hide from the current government. But what about future governments? Have the past 100 years of world history taught us nothing? Hitler, Franco, Stalin, Perón, … the list goes on and on. From “previously friendly state actors”? And from the criminals who are the main reason most of us use encryption in the first place? Puh-lease.

The who-do-you-trust question – this leads on from the “police and security services mess up” sub-bullet above. The fewer people to whom you give the backdoor details, the more hard work and expense there is in using that backdoor for your purposes. So there’s an obvious move to reduce costs by spreading knowledge of the backdoor. And governments tend towards any policy which reduces costs, so… And, of course, the more spread the knowledge, the more likely it is it leak.

Once it’s gone, it’s gone – and once it’s leaked, it’s leaked, whether by accident or intention (Chelsea Manning, Julian Assange, Edward Snowden, …). You can’t put this genie back in the bottle. The cost and complication of re-keying a communications channel for which the key has leaked is phenomenal. I’m assuming that this is just a re-keying exercise, but if it’s a recoding exercise, it’s even harder. And how do you enforce that only the new version is used, anyway?

The jurisdiction issue – do all governments agree on the same key? No? Well, then I have to have different versions of all apps I might use, and choose the correct one for each country I travel in? And ensure that neither I nor any businesses ever communicate across jurisdictional boundaries. Or we could have multiple backdoors, each for a different jurisdiction? Let’s introduce the phrase “combinatorial explosion” here, shall we?

Let’s work as an industry to disabuse governments of the idea that this is ever a good idea. And we also need to work them to come up with other techniques to help them catch criminals and stop terrorist attacks: let’s do that, too.

*believe me: I tried. Not that hard, but I tried.

**they probably want all “at-rest” keys as well as all transport keys. This is even more stupid.

***don’t get me wrong: this is going to happen anyway, but why add to the problem?

****inverted commas for irony, which I hope is obvious by this state in the proceedings

*****”I can’t even”, to borrow from popular parlance. This is the UK government, after all.

Tag: false-security

The Backdoor Fallacy: explaining it slowly for governments