Measured and trusted boot

What they give you – and don’t.

Sometimes I’m looking around for a subject to write about, and realise that there’s one which I assume that I’ve covered, but, on searching, discover that I haven’t. Such a one is “measured boot” and “trusted boot” – sometimes, misleadingly, referred to as “secure boot”. There are specific procedures which use these terms with capital letters – e.g. Secure Boot – which I’m going to try to avoid discussing in this post. I’m more interested in the generic processes, and a major potential downfall, than in trying to go into the ins and outs of specifics. What follows is a (heavily edited) excerpt from my forthcoming book on Trust in Computing and the Cloud for Wiley.

In order to understand what measured boot and trusted boot aim to achieve, let’s have a look at the Linux virtualisation stack: the components you run if you want to be using virtual machines (VMs) on a Linux machine. This description is arguably over-simplified, but we’re not interested here in the specifics (as I noted above), but in what we’re trying to achieve. We’ll concentrate on the bottom four layers (at a rather simple level of abstraction): CPU/management engine; BIOS/EFI; Firmware; and Hypervisor, but we’ll also consider a layer just above the CPU/management engine, where we interpose a TPM (a Trusted Platform Module) and some instructions for how to perform one of our two processes. Once the system starts to boot, the TPM is triggered, and then starts its work (alternative roots of trust such as HSMs might also be used, but we will use TPMs, the most common example in this context, as our example).

In both cases, the basic flow starts with the TPM performing a measurement of the BIOS/EFI layer. This measurement involves checking the binary instructions to be carried out by this layer, and then creating a cryptographic hash of the binary image. The hash that’s produced is then stored in one of several “PCR slots” in the TPM. These can be thought of as pieces of memory which can be read later on, either by the TPM for its purposes, or by entities external to the TPM, but which cannot be changed once they have been written. This provides assurances that once a value is written to a PCR by the TPM, it can be considered constant for the lifetime of the system until power-off or reboot.

After measuring the BIOS/EFI layer, the next layer (Firmware) is measured. In this case, the resulting hash is combined with the previous hash (which was stored in the PCR slot) and then itself stored in a PCR slot. The process continues until all of the layers involved in the process have been measured, and the results of the hashes stored. There are (sometimes quite complex) processes to set up the original TPM values (I’ve missed out some of the more low-level steps in the process for simplicity) and to allow (hopefully authorised) changes to the layers for upgrading or security patching, for example. What this process “measured boot” allows is for entities to query the TPM after the process has completed, and check whether the values in the PCR slots correspond to the expected values, pre-calculated with “known good” versions of the various layers – that is, pre-checked versions whose provenance and integrity have already been established. Various protocols exist to allow parties external to the system to check the values (e.g. via a network connection) that the TPM attests to being correct: the process of receiving and checking such values from an external system is known as “remote attestation”.

This process – measured boot – allows us to find out whether the underpinnings of our system – the lowest layers – are what we think they are, but what if they’re not? Measured boot (unsurprisingly, given the name) only measures, but doesn’t perform any other actions. The alternative, “trusted boot” goes a step further. When a trusted boot process is performed, the process not only measures each value, but also performs a check against a known (and expected!) good value at the same time. If the check fails, then the process will halt, and the booting of the system will fail. This may sound like a rather extreme approach to take to a system, but sometimes it is absolutely the right one. Where the system under consideration may have been compromised – which is one likely inference that you can make from the failure of a trusted boot process – then it is better that it not be available at all than to be running based on flawed expectations.

This is all very well if I’m the owner of the system which is being measured, have checked all of the various components being measured (and the measurements), and so can be happy that what’s being booted it what I want[1]. But what if I’m actually using a system on the cloud, for instance, or any system owned and managed by someone elese? In that case, I’m trusting the cloud provider (or owner/manager) with two things:

  1. do all the measuring correctly, and report correct results to me;
  2. actually to have built something which I should be trusting in the first place!

This is the problem with the nomenclature “trusted boot”, and, even worse, “secure boot”. Both imply that an absolute, objective property of a system has been established – it is “trusted” or “secure” – when this is clearly not the case. Obviously, it would be unfair to expect the designers of such processes to name them after the failure states – “untrusted boot” or “insecure boot” – but unless I can be very certain that I trust the owner of the system to do step 2 entirely correctly (and in my best interests, as user of the system, rather than theirs, and owner) then we can make no stronger assertions. There is an enormous temptation to take a system which has gone through a trusted boot process and to label it a “trusted system”, where the very best assertion we can make is that the particular layers measured in the measured and/or trusted boot process have been asserted to be those which the process expected to be present. Such a process says nothing at all about the fitness of the layers to provide assurances of behaviour, nor about the correctness (or fitness to provide assurances of behaviour) of any subsequent layers on top of those.

It’s important to note that designers of TPMs are quite clear what is being asserted, and that assertions about trust should be made carefully and sparingly. Unluckily, however, the complexities of systems, the general low level of understanding of trust, and the complexities of context and transitive trust make it very easy for designers and implementors of systems to do the wrong thing, and to assume that any system which has successfully performed a trusted boot process can be considered “trusted”. It is also extremely important to remember that TPMs, as hardware roots of trust, offer us one of the best mechanisms for we have for establishing a chain of trust in systems that we may be designing or implementing, and I plan to write an article about them soon.

1 – although this turns out to be much harder to do that you might expect!

Formal verification … or Ken Thompson?

“You can’t trust code that you did not totally create yourself” – Ken Thompson.

This article is an edited excerpt from my forthcoming book on Trust in Computing and the Cloud for Wiley.

How can we be sure that the code we’re running does what we think it does? One of the answers – or partial answers – to that question is “formal verification.” Formal verification is an important field of study, applying mathematics to computing, and it aims to start with proofs – at best, with an equivalent level of assurance to that of formal mathematical proofs – of the correctness of algorithms to be implemented in code to ensure that they perform the operations expected and set forth in a set of requirements. Though implementation of code can often fall down in the actual instructions created by a developer or set of developers – the programming – mistakes are equally possible at the level of the design of the code to be implemented in the first place, and so this must be a minimum step before looking at any actual implementations. What is more, these types of mistakes can be all the more hard to spot, as even if the developer has introduced no bugs in the work they have done, the implementation will be flawed by virtue of it being incorrectly defined in the first place. It is with an acknowledgement of this type of error, and an intention of reducing or eliminating it, that formal verification starts, but some areas go much further, with methods to examine concrete implementations and make statements about their correctness with regards to the algorithms which they are implementing.

Where we can make these work, they are extremely valuable, and the sort of places that they are applied are exactly where we would expect: for systems where security is paramount, and to prove the correctness of cryptographic designs and implementations. Another major focus of formal verification is software for safety systems, where the “correct” operation of the system – by which we mean “as designed and expected” – is vital. Examples might include oil refineries, fire suppression systems, nuclear power station management, aircraft flight systems and electrical grid management – unsurprisingly, given the composition of such systems, formal verification of hardware is also an important field of study. The practical application of formal verification methods to software is, however, more limited than we might like. As Alessandro Abate notes in a paper on formal verification of software:

“Two known shortcomings of standard techniques in formal verification are the limited capability to provide system-level assertions, and the scalability to large, complex models.”

To these shortcomings we can add another, extremely significant one: how sure can you be that what you are running is what you think you are running? Surely knowing what you are running is exactly why we write software, look at the source, and then compile it under our control? That, certainly, is the basic starting point for software that we care about.

The problem is arguably one of layers and dependencies, and was outlined by Ken Thompson, one of the founders or modern computing, in the lecture he gave at his acceptance of the Turing Award in 1983. It is short, stands as one of the establishing artefacts of computing security, and has weathered the tests of time: I have no hesitation in recommending that all readers of this blog read it: Reflections on Trusting Trust. In it, he describes how careful placing of malicious code in the C standard compiler could lead to vulnerabilities (his specific example is in account login code) which are not only undetectable by those without access to the source code, but also not removable. The final section of the paper is entitled “Moral”, and Thompson starts with these words:

“The moral is obvious. You can’t trust code that you did not totally create yourself. (Especially code from companies that employ people like me.) No amount of source-level verification or scrutiny will protect you from using untrusted code.”

However, as he goes on to point out, here is nothing special about the compiler:

“I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will be harder and harder to detect. A well-installed microcode bug will be almost impossible to detect.”

It is for this the reasons noted by Thompson that open source software – and hardware – is so vital to the field of computer security, and to our task of defining and understanding what “trust” means in the context of computing. Just relying on the “open source-ness” of your code is not enough: there is more work to be done in understanding your stack, the community and your requirements, but without the ability to look at the source code of all the layers of software and hardware on which you are running code, then you can have only reduced trust that what you are running is what you think you should be running, whether you have performed formal verification on it or not.

An Enarx milestone: binaries

Demoing the same binary in very different TEEs.

This week is Red Hat Summit, which is being held virtually for the first time because of the Covid-19 crisis. The lock-down has not affected the productivity of the Enarx team, however (at least not negatively), as we have a very exciting demo that we will be showing at Summit. This post should be published at 1100 EDT, 1500 BST, 1400 GMT on Tuesday, 2020-04-28, which is the time that the session which Nathaniel McCallum and I recorded will be released to the world. I hope to be able to link to that once it’s released to the world. But what will we be showing?

Well, to set the scene, and to discover a little more about the Enarx project, you might want to read these articles first (also available in Japanese – visit each article of a link):

Enarx, as you’ll discover, is about running workloads in TEEs (Trusted Execution environments), using WebAssembly, in what we call “Keeps”. It’s a mammoth job, particularly as we’re abstracting away the underlying processor architectures (currently two: Intel’s SGX and AMD’s SEV), so that you, the user, don’t need to worry about them: all you need to do is write and compile your application, then request that it be deployed. Enarx, then, has lots of moving parts, and one of the key tasks for us has been to start the work to abstract away the underlying processor architectures so that we can prepare the runtime layers on top. Here’s a general picture of the software layers, and how they sit on top of the hardware platforms:

What we’re announcing – and demoing – today is that we have an initial implementation of code to allow us to abstract away process-based and VM-based types of architecture (with examples for SGX and SEV), so that we can do this:

This seems deceptively simple, but what’s actually going on under the covers is rather more than is exposed in the picture above. The reality is more like this:

This gives more detail: the application that’s running on both architectures (SGX on the left, SEV on the right) is the very same ELF static-PIE binary. To be clear, this is not only the same source code, compiled for different platforms, but exactly the same binary, with the very same hash signature. What’s pretty astounding about this is that in order to make it run on both platforms, the engineering team has had to write two sets of seriously low-level code, including more than a little Assembly language, providing the “plumbing” to allow the binary to run on both.

This is a very big deal, because although we’ve only implemented a handful of syscalls on each platform – enough to make our simple binary run and print out a message – we now have a framework on which we know we can build. And what’s next? Well, we need to expand that framework so that we can then build the WebAssembly layers which will allow WebAssembly applications to run on top:

There’s a long way to go, but this milestone shows that we have an initial framework which we can improve, and on which we can build.

What’s next?

What’s exciting about this milestone from our point of view is that we think it puts Enarx at a stage where more people can join and take part. There’s still lots of low-level work to be done, but it’s going to be easier to split up now, and also to start some of the higher level work, too. Enarx is completely open source, and we do all of our design work in the open, along with our daily stand-ups. You’re welcome to browse our documentation, RFCs (mostly in draft at the moment), raise issues, and join our calls. You can find loads more information on the Enarx wiki: we look forward to your involvement in the project.

Last, and not least, I’d like to take a chance to note that we now have testing/CI/CD resources available for the project with both Intel SGX and AMD SEV systems available to us, all courtesy of Packet. This is amazingly generous, and we both thank them and encourage you to visit them and look at their offerings for yourself!

Isolationism – not a 4 letter word (in the cloud)

Things are looking up if you’re interested in protecting your workloads.

In the world of international relations, economics and fiscal policy, isolationism doesn’t have a great reputation. I could go on, I suppose, if I did some research, but this is a security blog[1], and international relations, fascinating area of study though it is, isn’t my area of expertise: what I’d like to do is borrow the word and apply it to a different field: computing, and specifically cloud computing.

In computing, isolation is a set of techniques to protect a process, application or component from another (or a set of the former from a set of the latter). This is pretty much always a good thing – you don’t want another process interfering with the correct workings of your one, whether that’s by design (it’s malicious) or in error (because it’s badly designed or implemented). Isolationism, therefore, however unpopular it may be on the world stage, is a policy that you generally want to adopt for your applications, wherever they’re running.

This is particularly important in the “cloud”. Cloud computing is where you run your applications or processes on shared infrastructure. If you own that infrastructure, then you might call that a “private cloud”, and infrastructure owned by other people a “public cloud”, but when people say “cloud” on its own, they generally mean public clouds, such as those operated by Amazon, Microsoft, IBM, Alibaba or others.

There’s a useful adage around cloud computing: “Remember that the cloud is just somebody else’s computer”. In other words, it’s still just hardware and software running somewhere, it’s just not being run by you. Another important thing to remember about cloud computing is that when you run your applications – let’s call them “workloads” from here on in – on somebody else’s cloud (computer), they’re unlikely to be running on their own. They’re likely to be running on the same physical hardware as workloads from other users (or “tenants”) of that provider’s services. These two realisations – that your workload is on somebody else’s computer, and that it’s sharing that computer with workloads from other people – is where isolation comes into the picture.

Workload from workload isolation

Let’s start with the sharing problem. You want to ensure that your workloads run as you expect them to do, which means that you don’t want other workloads impacting on how yours run. You want them to be protected from interference, and that’s where isolation comes in. A workload running in a Linux container or a Virtual Machine (VM) is isolated from other workloads by hardware and/or software controls, which try to ensure (generally very successfully!) that your workload receives the amount of computing time it should have, that it can send and receive network packets, write to storage and the rest without interruption from another workload. Equally important, the confidentiality and integrity of its resources should be protected, so that another workload can’t look into its memory and/or change it.

The means to do this are well known and fairly mature, and the building blocks of containers and VMs, for instance, are augmented by software like KVM or Xen (both open source hypervisors) or like SELinux (an open source capabilities management framework). The cloud service providers are definitely keen to ensure that you get a fair allocation of resources and that they are protected from the workloads of other tenants, so providing workload from workload isolation is in their best interests.

Host from workload isolation

Next is isolating the host from the workload. Cloud service providers absolutely do not want workloads “breaking out” of their isolation and doing bad things – again, whether by accident or design. If one of a cloud service provider’s host machines is compromised by a workload, not only can that workload possibly impact other workloads on that host, but also the host itself, other hosts and the more general infrastructure that allows the cloud service provider to run workloads for their tenants and, in the final analysis, make money.

Luckily, again, there are well-known and mature ways to provide host from workload isolation using many of the same tools noted above. As with workload from workload isolation, cloud service providers absolutely do not want their own infrastructure compromised, so they are, of course, going to make sure that this is well implemented.

Workload from host isolation

Workload from host isolation is more tricky. A lot more tricky. This is protecting your workload from the cloud service provider, who controls the computer – the host – on which your workload is running. The way that workloads run – execute – is such that such isolation is almost impossible with standard techniques (containers, VMs, etc.) on their own, so providing ways to ensure and prove that the cloud service provider – or their sysadmins, or any compromised hosts on their network – cannot interfere with your workload is difficult.

You might expect me to say that providing this sort of isolation is something that cloud service providers don’t care about, as they feel that their tenants should trust them to run their workloads and just get on with it. Until sometime last year, that might have been my view, but it turns out to be wrong. Cloud service providers care about protecting your workloads from the host because it allows them to make more money. Currently, there are lots of workloads which are considered too sensitive to be run on public clouds – think financial, health, government, legal, … – often due to industry regulation. If cloud service providers could provide sufficient isolation of workloads from the host to convince tenants – and industry regulators – that such workloads can be safely run in the public cloud, then they get more business. And they can probably charge more for these protections as well! That doesn’t mean that isolating your workloads from their hosts is easy, though.

There is good news, however, for both cloud service providers and their teants, which is that there’s a new set of hardware techniques called TEEs – Trusted Execution Environments – which can provide exactly this sort of protection[2]. This is rapidly maturing technology, and TEEs are not easy to use – in that it can not only be difficult to run your workload in a TEE, but also to ensure that it’s running in a TEE – but when done right, they do provide the sorts of isolation from the host that a workload wants in order to maintain its integrity and confidentiality[3].

There are a number of projects looking to make using TEEs easier – I’d point to Enarx in particular – and even an industry consortium to promote open TEE adoption, the Confidential Computing Consortium. Things are looking up if you’re interested in protecting your workloads, and the cloud service providers are on board, too.

1 – sorry if you came here expecting something different, but do stick around and have a read: hopefully there’s something of interest.

2 – the best known are Intel’s SGX and AMD’s SEV.

3 – availability – ensuring that it runs fairly – is more difficult, but as this is a property that is also generally in the cloud service provider’s best interest, and something that can can control, it’s not generally too much of a concern[4].

4 – yes, there are definitely times when it is, but that’s a story for another article.

A cybersecurity tip from Hazzard County

Don’t place that bet in Boss Hogg’s betting saloon: you know he’s up to no good!

It’s a slightly guilty secret, but I used to love watching The Dukes of Hazzard in the early 80’s (the first series started in late 1979, but I suspect that it didn’t make it to the UK until the next year at the earliest).  It all seemed very glamourous, and there were lots of fast car chases.  And a basset hound, which was an extra win.  To say this was early days for cybersecurity would be an understatement, and though there are references in the Wikipedia plot summaries to computers, I can’t honestly say that I remember any of those particular episodes.

One episode has stuck with me, however, for reasons that I can’t fathom.  It’s called “Hazzard Hustle” and (*SPOILER ALERT*) in it, Boss Hogg sets up a crooked betting saloon.  The swindle (if I remember it correctly) is that he controls and delays the supposedly live feeds to the TVs in the saloon, which means that he has access to results before they come in.  Needless to say, the Duke boys (probably aided and abetted by Daisy Duke) get the better of him in the end, and everything turns out OK (for them, not Boss Hogg).

“What can this have to do with cybersecurity?” you have every right to ask.  Well, the answer is reporting and monitoring channels.  Monitoring is important because without it, there is no way for us to check that what we believe should be happening actually is.  The opportunities for direct sensory monitoring of actions in computer-based systems are limited: if I request via a web browser that a banking application transfers funds between one account and another, the only visible effect that I am likely to see is an acknowledgement on the screen. Until I actually try to spend or withdraw that money, I realistically have no way to be assured that the transactions has taken place.

Let’s take an example from the human realm.  It is as if I have a trust relationship with somebody around the corner of a street, out of view, that she will raise a red flag at the stroke of noon, and I have a friend, standing on the corner, who will watch her, and tell me when and if she raises the flag. I may be happy with this arrangement, but only because I have a trust relationship to the friend: that friend is acting as a trusted channel for information.

The word “friend” was chosen carefully above, because there is a trust relationship already implicit in the term. The same is not true for the word “somebody”, which I used to describe the person who was to raise the flag. The situation as described above is likely to make our minds presume that there is a fairly high probability that the trust relationship I have to the friend is sufficient to assure me that he will pass the information correctly. But what if my friend is actually a business partner of the flag-waver? Given our human understanding of the trust relationships typically involved with business partnerships, we may immediately begin to assume that my friend’s motivations in respect to correct reporting are not neutral.

The channels for reporting on actions – monitoring them – are vitally important within cybersecurity, and it is both easy and dangerous to fall into the trap of assuming that they are neutral, and that the only important one is between me and the acting party. In reality, the trust relationship that I have to a set of channels is key to the maintenance of the trust relationships that I have to the key entity that they monitor. In trust relationships involving computer systems, there are often multiple entities or components involved in actions, and these form a “chain of trust”, where each link depends on the other, and the chain is typically only as strong as the weakest of its links.  Don’t forget that.  Oh, and don’t place that bet in Boss Hogg’s betting saloon: you know he’s up to no good!





他のしなければいけない業務もあって、例えば顧客会議、IBM(7月に私の勤めるRed Hatを買収してます)の業務、Kubernetesのセキュリティやパートナー企業と協業など重要なことは色々ありました。しかしEnarxが2019年のハイライトです。



その課題に対して、私たちはAMDのSEVチップと五月のボストンでのRed Hat Summitでデモを行い、このブログでアナウンスをしました。

IntelのSGXチップセットと10月のリヨンでのOpen Source Summitでフォローアップをしています。2019年のEnarxの開発でとても大切なことだったと考えています。




Enarxは私だけのものではもちろん、ありません。Nathaniel McCallumと共にプロジェクトの共同創立者の一人であることは非常に誇りです。ここまで達成できたのは多くのチームメンバーのおかげですし、オープンソースプロジェクトとして貢献し使用している皆様のおかげです。貢献者ページにはたくさんのメンバーの名前がありますが、まだ全員の名前が挙がっているわけではありません。また、Red Hat内外の何人かの方から頂いたプロジェクトに対するアドバイス、サポートとスポンサリングはとても大切なものです。その皆様の名前を言う許可は得ていないので、ここではお話しせず、丁重に扱う事とします。皆様のサポートとそのお時間を頂けたことに非常に感謝しています。









2019年の重大イベントはLinux FoundationのOpen Source SummitでのConfidential Computing Consortiumの発表でした。私たちRed HatではEnarxはこの新しいグループにぴったりだと考えており、10月の正式発足でプレミアメンバーになったことを嬉しく思っています。これを書いている2019年12月31日時点では、会員数は21、このコンソーシアムは幅広い業界で懸念と興味を惹きつけるものだと言うことがはっきりしてきました。Enarxの信念と目的が裏付けされていると言うことです。




最後に大切なことを一つ。私たちはプロジェクトを公表していきます。内製のプロジェクトからRed Hat外の参加を促進するために活動しています。詳細は12月17日のBlogをご覧ください。














2019年12月31日 Mike Bursell




2019: a year of Enarx

We have big plans for demos and more in 2020


This year has, for me, been pretty much all about the Enarx project.  I’ve had other work that I’ve been doing, including meeting with customers, participating in work with IBM (who acquired the company I work for, Red Hat, in July), looking at Kubernetes security, interacting with partners and a variety of other important pieces, but it’s been Enarx that has defined 2019 for me from a work point of view.

We started off the year with a belief that we could do something, and a challenge from our internal leadership to prove that it was possible.  We did that with a demo on AMD’s SEV chipset at Red Hat Summit in Boston, MA in May, and an announcement of the project on this blog.  We followed up with a demo on Intel’s SGX chipset at Open Source Summit Europe in Lyon in October.  I thought I would mention some of the most important components for the development (in the broadest sense) of Enarx this year.


Enarx is not mine: far from it.  I’m proud to be counted one of the co-founders of the project with Nathaniel McCallum, but we wouldn’t be where we are without a broader team, and as an open source project, it belongs to everyone who contributes and to everyone who uses it.  You’ll find many of the members on the contributors page, but not everybody is up there yet, and there have been some very important people whose contribution has been advice, support and sponsorship of the project both within Red Hat and outside it.  I don’t have permission to mention everybody’s name, so I’m going to play it safe and mention none of them.  You know who you are, and we really appreciate your time.

Use cases – and partners

One of the most important things that we’ve done this year is to work out how people might want to use Enarx “in the wild”, as it were, and to perform some fairly detailed analysis and write-ups.  Not enough of these are externally available yet, which is down to me, but the fact that we had done the work was vital in finding partners who are actually interested in using Enarx for real.  I can’t talk about any of these in public yet, but we have some really interesting use cases from a number of multi-national organisations of whom you will definitely have heard, as well as some smaller start-ups about whom you may well be hearing more in the future.  Having this kind of interest was vital to get buy-in to the project and showed that Enarx wasn’t just a flight of fancy by a bunch of enthusiastic engineers.

Looking outside

The most significant event in the project’s year was the announcement of the Confidential Computing Consortium at the Linux Foundation’s Open Source Summit this year.  We at Red Hat realised that Enarx was a great match for this new group, and was very pleased to be a premier member at the official launch in October.  At time of writing, there are 21 members, and it’s becoming clear that this the consortium has identified an area of concern and interest for the wider industry: this is another great endorsement of the aims and principles of Enarx.

Joining the Consortium hasn’t been the only activity in which we’ve been involved this year.  We’ve spoken at conferences, had articles published (on Alice, Eve and Bob, on now + Next and on, spoken to press, recorded webcasts and more.  Most important (arguably), we have hex stickers (if you’re interested, get in touch!).

Last, but not least, we’ve gone external.  From being an internal project (though we always had our code as open source), we’ve taken a number of measures to try to encourage and simplify involvement by non-Red Hat contributors – see 7 tips for kicking off an open source project for a little more information.

Architecture and code

What else?  Oh, there’s code, and an increasingly mature set of architectures for the various components.  We absolutely plan to make all of this externally visible, and the fact that we haven’t yet is that we’re just running to stand still at the moment: there’s just so much to do.  Our focus is on getting code out there for people to use and contribute to themselves and, without giving anything away, we have some pretty big plans for demos and more in 2020.


There’s one other thing that’s been important, of course, and that’s the fact that I’m writing a book for Wiley on trust, but I actually see that as very much related to Enarx.  Fundamentally, although the technology is cool, and we think that the Enarx project meets an existing need, both Nathaniel and I believe that there’s a real opportunity for it to change how people manage trust for workloads in the cloud, in IoT, at the Edge and wherever else sensitive data and algorithms need to be executed.

This blog is supposed to be about security, and I’m strongly of the opinion that trust is a very important part of that.  Enarx fits into that, so don’t be surprised to see more posts around trust and about Enarx over the coming year.  Please keep an eye out here and at for the latest information.



Timely risk or risky times?

Being aware of “the long game”.

On Friday, 29th November 2019, Jack Merritt and Saskia Jones were killed in a terrorist attack.  A number of members of the public (some with with improvised weapons) and of the emergency services acted with great heroism.  I wanted to mention the mention the names of the victims and to praise those involved in stopping him before mentioning the name of the attacker: Usman Khan.  The victims, the attacker were taking part in an offender rehabilitation conference to help offenders released from prison to reintegrate into society: Khan had been convicted to 16 years in prison for terrorist offences.

There’s an important formula that everyone involved in risk – and given that IT security is all about mitigating risk, that’s anyone involved in security – should know. It’s usually expressed thus:

Risk = likelihood x impact

Sometimes likelihood is sometimes expressed as “probability”, impact as “consequence” or “loss”, and I’ve seen some other variants as well, but the version above is generally sufficient for most purposes.

Using the formula

How should you use the formula? Well, it’s most useful for comparing risks and deciding how to mitigate them. Humans are terrible at calculating risk, and any tools that help them[1] is good.  In order to use this formula correctly, you want to compare risks over the same time period.  You could say that almost any eventuality may come to pass over the lifetime of the universe, but comparing the risk of losing broadband access to the risk of your lead developer quitting for another company between the Big Bang and the eventual heat death of the universe is probably not going to give you much actionable information.

Let’s look at the two variables that we need to have in order to calculate risk.  We’ll start with the impact, because I want to devote most of this article to the other part: likelihood.

Impact is what the damage will be if the risk happens.  In a business context, you want to look at the risk of your order system being brought down for a week by malicious attackers.  You might calculate that you would lose £15,000 in orders.  On top of that, there might be a loss of reputation which you might calculate at £30,000.  Fixing the problem might add £10,000.  Add these together, and the impact is £55,000.

What’s the likelihood?  Well, remember that we need to consider a particular time period.  What you choose will depend on what you’re interested in, but a classic use is for budgeting, and so the length of time considered is often a year.  “What is the likelihood of my order system being brought down for a week by malicious attackers over the next twelve months?” is the question you want to ask.  If you decide that it’s 0.005 (or 0.5%), then your risk is calculated thus:

Risk = 0.005 x 55,000

Risk = 275

The units don’t really matter, because what you want to do is compare risks.  If the risk of your order system being brought down through hardware failure is higher (say 500), then you should probably balance the amount of resources you assign to mitigate these risks accordingly.

Time, reputation, trust and risk

What I’m interested in is a set of rather more complicated risks, however: those associated with human behaviour.  I’m very interested in trust, and one of the interesting things about trust is how we decide to trust people.  One way is by their reputation: if someone keeps behaving well over a long period, then we tend to trust them more – or if badly, then to trust them less[2].  If we trust someone more, our calculation of risk is likely to be strongly based on that trust, as our view of the likelihood of a behaviour at odds with the reputation that person holds will be informed by that.

This makes sense: in the absence of perfect information about humans, their motivations and intentions, our view of risk must be based on something, and reputation is actually a fairly good measure for that.  We might say that the likelihood of a customer defaulting on payment terms reduces year by year as we start to think of them as a “trusted customer”.  As the likelihood reduces, we may decide to increase the amount we lend to them – and thereby the impact of defaulting – to keep the risk about the same, year on year.

The risk here is what is sometimes called “playing the long game”.  Humans sometimes manipulate their reputation, or build up a reputation, in order to perform an action once they have gained trust.  Online sellers my make lots of “good” sales in order to get a 5 star rating over time, only to wait and then make a set of “bad” sales, where they don’t ship goods at all, and then just pocket the money.  Or, they may make many small sales in order to build up a good reputation, and then use that reputation to make one big sale which they have no intention of fulfilling.  Online selling sites are wise to some of these tricks, and have algorithms to try to protect buyers (in fact, the same behaviour can be used by sellers in some cases), but these are not perfect.

I’d like to come back to the London Bridge attack.  In this case, it seems likely that the attacker bided his time over many years, behaving well, and raising his “reputation” among those who knew him – the prison staff, parole board, rehabilitation conference organisers, etc. – so that he had the opportunity to perform one major action at odds with that reputation.  The heroism of those around him stopped him being as successful as he may have hoped, but still at the cost of two innocent lives and several serious injuries.

There is no easy way to deal with such issues.  We need reputation, and we need to allow people to show that they have changed and can be integrated into society, but when we make risk calculations based on reputation in any sphere, we should take care to consider whether actors are playing a long game, and what the possible ramifications would be if they were to act at odds with that reputation.

I noted above that humans are bad at calculating risk, and to follow our example of the non-defaulting customer, one mistake might be to increase the credit we give to that customer beyond the balance of the increase of reputation: actually accepting higher risk than we would have done previously, because we consider them trustworthy.  If we do this, we’ve ceased to use the risk formula, and have started to act irrationally.  Don’t do that.


1 – OK, then: “us”.

2 – I’m writing this in the lead up to a UK General Election, and it occurs to me that we actually don’t apply this to most of our politicians.

Who do you trust on trust?

(I’m hoping it’s me.)

I’ve been writing about trust on this blog for a little over two years now. It’s not the only topic, but it’s one about which I’m passionate. I’ve been thinking about issues around trust, particularly in regards to computing and security, for nearly 20 years, and it’s something I care about a lot. I care about it so much that I’m writing a book about it.

In fact, I care about it maybe a little too much. I was at a conference earlier this year and – in a move that will come as little surprise to regular readers of this blog[1] – actually ended up getting quite cross about it. The problem is that lots of people talk about trust, but they either don’t really know what they’re talking about, or they really don’t know what they’re talking about. To be clear, I mean different things by those two statements. Some people know their subject, but their subject isn’t really trust. Other people don’t know their subject, but then again, the thing they think they’re talking about often isn’t trust either. Some people talk about “zero trust“, when I really need to look beyond that concept, and discuss implicit vs explicit trust. People ignore the importance of establishing trust. People ignore the importance of decaying trust. People assume that transitive trust is the same as direct trust. People ignore context. All of these are important, and arguably, its not their fault. There’s actually very little detailed writing about trust outside the social sciences. Given how much discussion there is of trust, trusted computing, trusted systems and the like within the world of IT security, there’s astonishingly little theoretical underpinning of the concept, which means that there’s very little agreement as to what is really meant. And, it turns out, although it seems that trust within the social sciences is quite like trust within computing, it really isn’t.

Anyway, there were people at this conference earlier this year who said things about trust which strongly suggested to me that it would be helpful if there were a good underpinning that people could read and discuss and disagree with: a book, in fact, about trust in computing. I got so annoyed that I made a decision to tell two people – my boss and one of the editors of – that I planned to write a book about it. I’m not sure whether they really believed me, but I ended up putting together a Table of Contents. And then looking for a publisher, and then sending several publishers a copy of the ToC and some further thoughts about what a book might look like, and word count estimates, and a list of possible reader types and markets.

And then someone offered me a contract. This was a little bit of surprise, but after some discussion and negotiation, I’m now contracted to write a book on trust for Wiley. I’m absolutely going to continue to publish this blog, and I’ll continue to write about trust here. And, on occasion, something a little bit more random. I don’t pretend to know everything about the subject, and writing about it here allows me to explore some of the more tricky issues. I hope you’ll join me for the ride – and if you have suggestions or questions, I’d love to hear about them.

1 – or my wife and kids.

Enarx goes multi-platform

Now with added SGX!

Yesterday, Nathaniel McCallum and I presented a session “Confidential Computing and Enarx” at Open Source Summit Europe. As well as some new information on the architectural components for an Enarx deployment, we had a new demo. What’s exciting about this demo was that it shows off attestation and encryption on Intel’s SGX. Our initial work focussed on AMD’s SEV, so this is our first working multi-platform work flow. We’re very excited, and particularly as this week a number of the team will be attending the first face to face meetings of the Confidential Computing Consortium, at which we’ll be submitting Enarx as a project for contribution to the Consortium.

The demo had been the work of several people, but I’d like to call out Lily Sturmann in particular, who got things working late at night her time, with little time to spare.

What’s particularly important about this news is that SGX has a very different approach to providing a TEE compared with the other technology on which Enarx was previously concentrating, SEV. Whereas SEV provides a VM-based model for a TEE, SGX works at the process level. Each approach has different advantages and offers different challenges, and the very different models that they espouse mean that developers wishing to target TEEs have some tricky decisions to make about which to choose: the run-time models are so different that developing for both isn’t really an option. Add to that the significant differences in attestation models, and there’s no easy way to address more than one silicon platform at a time.

Which is where Enarx comes in. Enarx will provide platform independence both for attestation and run-time, on process-based TEEs (like SGX) and VM-based TEEs (like SEV). Our work on SEV and SGX is far from done, but also we plan to support more silicon platforms as they become available. On the attestation side (which we demoed yesterday), we’ll provide software to abstract away the different approaches. On the run-time side, we’ll provide a W3C standardised WebAssembly environment to allow you to choose at deployment time what host you want to execute your application on, rather than having to choose at development time where you’ll be running your code.

This article has sounded a little like a marketing pitch, for which I apologise. As one of the founders of the project, alongside Nathaniel, I’m passionate about Enarx, and would love you, the reader, to become passionate about it, too. Please visit for more information – we’d love to tell you more about our passion.