Sorry – busy and recovering from illness.
Now with added SGX!
Yesterday, Nathaniel McCallum and I presented a session “Confidential Computing and Enarx” at Open Source Summit Europe. As well as some new information on the architectural components for an Enarx deployment, we had a new demo. What’s exciting about this demo was that it shows off attestation and encryption on Intel’s SGX. Our initial work focussed on AMD’s SEV, so this is our first working multi-platform work flow. We’re very excited, and particularly as this week a number of the team will be attending the first face to face meetings of the Confidential Computing Consortium, at which we’ll be submitting Enarx as a project for contribution to the Consortium.
The demo had been the work of several people, but I’d like to call out Lily Sturmann in particular, who got things working late at night her time, with little time to spare.
What’s particularly important about this news is that SGX has a very different approach to providing a TEE compared with the other technology on which Enarx was previously concentrating, SEV. Whereas SEV provides a VM-based model for a TEE, SGX works at the process level. Each approach has different advantages and offers different challenges, and the very different models that they espouse mean that developers wishing to target TEEs have some tricky decisions to make about which to choose: the run-time models are so different that developing for both isn’t really an option. Add to that the significant differences in attestation models, and there’s no easy way to address more than one silicon platform at a time.
Which is where Enarx comes in. Enarx will provide platform independence both for attestation and run-time, on process-based TEEs (like SGX) and VM-based TEEs (like SEV). Our work on SEV and SGX is far from done, but also we plan to support more silicon platforms as they become available. On the attestation side (which we demoed yesterday), we’ll provide software to abstract away the different approaches. On the run-time side, we’ll provide a W3C standardised WebAssembly environment to allow you to choose at deployment time what host you want to execute your application on, rather than having to choose at development time where you’ll be running your code.
This article has sounded a little like a marketing pitch, for which I apologise. As one of the founders of the project, alongside Nathaniel, I’m passionate about Enarx, and would love you, the reader, to become passionate about it, too. Please visit enarx.io for more information – we’d love to tell you more about our passion.
Tamper-evidence, auditability and measurability are three important properties.
A few months ago, in an article called “Turtles – and chains of trust“, I briefly mentioned Trusted Compute Bases, or TCBs, but then didn’t go any deeper. I had a bit of a search across the articles on this blog, and realised that I’ve never gone into this topic in much detail, which feels like a mistake, so I’m going to do it now.
First of all, let’s think about computer systems. When I talk about systems, I’m being both quite specific (see Systems security – why it matters) and quite broad (I don’t just mean computer that sits on your desk or in a data centre, but include phones, routers, aircraft navigation devices – pretty much anything that has a set of chips inside it). There are surely some systems that you don’t rely on too much to do important things, but in most cases, you’re going to care if they go wrong, or, more relevant to this discussion, if they get compromised. Even the most benign of systems – a smart light-bulb, for instance – can become a nightmare if compromised. Even if you don’t particularly care whether you can continue to use it in the way it was intended, there are still worries about its misuse in the case of compromise:
- it may become a “jumping off point” for malicious attacks into your network or other systems;
- it may be used as part of a botnet, piggybacking on your network to attack other systems (leading to sanctions against your legitimate systems from outside);
- it may be used as part of a botnet, using up resources such as network bandwidth, storage or electricity (leading to resource constraints or increased charges).
For any systems dealing with sensitive data – anything from your messages to loved ones on your phone through intellectual property secrets for a manufacturing organisation through to National Security data for government department – these issues are compounded. In order to protect your system, you can’t just say “this system is secure” (lovely as that would be). What can you do to start making statement about the general security of a system?
Systems consist of multiple components, and modern computing systems are typically composed from multiple layers (one of my favourite xkcd comics, Stack, shows some of them). What’s relevant from the point of view of this article is that, on the whole, the different layers of the stack start up – boot up – from the bottom upwards. This means, following the “bottom turtle” rule (see the Turtles article referenced above), that we need to ensure that the bottom layer is as secure as possible. In fact, in order to build a system in which we can have assurance that it will behave as expected and designed (in other words, a system in which we can have a trust relationship), we need to build a Trusted Compute Base. This should have at least the following set of properties: tamper-evidence, auditability and measurability, all of which are related to each other.
We want to know if the TCB – on which we are building everything else – has a problem. Specifically, we need a set of layers or components that we are pretty sure have not been compromised, or which, if compromised, will be tamper-evident:
- fail in expected ways,
- refuse to start, or
- flag that they have been compromised.
It turns out that this is not easy, and typically becomes more difficult as you move up the stack – partly because you’re adding more layers, and partly because those layers tend to get more complex.
Our TCB should have the properties listed above (around failure, refusing to start or compromise-flagging), and be as small as possible. This seems the wrong way around: surely you would want to ensure that as much of your system was trusted as possible? In fact, what you want is a small, easily measurable and easily auditable TCB on which you can build the rest of your system – from which you can build a “chain of trust” to the other parts of your system about which you care. Auditability and measurability are the other two properties that you want in a TCB, and these two properties are why open source is a very useful tool in your toolkit when building a TCB.
Auditability (and open source)
Auditability means that you – or someone else who you trust to do the job – can look into the various components of the TCB and assure yourself that they have been written, compiled and are executing properly. As I explained in Of projects, products and (security) community, the person may not always be you, or even someone in your organisation, but if you’re using widely deployed open source software, the rest of the community can be doing that auditing for you, which is a win for you and – if you contribute your knowledge back into the community – for everybody else as well.
Auditability typically gets harder the further you go down the stack – partly because you’re getting closer and closer to bits – ones and zeros – and to actual electrons, and partly because there is very little truly open source hardware out there at the moment. However, the more that we can see and audit of the TCB, the more confidence we can have in it as a building block for the rest of our system.
Measurability (and open source)
The other thing you want to be able to do is ensure that your TCB really is your TCB. Tamper-evidence is related to this, but that’s a run-time property only (for software components, at least). Being able to measure when you provision your system and then to check that what you originally loaded is still what you think it should be when you boot it is a very important property of a TCB. If what you’re running is open source, you can check it yourself, against your own measurements and those of the community, and if changes are made – by you or others – those changes can be checked (as part of auditing) and then propagated through measurement checking to the rest of the community. Equally important – and much more difficult – is run-time measurability. This turns out to be very difficult to do, although there are some techniques emerging which are beginning to get traction – for now, we tend to rely on tamper-evidence, which is easier in hardware than software.
Trusted Compute Bases (TCBs) are a key concept in building systems that we hope will behave in ways we expect – or allow us to find out when they are not. Tamper-evidence, auditability and measurability are three important properties that they should display, and it turns out that open source is an important factor in helping us ensure two of those.
Not all open source is created (and maintained) equal.
Open source is a good thing. Open source is a particularly good thing for security. I’ve written about this before (notably in Disbelieving the many eyes hypothesis and The commonwealth of Open Source), and I’m going to keep writing about it. In this article, however, I want to talk a little more about a feature of open source which is arguably both a possible disadvantage and a benefit: the difference between a project and a product. I’ll come down firmly on one side (spoiler alert: for organisations, it’s “product”), but I’d like to start with a little disclaimer. I am employed by Red Hat, and we are a company which makes money from supporting open source. I believe this is a good thing, and I approve of the model that we use, but I wanted to flag any potential bias early in the article.
The main reason that open source is good for security is that you can see what’s going on when there’s a problem, and you have a chance to fix it. Or, more realistically, unless you’re a security professional with particular expertise in the open source project in which the problem arises, somebody else has a chance to fix it. We hope that there are sufficient security folks with the required expertise to fix security problems and vulnerabilities in software projects about which we care.
It’s a little more complex than that, however. As an organisation, there are two main ways to consume open source:
- as a project: you take the code, choose which version to use, compile it yourself, test it and then manage it.
- as a product: a vendor takes the project, choose which version to package, compiles it, tests it, and then sells support for the package, typically including docs, patching and updates.
Now, there’s no denying that consuming a project “raw” gives you more options. You can track the latest version, compiling and testing as you go, and you can take security patches more quickly than the product version may supply them, selecting those which seem most appropriate for your business and use cases. On the whole, this seems like a good thing. There are, however, downsides which are specific to security. These include:
- some security fixes come with an embargo, to which only a small number of organisations (typically the vendors) have access. Although you may get access to fixes at the same time as the wider ecosystem, you will need to check and test these (unless you blindly apply them – don’t do that), which will already have been performed by the vendors.
- the huge temptation to make changes to the code that don’t necessarily – or immediately – make it into the upstream project means that you are likely to be running a fork of the code. Even if you do manage to get these upstream in time, during the period that you’re running the changes but they’re not upstream, you run a major risk that any security patches will not be immediately applicable to your version (this is, of course, true for non-security patches, but security patches are typically more urgent). One option, of course, if you believe that your version is likely to consumed by others, is to make an official fork of project, and try to encourage a community to grow around that, but in the end, you will still have to decide whether to support the new version internally or externally.
- unless you ensure that all instances of the software are running the same version in your deployment, any back-porting of security fixes to older versions will require you to invest in security expertise equal or close to equal to that of the people who created the fix in the first place. In this case, you are giving up the “commonwealth” benefit of open source, as you need to pay experts who duplicate the skills of the community.
What you are basically doing, by choosing to deploy a project rather than a product is taking the decision to do internal productisation of the project. You lose not only the commonwealth benefit of security fixes, but also the significant economies of scale that are intrinsic to the vendor-supported product model. There may also be economies of scope that you miss: many vendors will have multiple products that they support, and will be able to apply security expertise across those products in ways which may not be possible for an organisation whose core focus is not on product support.
These economies are reflected in another possible benefit to the commonwealth of using a vendor: the very fact that multiple customers are consuming their products mean that they have an incentive and a revenue stream to spend on security fixes and general features. There are other types of fixes and improvements on which they may apply resources, but the relative scarcity of skilled security experts means that the principle of comparative advantage suggests that they should be in the best position to apply them for the benefit of the wider community.
What if a vendor you use to provide a productised version of an open source project goes bust, or decides to drop support for that product? Well, this is a problem in the world of proprietary software as well, of course. But in the case of proprietary software, there are three likely outcomes:
- you now have no access to the software source, and therefore no way to make improvements;
- you are provided access to the software source, but it is not available to the wider world, and therefore you are on your own;
- everyone is provided with the software source, but no existing community exists to improve it, and it either dies or takes significant time for a community to build around it.
In the case of open source, however, if the vendor you have chosen goes out of business, there is always the option to use another vendor, encourage a new vendor to take it on, productise it yourself (and supply it to other organisations) or, if the worst comes to the worst, take the internal productisation route while you search for a scalable long-term solution.
In the modern open source world, we (the community) have got quite good at managing these options, as the growth of open source consortia shows. In a consortium, groups of organisations and individuals cluster around a software project or set of related projects to encourage community growth, alignment around feature and functionality additions, general security work and productisation for use cases which may as yet be ill-defined, all the while trying to exploit the economies of scale and scope outlined above. An example of this would be the Linux Foundation’s Confidential Computing Consortium, to which the Enarx project aims to be contributed.
Choosing to consume open source software as a product instead of as a project involves some trade-offs, but from a security point of view at least, the economics for organisations are fairly clear: unless you are in position to employ ample security experts yourself, products are most likely to suit your needs.
1 – note: I’m not an economist, but I believe that this holds in this case. Happy to have comments explaining why I’m wrong (if I am…).
2 – “consortiums” if you really must.
Why “signing parties” were never a good idea.
I went to a party recently, and it reminded of quite how bad humans are at trust. It was a work “mixer”, and an attempt to get people who didn’t know each other well to chat and exchange some information. We were each given two cards to hang around our necks: one on which to write our own name, and the other on which we were supposed to collect the initials of those to whom we spoke (in their own hand). At the end of the event, the plan was to hand out rewards whose value was related to the number of initials collected. Pens/markers were provided.
I gamed the system by standing by the entrance, giving out the cards, controlling the markers and ensuring that everybody signed card, hence ending up with easily the largest number of initials of anyone at the party. But that’s not the point. Somebody – a number of people, in fact – pointed out the similarities between this and “key signing parties”, and that got me thinking. For those of you not old enough – or not security-geeky enough – to have come across these, they were events which were popular in the late nineties and early parts of the first decade of the twenty-first century where people would get together, typically at a tech show, and sign each other’s PGP keys. PGP keys are an interesting idea whereby you maintain a public-private key pair which you use to sign emails, assert your identity, etc., in the online world. In order for this to work, however, you need to establish that you are who you say you are, and in order for this to work, you need to convince someone of this fact.
There are two easy ways to do this:
- meet someone IRL, get them to validate your public key, and sign it with theirs;
- have someone who knows the person you met in step 1 agree that they can probably trust you, as the person in step 1 did, and they trust them.
This is a form of trust based on reputation, and it turns out that it is a terrible model for trust. Let’s talk about some of the reasons for it not working. There are four main ones:
- transitive trust
- peer pressure.
Let’s evaluate these briefly.
I can’t emphasise this enough: trust is always, always contextual (see “What is trust?” for a quick primer). When people signed other people’s key-pairs, all they should really have been saying was “I believe that the identity of this person is as stated”, but signatures and encryption based on these keys was (and is) frequently misused to make statements about, or claim access to, capabilities that were not necessarily related to identity.
I lay some of the fault of this at the US alcohol consumption policy. Many (US) Americans use their driving licence/license as a form of authorisation: I am over this age, and am therefore entitled to purchase alcohol. It was designed to prove that their were authorised to drive, and nothing more than that, but you can now get a US driving licence to prove your age even if you can’t drive, and it can be used, for instance, as security identification for getting on aircraft at airportsThis is crazy, but partly explains why there is such a confusion between identification, authentication and authorisation.
Trust, as I’ve noted before in many articles, decays. Just because I trust you now (within a particular context) doesn’t mean that I should trust you in the future (in that or any other context). Mechanisms exist within the PGP framework to expire keys, but it was (I believe) typical for someone to resign a new set of keys just because they’d signed the previous set. If they were only being used for identity, then that’s probably OK – most people rarely change their identity, after all – but, as explained above, these key pairs were often used more widely.
This is the whole “trusting someone because I trust you” problem. Again, if this were only about identity, then I’d be less worried, but given people’s lack of ability to specify context, and their equal inability to communicate that to others, the “fuzziness” of the trust relationships being expressed was only going to increase with the level of transitiveness, reducing the efficacy of the system as a whole.
Honestly, this occurred to me due to my gaming of the system, as described in the second paragraph at the top of this article. I remember meeting people at events and agreeing to endorse their key-pairs basically because everybody else was doing it. I didn’t really know them, though (I hope) I had at least heard of them (“oh, you’re Denny’s friend, I think he mentioned you”), and I certainly shouldn’t have been signing their key-pairs. I am certain that I was not the only person to fall into this trap, and it’s a trap because humans are generally social animals, and they like to please others. There was ample opportunity for people to game the system much more cynically than I did at the party, and I’d be surprised if this didn’t happen from time to time.
Stepping back a bit
To be fair, it is possible to run a model like this properly. It’s possible to avoid all of these by insisting on proper contextual trust (with multiple keys for different contexts), by re-evaluating trust relationships on a regular basis, by being very careful about trusting people just due to their trusting someone else (or refusing to do so at all), and by refusing just to agree to trust someone because you’ve met them and they “seem nice”. But I’m not aware of anyone – anyone – who kept to these rules, and it’s why I gave up on this trust model over a decade ago. I suspect that I’m going to get some angry comments from people who assert that they used (and use) the system properly, and I’m sure that there are people out there who did and do: but as a widespread system, it was only going to work if the large majority of all users treated it correctly, and given human nature and failings, that never really happened.
I’m also not suggesting that we have many better models – but we really, really need to start looking for some, as this is important, and difficult stuff.
1 – I refuse to refer to these years the “aughts”.
2 – In Real Life – this used to be an actual distinction to online.
3 – even a large enough percentage of IT folks to make this a problem.
Your environment is n-dimensional – your trust must be, too.
One of the security principles by which we live is that security is only as strong as the weakest link in a chain. That link is variously identified as:
- your employees
- external threat actors
- all humans
- lack of training
- auditing capabilities
- the development lifecycle
- waterfall methodology
- any other authentication mechanisms
- electrical wiring
- and pestilence.
Actually, I don’t think I’ve ever seen the last one mentioned, but it’s only a matter of time. However, very rarely does anybody bother to identify exactly what the chain is that it is being broken by the weakest link splintering into a thousand pieces.
There are a number of candidates that spring to mind:
- your application flow. This is rather an old-fashioned way of thinking of applications: that a program is started, goes through a set of actions, and then terminates, but to think more broadly about it, any action which causes an application to behave in unexpected or unintended ways is a possible security flow, whether that is a monolithic application, a set of microservices or an app on mobile device.
- your software stack. Depending on how you think about your stack, there are likely to be at least 5, maybe a dozen or maybe even scores of layers in your software stack (for an example with just a few simple layers, see Announcing Enarx). However you think about it, you need to trust all of those layers to do what who expect them to do. If one of them is compromised, malicious, or just poorly implemented or maintained, then you have a security issue.
- your hardware stack. There was a time, barely five years ago, when most people (excepting us, of course), assumed that hardware did what we thought it was supposed to do, all of the time. In fact, we should all have known better, given the Clipper Chip and the Pentium bug (to name just to famous examples), but with Spectre, Meltdown and a growing realisation that hardware isn’t as trustworthy as was previously thought, everybody needs to decide exactly what security they can trust in which components.
- your operational processes. You can have the best software and hardware in the world, but if you don’t maintain it and operate it properly, it’s going to be full of holes. Failing to invest in operations, monitoring, logging, auditing and the rest leaves you wide open.
- your supply chain. There’s a growing understanding in the industry that our software and hardware supply chains are possible points of failure. Whether your vendor is entirely proprietary (in which case their security is largely opaque) or open source (in which case you’ve got a chance to be able to see what’s going on), errors or maliciousness in the supply chain can scupper any hopes you had of security for your deployment.
- your software and hardware lifecycle. Developing software? Patching it? Upgrading hardware (or software)? Unit testing? Unit testingg? We all know that a failure to manage the lifecycle of our environment can lead to security problems.
The point I’m trying to make above is that there’s no single chain. Your environment is n-dimensional – your trust must be, too. If you don’t think about all of these contexts – and there will be more beyond the half-dozen that I’ve just noted – then you can’t have a good chance of managing security in your environment. I honestly don’t think that there’s any single weakest link in the chain, because there are always already multiple chains in play: our job is to think about as many of them as possible, and then manage an mitigate the risks associated with each.
1 – the mythical “IT security community”.
2 – you’re right: “which we live by” would sound much more natural.
3 – and a growing industry to try to provide fixes.
The news from the UK is amazing today: the Supreme Court has ruled that the Prime Minister has failed to “prorogue” Parliament – the in other words, that the members of the House of Commons and the House of Lords are still in session. The words in the title come from the judgment that they have just handed down.
I’m travelling this week, and wasn’t expecting to write a post today, but this triggered a thought in me: what provisions are in place in your organisation to cope with abuses of power and possible illegal actions by managers and executives?
Having a whistle-blowing policy and an independent appeals process is vital. This is true for all employees, but having specific rules in place for employees who are involved in such areas as compliance and implementations involving regulatory requirements is vital. Robust procedures protect not only an employee who finds themself in a difficult position, but, in the long view, the organisation itself. They a can also act as a deterrent to managers and executives considering actions which might, in the absence of such procedures, likely go unreported.
Such procedures are not enough on their own – they fall into the category of “necessary, but not sufficient” – and a culture of ethical probity also needs to be encouraged. But without such a set of procedures, your organisation is at real risk.