Isolationism – not a 4 letter word (in the cloud)

Things are looking up if you’re interested in protecting your workloads.

In the world of international relations, economics and fiscal policy, isolationism doesn’t have a great reputation. I could go on, I suppose, if I did some research, but this is a security blog[1], and international relations, fascinating area of study though it is, isn’t my area of expertise: what I’d like to do is borrow the word and apply it to a different field: computing, and specifically cloud computing.

In computing, isolation is a set of techniques to protect a process, application or component from another (or a set of the former from a set of the latter). This is pretty much always a good thing – you don’t want another process interfering with the correct workings of your one, whether that’s by design (it’s malicious) or in error (because it’s badly designed or implemented). Isolationism, therefore, however unpopular it may be on the world stage, is a policy that you generally want to adopt for your applications, wherever they’re running.

This is particularly important in the “cloud”. Cloud computing is where you run your applications or processes on shared infrastructure. If you own that infrastructure, then you might call that a “private cloud”, and infrastructure owned by other people a “public cloud”, but when people say “cloud” on its own, they generally mean public clouds, such as those operated by Amazon, Microsoft, IBM, Alibaba or others.

There’s a useful adage around cloud computing: “Remember that the cloud is just somebody else’s computer”. In other words, it’s still just hardware and software running somewhere, it’s just not being run by you. Another important thing to remember about cloud computing is that when you run your applications – let’s call them “workloads” from here on in – on somebody else’s cloud (computer), they’re unlikely to be running on their own. They’re likely to be running on the same physical hardware as workloads from other users (or “tenants”) of that provider’s services. These two realisations – that your workload is on somebody else’s computer, and that it’s sharing that computer with workloads from other people – is where isolation comes into the picture.

Workload from workload isolation

Let’s start with the sharing problem. You want to ensure that your workloads run as you expect them to do, which means that you don’t want other workloads impacting on how yours run. You want them to be protected from interference, and that’s where isolation comes in. A workload running in a Linux container or a Virtual Machine (VM) is isolated from other workloads by hardware and/or software controls, which try to ensure (generally very successfully!) that your workload receives the amount of computing time it should have, that it can send and receive network packets, write to storage and the rest without interruption from another workload. Equally important, the confidentiality and integrity of its resources should be protected, so that another workload can’t look into its memory and/or change it.

The means to do this are well known and fairly mature, and the building blocks of containers and VMs, for instance, are augmented by software like KVM or Xen (both open source hypervisors) or like SELinux (an open source capabilities management framework). The cloud service providers are definitely keen to ensure that you get a fair allocation of resources and that they are protected from the workloads of other tenants, so providing workload from workload isolation is in their best interests.

Host from workload isolation

Next is isolating the host from the workload. Cloud service providers absolutely do not want workloads “breaking out” of their isolation and doing bad things – again, whether by accident or design. If one of a cloud service provider’s host machines is compromised by a workload, not only can that workload possibly impact other workloads on that host, but also the host itself, other hosts and the more general infrastructure that allows the cloud service provider to run workloads for their tenants and, in the final analysis, make money.

Luckily, again, there are well-known and mature ways to provide host from workload isolation using many of the same tools noted above. As with workload from workload isolation, cloud service providers absolutely do not want their own infrastructure compromised, so they are, of course, going to make sure that this is well implemented.

Workload from host isolation

Workload from host isolation is more tricky. A lot more tricky. This is protecting your workload from the cloud service provider, who controls the computer – the host – on which your workload is running. The way that workloads run – execute – is such that such isolation is almost impossible with standard techniques (containers, VMs, etc.) on their own, so providing ways to ensure and prove that the cloud service provider – or their sysadmins, or any compromised hosts on their network – cannot interfere with your workload is difficult.

You might expect me to say that providing this sort of isolation is something that cloud service providers don’t care about, as they feel that their tenants should trust them to run their workloads and just get on with it. Until sometime last year, that might have been my view, but it turns out to be wrong. Cloud service providers care about protecting your workloads from the host because it allows them to make more money. Currently, there are lots of workloads which are considered too sensitive to be run on public clouds – think financial, health, government, legal, … – often due to industry regulation. If cloud service providers could provide sufficient isolation of workloads from the host to convince tenants – and industry regulators – that such workloads can be safely run in the public cloud, then they get more business. And they can probably charge more for these protections as well! That doesn’t mean that isolating your workloads from their hosts is easy, though.

There is good news, however, for both cloud service providers and their teants, which is that there’s a new set of hardware techniques called TEEs – Trusted Execution Environments – which can provide exactly this sort of protection[2]. This is rapidly maturing technology, and TEEs are not easy to use – in that it can not only be difficult to run your workload in a TEE, but also to ensure that it’s running in a TEE – but when done right, they do provide the sorts of isolation from the host that a workload wants in order to maintain its integrity and confidentiality[3].

There are a number of projects looking to make using TEEs easier – I’d point to Enarx in particular – and even an industry consortium to promote open TEE adoption, the Confidential Computing Consortium. Things are looking up if you’re interested in protecting your workloads, and the cloud service providers are on board, too.


1 – sorry if you came here expecting something different, but do stick around and have a read: hopefully there’s something of interest.

2 – the best known are Intel’s SGX and AMD’s SEV.

3 – availability – ensuring that it runs fairly – is more difficult, but as this is a property that is also generally in the cloud service provider’s best interest, and something that can can control, it’s not generally too much of a concern[4].

4 – yes, there are definitely times when it is, but that’s a story for another article.

A cybersecurity tip from Hazzard County

Don’t place that bet in Boss Hogg’s betting saloon: you know he’s up to no good!

It’s a slightly guilty secret, but I used to love watching The Dukes of Hazzard in the early 80’s (the first series started in late 1979, but I suspect that it didn’t make it to the UK until the next year at the earliest).  It all seemed very glamourous, and there were lots of fast car chases.  And a basset hound, which was an extra win.  To say this was early days for cybersecurity would be an understatement, and though there are references in the Wikipedia plot summaries to computers, I can’t honestly say that I remember any of those particular episodes.

One episode has stuck with me, however, for reasons that I can’t fathom.  It’s called “Hazzard Hustle” and (*SPOILER ALERT*) in it, Boss Hogg sets up a crooked betting saloon.  The swindle (if I remember it correctly) is that he controls and delays the supposedly live feeds to the TVs in the saloon, which means that he has access to results before they come in.  Needless to say, the Duke boys (probably aided and abetted by Daisy Duke) get the better of him in the end, and everything turns out OK (for them, not Boss Hogg).

“What can this have to do with cybersecurity?” you have every right to ask.  Well, the answer is reporting and monitoring channels.  Monitoring is important because without it, there is no way for us to check that what we believe should be happening actually is.  The opportunities for direct sensory monitoring of actions in computer-based systems are limited: if I request via a web browser that a banking application transfers funds between one account and another, the only visible effect that I am likely to see is an acknowledgement on the screen. Until I actually try to spend or withdraw that money, I realistically have no way to be assured that the transactions has taken place.

Let’s take an example from the human realm.  It is as if I have a trust relationship with somebody around the corner of a street, out of view, that she will raise a red flag at the stroke of noon, and I have a friend, standing on the corner, who will watch her, and tell me when and if she raises the flag. I may be happy with this arrangement, but only because I have a trust relationship to the friend: that friend is acting as a trusted channel for information.

The word “friend” was chosen carefully above, because there is a trust relationship already implicit in the term. The same is not true for the word “somebody”, which I used to describe the person who was to raise the flag. The situation as described above is likely to make our minds presume that there is a fairly high probability that the trust relationship I have to the friend is sufficient to assure me that he will pass the information correctly. But what if my friend is actually a business partner of the flag-waver? Given our human understanding of the trust relationships typically involved with business partnerships, we may immediately begin to assume that my friend’s motivations in respect to correct reporting are not neutral.

The channels for reporting on actions – monitoring them – are vitally important within cybersecurity, and it is both easy and dangerous to fall into the trap of assuming that they are neutral, and that the only important one is between me and the acting party. In reality, the trust relationship that I have to a set of channels is key to the maintenance of the trust relationships that I have to the key entity that they monitor. In trust relationships involving computer systems, there are often multiple entities or components involved in actions, and these form a “chain of trust”, where each link depends on the other, and the chain is typically only as strong as the weakest of its links.  Don’t forget that.  Oh, and don’t place that bet in Boss Hogg’s betting saloon: you know he’s up to no good!

Timely risk or risky times?

Being aware of “the long game”.

On Friday, 29th November 2019, Jack Merritt and Saskia Jones were killed in a terrorist attack.  A number of members of the public (some with with improvised weapons) and of the emergency services acted with great heroism.  I wanted to mention the mention the names of the victims and to praise those involved in stopping him before mentioning the name of the attacker: Usman Khan.  The victims, the attacker were taking part in an offender rehabilitation conference to help offenders released from prison to reintegrate into society: Khan had been convicted to 16 years in prison for terrorist offences.

There’s an important formula that everyone involved in risk – and given that IT security is all about mitigating risk, that’s anyone involved in security – should know. It’s usually expressed thus:

Risk = likelihood x impact

Sometimes likelihood is sometimes expressed as “probability”, impact as “consequence” or “loss”, and I’ve seen some other variants as well, but the version above is generally sufficient for most purposes.

Using the formula

How should you use the formula? Well, it’s most useful for comparing risks and deciding how to mitigate them. Humans are terrible at calculating risk, and any tools that help them[1] is good.  In order to use this formula correctly, you want to compare risks over the same time period.  You could say that almost any eventuality may come to pass over the lifetime of the universe, but comparing the risk of losing broadband access to the risk of your lead developer quitting for another company between the Big Bang and the eventual heat death of the universe is probably not going to give you much actionable information.

Let’s look at the two variables that we need to have in order to calculate risk.  We’ll start with the impact, because I want to devote most of this article to the other part: likelihood.

Impact is what the damage will be if the risk happens.  In a business context, you want to look at the risk of your order system being brought down for a week by malicious attackers.  You might calculate that you would lose £15,000 in orders.  On top of that, there might be a loss of reputation which you might calculate at £30,000.  Fixing the problem might add £10,000.  Add these together, and the impact is £55,000.

What’s the likelihood?  Well, remember that we need to consider a particular time period.  What you choose will depend on what you’re interested in, but a classic use is for budgeting, and so the length of time considered is often a year.  “What is the likelihood of my order system being brought down for a week by malicious attackers over the next twelve months?” is the question you want to ask.  If you decide that it’s 0.005 (or 0.5%), then your risk is calculated thus:

Risk = 0.005 x 55,000

Risk = 275

The units don’t really matter, because what you want to do is compare risks.  If the risk of your order system being brought down through hardware failure is higher (say 500), then you should probably balance the amount of resources you assign to mitigate these risks accordingly.

Time, reputation, trust and risk

What I’m interested in is a set of rather more complicated risks, however: those associated with human behaviour.  I’m very interested in trust, and one of the interesting things about trust is how we decide to trust people.  One way is by their reputation: if someone keeps behaving well over a long period, then we tend to trust them more – or if badly, then to trust them less[2].  If we trust someone more, our calculation of risk is likely to be strongly based on that trust, as our view of the likelihood of a behaviour at odds with the reputation that person holds will be informed by that.

This makes sense: in the absence of perfect information about humans, their motivations and intentions, our view of risk must be based on something, and reputation is actually a fairly good measure for that.  We might say that the likelihood of a customer defaulting on payment terms reduces year by year as we start to think of them as a “trusted customer”.  As the likelihood reduces, we may decide to increase the amount we lend to them – and thereby the impact of defaulting – to keep the risk about the same, year on year.

The risk here is what is sometimes called “playing the long game”.  Humans sometimes manipulate their reputation, or build up a reputation, in order to perform an action once they have gained trust.  Online sellers my make lots of “good” sales in order to get a 5 star rating over time, only to wait and then make a set of “bad” sales, where they don’t ship goods at all, and then just pocket the money.  Or, they may make many small sales in order to build up a good reputation, and then use that reputation to make one big sale which they have no intention of fulfilling.  Online selling sites are wise to some of these tricks, and have algorithms to try to protect buyers (in fact, the same behaviour can be used by sellers in some cases), but these are not perfect.

I’d like to come back to the London Bridge attack.  In this case, it seems likely that the attacker bided his time over many years, behaving well, and raising his “reputation” among those who knew him – the prison staff, parole board, rehabilitation conference organisers, etc. – so that he had the opportunity to perform one major action at odds with that reputation.  The heroism of those around him stopped him being as successful as he may have hoped, but still at the cost of two innocent lives and several serious injuries.

There is no easy way to deal with such issues.  We need reputation, and we need to allow people to show that they have changed and can be integrated into society, but when we make risk calculations based on reputation in any sphere, we should take care to consider whether actors are playing a long game, and what the possible ramifications would be if they were to act at odds with that reputation.

I noted above that humans are bad at calculating risk, and to follow our example of the non-defaulting customer, one mistake might be to increase the credit we give to that customer beyond the balance of the increase of reputation: actually accepting higher risk than we would have done previously, because we consider them trustworthy.  If we do this, we’ve ceased to use the risk formula, and have started to act irrationally.  Don’t do that.

 


1 – OK, then: “us”.

2 – I’m writing this in the lead up to a UK General Election, and it occurs to me that we actually don’t apply this to most of our politicians.

Who do you trust on trust?

(I’m hoping it’s me.)

I’ve been writing about trust on this blog for a little over two years now. It’s not the only topic, but it’s one about which I’m passionate. I’ve been thinking about issues around trust, particularly in regards to computing and security, for nearly 20 years, and it’s something I care about a lot. I care about it so much that I’m writing a book about it.

In fact, I care about it maybe a little too much. I was at a conference earlier this year and – in a move that will come as little surprise to regular readers of this blog[1] – actually ended up getting quite cross about it. The problem is that lots of people talk about trust, but they either don’t really know what they’re talking about, or they really don’t know what they’re talking about. To be clear, I mean different things by those two statements. Some people know their subject, but their subject isn’t really trust. Other people don’t know their subject, but then again, the thing they think they’re talking about often isn’t trust either. Some people talk about “zero trust“, when I really need to look beyond that concept, and discuss implicit vs explicit trust. People ignore the importance of establishing trust. People ignore the importance of decaying trust. People assume that transitive trust is the same as direct trust. People ignore context. All of these are important, and arguably, its not their fault. There’s actually very little detailed writing about trust outside the social sciences. Given how much discussion there is of trust, trusted computing, trusted systems and the like within the world of IT security, there’s astonishingly little theoretical underpinning of the concept, which means that there’s very little agreement as to what is really meant. And, it turns out, although it seems that trust within the social sciences is quite like trust within computing, it really isn’t.

Anyway, there were people at this conference earlier this year who said things about trust which strongly suggested to me that it would be helpful if there were a good underpinning that people could read and discuss and disagree with: a book, in fact, about trust in computing. I got so annoyed that I made a decision to tell two people – my boss and one of the editors of Opensource.com – that I planned to write a book about it. I’m not sure whether they really believed me, but I ended up putting together a Table of Contents. And then looking for a publisher, and then sending several publishers a copy of the ToC and some further thoughts about what a book might look like, and word count estimates, and a list of possible reader types and markets.

And then someone offered me a contract. This was a little bit of surprise, but after some discussion and negotiation, I’m now contracted to write a book on trust for Wiley. I’m absolutely going to continue to publish this blog, and I’ll continue to write about trust here. And, on occasion, something a little bit more random. I don’t pretend to know everything about the subject, and writing about it here allows me to explore some of the more tricky issues. I hope you’ll join me for the ride – and if you have suggestions or questions, I’d love to hear about them.


1 – or my wife and kids.

What’s a Trusted Compute Base?

Tamper-evidence, auditability and measurability are three important properties.

A few months ago, in an article called “Turtles – and chains of trust“, I briefly mentioned Trusted Compute Bases, or TCBs, but then didn’t go any deeper.  I had a bit of a search across the articles on this blog, and realised that I’ve never gone into this topic in much detail, which feels like a mistake, so I’m going to do it now.

First of all, let’s think about computer systems.  When I talk about systems, I’m being both quite specific (see Systems security – why it matters) and quite broad (I don’t just mean computer that sits on your desk or in a data centre, but include phones, routers, aircraft navigation devices – pretty much anything that has a set of chips inside it).  There are surely some systems that you don’t rely on too much to do important things, but in most cases, you’re going to care if they go wrong, or, more relevant to this discussion, if they get compromised.  Even the most benign of systems – a smart light-bulb, for instance – can become a nightmare if compromised.  Even if you don’t particularly care whether you can continue to use it in the way it was intended, there are still worries about its misuse in the case of compromise:

  1. it may become a “jumping off point” for malicious attacks into your network or other systems;
  2. it may be used as part of a botnet, piggybacking on your network to attack other systems (leading to sanctions against your legitimate systems from outside);
  3. it may be used as part of a botnet, using up resources such as network bandwidth, storage or electricity (leading to resource constraints or increased charges).

For any systems dealing with sensitive data – anything from your messages to loved ones on your phone through intellectual property secrets for a manufacturing organisation through to National Security data for government department – these issues are compounded.  In order to protect your system, you can’t just say “this system is secure” (lovely as that would be).  What can you do to start making statement about the general security of a system?

The stack

Systems consist of multiple components, and modern computing systems are typically composed from multiple layers (one of my favourite xkcd comics, Stack, shows some of them).  What’s relevant from the point of view of this article is that, on the whole, the different layers of the stack start up – boot up – from the bottom upwards.  This means, following the “bottom turtle” rule (see the Turtles article referenced above), that we need to ensure that the bottom layer is as secure as possible.  In fact, in order to build a system in which we can have assurance that it will behave as expected and designed (in other words, a system in which we can have a trust relationship), we need to build a Trusted Compute Base.  This should have at least the following set of properties: tamper-evidence, auditability and measurability, all of which are related to each other.

Tamper-evidence

We want to know if the TCB – on which we are building everything else – has a problem.  Specifically, we need a set of layers or components that we are pretty sure have not been compromised, or which, if compromised, will be tamper-evident:

  • fail in expected ways,
  • refuse to start, or
  • flag that they have been compromised.

It turns out that this is not easy, and typically becomes more difficult as you move up the stack – partly because you’re adding more layers, and partly because those layers tend to get more complex.

Our TCB should have the properties listed above (around failure, refusing to start or compromise-flagging), and be as small as possible.  This seems the wrong way around: surely you would want to ensure that as much of your system was trusted as possible?  In fact, what you want is a small, easily measurable and easily auditable TCB on which you can build the rest of your system – from which you can build a “chain of trust” to the other parts of your system about which you care.  Auditability and measurability are the other two properties that you want in a TCB, and these two properties are why open source is a very useful tool in your toolkit when building a TCB.

Auditability (and open source)

Auditability means that you – or someone else who you trust to do the job – can look into the various components of the TCB and assure yourself that they have been written, compiled and  are executing properly.  As I explained in Of projects, products and (security) community, the person may not always be you, or even someone in your organisation, but if you’re using widely deployed open source software, the rest of the community can be doing that auditing for you, which is a win for you and – if you contribute your knowledge back into the community – for everybody else as well.

Auditability typically gets harder the further you go down the stack – partly because you’re getting closer and closer to bits – ones and zeros – and to actual electrons, and partly because there is very little truly open source hardware out there at the moment.  However, the more that we can see and audit of the TCB, the more confidence we can have in it as a building block for the rest of our system.

Measurability (and open source)

The other thing you want to be able to do is ensure that your TCB really is your TCB.  Tamper-evidence is related to this, but that’s a run-time property only (for software components, at least).  Being able to measure when you provision your system and then to check that what you originally loaded is still what you think it should be when you boot it is a very important property of a TCB.  If what you’re running is open source, you can check it yourself, against your own measurements and those of the community, and if changes are made – by you or others – those changes can be checked (as part of auditing) and then propagated through measurement checking to the rest of the community.  Equally important – and much more difficult – is run-time measurability.  This turns out to be very difficult to do, although there are some techniques emerging which are beginning to get traction – for now, we tend to rely on tamper-evidence, which is easier in hardware than software.

Summary

Trusted Compute Bases (TCBs) are a key concept in building systems that we hope will behave in ways we expect – or allow us to find out when they are not.  Tamper-evidence, auditability and measurability are three important properties that they should display, and it turns out that open source is an important factor in helping us ensure two of those.

 

 

 

Humans and (being bad at) trust

Why “signing parties” were never a good idea.

I went to a party recently, and it reminded of quite how bad humans are at trust. It was a work “mixer”, and an attempt to get people who didn’t know each other well to chat and exchange some information. We were each given two cards to hang around our necks: one on which to write our own name, and the other on which we were supposed to collect the initials of those to whom we spoke (in their own hand). At the end of the event, the plan was to hand out rewards whose value was related to the number of initials collected. Pens/markers were provided.

I gamed the system by standing by the entrance, giving out the cards, controlling the markers and ensuring that everybody signed card, hence ending up with easily the largest number of initials of anyone at the party. But that’s not the point. Somebody – a number of people, in fact – pointed out the similarities between this and “key signing parties”, and that got me thinking. For those of you not old enough – or not security-geeky enough – to have come across these, they were events which were popular in the late nineties and early parts of the first decade of the twenty-first century[1] where people would get together, typically at a tech show, and sign each other’s PGP keys. PGP keys are an interesting idea whereby you maintain a public-private key pair which you use to sign emails, assert your identity, etc., in the online world. In order for this to work, however, you need to establish that you are who you say you are, and in order for this to work, you need to convince someone of this fact.

There are two easy ways to do this:

  1. meet someone IRL[2], get them to validate your public key, and sign it with theirs;
  2. have someone who knows the person you met in step 1 agree that they can probably trust you, as the person in step 1 did, and they trust them.

This is a form of trust based on reputation, and it turns out that it is a terrible model for trust. Let’s talk about some of the reasons for it not working. There are four main ones:

  • context
  • decay
  • transitive trust
  • peer pressure.

Let’s evaluate these briefly.

Context

I can’t emphasise this enough: trust is always, always contextual (see “What is trust?” for a quick primer). When people signed other people’s key-pairs, all they should really have been saying was “I believe that the identity of this person is as stated”, but signatures and encryption based on these keys was (and is) frequently misused to make statements about, or claim access to, capabilities that were not necessarily related to identity.

I lay some of the fault of this at the US alcohol consumption policy. Many (US) Americans use their driving licence/license as a form of authorisation: I am over this age, and am therefore entitled to purchase alcohol. It was designed to prove that their were authorised to drive, and nothing more than that, but you can now get a US driving licence to prove your age even if you can’t drive, and it can be used, for instance, as security identification for getting on aircraft at airportsThis is crazy, but partly explains why there is such a confusion between identification, authentication and authorisation.

Decay

Trust, as I’ve noted before in many articles, decays. Just because I trust you now (within a particular context) doesn’t mean that I should trust you in the future (in that or any other context). Mechanisms exist within the PGP framework to expire keys, but it was (I believe) typical for someone to resign a new set of keys just because they’d signed the previous set. If they were only being used for identity, then that’s probably OK – most people rarely change their identity, after all – but, as explained above, these key pairs were often used more widely.

Transitive trust

This is the whole “trusting someone because I trust you” problem. Again, if this were only about identity, then I’d be less worried, but given people’s lack of ability to specify context, and their equal inability to communicate that to others, the “fuzziness” of the trust relationships being expressed was only going to increase with the level of transitiveness, reducing the efficacy of the system as a whole.

Peer pressure

Honestly, this occurred to me due to my gaming of the system, as described in the second paragraph at the top of this article. I remember meeting people at events and agreeing to endorse their key-pairs basically because everybody else was doing it. I didn’t really know them, though (I hope) I had at least heard of them (“oh, you’re Denny’s friend, I think he mentioned you”), and I certainly shouldn’t have been signing their key-pairs. I am certain that I was not the only person to fall into this trap, and it’s a trap because humans are generally social animals[3], and they like to please others. There was ample opportunity for people to game the system much more cynically than I did at the party, and I’d be surprised if this didn’t happen from time to time.

Stepping back a bit

To be fair, it is possible to run a model like this properly. It’s possible to avoid all of these by insisting on proper contextual trust (with multiple keys for different contexts), by re-evaluating trust relationships on a regular basis, by being very careful about trusting people just due to their trusting someone else (or refusing to do so at all), and by refusing just to agree to trust someone because you’ve met them and they “seem nice”. But I’m not aware of anyone – anyone – who kept to these rules, and it’s why I gave up on this trust model over a decade ago. I suspect that I’m going to get some angry comments from people who assert that they used (and use) the system properly, and I’m sure that there are people out there who did and do: but as a widespread system, it was only going to work if the large majority of all users treated it correctly, and given human nature and failings, that never really happened.

I’m also not suggesting that we have many better models – but we really, really need to start looking for some, as this is important, and difficult stuff.


1 – I refuse to refer to these years the “aughts”.

2 – In Real Life – this used to be an actual distinction to online.

3 – even a large enough percentage of IT folks to make this a problem.

Breaking the security chain(s)

Your environment is n-dimensional – your trust must be, too.

One of the security principles by which we[1] live[2] is that security is only as strong as the weakest link in a chain.  That link is variously identified as:

  • your employees
  • external threat actors
  • all humans
  • lack of training
  • cryptography
  • logging
  • anti-virus
  • auditing capabilities
  • the development lifecycle
  • waterfall methodology
  • passwords
  • any other authentication mechanisms
  • electrical wiring
  • hurricanes
  • earthquakes
  • and pestilence.

Actually, I don’t think I’ve ever seen the last one mentioned, but it’s only a matter of time.  However, very rarely does anybody bother to identify exactly what the chain is that it is being broken by the weakest link splintering into a thousand pieces.

There are a number of candidates that spring to mind:

  1. your application flow.  This is rather an old-fashioned way of thinking of applications: that a program is started, goes through a set of actions, and then terminates, but to think more broadly about it, any action which causes an application to behave in unexpected or unintended ways is a possible security flow, whether that is a monolithic application, a set of microservices or an app on  mobile device.
  2. your software stack.  Depending on how you think about your stack, there are likely to be at least 5, maybe a dozen or maybe even scores of layers in your software stack (for an example with just a few simple layers, see Announcing Enarx).  However you think about it, you need to trust all of those layers to do what who expect them to do.  If one of them is compromised, malicious, or just poorly implemented or maintained, then you have a security issue.
  3. your hardware stack.  There was a time, barely five years ago, when most people (excepting us[1], of course), assumed that hardware did what we thought it was supposed to do, all of the time.  In fact, we should all have known better, given the Clipper Chip and the Pentium bug (to name just to famous examples), but with Spectre, Meltdown and a growing realisation that hardware isn’t as trustworthy as was previously thought, everybody needs to decide exactly what security they can trust in which components.
  4. your operational processes.  You can have the best software and hardware in the world, but if you don’t maintain it and operate it properly, it’s going to be full of holes.  Failing to invest in operations, monitoring, logging, auditing and the rest leaves you wide open.
  5. your supply chain. There’s a growing understanding in the industry that our software and hardware supply chains are possible points of failure[3].  Whether your vendor is entirely proprietary (in which case their security is largely opaque) or open source (in which case you’ve got a chance to be able to see what’s going on), errors or maliciousness in the supply chain can scupper any hopes you had of security for your deployment.
  6. your software and hardware lifecycle.  Developing software?  Patching it?  Upgrading hardware (or software)?  Unit testing?  Unit testingg?  We all know that a failure to manage the lifecycle of our environment can lead to security problems.

The point I’m trying to make above is that there’s no single chain.  Your environment is n-dimensional – your trust must be, too.  If you don’t think about all of these contexts – and there will be more beyond the half-dozen that I’ve just noted – then you can’t have a good chance of managing security in your environment.  I honestly don’t think that there’s any single weakest link in the chain, because there are always already multiple chains in play: our job is to think about as many of them as possible, and then manage an mitigate the risks associated with each.


1 – the mythical “IT security community”.

2 – you’re right: “which we live by” would sound much more natural.

3 – and a growing industry to try to provide fixes.

What is confidential computing?

Industry interest has been high, and overwhelmingly positive.

On Wednesday, 21st August, 2019 (just under a week ago, at time of writing), Jim Zemlin of the Linux Foundation announced the intent to form the Confidential Computing Consortium, with members including Alibaba, Arm, Baidu, Google Cloud, IBM, Intel, Microsoft, Red Hat, Swisscom and Tencent.  I’m particularly proud as Red Hat (my employer) is one of those[1], and I spent the preceding few weeks and days working very hard to ensure that we would be listed as one of the planned founding members.

“Confidential Computing” sounds like a lofty goal, and it is.  We’ve known for ages that you should encrypt sensitive data at rest (in storage), in transit (on the network), but confidential computing, as defined by the consortium, is about doing the same for sensitive data – and algorithms – in use.  The consortium plans to encourage industry to use hardware technologies generally called Trust Execution Environments to allow applications and processes to be encrypted as they are running.

This may sound somewhat familiar to those who follow my blog, and it should: Enarx, an open source project launched by Red Hat, was announced as one of the projects that should be part of the initial launch.  I’ve written about Enarx in several places:

Additionally, you’ll find lots of information on the introduction page of the Enarx wiki.

The press release from the Linux Foundation lists the following goals for the Confidential Computing Consortium (my emboldening):

The Confidential Computing Consortium will bring together hardware vendors, cloud providers, developers, open source experts and academics to accelerate the confidential computing market; influence technical and regulatory standards; and build open source tools that provide the right environment for TEE development. The organization will also anchor industry outreach and education initiatives.

Enarx, of course, fits perfectly into this description, as per the text in bold.  Beyond that, however, is the alignment that there is with the other aims of the Enarx project, and the opportunities with which a wider consortium presents us.  The addition of hardware vendors gives us – and the other participants – opportunities to discuss implementations (hardware and software) in an open environment, cloud providers and other users will give us great use cases, and academic involvement broadens the likelihood of quick access to new ideas and research.

We also expect industry and regulatory standards to be forthcoming, and a need for education as the more sectors and industries engage with confidential computing: the consortium provides a framework to engage in related activities.

It’s early days for the Confidential Computing Consortium, but I’m really hopeful and optimistic.  Already, the openness displayed between the planned members on both technical and non-technical collaboration has gone far beyond what I would have expected.  The industry interest – as evidenced by press and community activities – has been high, and overwhelmingly positive. Fans of Enarx – and confidential computing generally – should be excited by the prospect of greater visibility and collaboration.  After all, isn’t that what open source is about in the first place?


1 – this seems like a good place to point out that the views in this article and blog are my own, and may not represent those of my employer, of the Confidential Computing Consortium, the Linux Foundation or any other body.

Turtles – and chains of trust

There’s a story about turtles that I want to tell.

One of the things that confuses Brits is that many Americans[1] don’t know the difference between tortoises and turtles[2], whereas we (who have no species of either type which are native to our shores) seem to no have no problem differentiating them[3].  This is the week when Americans[1] like to bash us Brits over the little revolution they had a couple of centuries ago, so I don’t feel too bad about giving them a little hassle about this.

As it happens, there’s a story about turtles that I want to tell. It’s important to security folks, to the extent that you may hear a security person just say “turtles” to a colleague in criticism of a particular scheme, which will just elicit a nod of agreement: they don’t like it. There are multiple versions of this story[4]: here’s the one I tell:

A learned gentleman[5] is giving a public lecture.  He has talked about the main tenets of modern science, such as the atomic model, evolution and cosmology.  At the end of the lecture, an elderly lady comes up to him.

“Young man,” she says.

“Yes,” says he.

“That was a very interesting lecture,” she continues.

“I’m glad you enjoyed it,” he replies.

“You are, however, completely wrong.”

“Really?” he says, somewhat taken aback.

“Yes.  All that rubbish about the Earth hovering in space, circling the sun.  Everybody knows that the Earth sits on the back of a turtle.”

The lecturer, spotting a hole in her logic, replies, “But madam, what does the turtle sit on?”

The elderly lady looks at him with a look of disdain.  “What a ridiculous question!  It’s turtles all the way down, of course!”

The problem with the elderly lady’s assertion, of course, is one of infinite regression: there has to be something at the bottom. The reason that this is interesting to security folks is that they know that systems need to have a “bottom turtle” at some point.  If you are to trust a system, it needs to sit on something: this is typically called the “TCB”, or Trusted Compute Base, and, in most cases, needs to be rooted in hardware.  Even saying “rooted in hardware” is not enough: exactly what hardware you trust, and where, depends on a number of factors, including what you know about your hardware supply chain; what you feel about motherboards; what your security posture is; how realistic it is that State Actors might try to attack you; how deeply you want to delve into the hardware stack; and, ultimately, just how paranoid you are.

Principles for chains of trust

When you are building a system which you need to have some trust in, you will typically talk about the chain of trust, from the bottom up.  This idea of a chain of trust is very important, and very pervasive, within security.  It allows for some important principles:

  • there has to be a root of trust somewhere (the “bottom turtle”);
  • the chain is only as strong as its weakest link (and attackers will find it);
  • be explicit about each of the links in the chain;
  • realise that some of the links in the chain may change (e.g. if software is updated);
  • be aware that once you have lost trust in a chain, you need to rebuild it from at least the layer below the one in which you have lost trust;
  • simple chains (with no “joins” with other chains of trust) are much, much simpler to validate and monitor than more complex ones.

Software/hardware systems are not the only place in which you will encounter chains of trust: in fact, you come across them every time you make a TLS[6] connection to a web site (you know: that green padlock icon in the address bar).  In this case, there’s a chain (sometimes short, sometimes long) of certificates from a “root CA” (a trusted party that your browser knows about) down to the organisation (or department or sub-organisation) running the web site to which you’re connecting.  Assuming that each link in the chain trusts the next link to be who they say they are, the chain of signatures (turned into a certificate) can be checked, to give an indication, at least, that the site you’re visiting isn’t a spoof one by somebody pretending to be, for example, your bank.  In this case, the bottom turtle is the root CA[7], and its manifestation in the chain of trust is its root certificate.

And chains of trust aren’t restricted to the world of IT, either: supply chains care a lot about chains of trust.  Can you be sure that the diamond in the ring you bought from your local jewellery store, who got it from an artisan goldsmith, who got it from a national diamond chain, did not originally come from a “blood diamond” nation?  Can you be sure that the replacement part for your car, which you got from your local independent dealership, is an original part, and can the manufacturer be sure of the quality of the materials they used?  Blockchains are offering some interesting ways to help track these sorts of supply chains, and can even be applied to supply chains in software.

Chains of trust are everywhere we look.  Some are short, and some are long.  In most cases, there will be a need to employ transitive trust – I need to believe that whoever created my browser checked the root CA, just as you need to believe that your local dealership verified that the replacement part came from the right place – because the number of links that we can verify ourselves is typically low.  This may be due to a variety of factors, including time, expertise and visibility.  But the more we are aware of the fact that there is a chain of trust in any particular situation, the more we can make conscious decision about the amount of trust we should put in it, rather than making assumptions about the safety, security or validation of something we are buying or using.


1 – citizens of the US of A,

2 – have a look on a stock photography site like Pixabay if you don’t believe me.

3 – tortoises are land-based, turtles are aquatic, I believe.

4 – Wikipedia has a good article explaining both the concept and the story’s etymology.

5 – the genders of the protagonists are typically as I tell, which tells you a lot about the historical context, I’m afraid.

6 – this used to be “SSL”, but if you’re still using SSL, you’re in trouble: it’s got lots of holes in it!

7 – or is it?  You could argue that the HSM that (hopefully) houses the root CA, or the processes that protect it, could be considered the bottom turtle.  For the purposes of this discussion, however, the extent of the “system” is the certificate chain and its signers.

Trust & choosing open source

Your impact on open source can be equal to that of others.

信頼と、オープンソースを選ぶということ

A long time ago, in a standards body far, far away, I was involved in drafting a document about trust and security. That document rejoices in the name ETSI GS NFV-SEC 003: Network Functions Virtualisation (NFV);NFV Security; Security and Trust Guidance[1], and section 5.1.6.3[2] talks about “Transitive trust”.  Convoluted and lengthy as the document is, I’m very proud of it[3], and I think it tackles a number of  very important issues including (unsurprisingly, given the title), a number of issues around trust.  It defines transitive trust thus:

“Transitive trust is the decision by an entity A to trust entity B because entity C trusts it.”

It goes on to disambiguate transitive trust from delegated trust, where C knows about the trust relationship.

At no point in the document does it mention open source software.  To be fair, we were trying to be even-handed and show no favour towards any type of software or vendors – many of the companies represented on the standards body were focused on proprietary software – and I wasn’t even working for Red Hat at the time.

My move to Red Hat, and, as it happens, generally away from the world of standards, has led me to think more about open source.  It’s also led me to think more about trust, and how people decide whether or not to use open source software in their businesses, organisations and enterprises.  I’ve written, in particular, about how, although open source software is not ipso facto more secure than proprietary software, the chances of it being more secure, or made more secure, are higher (in Disbelieving the many eyes hypothesis).

What has this to do with trust, specifically transitive trust?  Well, I’ve been doing more thinking about how open source and trust are linked together, and distributed trust is a big part of it.  Distributed trust and blockchain are often talked about in the same breath, and I’m glad, because I think that all too often we fall into the trap of ignoring the fact that there definitely trust relationships associated with blockchain – they are just often implicit, rather than well-defined.

What I’m interested in here, though, is the distributed, transitive trust as a way of choosing whether or not to use open source software.  This is, I think, true not only when talking about non-functional properties such as the security of open source but also when talking about the software itself.  What are we doing when we say “I trust open source software”?  We are making a determination that enough of the people who have written and tested it have similar requirements to mine, and that their expertise, combined, is such that the risk to my using the software is acceptable.

There’s actually a lot going on here, some of which is very interesting:

  • we are trusting architects and designers to design software to meet our use cases and requirements;
  • we are trusting developers to implement code well, to those designs;
  • we are trusting developers to review each others’ code;
  • we are trusting documentation folks to document the software correctly;
  • we are trusting testers to write, run and check tests which are appropriate to my use cases;
  • we are trusting those who deploy the code to run in it ways which are similar to my use cases;
  • we are trusting those who deploy the code to report bugs;
  • we are trusting those who receive bug reports to fix them as expected.

There’s more, of course, but that’s definitely enough to get us going.  Of course, when we choose to use proprietary software, we’re trusting people to do that, but in this case, the trust relationship is much clearer, and much tighter: if I don’t get what I expect, I can choose another vendor, or work with the original vendor to get what I want.

In the case of open source software, it’s all more nebulous: I may be able to identify at least some of the entities involved (designers, software engineers and testers, for example), but the amount of power that I as a consumer of the software have over their work is likely to be low.  There’s a weird almost-paradox here, though: you can argue that for proprietary software vendors, my power over the direction of the software is higher (I’m paying them or not paying them), but my direct visibility into what actually goes on, and my ability to ensure that I get what I want is reduced when compared to the open source case.

That’s because, for open source, I can be any of the entities outlined above.  I – or those in my organisation – can be architect, designer, document writer, tester, and certainly deployer and bug reporter.  When you realise that your impact on open source can be equal to that of others, the distributed trust becomes less transitive.  You understand that you have equal say in the creation, maintenance, requirements and quality of the software which you are running to all the other entities, and then you become part of a network of trust relationships which are distributed, but at less of a remove to that which you’ll experience when buying proprietary software.

Why, then, would anybody buy or license open source software from a vendor?  Because that way, you can address other risks – around support, patching, training, etc. – whilst still enjoying the benefits of the distributed trust network that I’ve outlined above.  There’s a place for those who consume directly from the source, but it doesn’t mean the risk appetite of all software consumers – including those who are involved in the open source community themselves.

Trust is a complex issue, and the ways in which we trust other things and other people is complex, too (you’ll find a bit of an introduction in Of different types of trust), but I think it’s desperately important that we examine and try to understand the sorts of decisions we make, and why we make them, in order to allow us to make informed choices around risk.


1 – if you’ve not been involved in standards creation, this may fill you with horror, but if you have so involved, this sort of title probably feels normal.  You may need help.

2 – see 1.

3 – I was one of two “rapporteurs”, or editors of the document, and wrote a significant part of it, particularly the sections around trust.