patches – Alice, Eve and Bob

One of the security principles by which we[1] live[2] is that security is only as strong as the weakest link in a chain. That link is variously identified as:

your employees
external threat actors
all humans
lack of training
cryptography
logging
anti-virus
auditing capabilities
the development lifecycle
waterfall methodology
passwords
any other authentication mechanisms
electrical wiring
hurricanes
earthquakes
and pestilence.

Actually, I don’t think I’ve ever seen the last one mentioned, but it’s only a matter of time. However, very rarely does anybody bother to identify exactly what the chain is that it is being broken by the weakest link splintering into a thousand pieces.

There are a number of candidates that spring to mind:

your application flow. This is rather an old-fashioned way of thinking of applications: that a program is started, goes through a set of actions, and then terminates, but to think more broadly about it, any action which causes an application to behave in unexpected or unintended ways is a possible security flow, whether that is a monolithic application, a set of microservices or an app on mobile device.
your software stack. Depending on how you think about your stack, there are likely to be at least 5, maybe a dozen or maybe even scores of layers in your software stack (for an example with just a few simple layers, see Announcing Enarx). However you think about it, you need to trust all of those layers to do what who expect them to do. If one of them is compromised, malicious, or just poorly implemented or maintained, then you have a security issue.
your hardware stack. There was a time, barely five years ago, when most people (excepting us[1], of course), assumed that hardware did what we thought it was supposed to do, all of the time. In fact, we should all have known better, given the Clipper Chip and the Pentium bug (to name just to famous examples), but with Spectre, Meltdown and a growing realisation that hardware isn’t as trustworthy as was previously thought, everybody needs to decide exactly what security they can trust in which components.
your operational processes. You can have the best software and hardware in the world, but if you don’t maintain it and operate it properly, it’s going to be full of holes. Failing to invest in operations, monitoring, logging, auditing and the rest leaves you wide open.
your supply chain. There’s a growing understanding in the industry that our software and hardware supply chains are possible points of failure[3]. Whether your vendor is entirely proprietary (in which case their security is largely opaque) or open source (in which case you’ve got a chance to be able to see what’s going on), errors or maliciousness in the supply chain can scupper any hopes you had of security for your deployment.
your software and hardware lifecycle. Developing software? Patching it? Upgrading hardware (or software)? Unit testing? Unit testingg? We all know that a failure to manage the lifecycle of our environment can lead to security problems.

The point I’m trying to make above is that there’s no single chain. Your environment is n-dimensional – your trust must be, too. If you don’t think about all of these contexts – and there will be more beyond the half-dozen that I’ve just noted – then you can’t have a good chance of managing security in your environment. I honestly don’t think that there’s any single weakest link in the chain, because there are always already multiple chains in play: our job is to think about as many of them as possible, and then manage an mitigate the risks associated with each.

1 – the mythical “IT security community”.

2 – you’re right: “which we live by” would sound much more natural.

3 – and a growing industry to try to provide fixes.

Today’s security story is people turning security off. For me, the fact that it’s even a story is the story. This particular story is covered in The Register, who explain (to nobody’s surprise) that some of the patches to fix issues identified in CPU’s (think Spectre, Meltdown, etc.) can actually slow down the applications running on them. The problem is that, in some cases, they don’t slow them down a little bit, but rather a lot. By which I mean up to 50%. And if you’ve bought expensive hardware – or rented it [1] – then you’d generally prefer it if it runs your applications/programs/workloads quickly, rather than just half as fast as they might run.

And so you turn off the security patches. Your decision: fine.

No, stop: this isn’t what has happened.

The mythical “you”, the person running the workload, isn’t the person who makes the decision, in most cases, because it’s been made for you. This is the real story.

Linus Torvalds, and a bunch of other experts in the Linux kernel[2], have decided that although the patch that could make your workloads secure is available, the functionality that does it should be “off” by default. They reason – quite correctly, in my opinion – that the vast majority of people running workloads, won’t easily be able to turn this functionality on themselves

They also reason – again, correctly, in my opinion – that most people will care more about how quickly their workloads run than about how secure they are. I’m not happy about this, but that’s the way it is.

What I worry about is the final step in the logic to making the decision. I’m going to quote Linus:

“Have you seen any actual realistic attacks for normal human users?” he asked. “Things where the kernel should actually care? The JavaScript thing is for the browser to fix up, not for the kernel to say ‘now everything should run up to 50 per cent slower.'”

I get the reasoning behind this, but I don’t like it. To give some context, somebody came up with an example attack which could compromise certain workloads, and Linus points out that there are better ways to fix this attack than fixing it in the kernel. My concerns are two-fold:

although there may be better places to fix that particular attack, a kernel-level fix is likely to fix an entire class of attacks, meaning better protection for users who are using any application which might include an attack vector.
pointing out that there haven’t been any attacks yet not only ignores the fact that there is a future out there[3] but also points malicious actors in the direction of a likely attack vector.

Now, I know that the more dedicated malicious actors are already looking for these things, but do we really need to advertise?

What’s my fix?

I don’t have one, or at least not an easy one.

Somebody, somewhere, needs to decide whether security is turned on or off. What I’d honestly like to see is an easier set of controls to allow people to turn on or off security, and to understand the trade-offs when they do that. The problems with that are:

the trade-offs are often much more complex than just “fast and insecure” or “slow and secure”, and are really difficult to explain.
in order to make a sensible decision about trade-offs, people need to understand risk. And people are awful at understanding risk.

And there’s a “chicken and egg problem”[7] here: people won’t understand risk until they are offered the chance to make decisions, but there’s little incentive to offer them complex decisions unless they understand risk.

My plea? Where possible, expose risk, and explain what it is. And if you’re turning off security-related functionality, make it easy to turn back on for those who need it.

1 – a quick heads-up: this is what “deploying to the cloud” actually is.

2 – what sits at the bottom of many of the workloads that are running in servers.

3 – hopefully. If the Three Minute Warning[4] sounds while you’re reading this, you may wish to duck and cover. You can come back to it later[6].

4 – “… sounds like this …”[5].

5 – 80s reference.

6 – or not. See [3].

7 – for non-native English readers, this means “a problem where the solution requires two pieces, both of which are dependent on each other”.

Tag: patches

Breaking the security chain(s)

I’m turning off your security.

What’s my fix?