Encryption “backdoor”? No, it’s an gaping archway.

Backdoors are just a non-starter.

Note: this is probably one of those posts where I should point out that the views expressed in this article aren’t necessarily those of my employer, Red Hat.  Though I hope that they are.

I understand that governments don’t like encryption. Well, to be fair, they like encryption for their stuff, but they don’t want criminals, or people who might be criminals to have it. The problem is that “people who might be criminals” means you and me[1]. I need encryption, and you need encryption. For banking, but business, for health records, for lots of things. This isn’t the first time I’ve blogged on this issue, and I actually compiled a list in a previous post, giving some examples of perfectly legal, perfectly appropriate reasons for “us” to be using encryption. I’ve even written about the importance of helping governments go about their business.

Unluckily, it seems that the Government of Australia has been paying insufficient attention to the points that I have[2] been making. It seems that they are hell-bent on passing a law that would require relevant organisations (the types of organisations listed are broad and ill-defined, in the coverage that I’ve seen) to provide a backdoor into individuals’ encrypted messages. Only for individuals, you’ll note, not blanket decryption.  Well, that’s a relief. And that was sarcasm.

The problem? Mathematics. Cryptography is based on mathematics. Much of it is actually quite simple, though some of it is admittedly complex. But you don’t argue with mathematics, and the mathematics say that you can’t just create a backdoor and have the rest of the scheme continue to be as secure.

Most existing encryption/decryption schemes[3] allow one party to send encrypted data to another with a single shared key. To decrypt, you need that key. In order to get that key, you either need to be one of the two parties (typically referred to as “Alice” and “Bob”), or hope that, as a malicious[5] third party (typically referred to as “Eve”[6]), you can do one of the following:

  1. get Alice or Bob to give you their key;
  2. get access to the key by looking at some or all of the encrypted messages;
  3. use a weakness in the encryption process to decrypt the messages.

Now, number 1 isn’t great if you don’t want Alice or Bob to know that you’re snooping on their messages. Number 2 is a protocol weakness, and designers of cryptographic protocols try very, very hard to avoid them. Number 3 is an implementation weakness, and reputable application developers will be try very, very hard to avoid those. What’s more, for applications which are open source, anyone can have a look at them, so putting them in on purpose isn’t likely to last for long.

Both 2 and 3 can lead to backdoors. But they’re not single-use backdoors, they’re gaping archways that anyone can find out about and exploit.

Would it be possible to design protocols that allowed a third party to hold a key for each encryption session, allowing individual sessions to be decrypted by a “trusted party” such as law enforcement? Yes, it would. But a) no-one with half a brain would knowingly use such a scheme[7]; b) the operational overhead of running such a scheme would be unmanageable; and c) it would only a matter of time before untrusted parties got access to the systems behind the scheme and misused it.

Backdoors are just a non-starter. Governments need to find sensible ways to perform legally approved surveillance, but encryption backdoors are not one of them.


1 – and I’m not even intentionally addressing any criminals who might be reading this article.

2 – quite eloquently, in my humble opinion.

3 – the two tend to go together as there isn’t much point in one without the other[4] .

4 – in most cases. You’d be surprised, though.

5 – at least as far as Alice and Bob are concerned.

6 – guess where the name of this blog originated?

7 – hint: not me, not you, and certainly not criminals.

I’m turning off your security.

“Don’t worry, I know what I’m doing.”

Today’s security story is people turning security off.  For me, the fact that it’s even a story is the story.  This particular story is covered in The Register, who explain (to nobody’s surprise) that some of the patches to fix issues identified in CPU’s (think Spectre, Meltdown, etc.) can actually slow down the applications running on them.  The problem is that, in some cases, they don’t slow them down a little bit, but rather a lot.  By which I mean up to 50%.  And if you’ve bought expensive hardware – or rented it [1] – then you’d generally prefer it if it runs your applications/programs/workloads quickly, rather than just half as fast as they might run.

And so you turn off the security patches.  Your decision: fine.

No, stop: this isn’t what has happened.

The mythical “you”, the person running the workload, isn’t the person who makes the decision, in most cases, because it’s been made for you.  This is the real story.

Linus Torvalds, and a bunch of other experts in the Linux kernel[2], have decided that although the patch that could make your workloads secure is available, the functionality that does it should be “off” by default.  They reason – quite correctly, in my opinion – that the vast majority of people running workloads, won’t easily be able to turn this functionality on themselves

They also reason – again, correctly, in my opinion – that most people will care more about how quickly their workloads run than about how secure they are.  I’m not happy about this, but that’s the way it is.

What I worry about is the final step in the logic to making the decision.  I’m going to quote Linus:

“Have you seen any actual realistic attacks for normal human users?” he asked. “Things where the kernel should actually care? The JavaScript thing is for the browser to fix up, not for the kernel to say ‘now everything should run up to 50 per cent slower.'”

I get the reasoning behind this, but I don’t like it.  To give some context, somebody came up with an example attack which could compromise certain workloads, and Linus points out that there are better ways to fix this attack than fixing it in the kernel. My concerns are two-fold:

  1. although there may be better places to fix that particular attack, a kernel-level fix is likely to fix an entire class of attacks, meaning better protection for users who are using any application which might include an attack vector.
  2. pointing out that there haven’t been any attacks yet not only ignores the fact that there is a future out there[3] but also points malicious actors in the direction of a likely attack vector.

Now, I know that the more dedicated malicious actors are already looking for these things, but do we really need to advertise?

What’s my fix?

I don’t have one, or at least not an easy one.

Somebody, somewhere, needs to decide whether security is turned on or off.  What I’d honestly like to see is an easier set of controls to allow people to turn on or off security, and to understand the trade-offs when they do that.  The problems with that are:

  • the trade-offs are often much more complex than just “fast and insecure” or “slow and secure”, and are really difficult to explain.
  • in order to make a sensible decision about trade-offs, people need to understand risk.  And people are awful at understanding risk.

And there’s a “chicken and egg problem”[7] here: people won’t understand risk until they are offered the chance to make decisions, but there’s little incentive to offer them complex decisions unless they understand risk.

My plea?  Where possible, expose risk, and explain what it is.  And if you’re turning off security-related functionality, make it easy to turn back on for those who need it.


1 – a quick heads-up: this is what “deploying to the cloud” actually is.

2 – what sits at the bottom of many of the workloads that are running in servers.

3 – hopefully.  If the Three Minute Warning[4] sounds while you’re reading this, you may wish to duck and cover.  You can come back to it later[6].

4 – “… sounds like this …”[5].

5 – 80s reference.

6 – or not.  See [3].

7 – for non-native English readers, this means “a problem where the solution requires two pieces, both of which are dependent on each other”.

Why security policies are worthless

A policy, to have any utility at all, needs to exist in a larger context.

“We need a policy on that.” This is a phrase that seems to act as a universal panacea to too many managers. When a problem is identified, and the blame game has been exhausted, the way to sort things out, they believe, is to create a policy. If you’ve got a policy, then everything will be fine, because everything will be clear, and everyone will obey the policy, and nothing can go wrong.

Right[1].

The problem is that policy, on its own, is worthless.

A policy, to have any utility at all, needs to exist in a larger context, or, to think of it in a different way, to sit in a chain of artefacts.  It is its place in this chain that actually gives it meaning.  Let me explain.

When that manager said that they wanted a policy, what did that actually mean?  That rather depends on how wise the manager is[2].  Hopefully, the reason that the manager identified the need for a policy was because:

  1. they noticed that something had gone wrong that shouldn’t have done and;
  2. they wanted to have a way to make sure it didn’t happen again.

The problem with policy on its own is that it doesn’t actually help with either of those points.  What use does it have, then?

Governance

Let’s look at those pieces separately.  When we say that “something had gone wrong that shouldn’t have done“, what we’re saying is that there is some sort of model for what the world should look like, giving us often general advice on our preferred state.  Sometimes this is a legal framework, sometimes it’s a regulatory framework, and sometimes it’s a looser governance model.  The sort of thing I’d expect to see at this level would be statements like:

  • patient data must be secured against theft;
  • details of unannounced mergers must not be made available to employees who are not directors of the company;
  • only authorised persons may access the military base[4].

These are high level requirements, and are statements of intent.  They don’t tell you what to do in order to make these things happen (or not happen), they just tell you that you have to do them (or not do them).  I’m going to call collections of these types of requirements “governance models”.

Processes

At the other end of the spectrum, you’ve got the actual processes (in the broader sense of the term) required to make the general intent happen.  For the examples above, these might include:

  • AES-256 encryption using OpenSSL version 1.1.1 FIPS[5], with key patient-sym-current for all data on database patients-20162019;
  • documents Indigo-1, Indigo-3, Indigo-4 to be collected after meeting and locked in cabinet Socrates by CEO or CFO;
  • guards on duty at post Alpha must report any expired passes to the base commander and refuse entry to all those producing them.

These are concrete processes that must be followed, which will hopefully allow the statements of intent that we noted above to be carried out.

Policies and audit

The problem with both the governance statements and the processes identified is that they’re both insufficient.  The governance statements don’t tell you how to do what needs to be done, and the processes are context-less, which means that you can’t tell what they relate to, or how they’re helping.  It’s also difficult to know what to do if they fail: what happens, for example, if the base commander turns up with an expired pass?

Policies are what sit in the middle.  They provide guidance as to how to implement the governance model, and provide context for the processes.  They should give you enough detail to cover all eventualities, even if that’s to say “consult the Legal Department when unsure[6]”.  What’s more, they should be auditable.  If you can’t audit your security measures, then how can you be sure that your governance model is being followed?  But governance models, as we’ve already discovered, are at the level of intent – not the sort of thing that can be audited.

Policies, then, should provide enough detail that they can be auditable, but they should also preferably be separated enough from implementation that rules can be written that are applicable in different circumstances.  Here are a few policies that we might create to follow the first governance model examples above:

  • patient data must be secured against theft;
    • Policy 1: all data at rest must be encrypted by a symmetric key of at least 256 bits or equivalent;
    • Policy 2: all storage keys must be rotated at least once a month;
    • Policy 3: in the event of a suspected key compromise, all keys at the same level in that department must be rotated.

You can argue that these policies are too detailed, or you can argue that they’re not detailed enough: both arguments are fair, and level of detail (or granularity) should depend on the context and the use to which they are being put.  However, though I’m sure that all of the example policies I’ve given could be improved, I hope that they are all:

  • auditable;
  • able to be implemented by one or more well-defined processes;
  • understandable by both those who concerned with the governance level and those involved at the process implementation and operations level.

The value of auditing

I’ve written about auditing before (Confessions of an auditor).  I think it’s really important.  Done well, it should allow you to discover whether:

  1. your processes are covering all of the eventualities of your policies;
  2. whether your policies are actually being implemented correctly.

Auditing may also address whether your policies fully meet your governance model.  Auditing well is a skill, but in order to help your auditor – whether they are good at it or bad at it – having a clearly defined set of policies is a must.  But, as I pointed out at the beginning of this article, policies for policies’ sake are worthless: put them together with governance and processes, however, and they provide technical and business value.


1 – this is sarcasm.

2 – yes, I know.  But let’s not be rude if we can avoid it.  We want to help managers, because the more clue we deliver to them, the easier our lives will be[3].

3 – and you never know: one day, even you might be a manager.

4 – which is probably not a US military base, given the spelling of “authorised”.

5 – example only…

6 – this, in my experience, is the correct answer to many questions.

The 3 things you need to know about disk encryption

Use software encryption, preferably an open-source and audited solution.

It turns out that somebody – well, lots of people, in fact – failed to implement a cryptographic standard very well.  This isn’t a surprise, I’m afraid, but it’s bad news.  I’ve written before about how important it is to be using disk encryption, but it turns out that the advice I gave wasn’t sufficient, or detailed enough.

Here’s a bit of background.  There are two ways to do disk encryption:

  1. let the disk hardware (and firmware) manage it: HDD (hard disk drive), SSD (solid state drive) and hybrid (a mix of HDD and SDD technologies) manufacturers create drives which have encryption built in.
  2. allow your Operating System (e.g. Linux[0], OSX[1], Windows[2]) to do the job: the O/S will have a little bit of itself on the disk unencrypted, which will allow it to decrypt the rest of the disk (which is encrypted) when provided with a password or key.

You’d think, wouldn’t you, that option 1 would be the safest?  It should be quick, as it’s done in hardware, and well, the companies who manufacture these disks will know that they’re doing, right?

No.

A paper (link opens a PDF file) written by some researchers in the Netherlands reveals some work that they did on several SSD drives to try to work out how good a job had been done on the encryption security.  They are all supposed to have implemented a fairly complex standard from the TCG[4] called Opal, but it seems that none of them did it right.  It turns out that someone with physical access to your hardware can, fairly trivially, decrypt what’s on your drive.  And they can do this without the password that you use to lock it or any associated key(s).  The simple lesson from this is that you shouldn’t trust hardware disk encryption.

So, software disk encryption is OK, then?

Also no.

Well, actually yes, as long as you’re not using Microsoft’s BitLocker in its default mode.  It turns out that BitLocker will just use hardware encryption if the drive its using supports it.  In other words, using BitLocker just uses hardware encryption unless you tell it not to do so.

What about other options?  Well, you can tell BitLocker not to use hardware encryption, but only for a new installation: it won’t change on an existing disk.  The best option[5] is to use a software encryption solution which is open source and audited by the wider community.  LUKS is the default for most Linux distributions.  One suggested by the papers’ authors for Windows is Veracrypt.  Can we be certain that there are no holes or mistakes in the implementation of these solutions?  No, we can’t, but the chances of security issues being found and fixed are much, much higher than for proprietary software[6].

What, then are my recommendations?

  1. Don’t use hardware disk encryption.  It’s been shown to be flawed in many implementations.
  2. Don’t use proprietary software.  For anything, honestly, particularly anything security-related, but specifically not for disk encryption.
  3. If you have to use Windows, and are using BitLocker, run with VeraCrypt on top.

 


1 – GNU Linux.

2 – I’m not even sure if this is the OS that Macs run anymore, to be honest.

3 – not my thing either, but I’m pretty sure this is what it’s call.  Couldn’t be certain of the version, though.

4 – Trusted Computing Group.

5 – as noted by the paper’s authors, and heartily endorsed by me.

6 – I’m not aware of any problems with Macintosh-based implementations, but open source is just better – read the article linked from earlier in the sentence.

On being acquired – a personal view

It’s difficult to think of a better fit than IBM.

First off, today is one of those days when I need to point you at the standard disclaimer that the views expressed in this post are my own, and not necessarily those of my employers.  That said, I think that many of them probably align, but better safe than sorry[1].  Another note: I believe that all of the information in this article is public knowledge.

The news came out two days ago (last Sunday, 2018-10-28) that Red Hat, my employer, is being acquired by IBM for $34bn.  I didn’t know about it the deal in advance (I’m not that exalted within the company hierarchy, which is probably a good thing, as all those involved needed to keep very tight-lipped about it, and that would have been hard), so the first intimation I got was when people started sharing stories from various news sites on internal chat discussions.  They (IBM) are quite clear about the fact that they are acquiring us for the people, which means that each of us (including me!) is worth around $2.6m, based on our current headcount.  Sadly, I don’t think it works quite like this, and certainly nobody has (yet) offered to pay me that amount[2].  IBM have also said that they intend to keep Red Hat operating as a separate entity within IBM.

How do I feel?  My initial emotion was shock.  It’s always a surprise when you get news that you weren’t expecting, and the message that we’d carried for a long time was the Red Hat would attempt to keep ploughing its own furrow[3] for as long as possible.  But I’d always known that, as a public company, we were available to be bought, if the money was good enough.  It appears on this occasion that it was.  And that emotion turned to interest as to what was going to happen next.

And do you know what?  It’s difficult to think of a better fit than IBM.  I’m not going to enumerate the reasons that I feel that other possible acquirers would have been worse, but here are some of the reasons that IBM, at least in this arrangement, is good:

  • they “get” open source, and have a long history of encouraging its use;
  • they seem to understand that Red Hat has a very distinctive culture, and want to encourage that, post-acquisition;
  • they have a hybrid cloud strategy and products, Red Hat has a hybrid cloud strategy and products: they’re fairly well-aligned;
  • we’re complementary in a number of sectors and markets;
  • they’re a much bigger player than us, and suddenly, we’ll have access to more senior people in new and exciting companies.

What about the impact on me, though?  Well, IBM takes security seriously.  IBM has some fantastic research and academic connections.  The group in which I work has some really bright and interesting people in it, and it’s difficult to imagine IBM wanting to break it up.  A number of the things I’m working on will continue to align with both Red Hat’s direction and IBM’s.  The acquisition will take up to a year to complete – assuming no awkward regulatory hurdles along the way – and not much is going to change in the day-to-day.  Except that I hope to get even better access to my soon-to-be-colleagues working in similar fields to me, but within IBM.

Will there be issues along the way?  Yes.  Will there be uncertainty?  Yes.  But do I trust that the leadership within Red Hat and IBM have an honest commitment to making things work in a way that will benefit Red Hatters?  Yes.

And am I looking to jump ship?  Oh, no.  Far too much interesting stuff to be doing.  We’ve got an interesting few months and years ahead of us.  My future looked red, until Sunday night.  Then maybe blue.  But now I’m betting on something somewhere between the two: go Team Purple.


1 – because, well, lawyers, the SEC, etc., etc.

2 – if it does, then, well, could somebody please contact me?

3 – doing its own thing independently.

“All systems nominal” – borrowing a useful phrase

Wouldn’t it be lovely if everything were functioning exactly as it should?

Wouldn’t it be lovely if everything were functioning exactly as it should? All the time. Alas, that is not to be our lot: we in IT security know that there’s always something that needs attention, something that’s not quite right, something that’s limping along and needs to be fixed.

The question I want to address in this article is whether that’s actually OK. I’ve written before about managed degradation (the idea that planning for when things go wrong is a useful and important practice[1]) in Service degradation: actually a good thing. The subject of this article, however, is living in a world where everything is running almost normally – or, more specifically, “nominally”. Most definitions of the word “nominal” focus on its meaning of “theoretical” or “token” value. A quick search of online dictionaries provided two definitions which were more relevant to the usage I’m going to be looking at today:

  • informal (chiefly in the context of space travel) functioning normally or acceptably[2].
  • being according to plan: satisfactory[3].

I’d like to offer a slightly different one:

  • within acceptable parameters for normal system operation.

I’ve seen “tolerances” used instead of “parameters”, and that works, too, but it’s not a word that I think we use much within IT security, so I lean towards “parameters”[4].

Utility

Why do I think that this is a useful concept? Because, as I noted above, we all know that it’s a rare day when everything works perfectly. But we find ways to muddle through, or we find enough bandwidth to make the backups happen without significant impact on database performance, or we only lose 1% of the credit card details collected that day[5]. All of these (except the last one), are fine, and if we are wise, we might start actually defining what counts as acceptable operation – nominal operation – for all of our users. This is what we should be striving for, and exactly how far off perfect operation we are in will give us clues as to how much effort we need to expend to sort them out, and how quickly we need to perform mitigations.

In fact, many organisations which provide services do this already: that’s where SLAs (Service Level Agreements) come from. I remember, at school, doing some maths[6] around food companies ensuring that they were in little danger of being in trouble for under-filling containers by looking at standard deviations to understand what the likely amount in each would be. This is similar, and the likelihood of hardware failures, based on similar calculations, is often factored into uptime planning.

So far, much of the above seems to be about resilience: where does security come in? Well, your security components, features and functionality are also subject to the same issues of resiliency to which any other part of your system is. The problem is that if a security piece fails, it may well be a single point of failure, which means that although the rest of the system is operating at 99% performance, your security just hit zero.

These are the reasons that we perform failure analysis, and why we consider defence in depth. But when we’re looking at defence in depth, do we remember to perform second order analysis? For instance, if my back-up LDAP server for user authentication is running on older hardware, what are the chances that it will fail when put under load?

Broader usage

It should come as no surprise to regular readers that I want to expand the scope of the discussion beyond just hardware and software components in systems to the people who are involved, and beyond that to processes in general.

Train companies are all too aware of the impact on their services if a bad flu epidemic hits their drivers – or if the weather is good, so their staff prefer to enjoy the sunshine with their families, rather than take voluntary overtime[7]. You may have considered the impact of a staff member or two being sick, but have you gone as far as modelling what would happen if four members of your team were sick, and, just as important, how likely that is? Equally vital to consider may be issues of team dynamics, or even terrorism attacks or union disputes. What about external factors, like staff not being able to get into work because of train cancellations? What are the chances of broadband failures occurring at the same time, scuppering your fall-back plan of allowing key staff to work from home?

We can go deeper and deeper into this, and at some point it becomes pointless to go any further. But I believe that it’s useful to consider how far to go with it, and also to spend some time considering exactly what you consider “nominal” operation, particularly for you security systems.


1 – I nearly wrote “art”.

2 – Oxford Dictionaries: https://en.oxforddictionaries.com/definition/nominal.

3 – Merriam-Webster: https://www.merriam-webster.com/dictionary/nominal.

4 – this article was inspired by the Public Service Broadcasting Song Go. Listen to it: it rocks. And they’re great live, too.

5 – Note: this is a joke, and not a very funny one. You’ve probably just committed a GDPR breach, and you need to tell someone about it. Now.

6 – in the UK, and most countries speaking versions of Commonwealth English, we do more than one math. Because “mathematics“.

7 – this happened: I saw it it on TV[8].

8 – so it must be true.