Knowing me, knowing you: on Russian spies and identity

Who are you, and who tells me so?

Who are you, and who tells me so?  These are questions which are really important for almost any IT-related system in use today.  I’ve previously discussed the difference between identification and authentication (and three other oft-confused terms) in Explained: five misused security words, and what I want to look at in this post is the shady hinterland between identification and authentication.

There has been a lot in the news recently about the poisoning in the UK of two Russian nationals and two British nationals, leading to the tragic death of Dawn Sturgess.  I’m not going to talk about that, or even about the alleged perpetrators, but about the problem of identity – their identity – and how that relates to IT.  The two men who travelled to Salisbury, and were named by British police as the perpetrators, travelled under Russian passports.  These documents provided their identities, as far as the British authorities – including UK Border Control, who allowed them into the country – were aware, and led to their being allowed into the country.

When we set up a new user in an IT system or allow them physical access to a building, for instance, we often ask for “Government-issued ID” as the basis for authenticating the identity that they have presented, in preparation for deciding whether to authorise them to perform whatever action they have requested.  There are two issues here – one obvious, and one less so.  The first, obvious one, is that it’s not always easy to tell whether a document has actually been issued by the authority by which it appears to be have been issued – document forgers have been making a prosperous living for hundreds, if not thousands of years.  The problem, of course, is that the more tell-tale signs of authenticity you reveal to those whose job it is to check a document, the more clues you give to would-be forgers for how to improve the quality of the false versions that they create.

The second, and less obvious problem, is that just because a document has been issued by a government authority doesn’t mean that it is real.  Well, actually, it does, and there’s the issue.  Its issuance by the approved authority makes it real – that is to say “authentic” – but it doesn’t mean that it is correct.  Correctness is a different property to authenticity. Any authority may decide to issue identification documents that may be authentic, but do not truly represent the identity of the person carrying them. When we realise that a claim of identity is backed up by an authority which is issuing documents that we do not believe to be correct, that means that we should change our trust relationship with that authority.  For most entities, IDs which have been authentically issued by a government authority are quite sufficient, but it is quite plausible, for instance, that the UK Border Force (and other equivalent entities around the world) may choose to view passports issued by certain government authorities as suspect in terms of their correctness.

What impact does this have on the wider IT security community?  Well, there are times when we are accepting government-issued ID when we might want to check with relevant home nation authorities as to whether we should trust them[1].  More broadly than that, we must remember that every time that we authenticate a user, we are making a decision to trust the authority that represented that user’s identity to us.  The level of trust we place in that authority may safely be reduced as we grow to know that user, but it may not – either because our interactions are infrequent, or maybe because we need to consider that they are playing “the long game”, and are acting as so-called “sleepers”.

What does this continuous trust mean?  What it means is that if we are relying on an external supplier to provide contractors for us, we need to keep remembering that this is a trust relationship, and one which can change.  If one of those contractors turns out to have faked educational qualifications, then we need to doubt the authenticity of the educational qualifications of all of the other contractors – and possibly other aspects of the identity which the external supplier has presented to us.  This is because we have placed transitive trust in the external supplier, which we must now re-evaluate.  What other examples might there be?  The problem is that the particular aspects of identity that we care about are so varied and differ between different roles that we perform.  Sometimes, we might not care about education qualifications, but credit score, or criminal background, or blood type.  In the film Gattaca[2], identity is tied to physical and mental ability to perform a particular task.

There are various techniques available to allow us to tie a particular individual to a set of pieces of information: DNA, iris scans and fingerprints are pretty good at telling us that the person in front of us now is the person who was in front of us a year ago.  But tying that person to other information relies on trust relationships with external entities, and apart from a typically small set of inferences that we can draw from our direct experiences of this person, it’s difficult to know exactly what is truly correct unless we can trust those external entities.


1 – That assumes, of course, that we trust our home nation authorities…

2 – I’m not going to put a spoiler in here, but it’s a great film, and really makes you think about identity: go and watch it!

3 laptop power mode options

Don’t suspend your laptop.

I wrote a post a couple of weeks ago called 7 security tips for travelling with your laptop.  The seventh tip was “Don’t suspend”: in other words, when you’re finished doing what you’re doing, either turn your laptop off, or put it into “hibernate” mode.  I thought it might be worth revisiting this piece of advice, partly to explain the difference between these different states, and partly to explain exactly why it’s a bad idea to use the suspend mode.  A very bad idea indeed.  In fact, I’d almost go as far as saying “don’t suspend your laptop”.

So, what are the three power modes usually available to us on a laptop?  Let’s look at them one at a time.  I’m going to assume that you have disk encryption enabled (the second of the seven tips in my earlier article), because you really, really should.

Power down

This is what you think it is: your laptop has powered down, and in order to start it up again, you’ve got to go through an entire boot process.  Any applications that you had running before will need to be restarted[1], and won’t come back in the same state that they were before[2].  If somebody has access to your laptop when you’re not there, then there’s not immediate way that they can get at your data, as it’s encrypted[3].  See the conclusion for a couple of provisos, but powering down your laptop when you’re not using it is pretty safe, and the time taken to reboot a modern laptop with a decent operating system on it is usually pretty quick these days.

It’s worth noting that for some operating systems – Microsoft Windows, at least – when you tell your laptop to power down, it doesn’t.  It actually performs a hibernate without telling you, in order to speed up the boot process.  There are (I believe – as a proud open source user, I don’t run Windows, so I couldn’t say for sure) ways around this, but most of the time you probably don’t care: see below on why hibernate mode is pretty good for many requirements and use cases.

Hibernate

Confusingly, hibernate is sometimes referred to as “suspend to disk”.  What actually happens when you hibernate your machine is that the contents of RAM (your working memory) are copied and saved to your hard disk.  The machine is then powered down, leaving the state of the machine ready to be reloaded when you reboot.  When you do this, the laptop notices that it was hibernated, looks for saved state, and loads it into RAM[4].  Your session should come back pretty much as it was before – though if you’ve moved to a different wifi network or a session on a website has expired, for instance, your machine may have to do some clever bits and pieces in the background to make things as nice as possible as you resume working.

The key thing about hibernating your laptop is that while you’ve saved state to the hard drive, it’s encrypted[3], so anyone who manages to get at your laptop while you’re not there will have a hard time getting any data from it.  You’ll need to unlock your hard drive before your session can be resumed, and given that your attacker won’t have your password, you’re good to go.

Suspend

The key difference between suspend and the other two power modes we’ve examined above is that when you choose to suspend your laptop, it’s still powered on.  The various components are put into low-power mode, and it should wake up pretty quickly when you need it, but, crucially, all of the applications that you were running beforehand are still running, and are still in RAM.  I mentioned in my previous post that this increases the attack surface significantly, but there are some protections in place to improve the security of your laptop when it’s in suspend mode.  Unluckily, they’re not always successful, as was demonstrated a few days ago by an attack described by the Register.  Even if your laptop is not at risk from this particular attack, my advice just not to use suspend.

There are two usages of suspend that are difficult to manage.  The first is when you have your machine set to suspend after a long period of inactivity.  Typically, you’ll set the screen to lock after a certain period of time, and then the system will suspend.  Normally, this is only set for when you’re on battery – in other words, when you’re not sat at your desk with the power plugged in.  My advice would be to change this setting so that your laptop goes to hibernate instead.  It’s a bit more time to boot it up, but if you’re leaving your laptop unused for a while, and it’s not plugged in, then it’s most likely that you’re travelling, and you need to be careful.

The second is when you get up and close the lid to move elsewhere.  If you’re moving around within your office or home, then that’s probably OK, but for anything else, try training yourself to hibernate or power down your laptop instead.

Conclusion

There are two important provisos here.

The first I’ve already mentioned: if you don’t have disk encryption turned on, then someone with access to your laptop, even for a fairly short while, is likely to have quite an easy time getting at your data.  It’s worth pointing out that you want full disk encryption turned on, and not just “home directory” encryption.  That’s because if someone has access to your laptop for a while, they may well be able to make changes to the boot-up mechanism in such a way that they can wait until you log in and either collect your password for later use or have the data sent to them over the network.  This is much less easy with full disk encryption.

The second is that there are definitely techniques available to use hardware and firmware attacks on your machine that may be successful even with full disk encryption.  Some of these are easy to spot – don’t turn on your machine if there’s anything in the USB port that you don’t recognise[5] – but others, where hardware may be attached or even soldered to the motherboard, or firmware changed, are very difficult to spot.  We’re getting into some fairly sophisticated attacks here, and if you’re worried about them, then consider my first security tip “Don’t take a laptop”.


1 – some of them automatically, either as system processes (you rarely have to remember to have to turn networking back on, for instance), or as “start-up” applications which most operating systems will allow you to specify as auto-starting when you log in.

2 – this isn’t actually quite true for all applications: it might have been more accurate to say “unless they’re set up this way”.  Some applications (web browsers are typical examples) will notice if they weren’t shut down “nicely”, and will attempt to get back into the state they were beforehand.

3 – you did enable disk encryption, right?

4 – assuming it’s there, and hasn’t been corrupted in some way, in which case the laptop will just run a normal boot sequence.

5 – and don’t just use random USB sticks from strangers or that you pick up in the carpark, but you knew that, right?

7 security tips for travelling with your laptop

Our laptop is a key tool that we tend to keep with us.

I do quite a lot of travel, and although I had a quiet month or two over the summer, I’ve got several trips booked over the next few months.  For many of us, our laptop is a key tool that we tend to keep with us, and most of us will have sensitive material of some type on our laptops, whether it’s internal emails, customer, partner or competitive information, patent information, details of internal processes, strategic documents or keys and tools for accessing our internal network.  I decided to provide a few tips around security and your laptop[1]. Of course, a laptop presents lots of opportunities for attackers – of various types.  Before we go any further, let’s think about some of the types of attacker you might be worrying about.  The extent to which you need to be paranoid will depend somewhat on what attackers you’re most concerned about.

Attackers

Here are some types of attackers that spring to my mind: bear in mind that there may be overlap, and that different individuals may take on different roles in different situations.

  • opportunistic thieves – people who will just steal your hardware.
  • opportunistic viewers – people who will have a good look at information on your screen.
  • opportunistic probers – people who will try to get information from your laptop if they get access to it.
  • customers, partners, competitors – it can be interesting and useful for any of these types to gain information from your laptop.  The steps they are willing to take to get that information may vary based on a variety of factors.
  • hackers/crackers – whether opportunistic or targeted, you need to be aware of where you – and your laptop – are most vulnerable.
  • state actors – these are people with lots and lots of resources, for whom access to your laptop, even for a short amount of time, gives them lots of chances to do short-term and long-term damage to you data and organisation.

 

7 concrete suggestions

  1. Don’t take a laptop.  Do you really need one with you?  There may be occasions when it’s safer not to travel with a laptop: leave it in the office, at home, in your bag or in your hotel room.  There are risks associated even with your hotel room (see below), but maybe a bluetooth keyboard with your phone, a USB stick or an emailed presentation will be all you need.  Not to suggest that any of those are necessarily safe, but you are at least reducing your attack surface.  Oh, and if you do travel with your laptop, make sure you keep it with you, or at least secured at all times.
  2. Ensure that you have disk encryption enabled.  If you have disk encryption, then if somebody steals your laptop, it’s very difficult for them to get at your data.  If you don’t, it’s really, really easy.  Turn on disk encryption: just do.
  3. Think about your screen. When your screen is on, people can see it.  Many laptop screens have a very broad viewing angle, so people to either side of you can easily see what’s on it.  The availability of high resolution cameras on mobile phones means that people don’t need long to capture what’s on your screen, so this is a key issue to consider.  What are your options?  The most common is to use a privacy screen, which fits over your laptop screen, typically reducing the angle from which it can be viewed.  These don’t stop people being able to view what’s on it, but it does mean that viewers need to be almost directly behind you.  This may sound like a good thing, but in fact, that’s the place you’re least likely to notice a surreptitious viewer, so employ caution.  I worry that these screens can give you a false sense of security, so I don’t use one.  Instead, I make a conscious decision never to have anything sensitive on my screen in situations where non-trusted people might see it.   If I really need to do some work, I’ll find a private place where nobody can look at my screen – and even try to be aware of the possibility of reflections in windows.
  4. Lock your screen.  Even if you’re stepping away for just a few seconds, always, always lock your screen.  Even if it’s just colleagues around.  Colleagues sometimes find it “funny” to mess with your laptop, or send emails from your account.  What’s more, there can be a certain kudos to having messaged with “the security guy/gal’s” laptop.  Locking the screen is always a good habit to get into, and rather than thinking “oh, I’ll only be 20 seconds”: think how often you get called over to chat to someone, or decide that you want a cup of tea/coffee, or just forget what you were doing, and just wander off.
  5. Put your laptop into airplane mode.  There are a multitude of attacks which can piggy-back on the wifi and bluetooth capabilities of your laptop (and your phone).  If you don’t need them, then turn them off.  In fact, turn off bluetooth anyway: there’s rarely a reason to leave it on.  There may be times to turn on wifi, but be careful about the networks you connect to: there are lots of attacks which pretend to be well-known wifi APs such as “Starbucks” which will let your laptop connect and then attempt to do Bad Things to it.  One alternative – if you have sufficient data on your mobile phone plan and you trust the provider you’re using – is to set your mobile (cell) phone up as a mobile access point and to connect to that instead.
  6. Don’t forget to take upgrades.  Just because you’re on the road, don’t forget to take software upgrades.  Obviously, you can’t do that with wifi off – unless you have Ethernet access – but when you are out on the road, you’re often more vulnerable than when you’re sitting behind the corporate firewall, so keeping your software patched and updated is a sensible precaution.
  7. Don’t suspend.  Yes, the suspend mode[2] makes it easy to get back to what you were doing, and doesn’t take much battery, but leaving your laptop in suspend increases the attack surface available to somebody who steals your laptop, or just has access to it for a short while (the classic “evil maid” attack of someone who has access to your hotel room, for instance).  If you turn off your laptop, and you’ve turned on disk encryption (see above), then you’re in much better shape.

Are there more things you can do?  Yes, of course.  But all of the above are simple ways to reduce the chance that you or your laptop are at risk from


1 – After a recent blog post, a colleague emailed me with a criticism.  It was well-intentioned, and I took it as such.  The comment he made was that although he enjoys my articles, he would prefer it if there were more suggestions on how to act, or things to do.  I had a think about it, and decided that this was entirely apt, so this week, I’m going to provide some thoughts and some suggestions this week.  I can’t promise to be consistent in meeting this aim, but this is at least a start.

2 – edited: I did have “hibernate” mode in here as well, but a colleague pointed out that hibernate should force disk encryption, so should be safer than suspend.  I never use either, as booting from cold is usually so quick these days.

Is homogeneity bad for security?

Can it really be good for security to have such a small number of systems out there?

For the last three years, I’ve attended the Linux Security Summit (though it’s not solely about Linux, actually), and that’s where I am for the first two days of this week – the next three days are taken up with the Open Source Summit.  This year, both are being run both in North America and in Europe – and there was a version of the Open Source Summit in Asia, too.  This is all good, of course: the more people, and the more diversity we have in the community, the stronger we’ll be.

The question of diversity came up at the Linux Security Summit today, but not in the way you might necessarily expect.  As with most of the industry, this very technical conference (there’s a very strong Linux kernel developer bias) is very under-represented by women, ethnic minorities and people with disabilities.  It’s a pity, and something we need to address, but when a question came up after someone’s talk, it wasn’t diversity of people’s background that was being questioned, but of the systems we deploy around the world.

The question was asked of a panel who were talking about open firmware and how making it open source will (hopefully) increase the security of the system.  We’d already heard how most systems – laptops, servers, desktops and beyond – come with a range of different pieces of firmware from a variety of different vendors.  And when we talk about a variety, this can easily hit over 100 different pieces of firmware per system.  How are you supposed to trust a system with some many different pieces?  And, as one of the panel members pointed out, many of the vendors are quite open about the fact that they don’t see themselves as security experts, and are actually asking the members of open source projects to design APIs, make recommendations about design, etc..

This self-knowledge is clearly a good thing, and the main focus of the panel’s efforts has been to try to define a small core of well-understood and better designed elements that can be deployed in a more trusted manner.   The question that was asked from the audience was in response to this effort, and seemed to me to be a very fair one.  It was (to paraphrase slightly): “Can it really be good for security to have such a small number of systems out there?”  The argument – and it’s a good one in general – is that if you have a small number of designs which are deployed across the vast majority of installations, then there is a real danger that a small number of vulnerabilities can impact on a large percentage of that install base.

It’s a similar problem in the natural world: a population with a restricted genetic pool is at risk from a successful attacker: a virus or fungus, for instance, which can attack many individuals due to their similar genetic make-up.

In principle, I would love to see more diversity of design within computing, and particular security, but there are two issues with this:

  1. management: there is a real cost to managing multiple different implementations and products, so organisations prefer to have a smaller number of designs, reducing the number of the tools to manage them, and the number of people required to be trained.
  2. scarcity of resources: there is a scarcity of resources within IT security.  There just aren’t enough security experts around to design good security into systems, to support them and then to respond to attacks as vulnerabilities are found and exploited.

To the first issue, I don’t see many easy answers, but to the second, there are three responses:

  1. find ways to scale the impact of your resources: if you open source your code, then the number of expert resources available to work on it expands enormously.  I wrote about this a couple of years ago in Disbelieving the many eyes hypothesis.  If your code is proprietary, then the number of experts you can leverage is small: if it is open source, you have access to almost the entire worldwide pool of experts.
  2. be able to respond quickly: if attacks on systems are found, and vulnerabilities identified, then the ability to move quickly to remedy them allows you to mitigate significantly the impact on the installation base.
  3. design in defence in depth: rather than relying on one defence to an attack or type of attack, try to design your deployment in such a way that you have layers of defence. This means that you have some time to fix a problem that arises before catastrophic failure affects your deployment.

I’m hesitant to overplay the biological analogy, but the second and third of these seem quite similar to defences we see in nature.  The equivalent to quick response is to have multiple generations in a short time, giving a species the opportunity to develop immunity to a particular attack, and defence in depth is a typical defence mechanism in nature – think of human’s ability to recognise bad meat by its smell, taste its “off-ness” and then vomit it up if swallowed.  I’m not quite sure how this particular analogy would map to the world of IT security (though some of the practices you see in the industry can turn your stomach), but while we wait to have a bigger – and more diverse pool of security experts, let’s keep being open source, let’s keep responding quickly, and let’s make sure that we design for defence in depth.

 

Single point of failure

Any failure which completely brings down a system for over 12 hours counts as catastrophic.

Yesterday[1], Gatwick Airport suffered a catastrophic failure. It wasn’t Air Traffic Control, it wasn’t security scanners, it wasn’t even check-in desk software, but the flight information boards. Catastrophic? Well, maybe the impact on the functioning of the airport wasn’t catastrophically affected, but the system itself was. For my money, any failure which completely brings down a system for over 12 hours (from 0430 to 1700 BST, reportedly), counts as catastrophic.

The failure has been blamed on damage to a fibre optic cable. It turned out that if this particular component of the system was brought down, then the system failed to operate as expected: it was a single point of failure. Now, in this case, it could be argued that the failure did not have a security impact: this was a resilience problem. Setting aside the fact that resilience and security are often bedfellows[2], many single points of failure absolutely are security issues, as they become obvious points of vulnerability for malicious actors to attack.

A key skill that needs to be grown with IT in general, but security in particular, is systems thinking, as I’ve discussed elsewhere, including in my first post on this blog: Systems security – why it matters. We need more systems engineers, and more systems architects. The role of systems architects, specifically, is to look beyond the single components that comprise a system, and to consider instead the behaviour of the system as a whole. This may mean looking past our first focus and our to, to for instance, hardware or externally managed systems to consider what the impact of failure, damage or compromise would be to the system’s overall operation.

Single points of failure are particularly awkward.  They crop up in all sorts of places, and they are a very good example of why diversity is important within IT security, and why you shouldn’t trust a single person – including yourself – to be the only person who looks at the security of a system.  My particular biases are towards crypto and software, for instance, so I’m more likely to miss a hardware or network point of failure than somebody with a different background to me.  Not to say that we shouldn’t try to train ourselves to think outside of whatever little box we come from – that’s part of the challenge and excitement of being a systems architect – but an acknowledgement of our own lack of expertise is in itself a realisation of our expertise: if you realise that you’re not an expert, you’re part way to becoming one.

I wanted to finish with an example of a single point of failure that is relevant to security, and exposes a process vulnerability.  The Register has a good write-up of the Foreshadow attack and its impact on SGX, Intel’s Trusted Execution Environment (TEE) capability.  What’s interesting, if the write-up is correct, is that what seems like a small break to a very specific part of the entire security chain means that you suddenly can’t trust anything.  The trust chain is broken, and you have to distrust everything you think you know.  This is a classic security problem – trust is a very tricky set of concepts – and one of the nasty things about it is that it may be entirely invisible to the user that an attack has taken place at all, particularly as the user, at this point, may have no visibility of the chain of trust that has been established – or not – up to the point that they are involved.  There’s a lot more to write about on this subject, but that’s for another day.  For now, if you’re planning to visit an airport, ensure that you have an app on your phone which will tell you your flight departure time and the correct gate.


1 – at time of writing, obviously.

2 – for non-native readers[3] , what I mean is that they are often closely related and should be considered together.

3 – and/or those unaquainted with my somewhat baroque language and phrasing habits[4].

4 – I prefer to double-dot when singing or playing Purcell, for instance[5].

5 – this is a very, very niche comment, for which slight apologies.

16 ways in which users are(n’t) like kittens

I’m going to exploit you all with an article about kittens and security.

It’s summer[1], it’s hot[2], nobody wants to work[3].  What we all want to do is look at pictures of cute kittens[5] and go “ahhh”.  So I’m going to exploit you all with an article about kittens and (vaguely about) security.  It’s light-hearted, it’s fluffy[6], and it has a picture of two of our cats at the top of it.  What’s not to like?

Warning: this article includes extreme footnoting, and may not be suitable for all readers[7].

Now, don’t get me wrong: I like users. realise the importance of users, really I do.  They are the reason we have jobs.  Unluckily, they’re often the reason we wish we didn’t have the jobs we do.  I’m surprised that nobody has previously bothered to compile a list comparing them with kittens[7.5], so I’ve done it for you.   For ease of reading, I’ve grouped ways in which users are like kittens towards the top of the table, and ways in which they’re unlike kittens towards the bottom[7.8].

Please enjoy this post, share it inappropriately on social media and feel free to suggest other ways in which kittens and users are similar or dissimilar.

Research findings

Hastily compiled table

Property Users Kittens
Capable of circumventing elaborate security measures
Yes Yes
Take up all of your time Yes Yes
Do things they’re not supposed to
Yes Yes
Forget all training instantly
Yes Yes
Damage sensitive equipment Yes Yes
Can often be found on Facebook
Yes Yes
Constantly need cleaning up after
Yes Yes
Often seem quite stupid, but are capable of extreme cunning at inopportune moments Yes Yes
Can turn savage for no obvious reason Yes Yes
Can be difficult to tell apart[10] Yes Yes
Fluffy No[8] Yes
Fall asleep a lot No[8] Yes
Wake you up at night No[9] Yes
Like to have their tummy tickled
No[8] Yes
Generally fun to be around No[8] Yes
Generally lovable No[8] Yes

1 – at time of writing, in the Northern Hemisphere, where I’m currently located.  Apologies to all those readers for whom it is not summer.

2 – see 1.

3 – actually, I don’t think this needs a disclaimer[4].

4 – sorry for wasting your time[4].

5 – for younger readers, “kittehs”.

6 – like the kittens.

7 – particularly those who object to footnotes.  You know who you are.

7.5 – actually, they may well have done, but I couldn’t be bothered to look[7.7]

7.7 – yes, I wrote the rest of the article first and then realised that I needed another footnote (now two), but couldn’t be bothered to renumber them all.  I’m lazy.

7.8 – you’re welcome[7.9].

7.9 – you know, this reminds me of programming BASIC in the old days, when it wasn’t easy to renumber your program, and you’d start out numbering in 10s, and then fill in the blanks and hope you didn’t need too many extra lines[7.95].

7.95 – b*gger.

8 – with some exceptions.

9 – unless you’re on support duty.  Then you can be pretty sure that they will.

10 – see picture.

11 – unused.

12 – intentionally left blank.

13 – unintentionally left blank.

Mitigate or remediate?

What’s the difference between mitigate and remediate?

I very, very nearly titled this article “The ‘aters gonna ‘ate”, and then thought better of it. This is a rare event, and I put it down to the extreme temperatures that we’re having here in the UK at the moment[1].

What prompted this article was reading something online today where I saw the word mitigate, and thought to myself, “When did I start using that word? It’s not a normal one to drop into conversation. And what about remediate? What’s the difference between mitigate and remediate? In fact, how well could I describe the difference between the two?” Both are quite jargon-y words, so, in the spirit of my recent article Jargon – a force for good or ill? here’s my attempt at describing the difference, but also pointing out how important both are – along with a couple of other “-ate” words.

Let’s work backwards.

Remediate

Remediation is a set of actions to get things back the way they should be. It’s the last step in the process of recovery from an attack or other failure. The reprefix here is the give -away: like resetting and reconciliation. When you’re remediating, there may be an expectation that you’ll be returning your systems to the same state they were before, for example, power failure, but that’s not necessarily the case. What you should be focussing on is the service you’re providing, rather than the system(s) that are providing it. A set of steps for remediation might require you to replace your database with another one from a completely different provider[3], to add a load-balancer and to change all of your hardware, but you’re still remediating the problem that hit you. At the very least, if you’ve suffered an attack, you should make sure that you plug any hole that allowed the attacker in to start with[4].

Mitigate

Mitigation isn’t about making things better – it’s about reducing the impact of an attack or failure. Mitigation is the first set of steps you take when you realise that you’ve got a problem. You could argue that these things are connected, but mitigation isn’t about returning the service to normal (remediation), but about taking steps to reduce the impact of an attack. Those mitigations may be external – adding load-balancers to help deal with a DDoS attack, maybe – or internal – shutting down systems that have been actively compromised.

In fact, I’d argue that some mitigations might quite properly actually have an adverse effect on the service: there may be short term reductions in availability to ensure that long-term remediations can be performed. Planning for this is vitally important, as I’ve discussed previously, in Service degradation: actually a good thing.

The other -ates: update and operate

As I promised above, there are a couple of other words that are part of your response to an attack – or possible attack. The first is update. Updating is one of the key measures that you can put in place to reduce the chance of a successful attack. It’s the step you take before mitigation – because, if you’re lucky, you won’t need mitigation, because you’ll be immune from attack[5].

The second of these is operate. Operation is your normal state: it’s where you want to be. But operation doesn’t mean that you can just sit back and feel secure: you need to be keeping an eye on what’s going on, planning upgrades, considering mitigations and preparing for remediations. We too often think of the operate step as our Happy Place, where all is rosy, and we can sit back and watch the daisies grow. As DevOps (and DevSecOps) is teaching us, this is absolutely not the case: operate is very much a dynamic state, and not a passive one.


1 – 30C (~86F) is hot for the UK. Oh, and most houses (ours included) don’t have air conditioning[2].

2 – and I work from an office in the garden. Direct sunlight through the mainly glass wall is a blessing in the winter. Less so in the summer.

3 – hopefully open source, of course.

4 – hint: patch, patch, patch.

5 – well, that attack, at least. You’re never immune from all attacks: sorry.

Who do you trust: your data, or your enemy’s?

Our security logs define our organisational memory.

Imagine, just imagine, that you’re head of an organisation, and you suspect that there’s been some sort of attack on your systems[1].  You don’t know for sure, but a bunch of your cleverest folks are pretty sure that there are sufficient signs to merit an investigation.  So, you agree to allow that investigation, and it’s pretty clear from the outset – and becomes yet more clear as the investigation unfolds – that there was an attack on your organisation.  And as this investigation draws close to its completion, you happen to meet the head of another organisation, which happens to be not only your competitor, but also the party that your investigators are certain was behind the attack.  That person – the leader of your competitor – tells you that they absolutely didn’t perform an attack: no, sirree.  Who do you believe?  Your people, who have been laboring[2] away for months, or your competitor?

What a ridiculous question: of course you’d believe your own people over your competitor, right?

So, having set up such an absurd scenario[4], let’s look at a scenario which is actually much more likely.  Your systems have been acting strangely, and there seems to be something going on.  Based on the available information, you believe that you’ve been attacked, but you’re not sure.  Your experts think it’s pretty likely, so you approve an investigation.  And then one of your investigatory team come to you to tell you some really bad news about the data.  “What’s the problem?” you ask.  “Is there no data?”

“No,” they reply, “it’s worse than that.  We’ve got loads of data, but we don’t know which is real.”

Our logs are our memory

There is a literary trope – one of my favourite examples is Margery Allingham’s The Traitor’s Purse – where a character realises that he or she is somebody other than who they think they are when they start to question their memories.  We are defined by our memories, and the same goes for organisational security.  Our security logs define our organisational memory.  If you cannot prove that the data you are looking at is[5] correct, then you cannot be sure what led to the state you are in now.

Let’s give a quick example.  In my organisation, I am careful to log every time I upgrade a piece of software, but I begin to wonder whether a particular application is behaving as expected.  Maybe I see some traffic to an external IP address which I’ve never seen before, for instance.  Well, the first thing I’m going to do is to check the logs to see whether somebody has updated the software with a malicious version.  Assuming that somebody has installed a malicious version, there are three likely outcomes at this point:

  1. you check the logs, and it’s clear that a non-authorised version was installed.  This isn’t good, because you now know that somebody has unauthorised access to your system, but at least you can do something about it.
  2. at least some of the logs are missing.  This is less good, because you really can’t be sure what’s gone on, but you know have a pretty strong suspicion that somebody with unauthorised access has deleted some logs to cover up their tracks, which means that you have a starting point for remediation.
  3. there’s nothing wrong.  All the logs look fine.  You’re now really concerned, as you can’t be sure of your own data – your organisation’s memories.  You may be looking at correct data, or you may be looking at incorrect data: data which has been written by your enemy.  Attackers can – and do – go into log files and change the data in them to obscure the fact that they have performed particular actions.  It’s actually one of the first steps that a competent attacker will perform.

In the scenario as defined, things probably aren’t too bad: you can check the checksum or hash of the installed software, and check that it’s the version you expect.  But what if the attacker has also changed the version of your checksum- or hash-checker so that, for packages associated with this particular application, they always return what you expect to see?  This is not a theoretical attack, and nor is it the only way approach that attackers have to try to muddy the waters as to what they’ve done. And if it has happened, they you have no way of confirming that you have the correct version of the application.  You can try updating the checksum- or hash-checker, but what if the attacker has meddled with the software installer to ensure that it always installs their version…?

It’s a slippery slope, and bar wiping the entire system and reinstalling[6], there’s little you can do to be certain that you’ve cleared things up properly.  And in some cases, it may be too late: what if the attacker’s main aim was to exfiltrate some of your proprietary data, for example?

Lessons to learn, actions to take

Well, the key lesson is that your logs are important, and they need to be protected.  If you can’t trust your logs, then it can be very, very difficult not only to identify the extent of an attack, but also to remediate it or, in the worst case, even to know that you’ve been attacked at all.

And what can you do?  Well, there are many techniques that you can employ, and the best combination will depend on a number of questions, including your regulatory regime, your security posture, and what attackers you decide to defend against.  I’d start with a fairly simple combination:

  • move your most important logs off-system.  Where possible, host logs on different systems to the ones that are doing the reporting.  If you’re an attacker, it’s more difficult to jump from one system to another than it is to make changes to logs on a system which you’ve already compromised;
  • make your logs write-only.  This sounds crazy – how are you supposed to check logs if they can’t be read?  What this really means is that you separate read and write privileges on your logs so that only those with a need to read them can do so.  Writing is less worrisome – though there are attacks here, including filesystem starvation – because if you can’t see what you need to change, then it’s almost impossible to do so.  If you’re an attacker, you might be able to wipe some logs – see our case 2 above – but obscuring what you’ve actually done is more difficult.

Exactly what steps you take are up to you, but remember: if you can’t trust your logs, you can’t trust your data, and if you can’t trust your data, you don’t know what has happened.  That’s what your enemy wants: confusion.


1 – like that’s ever going to happen.

2- on this, very rare, occasion, I’m going to countenance[2] a US spelling.  I think you can guess why.

3 – contenance?

4 – I know, I know.

5 – are?

6 – you checked the firmware, right?  Hmm – maybe safer just to buy completely new hardware.

 

7 steps to security policy greatness

… we’ve got a good chance of doing the right thing, and keeping the auditors happy.

Security policies are something that everybody knows they should have, but which are deceptively simple to get wrong.  A set of simple steps to check when you’re implementing them is something that I think it’s important to share.  You know when you come up a set of steps, you suddenly realise that you’ve got a couple of vowels in there and you think, “wow, I can make an acronym!”?[1] This was one of those times.  The problem was that when I looked at the different steps, I decided that DDEAVMM doesn’t actually have much of a ring to it.  I’ve clearly still got work to do before I can name major public sector projects, for instance[2].  However, I still think it’s worth sharing, so let’s go through them in order.  Order, as for many sets of steps, is important here.

I’m going to give an example and walk through the steps for clarity.  Let’s say that our CISO, in his or her infinite wisdom, has decided that they don’t want anybody outside our network to be able to access our corporate website via port 80.  This is the policy that we need to implement.

1. Define

The first thing I need is a useful definition.  We nearly have that from the request[3] noted above, when our CISO said “I don’t want anybody outside our network to be able to access our corporate website via port 80”.  So, let’s make that into a slightly more useful definition.

“Access to IP address mycorporate.network.com on port 80 must be blocked to all hosts outside the 10.0.x.x to 10.3.x.x network range.”

I’m assuming that we already know that our main internal network is within 10.0.x.x to 10.3.x.x.  It’s not exactly a machine readable definition, but actually, that’s the point: we’re looking for a definition which is clear and human understandable.  Let’s assume that we’re happy with this, at least for now.

2. Design

Next, we need to design a way to implement this.  There are lots of ways of doing this – from iptables[4] to a full-blown, commercially supported firewall[6] – and I’m not fluent in any of them these days, so I’m going to assume that somebody who is better equipped than I has created a rule or set of rules to implement the policy defined in step 1.

We also need to define some tests – we’ll come back to these in step 5.

3. Evaluate

But we need to check.  What if they’ve mis-implemented it?    What if they misunderstood the design requirement?  It’s good practice – in fact, it’s vital – to do some evaluation of the design to ensure it’s correct.  For this rather simple example, it should be pretty to check by eye, but we might want to set up a test environment to evaluate that it meets the policy definition or take other steps to evaluate its correctness.  And don’t forget: we’re not checking that the design does what the person/team writing it thinks it should do: we’re checking that it meets the definition.  It’s quite possible that at this point we’ll realise that the definition was incorrect.  Maybe there’s another subnet – 10.5.x.x, for instance – that the security policy designer didn’t know about.  Or maybe the initial request wasn’t sufficiently robust, and our CISO actually wanted to block all access on any port other than 443 (for HTTPS), which should be allowed.  Now is a very good time to find that out.

4. Apply

We’ve ascertained that the design does what it should do – although we may have iterated a couple of times on exactly “what it should do” means – so now we can implement it.  Whether that’s ssh-ing into a system, uploading a script, using some sort of GUI or physically installing a new box, it’s done.

Excellent: we’re finished, right?

No, we’re not.  The problem is that often, that’s exactly what people think.  Let’s move to our next step: arguably the most important, and the most often forgotten or ignored.

5. Validate

I really care about this one: it’s arguably the point of this post.  Once you’ve implemented a security policy, you need to validate that it actually does what it’s supposed to do.  I’ve written before about this, in my post If it isn’t tested, it doesn’t work, and it’s absolutely true of security policy implementations.  You need to check all of the different parts of the design.  This, you might hope, would be really easy, but even in the simple case that we’re working with, there are lots of tests you should be doing.  Let’s say that we took the two changes mentioned in step 3.  Here are some tests that I would want to be doing, with the expected result:

  • FAIL: access port 80 from an external IP address
  • FAIL: access port 8080 from an external IP address
  • FAIL: access port 80 from a 10.4.x.x address
  • PASS: access port 443 from an external IP address
  • UNDEFINED: access port 80 from a 10.5.x.x
  • UNDEFINED: access port 80 from a 10.0.x.x address
  • UNDEFINED: access port 443 from a 10.0.x.x address
  • UNDEFINED: access port 80 from an external IP address but with a VPN connection into 10.0.x.x

Of course, we’d want to be performing these tests on a number of other ports, and from a variety of IP addresses, too.

What’s really interesting about the list is the number of expected results that are “UNDEFINED”.  Unless we have a specific requirement, we just can’t be sure what’s expected.  We can guess, but we can’t be sure.  Maybe we don’t care?  I particularly like the last one, because the result we get may lead us much deeper into our IT deployment than we might expect.

The point, however, is that we need to check that the actual results meet our expectations, and maybe even define some new requirements if we want to remove some of the “UNDEFINED”.  We may be fine to leave some the expected results as “UNDEFINED”, particularly if they’re out of scope for our work or our role.  Obviously, if the results don’t meet our expectations, then we also need to make some changes and apply them and then re-validate.  We also need to record the final results.

When we’ve got more complex security policy – multiple authentication checks, or complex SDN[7] routing rules – then our tests are likely to be much, much more complex.

6. Monitor

We’re still not done.  Remember those results we got in our previous tests?  Well, we need to monitor our system and see if there’s any change.  We should do this on a fairly frequent basis.  I’m not going to say “regular”, because regular practices can lead to sloppiness, and also leave windows of opportunities open to attackers.  We also need to perform checks whenever we make a change to a connected system.  Oh, if I had a dollar for every time I’ve heard “oh, this won’t affect system X at all.”…[8]

One interesting point is that we should also note when results whose expected value remains “UNDEFINED” change.  This may be a sign that something in our system has changed, it may be a sign that a legitimate change has been made in a connected system, or it may be a sign of some sort of attack.  It may not be quite as important as a change in one of our expected “PASS” or “FAIL” results, but it certainly merits further investigation.

7. Mitigate

Things will go wrong.  Some of them will be our fault, some of them will be our colleagues’ fault[10], some of them will be accidental, or due to hardware failure, and some will be due to attacks.  In all of these cases, we need to act to mitigate the failure.  We are in charge of this policy, so even if the failure is out of our control, we want to make sure that mitigating mechanism are within our control.  And once we’ve completed the mitigation, we’re going to have to go back at least to step 2 and redesign our implementation.  We might even need to go back to step 1 and redefine what our definition should be.

Final steps

There are many other points to consider, and one of the most important is the question of responsibility, touched on in step 7 (and which is particularly important during holiday seasons), special circumstances and decommissioning, but if we can keep these steps in mind when we’re implementing – and running – security policies, we’ve got a good chance of doing the right thing, and keeping the auditors happy, which is always worthwhile.


1 – I’m really sure it’s not just me.  Ask your colleagues.

2 – although, back when Father Ted was on TV, a colleague of mine and I came up with a database result which we named a “Fully Evaluated Query”.  It made us happy, at least.

3 – if it’s from the CISO, and he/she is my boss, then it’s not a request, it’s an order, but you get the point.

4 – which would be my first port[5] of call, but might not be the appropriate approach in this context.

5 – sorry, that was unintentional.

6 – just because it’s commercially support doesn’t mean it has to be proprietary: open source is your friend, boys and girls, ladies and gentlemen.

7 – Software-Defined Networking.

8 – I’m going to leave you hanging, other than to say that, at current exchange rates, and assuming it was US dollars I was collecting, then I’d have almost exactly 3/4 of that amount in British pounds[9].

9 – other currencies are available, but please note that I’m not currently accepting bitcoin.

10 – one of the best types.

Happy 4th of July – from Europe

If I were going to launch a cyberattack on the US, I would do it on the 4th July.

There’s a piece of received wisdom from the years of the Cold War that if the Russians[1] had ever really wanted to start a land war in Europe, they would have done it on Christmas Day, when all of the US soldiers in Germany were partying and therefore unprepared for an attack.  I’m not sure if this is actually fair – I’m sure that US commanders had considered this eventuality – but it makes for a good story.

If I were going to launch a cyberattack on the US, I would do it on the 4thof  July.  Now, to be entirely clear, I have no intentions of performing any type of attack – cyber or not – on our great ally across the Pond[2]: not today (which is actually the 3rd July) or tomorrow.  Quite apart from anything else, I’m employed by[5] a US company, and I also need to travel to the US on business quite frequently.  I’d prefer to be able to continue both these activities without undue attention from the relevant security services.

The point, however, is that the 4th of July would be a good time to do it.  How do I know this?  I know it because it’s one of my favourite holidays.  This may sound strange to those of you who follow or regularly read this blog, who will know – from my spelling, grammar and occasional snide humour[6] – that I’m a Brit, live in the UK, and am proud of my Britishness.  The 4th of July is widely held, by residents and citizens of the USA, to be a US holiday, and, specifically, one where they get to cock a snook at the British[7].  But I know, and my European colleagues know – in fact, I suspect that the rest of the world outside the US knows – that if you are employed by, are partners of, or otherwise do business with the US, then the 4th of July is a holiday for you as well.

It’s the day when you don’t get emails.  Or phone calls.  There are no meetings arranged.

It’s the day when you can get some work done.  Sounds a bit odd for a holiday, but that’s what most of us do.

Now, I’m sure that, like the US military in the Cold War, some planning has taken place, and there is a phalanx of poor, benighted sysadmins ready to ssh into servers around the US in order to deal with any attacks that come in and battle with the unseen invaders.  But I wonder if there are enough of them, and I wonder whether the senior sysadmins, the really experienced ones who are most likely to be able to repulse the enemy, haven’t ensured that it’s their junior colleagues who are the ones on duty so that they – the senior ones – can get down to some serious barbecuing and craft beer consumption[8].  And I wonder what the chances are of getting hold of the CISO or CTO when urgent action is required.

I may be being harsh here: maybe everything’s completely under control across all organisations throughout the USA, and nobody will take an extra day or two of holiday this week.  In fact, I suspect that many sensible global organisations – even those based in the US – have ensured that they’ve readied Canadian, Latin American, Asian or European colleagues to deal with any urgent issues that come up.  I really, really hope so.  For now, though, I’m going to keep my head down and hope that the servers I need to get all that work done on my favourite holiday stay up and responsive.

Oh, and roll on Thanksgiving.


1 – I suppose it should really be “the Soviet Union”, but it was also “the Russians”: go figure.

2 – the Atlantic ocean – this is British litotes[3].

3 – which is, like, a million times better than hyperbole[4].

4 – look them up.

5 – saying “I work for” sets such a dangerous precendent, don’t you think?

6 – litotes again.

7 – the probably don’t cock a snook, actually, as that’s quite a British phrase.

8 – I’m assuming UNIX or Linux sysadmins: therefore most likely bearded, and most likely craft beer drinkers.  Your stereotypes may vary.