Do you know what’s lurking on your system?

Every utility, every library, every executable … increases your attack surface.

I had a job once which involved designing hardening procedures for systems that we were going to be use for security-related projects.  It was fascinating.  This was probably 15 years ago, and not only were guides a little thin on the ground for the Linux distribution I was using, but what we were doing was quite niche.  At first, I think I’d assumed that I could just write a script to close down a few holes that originated from daemons[1] that had been left running for no reasons: httpd, sendmail, stuff like that.  It did involve that, of course, but then I realised there was more to do, and started to dive down the rabbit hole.

So, after those daemons, I looked at users and groups.  And then at file systems, networking, storage.  I left the two scariest pieces to last, for different reasons.  The first was the kernel.  I ended up hand-crafting a kernel, removing anything that I thought it was unlikely we’d need, and then restarting several times when I discovered that the system wouldn’t boot because the things I thought I understood were more … esoteric than I’d realised.  I’m not a kernel developer, and this was a salutary lesson in quite how skilled those folks are.  At least, at the time I was doing it, there was less code, and fewer options, than there are today.  On the other hand, I was having to hack back to a required state, and there are more cut-down kernels and systems to start with than there were back then.

The other piece that I left for last was just pruning the installed Operating System applications and associated utilities.  Again, there are cut-down options that are easier to use now than then, but I also had some odd requirements – I believe that we needed Java, for instance, which has, or had …. well let’s say a lot of dependencies.  Most modern Linux distributions[3] start off by installing lots of pieces so that you can get started quickly, without having to worry about trying to work out dependencies for every piece of external software you want to run.

This is understandable, but we need to realise when we do this that we’re making a usability/security trade-off[5].  Every utility, every library, every executable that you add to a system increases your attack surface, and increases the likelihood of vulnerabilities.

The problem isn’t just that you’re introducing vulnerabilities, but that once they’re there, they tend to stay there.  Not just in code that you need, but, even worse, in code that you don’t need.  It’s a rare, but praiseworthy hacker[6] who spends time going over old code removing dependencies that are no longer required.  It’s a boring, complex task, and it’s usually just easier to leave the cruft[7] where it is and ship a slightly bigger package for the next release.

Sometimes, code is refactored and stripped: most frequently, security-related code. This is a very Good Thing[tm], but it turns out that it’s far from sufficient.  The reason I’m writing this post is because of a recent story in The Register about the “beep” command.  This command used the little speaker that was installed on most PC-compatible motherboards to make a little noise.  It was a useful little utility back in the day, but is pretty irrelevant now that most motherboards don’t ship with the relevant hardware.  The problem[8] is that installing and using the beep command on a system allows information to be leaked to users who lack the relevant permissions.  This can be a very bad thing.  There’s a good, brief overview here.

Now, “beep” isn’t installed by default on the distribution I’m using on the system on which I’m writing this post (Fedora 27), though it’s easily installable from one of the standard repositories that I have enabled.  Something of a relief, though it’s not a likely attack vector for this machine anyway.

What, though, do I have installed on this system that is vulnerable, and which I’d never have thought to check?  Forget all of the kernel parameters which I don’t need turned on, but which have been enabled by the distribution for ease of use across multiple systems.  Forget the apps that I’ve installed and use everyday.  Forget, even the apps that I installed once to try, and then neglected to remove.  What about the apps that I didn’t even know were there, and which I never realised might have an impact on the security posture of my system?  I don’t know, and have little way to find out.

This system doesn’t run business-critical operations.  When I first got it and installed the Operating System, I decided to err towards usability, and to be ready to trash[9] it and start again if I had problems with compromise.  But that’s not the case for millions of other systems out there.  I urge you to consider what you’re running on a system, what’s actually installed on it, and what you really need.  Patch what you need, remove what you don’t.   It’s time for a Spring clean[10].


1- I so want to spell this word dæmons, but I think that might be taking my Middle English obsession too far[2].

2 – I mentioned that I studied Middle English, right?

3 – I’m most interested in (GNU) Linux here, as I’m a strong open source advocate and because it’s the Operating System that I know best[4].

4 – oh, and I should disclose that I work for Red Hat, of course.

5 – as with so many things in our business.

6 – the good type, or I’d have said “cracker”, probably.

7 – there’s even a word for it, see?

8 – beyond a second order problem that a suggested fix seems to have made the initial problem worse…

9 – physically, if needs be.

10 – In the Northern Hemisphere, at least.

Confessions of an auditor

Moving to a postive view of security auditing.

So, right up front, I need to admit that I’ve been an auditor.  A security auditor.  Professionally.  It’s time to step up and be proud, not ashamed.  I am, however, dialled into a two-day workshop on compliance today and tomorrow, so this is front and centre of my thinking right at the moment.  Hence this post.

Now, I know that not everybody has an entirely … positive view of auditors.  And particularly security auditors[1].  They are, for many people, a necessary evil[2].  The intention of this post is to try to convince you that auditing[3] is, or can and should be, a Force for Good[tm].

The first thing to address is that most people have auditors in because they’re told that they have to, and not because they want to.  This is often – but not always – associated with regulatory requirements.  If you’re in the telco space, the government space or the financial services space, for instance, there are typically very clear requirements for you to adhere to particular governance frameworks.  In order to be compliant with the regulations, you’re going to need to show that you meet the frameworks, and in order to do that, you’re going to need to be audited.  For pretty much any sensible auditing regime, that’s going to require an external auditor who’s not employed by or otherwise accountable to the organisation being audited.

The other reason that audit may happen is when a C-level person (e.g. the CISO) decides that they don’t have a good enough idea of exactly what the security posture of their organisation – or specific parts of it – is like, so they put in place an auditing regime.  If you’re lucky, they choose an industry-standard regime, and/or one which is actually relevant to your organisation and what it does.  If you’re unlucky, … well, let’s not go there.

I think that both of the reasons above – for compliance and to better understand a security posture – are fairly good ones.  But they’re just the first order reasons, and the second order reasons – or the outcomes – are where it gets interesting.

Sadly, one of key outcomes of auditing seems to be blame.  It’s always nice to have somebody else to blame when things go wrong, and if you can point to an audited system which has been compromised, or a system which has failed audit, then you can start pointing fingers.  I’d like to move away from this.

Some positive suggestions

What I’d like to see would be a change in attitude towards auditing.  I’d like more people and organisations to see security auditing as a net benefit to their organisations, their people, their systems and their processes[4].  This requires some changes to thinking – changes which many organisations have, of course, already made, but which more could make.  These are the second order reasons that we should be considering.

  1. Stop tick-box[5] auditing.  Too many audits – particularly security audits – seem to be solely about ticking boxes.  “Does this product or system have this feature?”  Yes or no.  That may help you pass your audit, but it doesn’t give you a chance to go further, and think about really improving what’s going on.  In order to do this, you’re going to need to find auditors who actually understand the systems they’re looking at, and are trained to some something beyond tick-box auditing.  I was lucky enough to be encouraged to audit in this broader way, alongside a number of tick-boxes, and I know that the people I was auditing always preferred this approach, because they told me that they had to do the other type, too – and hated it.
  2. Employ internal auditors.  You may not be able to get approval for actual audit sign-off if you use internal auditors, so internal auditors may have to operate alongside – or before – your external auditors, but if you find and train people who can do this job internally, then you get to have people with deep knowledge (or deeper knowledge, at least, than the external auditors) of your systems, people and processes looking at what they all do, and how they can be improved.
  3. Look to be proactive, not just reactive.  Don’t just pick or develop products, applications and systems to meet audit, and don’t wait until the time of the audit to see whether they’re going to pass.  Auditing should be about measuring and improving your security posture, so think about posture, and how you can improve it instead.
  4. Use auditing for risk-based management.  Last, but not least – far from least – think about auditing as part of your risk-based management planning.  An audit shouldn’t be something you do, pass[6] and then put away in a drawer.  You should be pushing the results back into your governance and security policy model, monitoring, validation and enforcement mechanisms.  Auditing should be about building, rather than destroying – however often it feels like it.

 


1 – you may hear phrases like “a special circle of Hell is reserved for…”.

2 – in fact, many other people might say that they’re an unnecessary evil.

3 – if not auditors.

4 – I’ve been on holiday for a few days: I’ve maybe got a little over-optimistic while I’ve been away.

5 – British usage alert: for US readers, you’ll probably call this a “check-box”.

6 – You always pass every security audit first time, right?

What’s a State Actor, and should I care?

The bad thing about State Actors is that they rarely adhere to an ethical code.

How do you know what security measures to put in place for your organisation and the systems that you run?  As we’ve seen previously, There are no absolutes in security, so there’s no way that you’re ever going to make everything perfectly safe.  But how much effort should you put in?

There are a number of parameters to take into account.  One thing to remember is that there are always trade-offs with security.  My favourite one is a three-way: security, cost and usability.  You can, the saying goes, have two of the three, but not the third: choose security, cost or usability.  If you want security and usability, it’s going to cost you.  If you want a cheaper solution with usability, security will be reduced.  And if you want security cheaply, it’s not going to be easily usable.  Of course, cost stands in for time spent as well: you might be able to make things more secure and usable by spending more time on the solution, but that time is costly.

But how do you know what’s an appropriate level of security?  Well, you need to think about what you’re protecting, the impact of it being:

  • exposed (confidentiality);
  • changed (integrity);
  • inaccessible (availability).

These are three classic types of control, often recorded as C.I.A.[1].  Who would be impacted?  What mitigations might you put in place? What vulnerabilities might exist in the system, and how might they be exploited?

One of the big questions to ask alongside all of these is “who exactly might be wanting to attack my systems?”  There’s a classic adage within security that “no system is secure against a sufficiently motivated and resourced attacker”.  Luckily for us, there are actually very few attackers who fall into this category.  Some examples of attackers might be:

  • insiders[3]
  • script-kiddies
  • competitors
  • trouble-makers
  • hacktivists[4]
  • … and more.

Most of these will either not be that motivated, or not particularly well-resourced.  There are two types of attackers for whom that is not the case: academics and State Actors.  The good thing about academics is that they have adhere to an ethical code, and shouldn’t be trying anything against your systems without your permission.  The bad thing about State Actors is that they rarely adhere to an ethical code.

State Actors have the resources of a nation state behind them, and are therefore well-resourced.  They are also generally considered to be well-motivated – if only because they have many people available to perform attack, and those people are well-protected from legal process.

One thing that State Actors may not be, however, is government departments or parts of the military.  While some nations may choose to attack their competitors or enemies (or even, sometimes, partners) with “official” parts of the state apparatus, others may choose a “softer” approach, directing attacks in a more hands-off manner.  This help may take a variety of forms, from encouragement, logistical support, tools, money or even staff.  The intent, here, is to combine direction with plausible deniability: “it wasn’t us, it was this group of people/these individuals/this criminal gang working against you.  We’ll certainly stop them from performing any further attacks[5].”

Why should this matter to you, a private organisation or public company?  Why should a nation state have any interest in you if you’re not part of the military or government?

The answer is that there are many reasons why a State Actor may consider attacking you.  These may include:

  • providing a launch point for further attacks
  • to compromise electoral processes
  • to gain Intellectual Property information
  • to gain competitive information for local companies
  • to gain competitive information for government-level trade talks
  • to compromise national infrastructure, e.g.
    • financial
    • power
    • water
    • transport
    • telecoms
  • to compromise national supply chains
  • to gain customer information
  • as revenge for perceived slights against the nation by your company – or just by your government
  • and, I’m sure, many others.

All of these examples may be reasons to consider State Actors when you’re performing your attacker analysis, but I don’t want to alarm you.  Most organisations won’t be a target, and for those that are, there are few measures that are likely to protect you from a true State Actor beyond measures that you should be taking anyway: frequent patching, following industry practice on encryption, etc..  Equally important is monitoring for possible compromise, which, again, you should be doing anyway.  The good news is that if you might be on the list of possible State Actor targets, most countries provide good advice and support before and after the act for organisations which are based or operate within their jurisdiction.


1 – I’d like to think that they tried to find a set of initials for G.C.H.Q. or M.I.5., but I suspect that they didn’t[2].

2 – who’s “they” in this context?  Don’t ask.  Just don’t.

3 – always more common than we might think: malicious, bored, incompetent, bankrupt – the reasons for insider-related security issues are many and varied.

4 – one of those portmanteau words that I want to dislike, but find rather useful.

5 – yuh-huh, right.

Why I should have cared more about lifecycle

Every deployment is messy.

I’ve always been on the development and architecture side of the house, rather than on the operations side. In the old days, this distinction was a useful and acceptable one, and wasn’t too difficult to maintain. From time to time, I’d get involved with discussions with people who were actually running the software that I had written, but on the whole, they were a fairly remote bunch.

This changed as I got into more senior architectural roles, and particularly as I moved through some pre-sales roles which involved more conversations with users. These conversations started to throw up[1] an uncomfortable truth: not only were people running the software that I helped to design and write[3], but they didn’t just set it up the way we did in our clean test install rig, run it with well-behaved, well-structured data input by well-meaning, generally accurate users in a clean deployment environment, and then turn it off when they’re done with it.

This should all seem very obvious, and I had, of course, be on the receiving end of requests from support people who exposed that there were odd things that users did to my software, but that’s usually all it felt like: odd things.

The problem is that odd is normal.  There is no perfect deployment, no clean installation, no well-structured data, and certainly very few generally accurate users.  Every deployment is messy, and nobody just turns off the software when they’re done with it.  If it’s become useful, it will be upgraded, patched, left to run with no maintenance, ignored or a combination of all of those.  And at some point, it’s likely to become “legacy” software, and somebody’s going to need to work out how to transition to a new version or a completely different system.  This all has major implications for security.

I was involved in an effort a few years ago to describe the functionality, lifecycle for a proposed new project.  I was on the security team, which, for all the usual reasons[4] didn’t always interact very closely with some of the other groups.  When the group working on error and failure modes came up with their state machine model and presented it at a meeting, we all looked on with interest.  And then with horror.  All the modes were “natural” failures: not one reflected what might happen if somebody intentionally caused a failure.  “Ah,” they responded, when called on it by the first of the security to be able to form a coherent sentence, “those aren’t errors, those are attacks.”  “But,” one of us blurted out, “don’t you need to recover from them?”  “Well, yes,” they conceded, “but you can’t plan for that.  It’ll need to be on a case-by-case basis.”

This is thinking that we need to stamp out.  We need to design our systems so that, wherever possible, we consider not only what attacks might be brought to bear on them, but also how users – real users – can recover from them.

One way of doing this is to consider security as part of your resilience planning, and bake it into your thinking about lifecycle[5].  Failure happens for lots of reasons, and some of those will be because of bad people doing bad things.  It’s likely, however, that as you analyse the sorts of conditions that these attacks can lead to, a number of them will be similar to “natural” errors.  Maybe you could lose network connectivity to your database because of a loose cable, or maybe because somebody is performing a denial of service attack on it.  In both these cases, you may well start off with similar mitigations, though the steps to fix it are likely to be very different.  But considering all of these side by side means that you can help the people who are actually going to be operating those systems plan and be ready to manage their deployments.

So the lesson from today is the same as it so often is: make sure that your security folks are involved from the beginning of a project, in all parts of it.  And an extra one: if you’re a security person, try to think not just about the attackers, but also about all those poor people who will be operating your software.  They’ll thank you for it[6].


1 – not literally, thankfully[2].

2 – though there was that memorable trip to Singapore with food poisoning… I’ll stop there.

3 – a fact of which I actually was aware.

4 – some due entirely to our own navel-gazing, I’m pretty sure.

5 – exactly what we singularly failed to do in the project I’ve just described.

6 – though probably not in person.  Or with an actual gift.  But at least they’ll complain less, and that’s got to be worth something.

Moving to DevOps, what’s most important? 

Technology, process or culture? (Clue: it’s not the first two)

You’ve been appointed the DevOps champion in your organisation: congratulations.  So, what’s the most important issue that you need to address?

It’s the technology – tools and the toolchain – right?  Everybody knows that unless you get the right tools for the job, you’re never going to make things work.  You need integration with your existing stack – though whether you go with tight or loose integration will be an interesting question – a support plan (vendor, 3rd party or internal), and a bug-tracking system to go with your source code management system.  And that’s just the start.

No!  Don’t be ridiculous: it’s clearly the process that’s most important.  If the team doesn’t agree on how stand-ups are run, who participates, the frequency and length of the meetings, and how many people are required for a quorum, then you’ll never be able institute a consistent, repeatable working pattern.

In fact, although both the technology and the process are important, there’s a third component which is equally important, but typically even harder to get right: culture.  Yup, it’s that touch-feely thing that we techies tend to struggle with[1].

Culture

I was visiting a medium-sized government institution a few months ago (not in the UK, as it happens), and we arrived a little early to meet the CEO and CTO.  We were ushered into the CEO’s office and waited for a while as the two of them finished participating in the daily stand-up.  They apologised for being a minute or two late, but far from being offended, I was impressed.  Here was an organisation where the culture of participation was clearly infused all the way up to the top.

Not that culture can be imposed from the top – nor can you rely on it percolating up from the bottom[3] – but these two C-level execs were not only modelling the behaviour they expected from the rest of their team, but also seemed, from the brief discussion we had about the process afterwards, to be truly invested in it.  If you can get management to buy into the process – and to be seen to buy in – you are at least likely to have problems with other groups finding plausible excuses to keep their distance and get away with it.

So let’s say that management believes that you should give DevOps a go.  Where do you start?

Developers, tick?[5]

Developers may well be your easiest target group.  Developers are often keen to try new things, and to find ways to move things along faster, so they are often the group that can be expected to adopt new technologies and methodologies.  DevOps has arguably been mainly driven by the development community. But you shouldn’t assume that all developers will be keen to embrace this change.  For some, the way things have always been done – your Rick Parfitts of dev, if you will[7] – is fine.  Finding ways to help them work efficiently in the new world is part of your job, not just theirs.  If you have superstar developers who aren’t happy with change, you risk alienating them and losing them if you try to force them into your brave new world.  What’s worse, if they dig their heels in, you risk the adoption of your DevSecOps vision being compromised when they explain to their managers that things aren’t going to change if it makes their lives more difficult and reduces their productivity.

Maybe you’re not going to be able to move all the systems and people to DevOps immediately.  Maybe you’re going to need to choose which apps start with, and who will be your first DevOps champions.  Maybe it’s time to move slowly.

Not maybe: definitely

No – I lied.  You’re definitely going to need to move slowly.  Trying to change everything at once is a recipe for disaster.

This goes for all elements of the change – which people to choose, which technologies to choose, which applications to choose, which user base to choose, which use cases to choose – bar one.  For all of those elements, if you try to move everything in one go, you will fail.  You’ll fail for a number of reasons.  You’ll fail for reasons I can’t imagine, and, more importantly, for reasons you can’t imagine, but some of the reasons will include:

  • people – most people – don’t like change;
  • technologies don’t like change (you can’t just switch and expect everything to work still);
  • applications don’t like change (things worked before, or at least failed in known ways: you want to change everything in one go?  Well, they’ll all fail in new and exciting[9] ways;
  • users don’t like change;
  • use cases don’t like change.

The one exception

You noticed that, above, I wrote “bar one”, when discussing which elements you shouldn’t choose to change all in one go?  Well done.

What’s that exception?  It’s the initial team.  When you choose your initial application to change, and you’re thinking about choosing the team to make that change, select the members carefully, and select a complete set.  This is important.  If you choose just developers, just test folks, or just security folks, or just ops folks, or just management, then you won’t actually have proved anything at all.  If you leave out one functional group from your list, you won’t actually have proved anything at all.  Well, you might have proved to a small section of your community that it kind of works, but you’ll have missed out on a trick.  And that trick is that if you choose keen people from across your functional groups, it’s much harder to fail.

Say that your first attempt goes brilliantly.  How are you going to convince other people to replicate your success and adopt DevOps?  Well, the company newsletter, of course.  And that will convince how many people, exactly?  Yes, that number[12].  If, on the other hand, you have team members from across the functional parts or the organisation, then when you succeed, they’ll tell their colleagues, and you’ll get more buy-in next time.

If, conversely, it fails, well, if you’ve chosen your team wisely, and they’re all enthusiastic, and know that “fail often, fail fast” is good, then they’ll be ready to go again.

So you need to choose enthusiasts from across your functional groups.  They can work on the technologies and the process, and once that’s working, it’s the people who will create that cultural change.  You can just sit back and enjoy.  Until the next crisis, of course.


1 – OK, you’re right.  It should be “with which we techies tend to struggle”[2]

2 – you thought I was going to qualify that bit about techies struggling with touchy-feely stuff, didn’t you?  Read it again: I put “tend to”.  That’s the best you’re getting.

3 – is percolating a bottom-up process?  I don’t drink coffee[4], so I wouldn’t know.

4 – do people even use percolators to make coffee anymore?  Feel free to let me know in the comments. I may pretend interest if you’re lucky.

5 – for US readers (and some other countries, maybe?), please substitute “tick” for “check” here[6].

6 – for US techie readers, feel free to perform “s/tick/check/;”.

7 – this is a Status Quo[8] reference for which I’m extremely sorry.

8 – for Millennial readers, please consult your favourite online reference engine or just roll your eyes and move on.

9 – for people who say, “but I love excitement”, trying being on call at 2am on a Sunday morning at end of quarter when your Chief Financial Officer calls you up to ask why all of last month’s sales figures have been corrupted with the letters “DEADBEEF”[10].

10 – for people not in the know, this is a string often used by techies as test data because a) it’s non-numerical; b) it’s numerical (in hexadecimal); c) it’s easy to search for in debug files and d) it’s funny[11].

11 – though see [9].

12 – it’s a low number, is all I’m saying.

If it isn’t tested, it doesn’t work

Testing isn’t just coming up with tests for desired use cases.

Huh.  Shouldn’t that title be “If it isn’t tested, it’s not going to work”?

No.

I’m asserting something slightly different here – in fact, two things.  The first can be stated thus:

“In order for a system to ‘work’ correctly, and to defined parameters, test cases for all plausible conditions must be documented, crafted – and passed – before the system is considered to ‘work’.”

The second is a slightly more philosophical take on the question of what a “working system” is:

“An instantiated system – including software, hardware, data and wetware[1] components – may be considered to be ‘working’ if both its current state, and all known plausible future states from the working state have been anticipated, documented and appropriately tested.”

Let’s deal with these one by one, starting with the first[3].

Case 1 – a complete test suite

I may have given away the basis for my thinking by the phrasing in the subtitle above.  What I think we need to be looking for, when we’re designing a system, is what we should be doing ensuring that we have a test case for every plausible condition.  I considered “possible” here, but I think that may be going too far: for most systems, for instance, you don’t need to worry too much about meteor strikes.  This is an extension of the Agile methodology dictum: “a feature is not ‘done’ until it has a test case, and that test case has been passed.”  Each feature should be based on a use case, and a feature is considered correctly implemented when the test cases that are designed to test that feature are all correctly passed.

It’s too easy, however, to leave it there.  Defining features is, well not easy, but something we know how to do.  “When a user enters enters a valid username/password combination, the splash-screen should appear.”  “When a file has completed writing, a tick should appear on the relevant icon.”  “If a user cancels the transaction, no money should be transferred between accounts.”  The last is a good one, in that it deals with an error condition.  In fact, that’s the next step beyond considering test cases for features that implement functionality to support actions that are desired: considering test cases to manage conditions that arise from actions that are undesired.

The problem is that many people, when designing systems, only consider one particular type of undesired action: accidental, non-malicious action.  This is the reason that you need to get security folks[4] in when you’re designing your system, and the related test cases.  In order to ensure that you’re reaching all plausible conditions, you need to consider intentional, malicious actions.  A system which has not considered these and test for these cannot, in my opinion, be said truly to be “working”.

Case 2 – the bigger systems picture

I write fairly frequently[5] about the importance of systems and systems thinking, and one of the interesting things about a system, from my point of view, is that it’s arguably not really a system until it’s up and running: “instantiated”, in the language I used in my definition above.

Case 2 dealt, basically, with test cases and the development cycle.  That, by definition, is before you get to a fully instantiated system: one which is operating in the environment for which it was designed – you really, really hope – and is in situ.  Part of it may be quiescent, and that is hopefully as designed, but it is instantiated.

A system has a current state; it has a set of defined (if not known[7]) past states; and a set of possible future states that it can reach from there.  Again, I’m not going to insist that all possible states should be considered, for the same reasons I gave above, but I think that we do need to talk about all known plausible future states.

These types of conditions won’t all be security-related.  Many of them may be more appropriately thought of as to do with assurance or resilience.  But if you don’t get the security folks in, and early in the planning process, then you’re likely to miss some.

Here’s how it works.  If I am a business owner, and I am relying on a system to perform the tasks for which it was designed, then I’m likely to be annoyed if some IT person comes to me and says “the system isn’t working”.  However, if, in response to my question, “and did it fail due to something we had considered in our design and deployment of the system” is “yes”, then I’m quite lightly to move beyond annoyed to a state which, if we’re honest, the IT person could easily have considered, nay predicted, and which is closer to “incandescent” than “contented”[8].

Because if we’d considered a particular problem  – it was “known”, and “plausible” – then we should have put in place measures to deal with it. Some of those will be preventative measures, to stop the bad thing happening in the first place, and others will be mitigations, to deal with the effects of the bad thing that happened.  And there may also be known, plausible states for which we may consciously decide not to prepare.  If I’m a small business owner in Weston-super-mare[9], then I may be less worried about industrial espionage than if I’m a multi-national[10].  Some risks aren’t worth the bother, and that’s fine.

To be clear: the mitigations that we prepare won’t always be technical.  Let’s say that we come up with a scenario where an employee takes data from the system on a USB stick and gives it to a competitor.  It may be that we can’t restrict all employees from using USB sticks with the system, so we have to rely on legal recourse if that happens.  If, in that case, we call in the relevant law enforcement agency, then the system is working as designed if that was our plan to deal with this scenario.

Another point is that not all future conditions can be reached from the current working state, and if they can’t, then it’s fair to decide not to deal with them.  Once a TPM is initialised, for instance, taking it back to its factory state basically requires to reset it, so any system which is relying on it has also been reset.

What about the last bit of my definition?  “…[A]nticipated, documented and appropriately tested.”  Well, you can’t test everything fully.  Consider that the following scenarios are all known and plausible for your system:

  • a full power-down for your entire data centre;
  • all of your workers are incapacitate by a ‘flu virus;
  • your main sysadmin is kidnapped;
  • an employee takes data from the system on a USB stick and gives it to a competitor.

You’re really not going to want to test all of these.  But you can at least perform paper exercises to consider what steps you should take, and also document them.  You might ensure that you know which law enforcement agency to call, and what the number is, for instance, instead of actually convincing an employee to leak information to a competitor and then having them arrested[11].

Conclusion

Testing isn’t just coming up with tests for desired use cases.  It’s not even good enough just to prepare for accidental undesired use cases on top of that.  We need to consider malicious use cases, too.   And testing in development isn’t good enough either: we need to test with live systems, in situ.  Because if we don’t, something, somewhere, is going to go wrong.

And you really don’t want to be the person telling your boss that, “well, we thought it might, but we never tested it.”

 

 


1 – “wetware” is usually defined as human components of a system (as here), but you might have non-human inputs (from animals or aliens), or even from fauna[2], I suppose.

2 – “woodware”?

3 – because I, for one, need a bit of a mental run-up to the second one.

4 – preferably the cynical, suspicious types.

5 – if not necessarily regularly: people often confuse the two words.  A regular customer may only visit once a year, but always does it on the same day, whereas a frequent customer may visit on average once a week, but may choose a different day each week.[6]

6 – how is this relevant?  It’s not.

7 – yes, I know: Schrödinger’s cat, quantum effects, blah, blah.

8 – Short version: if the IT person says “it broke, and it did it in a way we had thought of before”, then I’m going to be mighty angry.

9 – I grew up nearby.  Windy, muddy, donkeys.

10 – which might, plausibly, also be based in Weston-super-mare, though I’m not aware of any.

11 – this is, I think, probably at least bordering on the unethical, and might get you in some hot water with your legal department, and possibly some other interested parties[12].

12 – your competitor might be pleased, though, so there is that.

Security patching and vaccinations: a surprising link

Learning from medicine, but recognising differences.

I’ve written a couple of times before about patching, and in one article (“The Curious Incident of the Patch in the Night-Time“), I said that I’d return to the question of how patches and vaccinations are similar.  Given the recent flurry of patching news since Meltdown and Spectre, I thought that now would be a good time to do that.

Now, one difference that I should point out up front is that nobody believes that applying security patches to your systems will give them autism[1].  Let’s counter that with the first obvious similarity, though: patching your systems makes them resistant to attacks based on particular vulnerabilities.  Equally, a particular patch may provide resistance to multiple types of attack of the same family, as do some vaccinations.  Also similarly, as new attacks emerge – or bacteria or viruses change and evolve – new patches are likely to be required to deal with the problem.

We shouldn’t overplay the similarities, of course.  Just because some types of malware are referred to as “viruses” doesn’t mean that their method of attack, or the mechanisms by which computer systems defend against them, are even vaguely alike[2].  Computer systems don’t have complex immune systems which adapt and learn how to deal with malware[3].  On the other hand, there are also lots of different types of vulnerability for which patches are efficacious which are very different to bacterial or virus attacks: a buffer overflow attack or SQL injection, for instance.  So, it’s clearly possible to over-egg this pudding[4].  But there is another similarity that I do think is worth drawing, though it’s not perfect.

There are some systems which, for whatever reason, it is actually quite risky to patch.   This is because of the business risk associated with patching them, and might be down to a number of factors, including:

  • projected downtime as the patch is applied and system rebooted is unacceptable;
  • side effects of the patch (e.g. performance impact) are too severe;
  • risk of the system not rebooting after patch application is too high;
  • other components of the system (e.g. hardware or other software) may be incompatible with the patch.

In these cases, a decision may be made that the business risk of patching the system outweighs the business risk of leaving it unpatched. Alternatively, it may be that you are running some systems which are old and outdated, and for which there is no patch available.

Here’s where there’s another surprising similarity with vaccinations.  There are, in any human population, individuals for whom the dangers of receiving a vaccination may outweigh the benefits.  The reasons for this are different from the computer case, and are generally down to weakened immune systems and/or poor health.  However, it turns out that as the percentage of a human population[6] that is vaccinated rises, the threat to the unvaccinated individuals reduces, as there are fewer infection vectors from whom those individuals can receive the infection.

We need to be careful with how closely we draw the analogy here, because we’re on shaky ground if we go too far, but there are types of system vulnerability – particularly malware – for which this is true for computer systems.  If you patch all the systems that you can, then the number of possible “jump-off” points for malware will reduce, meaning that the unpatched systems are less likely to be affected.  To a lesser degree, it’s probably true that as unsophisticated attackers notice that a particular attack vector is diminishing, they’ll ignore it and move to something else.  Over-stretching this thread, however, is particularly dangerous: a standard approach for any motivated attacker is to attempt attack vectors which are “old”, but to which unpatched systems are likely to be vulnerable.

Another difference is that in the computing world, attacks never die off.  Though there are stockpiles of viruses and bacteria which are extinct in the general population which are maintained for various reasons[7], some will die out over time.  In the world of IT, pretty much every vulnerability ever discovered will have been documented somewhere, whether there still exists an “infected” system or not, and so is still available for re-use or re-purposing.

What is the moral of this article?  Well, it’s this: even if you are unable to patch all of your systems, it’s still worth patching as many of them as you can.  It’s also worth considering whether there are some low-risk systems that you can patch immediately, and which require less business analysis before deciding whether they can be patched in a second or third round of patching.  It’s probably worth keeping a list of these somewhere.  Even better, you can maintain lists of high-, medium- and low-risk systems – both in terms of business risk and infection vulnerability – and use this to inform your patching, both automatic and manual.  But, dear reader: do patch.


1 – if you believe that – or, in fact, if you believe that vaccinations give children autism – then you’re reading the wrong blog.  I seriously suggest that you go elsewhere (and read some proper science on the subject).

2 – pace the attempts of Hollywood CGI departments to make us believe that they’re exactly the same.

3 – though this is obviously an interesting research area.

4 – “overextend this analogy”.  The pudding metaphor is a good one though, right?[5]

5 – and I like puddings, as my wife (and my waistline) will testify.

6 – or, come to think of it, animal (I’m unclear on flora).

7 – generally, one hopes, philanthropic.