Thinking like a (systems) architect

“…I know it when I see it, and the motion picture involved in this case is not that.” – Mr Justice Stewart.

My very first post on this blog, some six months ago*, was entitled “Systems security – why it matters“, and it turns out that posts where I talk** about architecture are popular, so I thought I’d go a bit further into what I think being an architect is about.  First of all, I’d like to reference two books which helped me come to some sort of understanding about the art of being an architect.  I read them a long time ago****, but I still dip into them from time to time.  I’m going to link to the publisher’s***** website, rather than to any particular bookseller:

What’s interesting about them is that they both have multiple points of view expressed in them: some contradictory – even within each book.  And this rather reflects the fact that I believe that being a systems architect is an art, or a discipline.  Different practitioners will have different views about it.  You can talk about Computer Science being a hard science, and there are parts of it which are, but much of software engineering (lower case intentional) goes beyond that.  The same, I think, is even more true for systems architecture: you may be able to grok what it is once you know it, but it’s very difficult to point to something – even a set of principles – and say, “that is systems architecture”.  Sometimes, the easiest way to define something is by defining what it’s not: search for “I know it when I see it, and the motion picture involved in this case is not that.”******

Let me, however, try to give some examples of the sort of things you should expect to see when someone (or a group of people) is doing good systems architecture:

  • pictures: if you can’t show the different components of a system in a picture, I don’t believe that you can fully describe what each does, or how they interact.  If you can’t separate them out, you don’t have a properly described system, so you have no architecture.  I know that I’m heavily visually oriented, but for me this feels like a sine qua non********.
  • a data description: if you don’t know what data is in your system, you don’t know what it does.
  • an entity description: components, users, printers, whatever: you need to know what’s doing what so that you can describe what the what is that’s being done to it, and what for*********.
  • an awareness of time: this may sound like a weird one, but all systems (of any use) process data through time.  If you don’t think about what will change, you won’t understand what will do the changing, and you won’t be able to consider what might go wrong if things get changed in ways you don’t expect, or by components that shouldn’t be doing the changing in the first place.
  • some thinking on failure modes: I’ve said it before, and I’ll say it again: “things will go wrong.”  You can’t be expected to imagine all the things that might go wrong, but you have a responsibility to consider what might happen to different components and data – and therefore to the operation of the system of the whole – if********** they fall over.

There are, of course, some very useful tools and methodologies (the use of UML views is a great example) which can help you with all of these.  But you don’t need to be an expert in all of them – or even any one of them – to be a good systems architect.

One last thing I’d add, though, and I’m going to call it the “bus and amnesiac dictum”***********:

  • In six months’ time, you’ll have forgotten the details or been hit by a bus: document it.  All of it.

You know it makes sense.


*this note is for nothing other than to catch those people who go straight to this section, hoping for a summary of the main article.  You know who you are, and you’re busted.

**well, write, I guess, but it feels like I’m chatting at people, so that’s how I think of it***

***yes, I’m going out of my way to make the notes even less info-filled than usual.  Deal with it.

****seven years is a long time, right?

*****almost “of course”, though I believe they’re getting out of the paper book publishing biz, sadly.

******this is a famous comment from a case called from the U.S. called “Jacobellis v. Ohio” which I’m absolutely not going to quote in full here, because although it might generate quite a lot of traffic, it’s not the sort of traffic I want on this blog*******.

*******I did some searching found the word: it’s apophasis.  I love that word.  Discovered it during some study once, forgot it.  Glad to have re-found it.

********I know: Greek and Latin in one post.  Sehr gut, ja?

********.*I realise that this is a complete mess of a sentence.  But it does have a charm to it, yes?  And you know what it means, you really do.

**********when.  I meant “when”.

***********more Latin.

Why I love technical debt

… if technical debt can be named, then it can be documented.

Not necessarily a title you’d expect for a blog post, I guess*, but I’m a fan of technical debt.  There are two reasons for this: a Bad Reason[tm] and a Good Reason[tm].  I’ll be up front about the Bad Reason first, and then explain why even that isn’t really a reason to love it.  I’ll then tackle the Good Reason, and you’ll nod along in agreement.

The Bad Reason I love technical debt

We’ll get this out of the way, then, shall we?  The Bad Reason is that, well, there’s just lots of it, it’s interesting, it keeps me in a job, and it always provides a reason, as a security architect, for me to get involved in** projects that might give me something new to look at.  I suppose those aren’t all bad things, and it can also be a bit depressing, because there’s always so much of it, it’s not always interesting, and sometimes I need to get involved even when I might have better things to do.

And what’s worse is that it almost always seems to be security-related, and it’s always there.  That’s the bad part.

Security, we all know, is the piece that so often gets left out, or tacked on at the end, or done in half the time that it deserves, or done by people who have half an idea, but don’t quite fully grasp it.  I should be clear at this point: I’m not saying that this last reason is those people’s fault.  That people know they need security it fantastic.  If we (the security folks) or we (the organisation) haven’t done a good enough job in making security resources – whether people, training or sufficient visibility – available to those people who need it, then the fact that they’re trying is a great, and something we can work on.  Let’s call that a positive.  Or at least a reason for hope***.

The Good Reason I love technical debt

So let’s get on to the other reason: the legitimate reason.  I love technical debt when it’s named.

What does that mean?

So, we all get that technical debt is a bad thing.  It’s what happens when you make decisions for pragmatic reasons which are likely to come back and bite you later in a project’s lifecycle.  Here are a few classic examples that relate to security:

  • not getting round to applying authentication or authorisation controls on APIs which might at some point be public;
  • lumping capabilities together so that it’s difficult to separate out appropriate roles later on;
  • hard-coding roles in ways which don’t allow for customisation by people who may use your application in different ways to those that you initially considered;
  • hard-coding cipher suites for cryptographic protocols, rather than putting them in a config file where they can be changed or selected later.

There are lots more, of course, but those are just a few which jump out at me, and which I’ve seen over the years.  Technical debt means making decisions which will mean more work later on to fix them.  And that can’t be good, can it?

Well, there are two words in the preceding paragraph or two which should make us happy: they are “decisions” and “pragmatic”.  Because in order for something to be named technical debt, I’d argue, it needs to have been subject to conscious decision-making, and for trade-offs to have been made – hopefully for rational reasons.  Those reasons may be many and various – lack of qualified resources; project deadlines; lack of sufficient requirement definition – but if they’ve been made consciously, then the technical debt can be named, and if technical debt can be named, then it can be documented.

And if they’re documented, then we’re half-way there.  As a security guy, I know that I can’t force everything that goes out of the door to meet all the requirements I’d like – but the same goes for the High Availability gal, the UX team, the performance folks, et al..  What we need – what we all need – is for there to exist documentation about why decisions were made, because then, when we return to the problem later on, we’ll know that it was thought about.  And, what’s more, the recording of that information might even make it into product documentation, too.  “This API is designed to be used in a protected environment, and should not be exposed on the public Internet” is a great piece of documentation.  It may not be what a customer is looking for, but at least they know how to deploy the product now, and, crucially, it’s an opportunity for them to come back to the product manager and say, “we’d really like to deploy that particular API in this way: could you please add this as a feature request?”.  Product managers like that.  Very much****.

The best thing, though, is not just that named technical debt is visible technical debt, but that if you encourage your developers to document the decisions in code*****, then there’s a half-way to decent chance that they’ll record some ideas about how this should be done in the future.  If you’re really lucky, they might even add some hooks in the code to make it easier (an “auth” parameter on the API which is unused in the current version, but will make API compatibility so much simpler in new releases; or cipher entry in the config file which only accepts one option now, but is at least checked by the code).

I’ve been a bit disingenuous, I know, by defining technical debt as named technical debt.  But honestly, if it’s not named, then you can’t know what it is, and until you know what it is, you can’t fix it*******.  My advice is this: when you’re doing a release close-down (or in your weekly stand-up: EVERY weekly stand-up), have an item to record technical debt.  Name it, document it, be proud, sleep at night.

 


*well, apart from the obvious clickbait reason – for which I’m (a little) sorry.

**I nearly wrote “poke my nose into”.

***work with me here.

****if you’re software engineer/coder/hacker – here’s a piece of advice: learn to talk to product managers like real people, and treat them nicely.  They (the better ones, at least) are invaluable allies when you need to prioritise features or have tricky trade-offs to make.

*****do this.  Just do it.  Documentation which isn’t at least mirrored in code isn’t real documentation******.

******don’t believe me?  Talk to developers.  “Who reads product documentation?”  “Oh, the spec?  I skimmed it.  A few releases back.  I think.”  “I looked in the header file: couldn’t see it there.”

*******or decide not to fix it, which may also be an entirely appropriate decision.

Security for the rest of us (them…)

“I think I’ve just been got by a phishing email…”

I attended Black Hat USA a few weeks ago in Las Vegas*.  I also spent some time at B-Sides LV and DEFCON.  These were my first visits to all of them, and they were interesting experiences.  There were some seriously clever people talking about some seriously complex things, some of which were way beyond my level of knowledge.   There were also some seriously odd things and odd people**.  There was one particular speaker who did a great job, and whose keynote made me think: Alex Stamos, CSO of Facebook.

The reason that I want to talk about Stamos’ talk is that I got a phone call a few minutes back from a member of my family.  It was about his iCloud account, which he was having problems accessing.  Now: I don’t use Apple products***, so I wasn’t able to help. But the background was the interesting point.  I’d had a call last week ago from the same family member.  He’s not … techno-savvy.   You know the one I’m talking about: that family member.  He was phoning me on last week in something of a fluster.

“I think I’ve just been got by a phishing email,” he started.

Now: this is a win.  Somebody – whether me or the media – has got him to understand what a phishing email is.  I’m not saying he could spell it correctly, mind you – or that he’s not going to get hit by one – but at least he knows.

“OK,” I said.

“It said that it was from Apple, and if I didn’t change my password within 72 hours, I’d lose all of my data,” he explained.

Ah, I thought, one of those.

“So I clicked on the link and changed my password.  But I realised after about 5 minutes and changed it again,” he continued.

“Where did you change it that time?” I asked.

“On the Apple site.”

“apple.com?”

“Yes.”

“Then you’re probably OK.”  I gave him some advice on things to check, and suggested ringing Apple and maybe his bank to let them know.  I also gave him the Stern Talk[tm] that we’ve all given users – the one about never clicking through a link on an email, and always entering it by hand.*****  He called me back a few hours later to tell me that the guy he’d spoken to at Apple had reassured him that his bank details weren’t in danger, and that a subsequent notification he’d got that someone was trying to use his account from an unidentified device was a good sign, because it meant that the extra layers of security that Apple had put in place were doing their job.  He was significantly (and rightly) relieved.

“So what has this to do with Stamos’ keynote?” you’re probably asking.  Well, Stamos talked about how many of the attacks and vulnerabilities that we worry about much of the time – zero days, privilege escalations, network segment isolation – make up the tiniest tip of the huge pyramid of security issues that affect our users.  Most of the problems are around misuse of accounts or services.  And most of the users in the world aren’t uberhackers or even script kiddies – they’re not even people like those in the audience****** – but people with sub-$100******* smartphones.

And he’s right.  We need to think about these people, too.  I’m not saying that we shouldn’t worry about all the complex and scary technical issues that we’re paid to understand, fix and mitigate.  They are important, and if we don’t fix them, we’re in for a world of pain.  But our jobs – what we get paid for – should also include thinking about the other people: people who Facebook and Apple make a great deal of money from, and who they quite rightly care about.  The question is: do we, the rest of the industry?  And how are we going to know that we’re thinking like a 68 year old woman in India or a 15 year old boy in Brazil?  (Hint: part of the answer is around diversity in our industry.)

Apple didn’t do too bad a job, I think – though my family member is still struggling with the impact of the password reset.  And the organisation I talked about in my previous post on the simple things we should do absolutely didn’t.  So, from now on, I’m going to try to think a little harder about what impact the recommendations, architectures and designs I come up with might have on the “hidden users” – not the sysadmins, not the developers, not the expert users, but people like my family members.  We need to think about security for them just as much as for security for people like us.


*weird place, right?  And hot.  Too hot.

**I walked out of one session at DEFCON after six minutes as it was getting more and more difficult to resist the temptation to approach the speaker at the podium and punch him on the nose.

***no, I’m not going to explain.  I just don’t: let’s leave it at that for now, OK?  I’m not judging you if you do.****

****of course I’m judging you.  But you’ll be fine.

*****clearly whoever had explained about phishing attacks hadn’t done quite as good a job as I’d hoped.

******who, he seemed to assume, were mainly Good Guys & Gals[tm].

*******approximately sub-€85 or sub-£80 at time of going to press********: please substitute your favoured currency here and convert as required.

********I’m guessing around 0.0000000000000001 bitcoins.  I don’t follow the conversion rate, to be brutally honest.

 

 

The simple things, sometimes…

I (re-)learned an important lesson this week: if you’re an attacker, start at the front door.

This week I’ve had an interesting conversation with an organisation with which I’m involved*.  My involvement is as a volunteer, and has nothing to do with my day job – in other words, I have nothing to do with the organisation’s security.  However, I got an email from them telling me that in order to perform a particular action, I should now fill in an online form, which would then record the information that they needed.

So this week’s blog entry will be about entering information on an online form.  One of the simplest tasks that you might want to design – and secure – for any website.  I wish I could say that it’s going to be a happy tale.

I had look at this form, and then I looked at the URL they’d given me.  It wasn’t a fully qualified URL, in that it had no protocol component, so I copied and pasted it into a browser to find out what would happen. I had a hope that it might automatically redirect to an https-served page.  It didn’t.  It was an http-served page.

Well, not necessarily so bad, except that … it wanted some personal information.  Ah.

So, I cheated: I changed the http:// … to an https:// and tried again**.  And got an error.  The certificate was invalid.  So even if they changed the URL, it wasn’t going to help.

So what did I do?  I got in touch with my contact at the organisation, advising them that there was a possibility that they might be in breach of their obligations under Data Protection legislation.

I got a phone call a little later.  Not from a technical person – though there was a techie in the background.  They said that they’d spoken with the IT and security departments, and that there wasn’t a problem.  I disagreed, and tried to explain.

The first argument was whether there was any confidential information being entered.  They said that there was no linkage between the information being entered and the confidential information held in a separate system (I’m assuming database). So I stepped back, and asked about the first piece of information requested on the form: my name.  I tried a question: “Could the fact that I’m a member of this organisation be considered confidential in any situation?”

“Yes, it could.”

So, that’s one issue out of the way.

But it turns out that the information is stored encrypted on the organisation’s systems.  “Great,” I said, “but while it’s in transit, while it’s being transmitted to those systems, then somebody could read it.”

And this is where communication stopped.  I tried to explain that unless the information from the form is transmitted over https, then people could read it.  I tried to explain that if I sent it over my phone, then people at my mobile provider could read it.  I tried a simple example: I tried to explain that if I transmitted it from a laptop in a Starbucks, then people who run the Starbucks systems – or even possibly other Starbucks customers – could see it.  But I couldn’t get through.

In the end, I gave up.  It turns out that I can avoid using the form if I want to.  And the organisation is of the firm opinion that it’s not at risk: that all the data that is collected is safe.  It was quite clear that I wasn’t going to have an opportunity to argue this with their IT or security people: although I did try to explain that this is an area in which I have some expertise, they’re not going to let any Tom, Dick or Harry*** bother their IT people****.

There’s no real end to this story, other than to say that sometimes it’s the small stuff we need to worry about.  The issues that, as security professionals, we feel are cut and dried, are sometimes the places where there’s still lots of work to be done.  I wish it weren’t the case, because frankly, I’d like to spend my time educating people on the really tricky things, and explaining complex concepts around cryptographic protocols, trust domains and identity, but I (re-)learned an important lesson this week: if you’re an attacker, start at the front door.  It’s probably not even closed: let alone locked.


*I’m not going to identify the organisation: it wouldn’t be fair or appropriate.  Suffice to say that they should know about this sort of issue.

**I know: all the skillz!

***Or “J. Random User”.  Insert your preferred non-specific identifier here.

****I have some sympathy with this point of view: you don’t want to have all of their time taken up by random “experts”.  The problem is when there really _are_ problems.  And the people calling them maybe do know their thing.

Diversity in IT security: not just a canine issue

“People won’t listen to you or take you seriously unless you’re an old white man, and since I’m an old white man I’m going to use that to help the people who need it.” —Patrick Stewart, Actor

A couple of weeks ago, I wrote an April Fool’s post: “Changing the demographic in IT security: a radical proposal“.  It was “guest-written” by my dog, Sherlock, and suggested that dogs would make a good demographic from which to recruit IT professionals.  It went down quite well, and I had a good spike of hits, which was nice, but I wish that it hadn’t been necessary, or resonated so obviously with where we really are in the industry*.

Of special interest to me is the representation of women within IT security, and particularly within technical roles.  This is largely due to the fact that I have two daughters of school age**, and although I wouldn’t want to force them into a technical career, I want to be absolutely sure that they have as much chance both to try it – and then to succeed – as anybody else does, regardless of their gender.

But I think we should feel that other issues of under-representation should be of equal concern.  Professionals from ethnic minorities and with disabilities are also under-represented within IT security, and this is, without a doubt, a Bad Thing[tm].  I suspect that the same goes for people in the LGBTQ+ demographics.  From my perspective, diversity is something which is an unalloyed good within pretty much any organisation.  Different viewpoints don’t just allow us to reflect what our customers see and do, but also bring different perspectives to anything from perimeter defence to user stories, from UX to threading models.  Companies and organisations are just more flexible – and therefore more resilient – if they represent a wide ranging set of perspectives and views.  Not only because they’re more likely to be able to react positively if they come under criticism, but because they are less likely to succumb to groupthink and the “yes-men”*** mentality.

Part of the problem is that we hire ourselves.  I’m a white male in a straight marriage with a Western university education and a nuclear family.  I’ve got all of the privilege starting right there, and it’s really, really easy to find people like me to work with, and to hire, and to trust in business relationships.  And I know that I sometimes get annoyed with people who approach things differently to me, whose viewpoint leads them to consider alternative solutions or ideas.  And whether there’s a disproportionate percentage of annoyances associated with people who come from a different background to me, or that I’m just less likely to notice such annoyances when they come from someone who shares my background, there’s a real danger of prejudice kicking in and privilege – my privilege – taking over.

So, what can we do?  Here are some ideas:

  • Go out of our way to read, listen to and engage with people from different backgrounds to our own, particularly if we disagree with them, and particularly if they’re in our industry
  • Make a point of including the views of non-majority members of teams and groups in which you participate
  • Mentor and encourage those from disparate backgrounds in their careers
  • Consider positive discrimination – this is tricky, particularly with legal requirements in some contexts, but it’s worth considering, if only to recognise what a difference it might make.
  • Encourage our companies to engage in affirmative groups and events
  • Encourage our companies only to sponsor events with positive policies on harassment, speaker and panel selection, etc.
  • Consider refusing to speak on industry panels made up of people who are all in our demographic****
  • Interview out-liers
  • Practice “blind CV” selection

These are my views. The views of someone with privilege.  I’m sure they’re not all right.  I’m sure they’re not all applicable to everybody’s situation.  I’m aware that there’s a danger of my misappropriating a fight which is not mine, and of the dangers of intersectionality.

But if I can’t stand up from my position of privilege***** and say something, then who can?


*Or, let’s face it, society.

**I’m also married to a very strong proponent of equal rights and feminism.  It’s not so much that it rubbed off on me, but that I’m pretty sure she’d have little to do with me if I didn’t feel the same way.

***And I do mean “men” here, yes.

****My wife challenged me to put this in.  Because I don’t do it, and I should.

*****“People won’t listen to you or take you seriously unless you’re an old****** white man, and since I’m an old white man I’m going to use that to help the people who need it.” —Patrick Stewart, Actor

******Although I’m not old.*******

*******Whatever my daughters may say.

Diversity – redux

One of the recurring arguments against affirmative action from majority-represented groups is that it’s unfair that the under-represented group has comparatively special treatment.

Fair warning: this is not really a blog post about IT security, but about issues which pertain to our industry.  You’ll find social sciences and humanities – “soft sciences” – referenced.  I make no excuses (and I should declare previous form*).

Warning two: many of the examples I’m going to be citing are to do with gender discrimination and imbalances.  These are areas that I know the most about, but I’m very aware of other areas of privilege and discrimination, and I’d specifically call out LGBTQ, ethnic minority, age, disability and non-neurotypical discrimination.  I’m very happy to hear (privately or in comments) from people with expertise in other areas.

You’ve probably read the leaked internal document (a “manifesto”) from a Google staffer talking challenging affirmative action to try to address diversity, and complaining about a liberal/left-leaning monoculture at the company.  If you haven’t, you should: take the time now.  It’s well-written, with some interesting points, but I have some major problems with it that I think it’s worth addressing.  (There’s a very good rebuttal of certain aspects available from an ex-Google staffer.)  If you’re interested in where I’m coming from on this issue, please feel free to read my earlier post: Diversity in IT security: not just a canine issue**.

There are two issues that concern me specifically:

  1. no obvious attempt to acknowledge the existence of privilege and power imbalances;
  2. the attempt to advance the gender essentialism argument by alleging an overly leftist bias in the social sciences.

I’m not sure that these approaches are intentional or unconscious, but they’re both insidious, and if ignored, allow more weight to be given to the broader arguments put forward than I believe they merit.  I’m not planning to address those broader issues: there are other people doing a good job of that (see the rebuttal I referenced above, for instance).

Before I go any further, I’d like to record that I know very little about Google, its employment practices or its corporate culture: pretty much everything I know has been gleaned from what I’ve read online***.  I’m not, therefore, going to try to condone or condemn any particular practices.  It may well be that some of the criticisms levelled in the article/letter are entirely fair: I just don’t know.  What I’m interested in doing here is addressing those areas which seem to me not to be entirely open or fair.

Privilege and power imbalances

One of the recurring arguments against affirmative action from majority-represented groups is that it’s unfair that the under-represented group has comparatively special treatment.  “Why is there no march for heterosexual pride?”  “Why are there no men-only colleges in the UK?”  The generally accepted argument is that until there is equality in the particular sphere in which a group is campaigning, then the power imbalance and privilege afforded to the majority-represented group means that there may be a need for action to help for members the under-represented group to achieve parity.  That doesn’t mean that members of that group are necessarily unable to reach positions of power and influence within that sphere, just that, on average, the effort required will be greater than that for those in the majority-privileged group.

What does all of the above mean for women in tech, for example?  That it’s generally harder for women to succeed than it is for men.  Not always.  But on average.  So if we want to make it easier for women (in this example) to succeed in tech, we need to find ways to help.

The author of the Google piece doesn’t really address this issue.  He (and I’m just assuming it’s a man who wrote it) suggests that women (who seem to be the key demographic with whom he’s concerned) don’t need to be better represented in all parts of Google, and therefore affirmative action is inappropriate.  I’d say that even if the first part of that thesis is true (and I’m not sure it is: see below), then affirmative action may still be required for those who do.

The impact of “leftist bias”

Many of the arguments presented in the manifesto are predicated on the following thesis:

  • the corporate culture at Google**** are generally leftist-leaning
  • many social sciences are heavily populated by leftist-leaning theorists
  • these social scientists don’t accept the theory of gender essentialism (that women and men are suited to different roles)
  • THEREFORE corporate culture is overly inclined to reject gender essentialism
  • HENCE if a truly diverse culture is to be encouraged within corporate culture, leftist theories such as gender essentialism should be rejected.

There are several flaws here, one of which is that diversity means accepting views which are anti-diverse.  It’s a reflection of a similar right-leaning fallacy that in order to show true tolerance, the views of intolerant people should be afforded the same privilege of those who are aiming for greater tolerance.*****

Another flaw is the argument that just because a set of theories is espoused by a political movement to which one doesn’t subscribe that it’s therefore suspect.

Conclusion

As I’ve noted above, I’m far from happy with much of the so-called manifesto from what I’m assuming is a male Google staffer.  This post hasn’t been an attempt to address all of the arguments, but to attack a couple of the underlying arguments, without which I believe the general thread of the document is extremely weak.  As always, I welcome responses either in comments or privately.

 


*my degree is in English Literature and Theology.  Yeah, I know.

**it’s the only post on which I’ve had some pretty negative comments, which appeared on the reddit board from which I linked it.

***and is probably therefore just as far off the mark as anything else that you or I read online.

****and many other tech firms, I’d suggest.

*****an appeal is sometimes made to the left’s perceived poster child of postmodernism: “but you say that all views are equally valid”.  That’s not what postmodern (deconstructionist, post-structuralist) theory actually says.  I’d characterise it more as:

  • all views are worthy of consideration;
  • BUT we should treat with suspicion those views held by those which privilege, or which privilege those with power.

The Curious Incident of the Patch in the Night-Time

Gregory: “The patch did nothing in the night-time.”
Holmes: “That was the curious incident.”

To misquote Sir Arthur Conan-Doyle:

Gregory (cyber-security auditor) “Is there any other point to which you would wish to draw my attention?”
Holmes: “To the curious incident of the patch in the night-time.”
Gregory: “The patch did nothing in the night-time.”
Holmes: “That was the curious incident.”

I considered a variety of (munged) literary titles to head up this blog, and settled on the one above or “We Need to Talk about Patching”.  Either way round, there’s something rotten in the state of patching*.

Let me start with what I hope is a fairly uncontroversial statement: “we all know that patches are important for security and stability, and that we should really take them as soon as they’re available and patch all of our systems”.

I don’t know about you, but I suspect you’re the same as me: I run ‘sudo dnf –refresh upgrade’** on my home machines and work laptop at least once every day that I turn them on.  I nearly wrote that when an update comes out to patch my phone, I take it pretty much immediately, but actually, I’ve been burned before with dodgy patches, and I’ll often have a check of the patch number to see if anyone has spotted any problems with it before downloading it. This feels like basic due diligence, particularly as I don’t have a “staging phone” which I could use to test pre-production and see if my “production phone” is likely to be impacted***.

But the overwhelming evidence from the industry is that people really don’t apply patches – including security patches – even though they understand that they ought to.  I plan to post another blog entry at some point about similarities – and differences – between patching and vaccinations, but let’s take as read, for now, the assumption that organisations know they should patch, and look at the reasons they don’t, and what we might do to improve that.

Why people don’t patch

Here are the legitimate reasons that I can think of for organisations not patching****.

  1. they don’t know about patches
    • not all patches are advertised well enough
    • organisations don’t check for patches
  2. they don’t know about their systems
    • incomplete knowledge of their IT estate
  3. legacy hardware
    • patches not compatible with legacy hardware
  4. legacy software
    • patches not  compatible with legacy software
  5. known impact with up-to-date hardware & software
  6. possible impact with up-to-date hardware & software

Some of these are down to the organisations, or their operating environment, clearly: 1b, 2, 3 and 4.  The others, however, are down to us as an industry.  What it comes down to is a balance of risk: the IT operations department doesn’t dare to update software with patches because they know that if the systems that they maintain go down, they’re in real trouble.  Sometimes they know there will be a problem (typically because they test patches in a staging environment of some type), and sometimes because they just don’t dare.  This may be because they are in the middle of their own software update process, and the combination of Operating System, middleware or integrated software updates with their ongoing changes just can’t be trusted.

What we can do

Here are some thoughts about what we as an industry can do to try to address this problem – or set of problems.

Staging

Staging – what is a staging environment for?  It’s for testing changes before they go into production, of course.  But what changes?  Changes to your software, or your suppliers’ software?  The answer has to be “both”, I think.  You may need separate estates so that you can look at changes of these two sets of software separately before seeing what combining them does, but in the end, it is the combination of the two that matters.  You may consider using the same estate at different times to test the different options, but that’s not an option for all organisations.

DevOps

DevOps shouldn’t just be about allowing agile development practices to become part of the software lifecycle: it should also be about allowing agile operational practices become a part of the software lifecycle.  DevOps can really help with patching strategy if you think of it this way.  Remember, in DevOps, everybody has responsibility.  So your DevOps pipeline the perfect way to test how changes in your software are affected by changes in the underlying estate.  And because you’re updating regularly, and have unit tests to check all the key functionality*****, any changes can be spotted and addressed quickly.

Dependencies

Patches sometimes have dependencies.  We should be clear when a patch requires other changes, resulting a large patchset, and when a large patchset just happens to be released because multiple patches are available.  Some dependencies may be outside the control of the vendor.  This is easier to test when your patch has dependencies on an underlying Operating System, for instance, but more difficult if the dependency is on the opposite direction.  If you’re the one providing the underlying update and the customer is using software that you don’t explicitly test, then it’s incumbent on you, I’d argue, to use some of the other techniques that I’ve outlined to help your customers understand likely impact.

Visibility of likely impact

One obvious option available to those providing patches is a good description of areas of impact.  You’d hope that everyone did this already, of course, but a brief line something like “this update is for the storage subsystem, and should affect only those systems using EXT3”, for instance, is a great help in deciding the likely impact of a patch.  You can’t always get it right – there may always be unexpected consequences, and vendors can’t test for all configurations.  But they should at least test all supported configurations…

Risk statements

This is tricky, and maybe political, but is it time that we started giving those customers who need it a little more detail about the likely impact of the changes within a patch?  It’s difficult to quantify, of course: a one-character change may affect 95% of the flows through a module, whereas what may seem like a simple functional addition to a customer may actually require thousands of lines of code.  But as vendors, we should have an idea of the impact of a change, and we ought to be considering how we expose that to customers.

Combinations

Beyond that, however, I think there are opportunities for customers to understand what the impact of not having accepted a previous patch is.  Maybe the risk of accepting patch A is low, but the risk of not accepting patch A and patch B is much higher.  Maybe it’s safer to accept patch A and patch C, but wait for a successor to patch B.  I’m not sure quite how to quantify this, or how it might work, but I think there’s grounds for research******.

Conclusion

Businesses have every right not to patch.  There are business reasons to balance the risk of patching against not patching.  But the balance is currently often tipped too far in direction of not patching.  Much too far.  And if we’re going to improve the state of IT security, we, the industry, need to do something about it.  By helping organisations with better information, by encouraging them to adopt better practices, by training them in how to assess risk, and by adopting better practices ourselves.

 


*see what I did there?

**your commands my vary.

***this almost sounds like a very good excuse for a second phone, though I’m not sure that my wife would agree.

****I’d certainly be interested to hear of others: please let me know via comments.

*****you do have these two things, right?  Because if you don’t, you’re really not doing DevOps.  Sorry.

******as soon as I wrote this, I realised that somebody’s bound to have done research on this issue.  Please let me know if you have: or know somebody who has.