The “invisible” trade-off? Security.

“For twenty years, people have been leaving security till last.”

Colleague (in a meeting): “For twenty years, people have been leaving security till last.”

Me (in response): “You could have left out those last two words.”

This article will be a short one, and it’s a plea.  It’s also not aimed at my regular readership, because if you’re part of my regular readership, then you don’t need telling.  Many of the articles on this blog, however, are written with the express intention of meeting two criteria:

  1. they should be technical credible[1].
  2. you should be able to show them to your parents or to your manager[2].

I suspect that it’s your manager, this time round, who I’ll be targeting, but I don’t want to make assumptions about your parents’ roles or influence, so let’s leave it open.

The issue I want to address this week is the impact of not placing security firmly at the beginning, middle and end of any system or application design process.  As we all know, security isn’t something that you can bolt onto the end of a project and hope that you’ll be OK.  Equally, if you think about it only at the beginning, you’ll find that by the end, your requirements, use cases, infrastructure or personae will have changed[3], and what you planned at the beginning is no longer fit for purpose.  After all, if you know that your functional requirements will change (and everybody knows this), then why would your non-functional requirements be subject to the same drift?

The problem is that security, being a non-functional requirement[4], doesn’t get the up-front visibility that it needs.  And, because it’s difficult to do well, and it’s often the responsibility of a non-core team member “flown in” as a consultant or expert for a small percentage design meetings, security is the area that it’s easy to decide to let slide a bit.  Or a lot.  Or completely.

If there’s a trade-off around features, functionality or resource location, it’s likely to be security, and often, nobody even raises the point that there has been a trade-off: it’s completely invisible (this is one of the reasons Why I love technical debt).  This is also the reason that whenever I look at a system, I try to think “what were the decisions made about security?”, because, too often, no decisions were made about security at all.

So, if you’re a manager[6], and you’re involved with designing a system or application, don’t let security be the invisible trade-off.  I’m not saying that it needs to be the be-all and end-all of the project, but at least ensure that you think about it.  Thank you.


1 – they should be accurate, to be honest, but I also try not to dive deeper into technical topics than is absolutely required for context.

2 – to be clear, this isn’t about making them work- and parent-safe, but about presenting the topics in a manner that is approachable by non-experts.

3 – or, equally likely, all of them.

4 – I don’t mean that security doesn’t function correctly[5], but rather that it’s not one of the key functions of the system or application that’s being designed.

5 – though, now you mention it…

6 – or parent – see above.

Is homogeneity bad for security?

Can it really be good for security to have such a small number of systems out there?

For the last three years, I’ve attended the Linux Security Summit (though it’s not solely about Linux, actually), and that’s where I am for the first two days of this week – the next three days are taken up with the Open Source Summit.  This year, both are being run both in North America and in Europe – and there was a version of the Open Source Summit in Asia, too.  This is all good, of course: the more people, and the more diversity we have in the community, the stronger we’ll be.

The question of diversity came up at the Linux Security Summit today, but not in the way you might necessarily expect.  As with most of the industry, this very technical conference (there’s a very strong Linux kernel developer bias) is very under-represented by women, ethnic minorities and people with disabilities.  It’s a pity, and something we need to address, but when a question came up after someone’s talk, it wasn’t diversity of people’s background that was being questioned, but of the systems we deploy around the world.

The question was asked of a panel who were talking about open firmware and how making it open source will (hopefully) increase the security of the system.  We’d already heard how most systems – laptops, servers, desktops and beyond – come with a range of different pieces of firmware from a variety of different vendors.  And when we talk about a variety, this can easily hit over 100 different pieces of firmware per system.  How are you supposed to trust a system with some many different pieces?  And, as one of the panel members pointed out, many of the vendors are quite open about the fact that they don’t see themselves as security experts, and are actually asking the members of open source projects to design APIs, make recommendations about design, etc..

This self-knowledge is clearly a good thing, and the main focus of the panel’s efforts has been to try to define a small core of well-understood and better designed elements that can be deployed in a more trusted manner.   The question that was asked from the audience was in response to this effort, and seemed to me to be a very fair one.  It was (to paraphrase slightly): “Can it really be good for security to have such a small number of systems out there?”  The argument – and it’s a good one in general – is that if you have a small number of designs which are deployed across the vast majority of installations, then there is a real danger that a small number of vulnerabilities can impact on a large percentage of that install base.

It’s a similar problem in the natural world: a population with a restricted genetic pool is at risk from a successful attacker: a virus or fungus, for instance, which can attack many individuals due to their similar genetic make-up.

In principle, I would love to see more diversity of design within computing, and particular security, but there are two issues with this:

  1. management: there is a real cost to managing multiple different implementations and products, so organisations prefer to have a smaller number of designs, reducing the number of the tools to manage them, and the number of people required to be trained.
  2. scarcity of resources: there is a scarcity of resources within IT security.  There just aren’t enough security experts around to design good security into systems, to support them and then to respond to attacks as vulnerabilities are found and exploited.

To the first issue, I don’t see many easy answers, but to the second, there are three responses:

  1. find ways to scale the impact of your resources: if you open source your code, then the number of expert resources available to work on it expands enormously.  I wrote about this a couple of years ago in Disbelieving the many eyes hypothesis.  If your code is proprietary, then the number of experts you can leverage is small: if it is open source, you have access to almost the entire worldwide pool of experts.
  2. be able to respond quickly: if attacks on systems are found, and vulnerabilities identified, then the ability to move quickly to remedy them allows you to mitigate significantly the impact on the installation base.
  3. design in defence in depth: rather than relying on one defence to an attack or type of attack, try to design your deployment in such a way that you have layers of defence. This means that you have some time to fix a problem that arises before catastrophic failure affects your deployment.

I’m hesitant to overplay the biological analogy, but the second and third of these seem quite similar to defences we see in nature.  The equivalent to quick response is to have multiple generations in a short time, giving a species the opportunity to develop immunity to a particular attack, and defence in depth is a typical defence mechanism in nature – think of human’s ability to recognise bad meat by its smell, taste its “off-ness” and then vomit it up if swallowed.  I’m not quite sure how this particular analogy would map to the world of IT security (though some of the practices you see in the industry can turn your stomach), but while we wait to have a bigger – and more diverse pool of security experts, let’s keep being open source, let’s keep responding quickly, and let’s make sure that we design for defence in depth.

 

Single point of failure

Any failure which completely brings down a system for over 12 hours counts as catastrophic.

Yesterday[1], Gatwick Airport suffered a catastrophic failure. It wasn’t Air Traffic Control, it wasn’t security scanners, it wasn’t even check-in desk software, but the flight information boards. Catastrophic? Well, maybe the impact on the functioning of the airport wasn’t catastrophically affected, but the system itself was. For my money, any failure which completely brings down a system for over 12 hours (from 0430 to 1700 BST, reportedly), counts as catastrophic.

The failure has been blamed on damage to a fibre optic cable. It turned out that if this particular component of the system was brought down, then the system failed to operate as expected: it was a single point of failure. Now, in this case, it could be argued that the failure did not have a security impact: this was a resilience problem. Setting aside the fact that resilience and security are often bedfellows[2], many single points of failure absolutely are security issues, as they become obvious points of vulnerability for malicious actors to attack.

A key skill that needs to be grown with IT in general, but security in particular, is systems thinking, as I’ve discussed elsewhere, including in my first post on this blog: Systems security – why it matters. We need more systems engineers, and more systems architects. The role of systems architects, specifically, is to look beyond the single components that comprise a system, and to consider instead the behaviour of the system as a whole. This may mean looking past our first focus and our to, to for instance, hardware or externally managed systems to consider what the impact of failure, damage or compromise would be to the system’s overall operation.

Single points of failure are particularly awkward.  They crop up in all sorts of places, and they are a very good example of why diversity is important within IT security, and why you shouldn’t trust a single person – including yourself – to be the only person who looks at the security of a system.  My particular biases are towards crypto and software, for instance, so I’m more likely to miss a hardware or network point of failure than somebody with a different background to me.  Not to say that we shouldn’t try to train ourselves to think outside of whatever little box we come from – that’s part of the challenge and excitement of being a systems architect – but an acknowledgement of our own lack of expertise is in itself a realisation of our expertise: if you realise that you’re not an expert, you’re part way to becoming one.

I wanted to finish with an example of a single point of failure that is relevant to security, and exposes a process vulnerability.  The Register has a good write-up of the Foreshadow attack and its impact on SGX, Intel’s Trusted Execution Environment (TEE) capability.  What’s interesting, if the write-up is correct, is that what seems like a small break to a very specific part of the entire security chain means that you suddenly can’t trust anything.  The trust chain is broken, and you have to distrust everything you think you know.  This is a classic security problem – trust is a very tricky set of concepts – and one of the nasty things about it is that it may be entirely invisible to the user that an attack has taken place at all, particularly as the user, at this point, may have no visibility of the chain of trust that has been established – or not – up to the point that they are involved.  There’s a lot more to write about on this subject, but that’s for another day.  For now, if you’re planning to visit an airport, ensure that you have an app on your phone which will tell you your flight departure time and the correct gate.


1 – at time of writing, obviously.

2 – for non-native readers[3] , what I mean is that they are often closely related and should be considered together.

3 – and/or those unaquainted with my somewhat baroque language and phrasing habits[4].

4 – I prefer to double-dot when singing or playing Purcell, for instance[5].

5 – this is a very, very niche comment, for which slight apologies.

What’s an attack surface?

“Reduce your attack surface,” they say. But what is it?

“Reduce your attack surface,” they[1] say.  But what is it?  The instruction to reduce your attack surface is one of the principles of IT security, so it must be a Good Thing[tm].  The problem is that it’s not always clear what an attack surface actually is.

I’m going to go for the broadest possible description I can think of, or nearly, because I’m pretty paranoid, and because I’m not convinced that the Wikipedia definition[2] is sufficient[3].  Although I’ll throw in a few examples of how to reduce attack surfaces, the purpose of this post is really to explain what one is, rather than to help protect you – but a good understanding really is required before you start with anything else, so hopefully this will be useful.

So, here’s my start at a definition:

  • The attack surface of a system is the sum of areas where attacks could be launched against it.

That feels a little bit circular – let’s define some terms.  First of all, what’s an an “area” in this definition?  Well, I’d say that any particular component of a system may have many points of possible vulnerability – and therefore attack.  The sum of those points is an area – and the sum of the areas of the different components of a system gives us our system’s attack surface.

To understand better, we’re going to have to talk about systems – one of my favourite topics[4] – because I think it’s important to clarify a key difference between the attack surface of a component considered alone, and the area that a component adds when part of a system.  They will not generally be the same.

Here’s an example: you’re deploying an Operating System.  Let’s look at two options for deployment, and compare the attack surfaces.  In both cases, I’m going to take a fairly restricted look at points of vulnerability, excluding, for instance, human factors, as I don’t want to get bogged down in the details.

Deployment one – bare metal

You install your Operating System onto a physical machine, and plug it into the network.  What are some of the attack points?

  • your network connection
  • the physical hardware
  • services which are listening on the network connection
  • connections via USB – keyboard and mouse, for example.

There are more, but this should give us enough to do some comparisons.  I’d generally think of the attack surface as being associated with the physical bounds of the hardware, with the addition of the network port and USB connections.

How can we reduce the attack surface?  Well, we could unplug the network connection – though that might significantly reduce the efficacy of the system! – or we might take steps to reduce the number of services listening on the connection, to reduce the privilege level at which they run, or increase the authentication requirements for connecting to them.  We could reduce our surface area by using a utility such as “usbguard” to restrict USB connections, and, if we’re worried about physical access to the machine, we could put it in a locked cabinet somewhere.  These are all useful and appropriate ways to reduce our system’s attack surface.

Deployment two – a Virtual Machine

In this deployment scenario, we’re going to install the Operating System onto a Virtual Machine (VM), running on a physical host.  What does my attack surface look like now?  Well, that rather depends on how you define your system.  You could, of course, look at the wider system – the VM and the physical host – but for the purposes of this discussion, I’m going to consider that the operation of the Operating System is what we’re interested in, rather than the broader system[6].  So, what does our attack surface look like this time?  Here’s a quick list.

  • your network connection
  • the hypervisor
  • services which are listening on the network connection
  • connections via USB – keyboard and mouse, for example.

You’ll notice that “the physical hardware” is missing from this list, and that’s because it’s been replace with “the hypervisor”.  This is a little simplistic, for a few reasons, including that the hypervisor is arguably implemented via a combination of software and hardware controls, but it’s certainly different from the entire physical hardware we were talking about before, and in fact, there’s not much you can do from the point of the Virtual Machine to secure it, other than recognise its restrictions, so we might want to remove it from our list at this level.

The other entries are also somewhat different from our first scenario, although you might not realise at first glance.  First, it’s quite likely (though not certain) that your network connection may in fact be a virtual network connection provided by the hosting system, which means that some of the burden of defending it goes to the hosting system.  The same goes for the connections via USB – the hypervisor generally provides “virtual hardware” (via something like qemu, for example), which can be attached – or removed – from virtual machines.

So, you still have the services which are listening on the network connection, but it’s definitely a different attack surface from the first deployment scenario.

Now, if you take the wider view, then there’s definitely an attack surface at the physical machine level as well, and that needs to be considered – but it’s quite likely that this will be under the control of somebody completely different (such as a Cloud Service Provider – CSP).

Another quick example

When I deploy a webserver (using, for instance, Apache), I’ll need to consider a variety of attack vectors, from authentication to denial of service to storage attacks: these are part of our attack surface.  If I deploy it with a database (e.g. PostgreSQL or MySQL), the attack surface looks different, assuming that I care about the data in the database.  Whereas I might previously have been concerned to ensure that an HTTP “PUT” command didn’t overwrite or scramble a file on my filesystem, a malformed command to my database server could delete or corrupt multiple tables.  On the other hand, I might now be able to lock down some of the functions of my webserver that I no longer need to worry about filesystem attacks.  The attack surface of my webserver is different when it’s combined in a system with other components[7].

Why do I want to reduce my attack surface?

Well, this is quite an easy one.  By looking back at my earlier definition, you’ll see that the smaller a system’s attack surface, the fewer points of attack there are available to malicious actors.  That’s got to be a piece of good news.

You will, of course, never be able to reduce your attack  surface to zero (see There are no absolutes in security), but the more you reduce (and document, always document!), the better position you’ll be in.  It’s always about raising the bar to make it more difficult for malicious actors to affect you.


1 – the mythical IT Security Community, that’s who.

2 – to give one example.

3 – it only talks about data, and only about software: that’s not broad enough for me.

4 – as long-standing[4] readers of this blog will know.

5 – and long-suffering.

6 – yes, I know we can’t ignore that, but we’ll come back to it, honest.

7 – there are considerations around the attack surface of the database as well, of course.

What’s your availability? DoS attacks and more

In security we talk about intentional degradation of availability

A colleague of mine recently asked me about protection from DoS attacks[1] for a project with which he’s involved – Denial of Service attacks.  The first thing that sprung to mind, of course, was DDoS: Distributed Denial of Service attacks, where hundreds or thousands[2] of hosts are used to send vast amounts of network traffic to – or maybe more accurately “at” – servers in the hopes of bringing the servers to their knees and stopping them providing the service for which they’re designed.  These are the attacks that get into the news, and with good reason.

There are other types of DoS however, and the more I thought about it, the more I wondered whether he – and I – should be worrying about these other DoS attacks and also considering other related types of issue which could cause problems to systems.  And because I realised it was an interesting topic, I decided to write about it[3].

I’m going to return to the classic “C.I.A.” model of computer security: Confidentiality, Integrity and Availability.  The attacks we’re talking about here are those most often overlooked: attempts to degrade the availability of a service.  There’s an overlap with the related discipline of resilience here, but I think that the key differentiator is that in security we’re generally talking about intentional degradation of availability, whereas resilience also covers (and maybe focuses on) unintentional degradation.

So, what types of availability attacks might we want to consider?

Denial of service attacks

I think it’s worth linking to Wikipedia’s pretty awesome entry “Denial of service attack” – not something I often do, but I thought it was excellent.  Although they’re not mutually exclusive at all, here are some of the key types as I’d define them:

  • Distributed DoS – where you have lots of different hosts attacking at the same time, flooding the target with traffic.  These days, this can be easily automated, and it’s possible to rent compromised machines to perform a coordinated attack.
  • Application layer – where the attack is aimed at the service, rather than at the host beneath.  This may seem like an academic distinction, but it’s not: what it really means is that the attack is performed with knowledge of the application layer.  So, for instance, if you’re attacking a web server, you might initiate lots of HTTP sessions, or if you were attacking a Kerberos server, you might request lots of authentication tickets.  These types of attacks may be quite costly to perform, but they’re also difficult to protect against, as each attack looks like a “legal” interaction with the service, and unless you’re on the look-out in a way which is typically not automated at this level, they’re difficult to avoid.
  • Host level – this is a family of attacks which go for the host and/or associated Operating System, rather than the service itself.  A classic attack would be the SYN flood, which misused the TCP protocol to use up resources on the host, thereby stopping any associated services from being able to respond.  Host attacks may be somewhat simpler to defend against, as it’s easier to invest in logic to detect them at this level (or maybe “set of layers”, if we adopt the OSI model), and to correlate responses across different hosts.  Firewalls and similar defences are also more likely to be able to be configured to help defend hosts which may be targeted.

Resource starvation

The term “resource starvation” most accurately refers[4] to situations where a process (or application) is denied sufficient CPU allocation to perform correctly.  How could this occur?  Well, it’s going to be rarer than in the DoS case, because in order to do it, you’re going to need some way to impact the underlying scheduling of the Operating System and/or virtualisation management (think hypervisor, typically).  That would normally mean that you’d need pretty low-level access to the machine, but there is a family of attacks known as “noisy neighbour”[5] where workloads – VMs or containers, typically – use up so many resources that other workloads are starved.

However, partly because of this case, I’d argue that resource starvation can usefully be associated with other types of availability attacks which occur locally to the machine hosting the targeted service, which might be related to CPU, file descriptor, network or other resources.

Generally, noisy neighbour attacks can be fairly easily mitigated by controls in the Operating System or virtualisation manager, though, of course, compromised or malicious components at this layer are very difficult to manage.

 

Dependency blocking

I’m not sure what the best term for this type of attack is, but what I’m thinking of is attacks which impact a service by reducing or removing access to external services on which they depend – remote components, if you will.  If, for instance, my web application requires access to a database, then an attack on that database – however performed – will impact my service.  As almost any kind of service will have external dependencies these days[6], this is can be a very effective attack, as it allows knowledgeable attackers to target the weakest link in the “chain” of components that make up your service.

There are mitigations against some of these attacks – caching and later reconciliation/synching being one – but identifying and defending against these sorts of attacks depends largely on considering your service as a system, and realising the types of impact degradation of the different parts might have.

 

Conclusion – managed degradation

Which leads me to a final point, which is that when considering availability attacks, understanding and planning Service degradation: actually a good thing is going to be invaluable – and when you’ve done that, you’ll definitely going to need to test it, too (If it isn’t tested, it doesn’t work).

 


1 – yes, I checked the capitalisation – he wasn’t worried about DRDOS, MS-DOS or any of those lovely 80s era command line Operating Systems.

2 – or millions or more, these days.

3 – here, for the avoidance of doubt.

4 – I believe.

5 – you know my policy on spellings by now.  I’m British, and we’ll keep it that way.

6 – unless you’re still using green-screen standalone machines to run your business, in which case either a) yikes or b) well done.

There are no absolutes in security

There is no “secure”.

Let’s stop using the word “secure”. There is no “secure” in IT.

I know that sounds crazy, but it’s true.

Sometimes, when I speak to colleagues and customers, there will be non-technical or non-security people there, and they ask how to get a secure system. So I explain how I’d make a system secure. It goes a bit like this.

  1. Remove any non-critical USB connections: in particular external or “thumb” drives.
  2. Turn off all bluetooth.
  3. Turn off all wifi.
  4. Remove any network cables.
  5. Remove any other USB connections, including mouse or keyboard.
  6. Disconnect any monitors.
  7. Disconnect any other cables that are connected to the system.
  8. Yes, that includes the power cable.
  9. Now take out any hard drives – SSD, HDD or other.
  10. Destroy them. My preferred method is to gouge tracks in all spinning media, break the heads, bash all pieces with a hammer and then throw them into Mount Doom, but any other volcano[1] will do. Thermite lances are probably acceptable. You should do the same with all other components that you removed in earlier steps.
  11. Destroy the motherboard, including all chips and RAM.
  12. Tip all remaining pieces down a well.
  13. Pour concrete down the well.[2]
  14. You probably now have a secure which is about as secure as you’re going to get.

Yes, it’s a bit extreme, but the point is that all of the components there are possible threat vectors or information leakage channels.

Can we design and operate a system where we manage and mitigate the risks of threats and information leakage? Yes. That’s where we improve the security of a system. Is that a secure system? No, it’s not. What we’ve done is raise the bar, but we’ve not made it absolutely secure.

Part of the problem is that there’s just no way, these days[4], that any single person can be certain of the security of all parts of a system: they are just too many, and too complex. You may understand the application layer, but what about the virtualisation layer, for instance? I presented a simplified layer diagram in my post Isolationism a few months back, in which I listed the host as the bottom layer, but that was, of course, just asking for trouble. Along came Meltdown and Spectre, and now it’s clear (as if we didn’t know it already) that you should never ignore the fact that you can’t even trust the silicon you’re running on to do the thing you think it ought.

None of this, however, stops people and companies telling you that they’ll “secure your perimeter”, or provide you with “secure systems”. And it annoys me[5]. “We’ll help you secure your perimeter” isn’t too bad, but anything that suggests that you can have “secure systems” smacks to me of marketing – bad marketing.

So here you go: please stop using the word “secure” as an unqualified adjective or verb. We’re grown-ups, now, and we know it’s not real. So let’s not pretend.

Now – where was that well-cover? I need to deal with little Tommy.


1 – terrestrial/Middle Earth. I’m not sure about volcano temperatures on other planets or in the Undying Lands across the Western Sea.

2 – it should probably therefore be a disused well. Check there are no animals down there first[3]. In fact, before you throw anything down there.

3 – what’s that, Lassie? Little Tommy’s down the well? Well, I wonder whether little Tommy is waiting for us to throw the components down there so that he can do bad things. Bad Tommy.

4 – I’d like to think that maybe there was, once, in the distant past, but I’m probably kidding myself.

5 – you might be surprised at the number of things that annoy me[6].

6 – unless you’re my wife, in which case you probably won’t be[7].

7 – surprised. Or, in fact, reading this article.

Explained: five misused security words

Untangling responsibility, authority, authorisation, authentication and identification.

I took them out of the title, because otherwise it was going to be huge, with lots of polysyllabic words.  You might, therefore, expect a complicated post – but that’s not my intention*.  What I’d like to do it try to explain these five important concepts in security, as they’re often confused or bound up with one another.  They are, however, separate concepts, and it’s important to be able to disentangle what each means, and how they might be applied in a system.  Today’s words are:

  • responsibility
  • authority
  • authorisation
  • authentication
  • identification.

Let’s start with responsibility.

Responsibility

Confused with: function; authority.

If you’re responsible for something, it means that you need to do it, or if something goes wrong.  You can be responsible for a product launching on time, or for the smooth functioning of a team.  If we’re going to ensure we’re really clear about it, I’d suggest using it only for people.  It’s not usually a formal description of a role in a system, though it’s sometimes used as short-hand for describing what a role does.  This short-hand can be confusing.  “The storage module is responsible for ensuring that writes complete transactionally” or “the crypto here is responsible for encrypting this set of bytes” is just a description of the function of the component, and doesn’t truly denote responsibility.

Also, just because you’re responsible for something doesn’t mean that you can make it happen.  One of the most frequent confusions, then, is with authority.  If you can’t ensure that something happens, but it’s your responsibility to make it happen, you have responsibility without authority***.

Authority

Confused with: responsibility, authorisation.

If you have authority over something, then you can make it happen****.  This is another word which is best restricted to use about people.  As noted above, it is possible to have authority but no responsibility*****.

Once we start talking about systems, phrases like “this component has the authority to kill these processes” really means “has sufficient privilege within the system”, and should best be avoided. What we may need to check, however, is whether a component should be given authorisation to hold a particular level of privilege, or to perform certain tasks.

Authorisation

Confused with: authority; authentication.

If a component has authorisation to perform a certain task or set of tasks, then it has been granted power within the system to do those things.  It can be useful to think of roles and personae in this case.  If you are modelling a system on personae, then you will wish to grant a particular role authorisation to perform tasks that, in real life, the person modelled by that role has the authority to do.  Authorisation is an instantiation or realisation of that authority.  A component is granted the authorisation appropriate to the person it represents.  Not all authorisations can be so easily mapped, however, and may be more granular.  You may have a file manager which has authorisation to change a read-only permission to read-write: something you might struggle to map to a specific role or persona.

If authorisation is the granting of power or capability to a component representing a person, the question that precedes it is “how do I know that I should grant that power or capability to this person or component?”.  That process is authentication – authorisation should be the result of a successful authentication.

Authentication

Confused with: authorisation; identification.

If I’ve checked that you’re allowed to perform and action, then I’ve authenticated you: this process is authentication.  A system, then, before granting authorisation to a person or component, must check that they should be allowed the power or capability that comes with that authorisation – that are appropriate to that role.  Successful authentication leads to authorisation.  Unsuccessful authentication leads to blocking of authorisation******.

With the exception of anonymous roles, the core of an authentication process is checking that the person or component is who he, she or it says they are, or claims to be (although anonymous roles can be appropriate for some capabilities within some systems).  This checking of who or what a person or component is authentication, whereas the identification is the claim and the mapping of an identity to a role.

Identification

Confused with: authentication.

I can identify that a particular person exists without being sure that the specific person in front of me is that person.  They may identify themselves to me – this is identification – and the checking that they are who they profess to be is the authentication step.  In systems, we need to map a known identity to the appropriate capabilities, and the presentation of a component with identity allows us to apply the appropriate checks to instantiate that mapping.

Bringing it all together

Just because you know whom I am doesn’t mean that you’re going to let me do something.  I can identify my children over the telephone*******, but that doesn’t mean that I’m going to authorise them to use my credit card********.  Let’s say, however, that I might give my wife my online video account password over the phone, but not my children.  How might the steps in this play out?

First of all, I have responsibility to ensure that my account isn’t abused.  I also have authority to use it, as granted by the Terms and Conditions of the providing company (I’ve decided not to mention a particular service here, mainly in case I misrepresent their Ts&Cs).

“Hi, darling, it’s me, your darling wife*********. I need the video account password.” Identification – she has told me who she claims to be, and I know that such a person exists.

“Is it really you, and not one of the kids?  You’ve got a cold, and sound a bit odd.”  This is my trying to do authentication.

“Don’t be an idiot, of course it’s me.  Give it to me or I’ll pour your best whisky down the drain.”  It’s her.  Definitely her.

“OK, darling, here’s the password: it’s il0v3myw1fe.”  By giving her the password, I’ve  performed authorisation.

It’s important to understand these different concepts, as they’re often conflated or confused, but if you can’t separate them, it’s difficult not only to design systems to function correctly, but also to log and audit the different processes as they occur.


*we’ll have to see how well I manage, however.  I know that I’m prone to long-windedness**

**ask my wife.  Or don’t.

***and a significant problem.

****in a perfect world.  Sometimes people don’t do what they ought to.

*****this is much, much nicer than responsibility without authority.

******and logging.  In both cases.  Lots of logging.  And possibly flashing lights, security guards and sirens on failure, if you’re into that sort of thing.

*******most of the time: sometimes they sound like my wife.  This is confusing.

********neither should you assume that I’m going to let my wife use it, either.*********

*********not to suggest that she can’t use a credit card: it’s just that we have separate ones, mainly for logging purposes.

**********we don’t usually talk like this on the phone.