networking – Alice, Eve and Bob

Fallback is not your friend – learning from TLS

The problem with standards is that people standardise on them.

People still talk about “SSL encryption”, which is so last century. SSL – Secure Sockets Layer – was superseded by TLS – Transport Layer Security – in 1999 (which is when the TLS 1.0 specification was published). TLS 1.1 came out in 2006. That’s 12 years ago, which is, like, well, ages and ages in Internet time. The problem with standards, however, is that people standardise on them. It turns out that TLS v1.0 and v1.1 have some security holes in them, which mean that you shouldn’t use them, but people not only did, but they wrote their specifications and implementations to do so – and to continue doing so. In fact, as reported by The Register, the IETF is leading a call for TLS v1.0 and v1.1 to be fully deprecated: in other words, to tell people that they definitely, really, absolutely shouldn’t be using them.

Given that TLS v1.2 came out in August 2008, nearly ten years ago, and that TLS v1.3 is edging towards publication now, surely nobody is still using TLS v1.0 or v1.1 anyway, right?

Answer – it doesn’t matter whether they are or not

This sounds a bit odd as an answer, but bear with me. Let’s say that you have implemented a new website, a new browser, or a client for some TLS-supporting server. You’re a good and knowledgeable developer[1], or you’re following the designs from your good and knowledgeable architect or the tests from your good and knowledgeable test designer[2]. You are, of course, therefore using the latest version of TLS, which is v1.2. You’ll be upgrading to v1.3 just as soon as it’s available[3]. You’re safe, right? None of those nasty vulnerabilities associated with v1.0 or v1.1 can come and bite you, because all of the other clients or servers to which you’ll be connecting will be using v1.2 as well.

But what if they aren’t? What about legacy clients or old server versions? Wouldn’t it just be better to allow them to say “You know what? I don’t speak v1.2, so let’s speak a version that I do support, say v1.0 instead?”

STOP.

This is fallback, and although it seems like the sensible, helpful thing to do, once you support it – and it’s easy, as many TLS implementations allow it – then you’ve got problems. You’ve just made a decision to sacrifice security for usability – and worse, what you’re actually doing is opening the door for attackers not just to do bad things if they come across legitimate connections, but actually to pretend to be legitimate, force you to downgrade the protocol version to one they can break, and then do bad things. Of course, those bad things will depend on a variety of factors, including the protocol version, the data being passed, cipher suites in use, etc., but this is a classic example of extending your attack surface when you don’t need to.

The two problems are:

it feels like the right thing to do;
it’s really easy to implement.

The right thing to do?

Ask yourself the question: do I really need to accept connections from old clients or make connections to new servers? If I do, should I not at least throw an error[4]? You should consider whether this is really required, or just feels like the right thing to do, and it’s a question that should be considered by the security folks on your project team[5]. Would it not be safer to upgrade the old clients and servers? In lots of cases, the answer is yes – or at least ensure that you deprecate in one release, and then remove support in the next.

The easy implementation?

According to a survey performed by Qualys SSL Labs in May 2018 this was a breakdown of 150,000 popular websites in the world, here are versions of TLS that are supported:

TLS v1.2 – 92.3%
TLS v1.1 – 83.5%
TLS v1.0 – 83.4%
SSL v3.0 – 10.7%
SSL v2.0 – 2.8%

Yes, you’re reading that correctly: over ten percent of the most popular 150,000 websites in the world support SSL v3.0. You don’t even want to think about that, believe me. Now, there may be legitimate reasons for some of those websites to support legacy browsers, but you can bet your hat that a good number of them have just left support in for some of these versions just because it’s more effort to turn off than it is to leave on. This is lazy programming, and it’s insecure practice. Don’t do it.

Conclusion

TLS isn’t the only protocol that supports fallback[8], and there can be good reasons for allowing it. But when the protocol you’re using is a security protocol, and the new version fixes security problems in the old version(s), then you should really avoid it unless you absolutely have to allow it.

1 – the only type of developer to read this blog.

2 – I think you get the idea.

3 – of course you will.

4 – note – throwing security-based errors that users understand and will act on is notoriously difficult.

5 – if you don’t have any security folks on your design team, find some[6].

6 – or at least one, for pity’s sake[7].

7 – or nominate yourself and do some serious reading…

8 – I remember having to hand-code SOCKS v2.0 to SOCKS v1.0 auto-fallback into a client implementation way back in 1996 or so, because the SOCKS v2.0 protocol didn’t support it.

What’s an attack surface?

“Reduce your attack surface,” they say. But what is it?

“Reduce your attack surface,” they[1] say. But what is it? The instruction to reduce your attack surface is one of the principles of IT security, so it must be a Good Thing[tm]. The problem is that it’s not always clear what an attack surface actually is.

I’m going to go for the broadest possible description I can think of, or nearly, because I’m pretty paranoid, and because I’m not convinced that the Wikipedia definition[2] is sufficient[3]. Although I’ll throw in a few examples of how to reduce attack surfaces, the purpose of this post is really to explain what one is, rather than to help protect you – but a good understanding really is required before you start with anything else, so hopefully this will be useful.

So, here’s my start at a definition:

The attack surface of a system is the sum of areas where attacks could be launched against it.

That feels a little bit circular – let’s define some terms. First of all, what’s an an “area” in this definition? Well, I’d say that any particular component of a system may have many points of possible vulnerability – and therefore attack. The sum of those points is an area – and the sum of the areas of the different components of a system gives us our system’s attack surface.

To understand better, we’re going to have to talk about systems – one of my favourite topics[4] – because I think it’s important to clarify a key difference between the attack surface of a component considered alone, and the area that a component adds when part of a system. They will not generally be the same.

Here’s an example: you’re deploying an Operating System. Let’s look at two options for deployment, and compare the attack surfaces. In both cases, I’m going to take a fairly restricted look at points of vulnerability, excluding, for instance, human factors, as I don’t want to get bogged down in the details.

Deployment one – bare metal

You install your Operating System onto a physical machine, and plug it into the network. What are some of the attack points?

your network connection
the physical hardware
services which are listening on the network connection
connections via USB – keyboard and mouse, for example.

There are more, but this should give us enough to do some comparisons. I’d generally think of the attack surface as being associated with the physical bounds of the hardware, with the addition of the network port and USB connections.

How can we reduce the attack surface? Well, we could unplug the network connection – though that might significantly reduce the efficacy of the system! – or we might take steps to reduce the number of services listening on the connection, to reduce the privilege level at which they run, or increase the authentication requirements for connecting to them. We could reduce our surface area by using a utility such as “usbguard” to restrict USB connections, and, if we’re worried about physical access to the machine, we could put it in a locked cabinet somewhere. These are all useful and appropriate ways to reduce our system’s attack surface.

Deployment two – a Virtual Machine

In this deployment scenario, we’re going to install the Operating System onto a Virtual Machine (VM), running on a physical host. What does my attack surface look like now? Well, that rather depends on how you define your system. You could, of course, look at the wider system – the VM and the physical host – but for the purposes of this discussion, I’m going to consider that the operation of the Operating System is what we’re interested in, rather than the broader system[6]. So, what does our attack surface look like this time? Here’s a quick list.

your network connection
the hypervisor
services which are listening on the network connection
connections via USB – keyboard and mouse, for example.

You’ll notice that “the physical hardware” is missing from this list, and that’s because it’s been replace with “the hypervisor”. This is a little simplistic, for a few reasons, including that the hypervisor is arguably implemented via a combination of software and hardware controls, but it’s certainly different from the entire physical hardware we were talking about before, and in fact, there’s not much you can do from the point of the Virtual Machine to secure it, other than recognise its restrictions, so we might want to remove it from our list at this level.

The other entries are also somewhat different from our first scenario, although you might not realise at first glance. First, it’s quite likely (though not certain) that your network connection may in fact be a virtual network connection provided by the hosting system, which means that some of the burden of defending it goes to the hosting system. The same goes for the connections via USB – the hypervisor generally provides “virtual hardware” (via something like qemu, for example), which can be attached – or removed – from virtual machines.

So, you still have the services which are listening on the network connection, but it’s definitely a different attack surface from the first deployment scenario.

Now, if you take the wider view, then there’s definitely an attack surface at the physical machine level as well, and that needs to be considered – but it’s quite likely that this will be under the control of somebody completely different (such as a Cloud Service Provider – CSP).

Another quick example

When I deploy a webserver (using, for instance, Apache), I’ll need to consider a variety of attack vectors, from authentication to denial of service to storage attacks: these are part of our attack surface. If I deploy it with a database (e.g. PostgreSQL or MySQL), the attack surface looks different, assuming that I care about the data in the database. Whereas I might previously have been concerned to ensure that an HTTP “PUT” command didn’t overwrite or scramble a file on my filesystem, a malformed command to my database server could delete or corrupt multiple tables. On the other hand, I might now be able to lock down some of the functions of my webserver that I no longer need to worry about filesystem attacks. The attack surface of my webserver is different when it’s combined in a system with other components[7].

Why do I want to reduce my attack surface?

Well, this is quite an easy one. By looking back at my earlier definition, you’ll see that the smaller a system’s attack surface, the fewer points of attack there are available to malicious actors. That’s got to be a piece of good news.

You will, of course, never be able to reduce your attack surface to zero (see There are no absolutes in security), but the more you reduce (and document, always document!), the better position you’ll be in. It’s always about raising the bar to make it more difficult for malicious actors to affect you.

1 – the mythical IT Security Community, that’s who.

2 – to give one example.

3 – it only talks about data, and only about software: that’s not broad enough for me.

4 – as long-standing[4] readers of this blog will know.

5 – and long-suffering.

6 – yes, I know we can’t ignore that, but we’ll come back to it, honest.

7 – there are considerations around the attack surface of the database as well, of course.

Defending our homes

Your router is your first point of contact with the Internet: how insecure is it?

I’ve always had a problem with the t-shirt that reads “There’s no place like 127.0.0.1”. I know you’re supposed to read it “home”, but to me, it says “There’s no place like localhost”, which just doesn’t have the same ring to it. And in this post, I want to talk about something broader: the entry-point to your home network, which for most people will be a cable or broadband router[1]. The UK and US governments just published advice that “Russia”[2] is attacking routers. This attack will be aimed mostly, I suspect, at organisations (see my previous post What’s a State Actor, and should I care?), rather than homes, but it’s a useful wake-up call for all of us.

What do routers do?

Routers are important: they provide the link between one network (in this case, our home network) and another one (in this case, the Internet, via our ISP’s network. In fact, for most of us, the box we think of as “the router”[3] is doing a lot more than that. The “routing” bit is what is sounds like: it helps computers on your network to find routes to send data to computers outside the network – and vice-versa, for when you’re getting data back. But most routers will actual be doing more than that. The other purpose that many will be performing is that of a modem. Most of us [4] connect to the Internet via a phoneline – whether cable or standard landline – though there is a growing trend for mobile Internet to the home. Where you’re connecting via a phone line, there’s a need to convert the signals that we use for the Internet to something else and then (at the other end) back again. For those of us old enough to remember the old “dial-up” days, that’s what the screechy box next to your computer used to do.

But routers often do more things as, well. Sometimes many more things, including traffic logging, being an WiFi access point, providing a VPN for external access to your internal network, child access, firewalling and all the rest.

Routers are complex things these days, and although state actors may not be trying to get into them, other people may.

Does this matter, you ask? Well, if other people can get into your system, they have easy access to attacking your laptops, phones, network drives and the rest. They can access and delete unprotected personal data. They can plausibly pretend to be you. They can use your network to host illegal data or launch attacks on others. Basically, all the bad things.

Luckily, routers tend to come set up by your ISP, with the implication being that you can leave them, and they’ll be nice and safe.

So we’re safe, then?

Unluckily, we’re really not.

The first problem is that the ISPs are working on a budget, and it’s in their best interests to provide cheap kit which just does the job. The quality of ISP-provided routers tends to be pretty terrible. It’s also high on the list of things to try to attack by malicious actors: if they know that a particular router model will be installed in a several million homes, there’s a great incentive to find an attack, as an attack on that model will be very valuable to them.

Measures to take

Here’s a quick list of steps you can take to try to improve the security of your first hop to the Internet. I’ve tried to order them in terms of ease – simplest first. Before you do any of these, however, save the configuration data so that you can bring it back if you need it.

Passwords – always, always, always change the admin password for your router. It’s probably going to be one that you rarely use, so you’ll want to record it somewhere. This is one of the few times where you might want to consider taping it to the router itself, as long as the router is in a secure place where only authorised people (you and your family[5]) have access.
Internal admin access only – unless you have very good reasons, and you know what you’re doing, don’t allow machines to administer the router unless they’re on your home network. There should be a setting on your router for this.
Wifi passwords – once you’ve done 2., you need to ensure that wifi passwords on your network – whether set on your router or elsewhere – are strong. It’s easy to set a “friendly” password so that it’s easy for visitors to connect to your network, but if it’s guessed by a malicious person who happens to be nearby, the first thing they’ll do will be to look for routers on the network, and as they’re on the internal network they’ll have access to it (hence why 1 is important).
Only turn on functions that you understand and need – as I noted above, modern routers have all sorts of cool options. Disregard them. Unless you really need them, and you actually understand what they do, and what the dangers of turning them on are, then leave them off. You’re just increasing your attack surface.
Buy your own router – replace your ISP-supplied router with a better one. Go to your local computer store and ask for suggestions. You can pay an awful lot, but you can conversely get something fairly cheap that does the job and is more robust, performant and easy to secure than the one you have at the moment. You may also want to buy a separate modem. Generally setting up your own modem or router is simple, and you can copy the settings from the ISP-supplied one and it will “just work”.
Firmware updates – I’d love to have this further up the list, but it’s not always easy. From time to time, firmware updates appear for your router. Most routers will check automatically, and may prompt you to update when you next log in. The problem is that failure to update correctly can cause catastrophic results[6], or lose configuration data that you’ll need to re-enter. But you really do need to consider doing this, and keeping a look-out of firmware updates which fix severe security issues.
Go open source – there are some great open source router projects out there which allow you to take an existing router and replace all of the firmware/software on it with an open source alternative. You can find a list of at least some of them on Wikipedia – https://en.wikipedia.org/wiki/List_of_router_firmware_projects, and a search on “router” on Opensource.com will open your eyes to a set of fascinating opportunities. This isn’t a step for the faint-hearted, as you’ll definitely void the warranty on your existing router, but if you want to have real control, open source is always the way to go.

Other issues…

I’d love to pretend that once you’ve improved the security of your router, that all’s well and good, but it’s not on your home network.. What about IoT devices in your home (Alexa, Nest, Ring doorbells, smart lightbulbs, etc.?) What about VPNs to other networks? Malicious hosts via Wifi, malicious apps on your childrens phones…?

No – you won’t be safe. But, as we’ve discussed before, although there is no “secure”, that doesn’t mean that we shouldn’t raise the bar and make it harder for the Bad Folks[tm].

1 – I’m simplifying – but read on, we’ll get there.

2 -“Russian State-Sponsored Cyber Actors”

3 – or, in my parents’ case, “the Internet box”, I suspect.

4 – this is one of these cases where I don’t want comments telling me how you have a direct 1 Terabit/s connection to your local backbone, thank you very much.

5 – maybe not the entire family.

6 – your router is now a brick, and you have no access to the Internet.

Isolationism

… what’s the fun in having an Internet if you can’t, well, “net” on it?

Sometimes – and I hope this doesn’t come as too much of a surprise to my readers – sometimes, there are bad people, and they do bad things with computers. These bad things are often about stopping the good things that computers are supposed to be doing* from happening properly. This is generally considered not to be what you want to happen**.

For this reason, when we architect and design systems, we often try to enforce isolation between components. I’ve had a couple of very interesting discussions over the past week about how to isolate various processes from each other, using different types of isolation, so I thought it might be interesting to go through some of the different types of isolation that we see out there. For the record, I’m not an expert on all different types of system, so I’m going to talk some history****, and then I’m going to concentrate on Linux*****, because that’s what I know best.

In the beginning

In the beginning, computers didn’t talk to one another. It was relatively difficult, therefore, for the bad people to do their bad things unless they physically had access to the computers themselves, and even if they did the bad things, the repercussions weren’t very widespread because there was no easy way for them to spread to other computers. This was good.

Much of the conversation below will focus on how individual computers act as hosts for a variety of different processes, so I’m going to refer to individual computers as “hosts” for the purposes of this post. Isolation at this level – host isolation – is still arguably the strongest type available to us. We typically talk about “air-gapping”, where there is literally an air gap – no physical network connection – between one host and another, but we also mean no wireless connection either. You might think that this is irrelevant in the modern networking world, but there are classes of usage where it is still very useful, the most obvious being for Certificate Authorities, where the root certificate is so rarely accessed – and so sensitive – that there is good reason not to connect the host on which it is stored to be connected to any other computer, and to use other means, such as smart-cards, a printer, or good old pen and paper to transfer information from it.

And then…

And then came networks. These allow hosts to talk to each other. In fact, by dint of the Internet, pretty much any host can talk to any other host, given a gateway or two. So along came network isolation to try to stop tha. Network isolation is basically trying to re-apply host isolation, after people messed it up by allowing hosts to talk to each other******.

Later, some smart alec came up with the idea of allowing multiple processes to be on the same host at the same time. The OS and kernel were trusted to keep these separate, but sometimes that wasn’t enough, so then virtualisation came along, to try to convince these different processes that they weren’t actually executing alongside anything else, but had their own environment to do their old thing. Sadly, the bad processes realised this wasn’t always true and found ways to get around this, so hardware virtualisation came along, where the actual chips running the hosts were recruited to try to convince the bad processes that they were all alone in the world. This should work, only a) people don’t always program the chips – or the software running on them – properly, and b) people decided that despite wanting to let these processes run as if they were on separate hosts, they also wanted them to be able to talk to processes which really were on other hosts. This meant that networking isolation needed to be applied not just at the host level, but at the virtual host level, as well******.

A step backwards?

Now, in a move which may seem retrograde, it occurred to some people that although hardware virtualisation seemed like a great plan, it was also somewhat of a pain to administer, and introduced inefficiencies that they didn’t like: e.g. using up lots of RAM and lots of compute cycles. These were often the same people who were of the opinion that processes ought to be able to talk to each other – what’s the fun in having an Internet if you can’t, well, “net” on it? Now we, as security folks, realise how foolish this sounds – allowing processes to talk to each other just encourages the bad people, right? – but they won the day, and containers came along. Containers allow lots of processes to be run on a host in a lightweight way, and rely on kernel controls – mainly namespaces – to ensure isolation********. In fact, there’s more you can do: you can use techniques like system call trapping to intercept the things that processes are attempting and stop them if they look like the sort of things they shouldn’t be attempting*********.

And, of course, you can write frameworks at the application layer to try to control what the different components of an application system can do – that’s basically the highest layer, and you’re just layering applications on applications at this point.

Systems thinking

So here’s where I get to the chance to mention one of my favourite topics: systems. As I’ve said before, by “system” here I don’t mean an individual computer (hence my definition of host, above), but a set of components that work together. The thing about isolation is that it works best when applied to a system.

Let me explain. A system, at least as I’d define it for the purposes of this post, is a set of components that work together but don’t have knowledge of external pieces. Most important, they don’t have knowledge of different layers below them. Systems may impose isolation on applications at higher layers, because they provide abstractions which allow higher systems to be able to ignore them, but by virtue of that, systems aren’t – or shouldn’t be – aware of the layers below them.

A simple description of the layers – and it doesn’t always hold, partly because networks are tricky things, and partly because there are various ways to assemble the stack – may look like this.

Application (top layer)
Container
System trapping
Kernel
Hardware virtualisation
Networking
Host (bottom layer)

As I intimated above, this is a (gross) simplification, but the point holds that the basic rule is that you can enforce isolation upwards in the layers of the stack, but you can’t enforce it downwards. Lower layer isolation is therefore generally stronger than higher layer isolation. This shouldn’t come as a huge surprise to anyone who’s used to considering network stacks – the principle is the same – but it’s helpful to lay out and explain the principles from time to time, and the implications for when you’re designing and architecting.

Because if you are considering trust models and are defining trust domains, you need to be very, very careful about defining whether – and how – these domains spread across the layer boundaries. If you miss a boundary out when considering trust domains, you’ve almost certainly messed up, and need to start again. Trust domains are important in this sort of conversation because the boundaries between trust domains are typically where you want to be able to enforce and police isolation.

The conversations I’ve had recently basically ran into problems because what people really wanted to do was apply lower layer isolation from layers above which had no knowledge of the bottom layers, and no way to reach into the control plane for those layers. We had to remodel, and I think that we came up with some sensible approaches. It was as I was discussing these approaches that it occurred to me that it would have been a whole lot easier to discuss them if we’d started out with a discussion of layers: hence this blog post. I hope it’s useful.

*although they may well not be, because, as I’m pretty sure I’ve mentioned before on this blog, the people trying to make the computers do the good things quite often get it wrong.

**unless you’re one of the bad people. But I’m pretty sure they don’t read this blog, so we’re OK***.

***if you are a bad person, and you read this blog, would you please mind pretending, just for now, that you’re a good person? Thank you. It’ll help us all sleep much better in our beds.

****which I’m absolutely going to present in an order that suits me, and generally neglect to check properly. Tough.

*****s/Linux/GNU Linux/g; Natch.

******for some reason, this seemed like a good idea at the time.

*******for those of you who are paying attention, we’ve got to techniques like VXLAN and SR-IOV.

********kernel purists will try to convince you that there’s no mention of containers in the Linux kernel, and that they “don’t really exist” as a concept. Try downloading the kernel source and doing a search for “container” if you want some ammunition to counter such arguments.

*********this is how SELinux works, for instance.