Defending our homes

Your router is your first point of contact with the Internet: how insecure is it?

I’ve always had a problem with the t-shirt that reads “There’s no place like 127.0.0.1”. I know you’re supposed to read it “home”, but to me, it says “There’s no place like localhost”, which just doesn’t have the same ring to it. And in this post, I want to talk about something broader: the entry-point to your home network, which for most people will be a cable or broadband router[1].  The UK and US governments just published advice that “Russia”[2] is attacking routers.  This attack will be aimed mostly, I suspect, at organisations (see my previous post What’s a State Actor, and should I care?), rather than homes, but it’s a useful wake-up call for all of us.

What do routers do?

Routers are important: they provide the link between one network (in this case, our home network) and another one (in this case, the Internet, via our ISP’s network.  In fact, for most of us, the box we think of as “the router”[3] is doing a lot more than that.  The “routing” bit is what is sounds like: it helps computers on your network to find routes to send data to computers outside the network – and vice-versa, for when you’re getting data back.  But most routers will actual be doing more than that.  The other purpose that many will be performing is that of a modem.  Most of us [4] connect to the Internet via a phoneline – whether cable or standard landline – though there is a growing trend for mobile Internet to the home.  Where you’re connecting via a phone line, there’s a need to convert the signals that we use for the Internet to something else and then (at the other end) back again.  For those of us old enough to remember the old “dial-up” days, that’s what the screechy box next to your computer used to do.

But routers often do more things as, well.  Sometimes many more things, including traffic logging, being an WiFi access point, providing a VPN for external access to your internal network, child access, firewalling and all the rest.

Routers are complex things these days, and although state actors may not be trying to get into them, other people may.

Does this matter, you ask?  Well, if other people can get into your system, they have easy access to attacking your laptops, phones, network drives and the rest.  They can access and delete unprotected personal data.  They can plausibly pretend to be you.  They can use your network to host illegal data or launch attacks on others.  Basically, all the bad things.

Luckily, routers tend to come set up by your ISP, with the implication being that you can leave them, and they’ll be nice and safe.

So we’re safe, then?

Unluckily, we’re really not.

The first problem is that the ISPs are working on a budget, and it’s in their best interests to provide cheap kit which just does the job.  The quality of ISP-provided routers tends to be pretty terrible.  It’s also high on the list of things to try to attack by malicious actors: if they know that a particular router model will be installed in a several million homes, there’s a great incentive to find an attack, as an attack on that model will be very valuable to them.

Other problems that arise include:

  • slowness to fix known bugs or vulnerabilities – updating firmware can be costly to your ISP, so they may be slow to arrive (if they do at all);
  • easily-derived or default admin passwords, meaning that attackers don’t even need to find a real vulnerability – they can just log in.

 

Measures to take

Here’s a quick list of steps you can take to try to improve the security of your first hop to the Internet.  I’ve tried to order them in terms of ease – simplest first.  Before you do any of these, however, save the configuration data so that you can bring it back if you need it.

  1. Passwords – always, always, always change the admin password for your router.  It’s probably going to be one that you rarely use, so you’ll want to record it somewhere.  This is one of the few times where you might want to consider taping it to the router itself, as long as the router is in a secure place where only authorised people (you and your family[5]) have access.
  2. Internal admin access only – unless you have very good reasons, and you know what you’re doing, don’t allow machines to administer the router unless they’re on your home network.  There should be a setting on your router for this.
  3. Wifi passwords – once you’ve done 2., you need to ensure that wifi passwords on your network – whether set on your router or elsewhere – are strong.  It’s easy to set a “friendly” password so that it’s easy for visitors to connect to your network, but if it’s guessed by a malicious person who happens to be nearby, the first thing they’ll do will be to look for routers on the network, and as they’re on the internal network they’ll have access to it (hence why 1 is important).
  4. Only turn on functions that you understand and need – as I noted above, modern routers have all sorts of cool options.  Disregard them.  Unless you really need them, and you actually understand what they do, and what the dangers of turning them on are, then leave them off.  You’re just increasing your attack surface.
  5. Buy your own router – replace your ISP-supplied router with a better one.  Go to your local computer store and ask for suggestions.  You can pay an awful lot, but you can conversely get something fairly cheap that does the job and is more robust, performant and easy to secure than the one you have at the moment.  You may also want to buy a separate modem.  Generally setting up your own modem or router is simple, and you can copy the settings from the ISP-supplied one and it will “just work”.
  6. Firmware updates – I’d love to have this further up the list, but it’s not always easy.  From time to time, firmware updates appear for your router.  Most routers will check automatically, and may prompt you to update when you next log in.  The problem is that failure to update correctly can cause catastrophic results[6], or lose configuration data that you’ll need to re-enter.  But you really do need to consider doing this, and keeping a look-out of firmware updates which fix severe security issues.
  7. Go open source – there are some great open source router projects out there which allow you to take an existing router and replace all of the firmware/software on it with an open source alternative.  You can find a list of at least some of them on Wikipedia – https://en.wikipedia.org/wiki/List_of_router_firmware_projects, and a search on “router” on Opensource.com will open your eyes to a set of fascinating opportunities.  This isn’t a step for the faint-hearted, as you’ll definitely void the warranty on your existing router, but if you want to have real control, open source is always the way to go.

Other issues…

I’d love to pretend that once you’ve improved the security of your router, that all’s well and good, but it’s not on your home network..  What about IoT devices in your home (Alexa, Nest, Ring doorbells, smart lightbulbs, etc.?)  What about VPNs to other networks?  Malicious hosts via Wifi, malicious apps on your childrens phones…?

No – you won’t be safe.  But, as we’ve discussed before, although there is no “secure”, that doesn’t mean that we shouldn’t raise the bar and make it harder for the Bad Folks[tm].

 


1 – I’m simplifying – but read on, we’ll get there.

2 -“Russian State-Sponsored Cyber Actors”

3 – or, in my parents’ case, “the Internet box”, I suspect.

4 – this is one of these cases where I don’t want comments telling me how you have a direct 1 Terabit/s connection to your local backbone, thank you very much.

5 – maybe not the entire family.

6 – your router is now a brick, and you have no access to the Internet.

Do you know what’s lurking on your system?

Every utility, every library, every executable … increases your attack surface.

I had a job once which involved designing hardening procedures for systems that we were going to be use for security-related projects.  It was fascinating.  This was probably 15 years ago, and not only were guides a little thin on the ground for the Linux distribution I was using, but what we were doing was quite niche.  At first, I think I’d assumed that I could just write a script to close down a few holes that originated from daemons[1] that had been left running for no reasons: httpd, sendmail, stuff like that.  It did involve that, of course, but then I realised there was more to do, and started to dive down the rabbit hole.

So, after those daemons, I looked at users and groups.  And then at file systems, networking, storage.  I left the two scariest pieces to last, for different reasons.  The first was the kernel.  I ended up hand-crafting a kernel, removing anything that I thought it was unlikely we’d need, and then restarting several times when I discovered that the system wouldn’t boot because the things I thought I understood were more … esoteric than I’d realised.  I’m not a kernel developer, and this was a salutary lesson in quite how skilled those folks are.  At least, at the time I was doing it, there was less code, and fewer options, than there are today.  On the other hand, I was having to hack back to a required state, and there are more cut-down kernels and systems to start with than there were back then.

The other piece that I left for last was just pruning the installed Operating System applications and associated utilities.  Again, there are cut-down options that are easier to use now than then, but I also had some odd requirements – I believe that we needed Java, for instance, which has, or had …. well let’s say a lot of dependencies.  Most modern Linux distributions[3] start off by installing lots of pieces so that you can get started quickly, without having to worry about trying to work out dependencies for every piece of external software you want to run.

This is understandable, but we need to realise when we do this that we’re making a usability/security trade-off[5].  Every utility, every library, every executable that you add to a system increases your attack surface, and increases the likelihood of vulnerabilities.

The problem isn’t just that you’re introducing vulnerabilities, but that once they’re there, they tend to stay there.  Not just in code that you need, but, even worse, in code that you don’t need.  It’s a rare, but praiseworthy hacker[6] who spends time going over old code removing dependencies that are no longer required.  It’s a boring, complex task, and it’s usually just easier to leave the cruft[7] where it is and ship a slightly bigger package for the next release.

Sometimes, code is refactored and stripped: most frequently, security-related code. This is a very Good Thing[tm], but it turns out that it’s far from sufficient.  The reason I’m writing this post is because of a recent story in The Register about the “beep” command.  This command used the little speaker that was installed on most PC-compatible motherboards to make a little noise.  It was a useful little utility back in the day, but is pretty irrelevant now that most motherboards don’t ship with the relevant hardware.  The problem[8] is that installing and using the beep command on a system allows information to be leaked to users who lack the relevant permissions.  This can be a very bad thing.  There’s a good, brief overview here.

Now, “beep” isn’t installed by default on the distribution I’m using on the system on which I’m writing this post (Fedora 27), though it’s easily installable from one of the standard repositories that I have enabled.  Something of a relief, though it’s not a likely attack vector for this machine anyway.

What, though, do I have installed on this system that is vulnerable, and which I’d never have thought to check?  Forget all of the kernel parameters which I don’t need turned on, but which have been enabled by the distribution for ease of use across multiple systems.  Forget the apps that I’ve installed and use everyday.  Forget, even the apps that I installed once to try, and then neglected to remove.  What about the apps that I didn’t even know were there, and which I never realised might have an impact on the security posture of my system?  I don’t know, and have little way to find out.

This system doesn’t run business-critical operations.  When I first got it and installed the Operating System, I decided to err towards usability, and to be ready to trash[9] it and start again if I had problems with compromise.  But that’s not the case for millions of other systems out there.  I urge you to consider what you’re running on a system, what’s actually installed on it, and what you really need.  Patch what you need, remove what you don’t.   It’s time for a Spring clean[10].


1- I so want to spell this word dæmons, but I think that might be taking my Middle English obsession too far[2].

2 – I mentioned that I studied Middle English, right?

3 – I’m most interested in (GNU) Linux here, as I’m a strong open source advocate and because it’s the Operating System that I know best[4].

4 – oh, and I should disclose that I work for Red Hat, of course.

5 – as with so many things in our business.

6 – the good type, or I’d have said “cracker”, probably.

7 – there’s even a word for it, see?

8 – beyond a second order problem that a suggested fix seems to have made the initial problem worse…

9 – physically, if needs be.

10 – In the Northern Hemisphere, at least.

Meltdown and Spectre: thinking about embargoes and disclosures

The technical details behind Meltdown and Spectre are complex and fascinating – and the possible impacts wide-ranging and scary.  I’m not going to go into those here, though there are some excellent articles explaining them.  I’d point readers in particular at the following URLs (which both resolve to the same page, so there’s no need to follow both):

I’d also recommend this article on the Red Hat blog, written by my colleague Jon Masters: What are Meltdown and Spectre? Here’s what you need to know.  It includes a handy cut-out-and-keep video[1] explaining the basics.   Jon has been intimately involved in the process of creating and releasing mitigations and fixes to Meltdown and Spectre, and is very well-placed to talk about it.

All that is interesting and important, but I want to talk about the mechanics and process behind how a vulnerability like this moves from discovery through to fixing.  Although similar processes exist for proprietary software, I’m most interested in open source software, so that’s what I’m going to describe.

 

Step 1 – telling the right people

Let’s say that someone has discovered a vulnerability in a piece of open source software.  There are a number of things they can do.  The first is ignore it.  This is no good, and isn’t interesting for our story, so we’ll ignore it in turn.  Second is to keep it to themselves or try to make money out of it.  Let’s assume that the discoverer is a good type of person[2], so isn’t going to do this[3].  A third is to announce it on the project mailing list.  This might seem like a sensible thing to do, but it can actually be very damaging to the project and the community at large. The problem with this is that the discoverer has just let all the Bad Folks[tm] know about a vulnerability which hasn’t been fixed, and they can now exploit.  Even if the discoverer submits a patch, it needs to be considered and tested, and it may be that the fix they’ve suggested isn’t the best approach anyway.  For a serious security issue, the project maintainers are going to want to have a really good look at it in order to decide what the best fix – or mitigation – should be.

It’s for this reason that many open source projects have a security disclosure process.  Typically, there’s a closed mailing list to which you can send suspected vulnerabilities, often something like “security@projectname.org“.  Once someone has emailed that list, that’s where the process kicks in.

Step 2 – finding a fix

Now, before that email got to the list, somebody had to decide that they needed a list in the first place, so let’s assume that you, the project leader, has already put a process of some kind in place.  There are a few key decisions to make, of which the most important are probably:

  • are you going to keep it secret[5]?  It’s easy to default to “yes” here, for the reasons that I mentioned above, but the number of times you actually put in place restrictions on telling people about the vulnerability – usually referred to as an embargo, given its similarity to a standard news embargo – should be very, very small.  This process is going to be painful and time-consuming: don’t overuse it.  Oh, and your project’s meant to be about open source software, yes?  Default to that.
  • who will you tell about the vulnerability?  We’re assuming that you ended up answering “yes” to te previous bullet, and that you don’t want everybody to know, and also that this is a closed list.  So, who’s the most important set of people?  Well, most obvious is the set of project maintainers, and key programmers and testers – of which more below.  These are the people who will get the fix completed, but there are two other constituencies you might want to consider: major distributors and consumers of the project. Why these people, and who are they?  Well, distributors might be OEMs, Operating System Vendors or ISVs who bundle your project as part of their offering.  These may be important because you’re going to want to ensure that people who will need to consume your patch can do so quickly, and via their preferred method of updating.  What about major consumers?  Well, if an organisation has a deployed estate of hundreds or thousands of instances of a project[6], then the impact of having to patch all of those at short notice – let alone the time required to test the patch – may be very significant, so they may want to know ahead of time, and they may be quite upset if they don’t.  This privileging of major over rank-and-file users of your project is politically sensitive, and causes much hand-wringing in the community.  And, of course, the more people you tell, the more likely it is that a leak will occur, and news of the vulnerability to get out before you’ve had a chance to fix it properly.
  • who should be on the fixing team?  Just because you tell people doesn’t mean that they need to be on the team.  By preference, you want people who are both security experts and also know your project software inside-out.  Good luck with this.  Many projects aren’t security projects, and will have no security experts attached – or a few at most.  You may need to call people in to help you, and you may not even know which people to call in until you get a disclosure in the first place.
  • how long are you going to give yourself?  Many discoverers of vulnerabilities want their discovery made public as soon as possible.  This could be for a variety of reasons: they have an academic deadline; they don’t want somebody else to beat them to disclosure; they believe that it’s important to the community that the project is protected as soon as possible; they’re concerned that other people have found – are are exploiting – the vulnerability; they don’t trust the project to take the vulnerability seriously and are concerned that it will just be ignored.  This last reason is sadly justifiable, as there are projects who don’t react quickly enough, or don’t take security vulnerabilities seriously, and there’s a sorry history of proprietary software vendors burying vulnerabilities and pretending they’ll just go away[7].  A standard period of time before disclosure is 2-3 weeks, but as projects get bigger and more complex, balancing that against issues such as pressure from those large-scale users to give them more time to test, holiday plans and all the rest becomes important.  Having an agreed time and sticking to it can be vital, however, if you want to avoid deadline slip after deadline slip.  There’s another interesting issue, as well, which is relevant to the Meltdown and Spectre vulnerabilities – you can’t just patch hardware.  Where hardware is involved, the fixes may involve multiple patches to multiple projects, and not just standard software but microcode as well: this may significantly increase the time needed.
  • what incentives are you going to provide?  Some large projects are in a position to offer bug bounties, but these are few and far between.  Most disclosers want – and should expect to be granted – public credit when the fix is finally provided and announced to the public at large.  This is not to say that disclosers should necessarily be involved in the wider process that we’re describing: this can, in fact, be counter-productive, as their priorities (around, for instance, timing) may be at odds with the rest of the team.

There’s another thing you might want to consider, which is “what are we going to do if this information leaks early?” I don’t have many good answers for this one, as it will depend on numerous factors such as how close are you to a fix, how major is the problem, and whether anybody in the press picks up on it.  You should definitely consider it, though.

Step 3 – external disclosure

You’ve come up with a fix?  Well done.  Everyone’s happy?  Very, very well done[8].  Now you need to tell people.  But before you do that, you’re going to need to decide what to tell them.  There are at least three types of information that you may well want to prepare:

  • technical documentation – by this, I mean project technical documentation.  Code snippets, inline comments, test cases, wiki information, etc., so that when people who code on your project – or code against it – need to know what’s happened, they can look at this and understand.
  • documentation for techies – there will be people who aren’t part of your project, but who use it, or are interested in it, who will want to understand the problem you had, and how you fixed it.  This will help them to understand the risk to them, or to similar software or systems.  Being able to explain to these people is really important.
  • press, blogger and family documentation – this is a category that I’ve just come up with.  Certain members of the press will be quite happy with “documentation for techies”, but, particularly if your project is widely used, many non-technical members of the press – or technical members of the press writing for non-technical audiences – are going to need something that they can consume and which will explain in easy-to-digest snippets a) what went wrong; b) why it matters to their readers; and c) how you fixed it.  In fact, on the whole, they’re going to be less interested in c), and probably more interested in a hypothetical d) & e) (which are “whose fault was it and how much blame can we heap on them?” and “how much damage has this been shown to do, and can you point us at people who’ve suffered due to it?” – I’m not sure how helpful either of these is).  Bloggers may also be useful in spreading the message, so considering them is good.  And generally, I reckon that if I can explain a problem to my family, who aren’t particularly technical, then I can probably explain it to pretty much anybody I care about.

Of course, you’re now going to need to coordinate how you disseminate all of these types of information.  Depending on how big your project is, your channels may range from your project code repository through your project wiki through marketing groups, PR and even government agencies[9].

 

Conclusion

There really is no “one size fits all” approach to security vulnerability disclosures, but it’s an important consideration for any open source software project, and one of those that can be forgotten as a project grows, until suddenly you’re facing a real use case.  I hope this overview is interesting and useful, and I’d love to hear thoughts and stories about examples where security disclosure processes have worked – and when they haven’t.


2 – because only good people read this blog, right?

3 – some government security agencies have a policy of collecting – and not disclosing – security vulnerabilities for their own ends.  I’m not going to get into the rights or wrongs of this approach.[4]

4 – well, not in this post, anyway.

5 – or try, at least.

6 – or millions or more – consider vulnerabilities in smartphone software, for instance, or cloud providers’ install bases.

7 – and even, shockingly, of using legal mechanisms to silence – or try to silence – the discloser.

8 – for the record, I don’t believe you: there’s never been a project – let alone an embargo – where everyone was happy.  But we’ll pretend, shall we?

9 – I really don’t think I want to be involved in any of these types of disclosure, thank you.

Top 5 resolutions for security folks – 2018

Yesterday, I wrote some jokey resolutions for 2018 – today, as it’s a Tuesday, my regular day for posts, I decided to come up with some real ones.

1 – Embrace the open

I’m proud to have been using Linux[1] and other open source software for around twenty years now.  Since joining Red Hat in 2016, and particularly since I started writing for Opensource.com, I’ve become more aware of other areas of open-ness out there, from open data to open organisations.  There are still people out there who are convinced that open source is less secure than proprietary software.  You’ll be unsurprised to discover that I disagree.  I encourage everyone to explore how embracing the open can benefit them and their organisations.

2 – Talk about risk

I’m convinced that we talk too much about security for security’s sake, and not about risk, which is what most “normal people” think about.  There’s education needed here as well: of us, and of others.  If we don’t understand the organisations we’re part of, and how they work, we’re not going to be able to discuss risk sensibly.  In the other direction, we need to be able to talk about security a bit, in order to explain how it will mitigate risk, so we need to learn how to do this in a way that informs our colleagues, rather than alienating them.

3 – Think about systems

I don’t believe that we[2] talk enough about systems.  We spend a lot of our time thinking about functionality and features, or how “our bit” works, but not enough about how all the bits fit together. I don’t often link out to external sites or documents, but I’m going to make an exception for NIST special publication 800-160 “Systems Security Engineering: Considerations for a Multidisciplinary Approach in the Engineering of Trustworthy Secure Systems”, and I particularly encourage you to read Appendix E “Roles, responsibilities and skills: the characteristics and expectations of a systems security engineer”.  I reckon this is an excellent description of the core skills and expertise required for anyone looking to make a career in IT security.

4 – Examine the point of conferences

I go to a fair number of conferences, both as an attendee and as a speaker – and also do my share of submission grading.  I’ve written before about how annoyed I get (and I think others get) by product pitches at conferences.  There are many reasons to attend the conferences, but I think it’s important for organisers, speakers and attendees to consider what’s most important to them.  For myself, I’m going to try to ensure that what I speak about is what I think other people will be interested in, and not just what I’m focussed on.  I’d also highlight the importance of the “hallway track”: having conversations with other attendees which aren’t necessarily directly related to the specific papers or talks. We should try to consider what conferences we need to attend, and which ones to allow to fall by the wayside.

5 – Read outside the IT security discipline

We all need downtime.  One way to get that is to read – on an e-reader, online, on your phone, magazines, newspapers or good old-fashioned books.  Rather than just pick up something directly related to work, choose something which is at least a bit off the beaten track.  Whether it’s an article on a topic to do with your organisation’s business,  a non-security part of IT[3], something on current affairs, or a book on a completely unrelated topic[4], taking the time to gain a different perspective on the world is always[5] worth it.

What have I missed?

I had lots of candidates for this list, and I’m sure that I’ve missed something out that you think should be in there.  That’s what comments are for, so please share your thoughts.


1 GNU Linux.

2 the mythical IT community

3 – I know, it’s not going to be as sexy as security, but go with it.  At least once.

4 – I’m currently going through a big espionage fiction phase.  Which is neither here nor there, but hey.

5 – well, maybe almost always.

Who’s saying “hello”? – agency, intent and AI

Who is saying “hello world?”: you, or the computer?

I don’t yet have one of those Google or Amazon talking speaker thingies in my house or office.  A large part of this is that I’m just not happy about the security side: I know that the respective companies swear that they’re only “listening” when you say the device’s trigger word, but even if that’s the case, I like to pretend[1] that I have at least some semblance of privacy in my life.  Another reason, however, is that I’m not sure that I like what happens to people when they pretend that there’s a person listening to them, but it’s really just a machine.

It’s not just Alexa and the OK, Google persona, however.  When I connect to an automated phone-answering service, I worry when I hear “I’ll direct your call” from a non-human.  Who is “I”?  “We’ll direct your call” is better – “we” could be the organisation with whom I’m interacting.  But “I”?  “I” is the pronoun that people use.  When I hear “I”, I’m expecting sentience: if it’s a machine I’m expecting AI – preferably fully Turing-compliant.

There’s a more important point here, though.  I’m entirely aware that there’s no sentience behind that “I”[2], but there’s an important issue about agency that we should unpack.

What, then, is “agency”?  I’m talking about the ability of an entity to act on its or another’s behalf, and I touched on this this in a previous post, “Wow: autonomous agents!“.  When somebody writes some code, what they’re doing is giving ability to the system that will run that code to do something – that’s the first part.  But the agency doesn’t really occur, I’d say, until that code is run/instantiated/executed.  At this point, I would argue, the software instance has agency.

But whose agency, exactly?  For whom is this software acting?

Here are some answers.  I honestly don’t think that any of them is right.

  1. the person who owns the hardware (you own the Alexa hardware, right?  You paid Amazon for it…  Or what about running applications on the cloud?).
  2. the person who started the software (you turned on the Alexa hardware, which started the software…  And don’t forget software which is automatically executed in response to triggers or on a time schedule.)
  3. the person who gave the software the instructions (what do you mean, “gave it the instructions”?  Wrote its config file?  Spoke to it?  Set up initial settings?  Typed in commands?  And even if you gave it instructions, do you think that your OK Google hardware is implementing your wishes, or Google’s?  For whom is it actually acting?  And what side effects (like recording your search history and deciding what to suggest in your feed) are you happy to believe are “yours”?)
  4. the person who installed the software (your phone comes with all sorts of software installed, but surely you are the one who imbues it with agency?  If not, whom are you blaming: Google (for the Android apps) or Samsung (which actually put them on the phone)?)
  5. the person who wrote the software (I think we’ve already dealt with this, but even then, is it a single person, or an organisation?  What about open source software, which is typically written, compiled and documented by many different people?  Ascribing “ownership” or “authorship” is a distinctly tricky (and intentionally tricky) issue when you discuss open source)

Another way to think of this problem is to ask: when you write and execute a program, who is saying “hello world?”: you, or the computer?

There are some really interesting questions that come out of this.  Here are a couple that come to mind, which seem to me to be closely connected.

  • In the film Wargames[3], is the automatic dialling that Matthew Broderick’s character’s computer carries out an act with agency?  Or is it when it connects to another machine?  Or when it records the details of that machine?  I don’t think anyone would argue that the computer is acting with agency once David Lightman actually gets it to complete a connection and interact with it, but what about before?
  • Google used to run automated programs against messages received as part of the Gmail service looking for information and phrases which it could use to serve ads.  They were absolutely adamant that they, Google, weren’t doing the reading: it was just a computer program.  I’m not sure how clear or safe a distinction that is.

Why does this all matter?  Well, one of the more pressing reasons is because of self-driving cars.  Whose fault is it when one goes wrong and injures or kills someone?  What about autonomous defence systems?

And here’s the question that really interests – and vexes – me: is this different when the program which is executing can learn.  I don’t even mean strong AI: just that it can change what it does based on the behaviour it “sees”, “hears” or otherwise senses.  It feels to me that there’s a substantive difference between:

a) actions carried out at the explicit (asynchronous) request of a human operator, or according to sets of rules coded into a program

AND

b) actions carried out in response to rules that have been formed by the operation of the program itself.  There is what I’d called synchronous intent within the program.

You can argue that b) has pretty much always been around, in basic forms, but it seems to me to be different when programs are being created with the expectation that humans will not necessarily be able to decode the rules, and where the intent of the human designers is to allow rulesets to be created in this way.

There is some discussion about at the moment as to how and/or whether rulesets generated by open source projects should be shared.  I think the general feeling is that there’s no requirement for them to be – in the same way that material I write using an open source text editor shouldn’t automatically be considered open source – but open data is valuable, and finding ways to share it is a good idea, IMHO.

In Wargames, that is the key difference between the system as originally planned, and what it ends up doing: Joshua has synchronous intent.

I really don’t think this is all bad: we need these systems, and they’re going to improve our lives significantly.  But I do feel that it’s important that you and I start thinking hard about what is acting for whom, and how.

Now, if you wouldn’t mind opening the Pod bay doors, HAL…[5]

 


1. and yes, I know it’s a pretense.

2. yet…

3. go on – re-watch it: you know you want to[4].

4. and if you’ve never watched it, then stop reading this article and go and watch it NOW.

5. I think you know the problem just as well as I do, Dave.

Entropy

… algorithms, we know, are not always correctly implemented …

Imagine that you’re about to play a boardgame which involves using dice.  I don’t know: Monopoly, Yahtzee, Cluedo, Dungeons & Dragons*.  In most cases, at least where you’re interested in playing a fair game, you want to be pretty sure that there’s a random distribution of the dice roll results.  In other words, for a 6-sided dice, you’d hope that, for each roll, there’s an equal chance that any of the numbers 1 through 6 will appear.  This seems like a fairly simple thing to want to define, and, like many things which seem to be simple when you first look at them, mathematicians have managed to conjure an entire field of study around it, making it vastly complicated in the process****.

Let’s move to computers.  As opposed to boardgames, you generally want computers to do the same thing every time you ask them to do it, assuming that give them the same inputs: you want their behaviour to be deterministic when presented with the same initial conditions.  Random behaviour is generally not a good thing for computers.  There are, of course, exceptions to this rule, and the first is when you want to use computers to play games, as things get very boring very quickly if there’s no variation in gameplay.

There’s another big exception: cryptography.  In fact, it’s not all of cryptography: you definitely want a single plaintext to be encrypted to a single ciphertext under the same key in almost all cases.  But there is one area where randomness is important: and that’s in the creation of the cryptographic key(s) you’re going to be using to perform those operations.  It turns out that you need to have quite a lot of randomness available to create a key which is truly unique – and keys really need to be truly unique – and that if you don’t have enough randomness, then not only will you possible generate the same key (or set of them) repeatedly, but other people may do so as well, allowing them to guess what keys you’re using, and thereby be able do things like read your messages or pretend to be you.

Given that these are exactly the sorts of things that cryptography tries to stop, it is clearly very important that you do have lots of randomness.

Luckily, mathematicians and physicists have come to our rescue.  Their word for randomness is “entropy”.  In fact, what mathematicians and physicists mean when they talk about entropy is – as far as my understanding goes – to be a much deeper and complex issue than just randomness.  But if we can find a good source of entropy, and convert it into something that computers can use, then we should have enough randomness to do all things that we want to do with cryptographic key generation*****.  The problem in the last sentence is the “if” and the “should”.

First, we need to find a good source of entropy, and prove that it is good.  The good thing about this is that there are, in fact, lots of natural sources of entropy.  Airflow is often random enough around computers that temperature variances can be measured that will provide good enough entropy.  Human interactions with peripherals such as mouse movements or keyboard strokes can provide more entropy.  In the past, variances between network packets receive times were used, but there’s been some concern that these are actually less random than previously thought, and may be measurable by outside parties******.  There are algorithms that allow us to measure quite how random entropy sources are – though they can’t make predictions about future randomness, of course.

Let’s assume, though, that we have a good source of entropy.  Or let’s not: let’s assume that we’ve got several pretty good sources of entropy, and that we believe that when we combine them, they’ll be good enough as a group.

And this is what computers – and Operating Systems such –  generally do.  They gather data from various entropy sources, and then convert it to a stream of bits – your computer’s favourite language of 1s and 0s – that can then be used to provide random numbers. The problem arises when they don’t do it well enough.

This can occur for a variety of reasons, the main two being bad sampling and bad combination.  Even if your sources of entropy are good, if you don’t sample them in an appropriate manner, then what you actually get won’t reflect the “goodness” of that entropy source: that’s a sampling problem.  This is bad enough, but the combination algorithms are supposed to smooth out this sort of issue, assuming it’s not too bad and you have enough sources of entropy.  However, when you have an algorithm which isn’t actually doing that, or isn’t combining even well-sampled, good sources, then you have a real issue.  And algorithms, we know, are not always correctly implemented – and there have even been allegations that some government security services have managed to introduce weakened algorithms – with weaknesses that only they know about, and can exploit – into systems around the world.  There have been some very high profile examples of poor implementation in both the proprietary and open source worlds, which have led to real problems in actual deployments.  At least, when you have an open source implementation, you have the chance to fix it.

That problem is compounded when – as is often the case – these algorithms are embedded in hardware such as a chip on a motherboard.   In this case, it’s very difficult to fix, as you generally can’t just replace all the affected chips, and may also be difficult to trace.  Whether you are operating in hardware or software, however, the impact of a bad algorithm which isn’t spotted – at least by the Good Guys and Gals[tm] – for quite a while is that you may have many millions of weak keys out there, which are doing a very bad job of protecting identities or private data.   Even if you manage to replace these keys, what about all of the historical encryptions which, if recorded, can now be read?  What if I could forge the identity of the person who signed a transaction buying a house several years ago, to make it look like I now owned it, for instance?

Entropy, then, can be difficult to manage, and when we have a problem, the impact of that problem can be much larger than we might immediately imagine.


*I’m sure that there are trademarks associated with these games**

**I’m also aware that Dungeons & Dragons*** isn’t really a boardgame

***I used to be a Dungeon Master!

****for an example, try reading just the first paragraph of the entry for  stochastic process on Wikipedia.

*****and gaming.

******another good source of entropy is gained by measuring radioactive decay, but you generally don’t want to be insisting that computers – or there human operators – require a radioactive source near enough to them to be useful.

The commonwealth of Open Source

This commonwealth does not apply to proprietary software: what stays hidden does not enlighten or enrich the world.

“But surely Open Source software is less secure, because everybody can see it, and they can just recompile it and replace it with bad stuff they’ve written?”

Hands up who’s heard this?*  I’ve been talking to customers – yes, they let me talk to customers sometimes – and to folks in the Field**, and this is one that comes up, it turns out, quite frequently.  I talked in a previous post (“Disbelieving the many eyes hypothesis“) about how Open Source software – particularly security software – doesn’t get to be magically more secure than proprietary software, and talked a little bit there about how I’d still go with Open Source over proprietary every time, but the way that I’ve heard the particular question – about OSS being less secure – suggests to me that we there are times when we don’t just need to a be able to explain why Open Source needs work, but also to be able to engage actively in Apologetics***.  So here goes.  I don’t expect it to be up to Newton’s or Wittgenstein’s levels of logic, but I’ll do what I can, and I’ll summarise at the bottom so that you’ve got a quick list of the points if you want it.

The arguments

First of all, we should accept that no software is perfect******.  Not proprietary software, not Open Source software.  Second, we should accept that there absolutely is good proprietary software out there.  Third, on the other hand, there is some very bad Open Source software.  Fourth, there are some extremely intelligent, gifted and dedicated architects, designers and software engineers who create proprietary software.

But here’s the rub.  Fifth – the pool of people who will work on or otherwise look at that proprietary software is limited.  And you can never hire all the best people.  Even in government and public sector organisations – who often have a larger talent pool available to them, particularly for *cough* security-related *cough* applications – the pool is limited.

Sixth – the pool of people available to look at, test, improve, break, re-improve, and roll out Open Source software is almost unlimited, and does include the best people.  Seventh – and I love this one: the pool also includes many of the people writing the proprietary software.  Eighth – many of the applications being written by public sector and government organisations are open sourced anyway these days.

Ninth – if you’re worried about running Open Source software which is unsupported, or comes from dodgy, un-provenanced sources, then good news: there are a bunch of organisations******* who will check the provenance of that code, support, maintain and patch it.  They’ll do it along the same type of business lines that you’d expect from a proprietary software provider.  You can also ensure that the software you get from them is the right software: the standard technique is for them to sign bundles of software so that you can check that what you’re installing isn’t just from some random bad person who’s taken that code and done Bad Things[tm] with it.

Tenth – and here’s the point of this post – when you run Open Source software, when you test it, when you provide feedback on issues, when you discover errors and report them, you are tapping into, and adding to, the commonwealth of knowledge and expertise and experience that is Open Source.  And which is only made greater by your doing so.  If you do this yourself, or through one of the businesses who will support that Open Source software********, you are part of this commonwealth.  Things get better with Open Source software, and you can see them getting better.  Nothing is hidden – it’s, well, “open”.  Can things get worse?  Yes, they can, but we can see when that happens, and fix it.

This commonwealth does not apply to proprietary software: what stays hidden does not enlighten or enrich the world.

I know that I need to be careful about the use of the “commonwealth” as a Briton: it has connotations of (faded…) empire which I don’t intend it to hold in this case.  It’s probably not what Cromwell*********, had in mind when he talked about the “Commonwealth”, either, and anyway, he’s a somewhat … controversial historical figure.  What I’m talking about is a concept in which I think the words deserve concatenation – “common” and “wealth” – to show that we’re talking about something more than just money, but shared wealth available to all of humanity.

I really believe in this.  If you want to take away a religious message from this blog, it should be this**********: the commonwealth is our heritage, our experience, our knowledge, our responsibility.  The commonwealth is available to all of humanity.  We have it in common, and it is an almost inestimable wealth.

 

A handy crib sheet

  1. (Almost) no software is perfect.
  2. There is good proprietary software.
  3. There is bad Open Source software.
  4. There are some very clever, talented and devoted people who create proprietary software.
  5. The pool of people available to write and improve proprietary software is limited, even within the public sector and government realm.
  6. The corresponding pool of people for Open Source is virtually unlimited…
  7. …and includes a goodly number of the talent pool of people writing proprietary software.
  8. Public sector and government organisations often open source their software anyway.
  9. There are businesses who will support Open Source software for you.
  10. Contribution – even usage – adds to the commonwealth.

*OK – you can put your hands down now.

**should this be capitalised?  Is there a particular field, or how does it work?  I’m not sure.

***I have a degree in English Literature and Theology – this probably won’t surprise some of the regular readers of this blog****.

****not, I hope, because I spout too much theology*****, but because it’s often full of long-winded, irrelevant Humanities (US Eng: “liberal arts”) references.

*****Emacs.  Every time.

******not even Emacs.  And yes, I know that there are techniques to prove the correctness of some software.  (I suspect that Emacs doesn’t pass many of them…)

*******hand up here: I’m employed by one of them, Red Hat, Inc..  Go have a look – fun place to work, and we’re usually hiring.

********assuming that they fully abide by the rules of the Open Source licence(s) they’re using, that is.

*********erstwhile “Lord Protector of England, Scotland and Ireland” – that Cromwell.

**********oh, and choose Emacs over vi variants, obviously.