Entropy

… algorithms, we know, are not always correctly implemented …

Imagine that you’re about to play a boardgame which involves using dice.  I don’t know: Monopoly, Yahtzee, Cluedo, Dungeons & Dragons*.  In most cases, at least where you’re interested in playing a fair game, you want to be pretty sure that there’s a random distribution of the dice roll results.  In other words, for a 6-sided dice, you’d hope that, for each roll, there’s an equal chance that any of the numbers 1 through 6 will appear.  This seems like a fairly simple thing to want to define, and, like many things which seem to be simple when you first look at them, mathematicians have managed to conjure an entire field of study around it, making it vastly complicated in the process****.

Let’s move to computers.  As opposed to boardgames, you generally want computers to do the same thing every time you ask them to do it, assuming that give them the same inputs: you want their behaviour to be deterministic when presented with the same initial conditions.  Random behaviour is generally not a good thing for computers.  There are, of course, exceptions to this rule, and the first is when you want to use computers to play games, as things get very boring very quickly if there’s no variation in gameplay.

There’s another big exception: cryptography.  In fact, it’s not all of cryptography: you definitely want a single plaintext to be encrypted to a single ciphertext under the same key in almost all cases.  But there is one area where randomness is important: and that’s in the creation of the cryptographic key(s) you’re going to be using to perform those operations.  It turns out that you need to have quite a lot of randomness available to create a key which is truly unique – and keys really need to be truly unique – and that if you don’t have enough randomness, then not only will you possible generate the same key (or set of them) repeatedly, but other people may do so as well, allowing them to guess what keys you’re using, and thereby be able do things like read your messages or pretend to be you.

Given that these are exactly the sorts of things that cryptography tries to stop, it is clearly very important that you do have lots of randomness.

Luckily, mathematicians and physicists have come to our rescue.  Their word for randomness is “entropy”.  In fact, what mathematicians and physicists mean when they talk about entropy is – as far as my understanding goes – to be a much deeper and complex issue than just randomness.  But if we can find a good source of entropy, and convert it into something that computers can use, then we should have enough randomness to do all things that we want to do with cryptographic key generation*****.  The problem in the last sentence is the “if” and the “should”.

First, we need to find a good source of entropy, and prove that it is good.  The good thing about this is that there are, in fact, lots of natural sources of entropy.  Airflow is often random enough around computers that temperature variances can be measured that will provide good enough entropy.  Human interactions with peripherals such as mouse movements or keyboard strokes can provide more entropy.  In the past, variances between network packets receive times were used, but there’s been some concern that these are actually less random than previously thought, and may be measurable by outside parties******.  There are algorithms that allow us to measure quite how random entropy sources are – though they can’t make predictions about future randomness, of course.

Let’s assume, though, that we have a good source of entropy.  Or let’s not: let’s assume that we’ve got several pretty good sources of entropy, and that we believe that when we combine them, they’ll be good enough as a group.

And this is what computers – and Operating Systems such –  generally do.  They gather data from various entropy sources, and then convert it to a stream of bits – your computer’s favourite language of 1s and 0s – that can then be used to provide random numbers. The problem arises when they don’t do it well enough.

This can occur for a variety of reasons, the main two being bad sampling and bad combination.  Even if your sources of entropy are good, if you don’t sample them in an appropriate manner, then what you actually get won’t reflect the “goodness” of that entropy source: that’s a sampling problem.  This is bad enough, but the combination algorithms are supposed to smooth out this sort of issue, assuming it’s not too bad and you have enough sources of entropy.  However, when you have an algorithm which isn’t actually doing that, or isn’t combining even well-sampled, good sources, then you have a real issue.  And algorithms, we know, are not always correctly implemented – and there have even been allegations that some government security services have managed to introduce weakened algorithms – with weaknesses that only they know about, and can exploit – into systems around the world.  There have been some very high profile examples of poor implementation in both the proprietary and open source worlds, which have led to real problems in actual deployments.  At least, when you have an open source implementation, you have the chance to fix it.

That problem is compounded when – as is often the case – these algorithms are embedded in hardware such as a chip on a motherboard.   In this case, it’s very difficult to fix, as you generally can’t just replace all the affected chips, and may also be difficult to trace.  Whether you are operating in hardware or software, however, the impact of a bad algorithm which isn’t spotted – at least by the Good Guys and Gals[tm] – for quite a while is that you may have many millions of weak keys out there, which are doing a very bad job of protecting identities or private data.   Even if you manage to replace these keys, what about all of the historical encryptions which, if recorded, can now be read?  What if I could forge the identity of the person who signed a transaction buying a house several years ago, to make it look like I now owned it, for instance?

Entropy, then, can be difficult to manage, and when we have a problem, the impact of that problem can be much larger than we might immediately imagine.


*I’m sure that there are trademarks associated with these games**

**I’m also aware that Dungeons & Dragons*** isn’t really a boardgame

***I used to be a Dungeon Master!

****for an example, try reading just the first paragraph of the entry for  stochastic process on Wikipedia.

*****and gaming.

******another good source of entropy is gained by measuring radioactive decay, but you generally don’t want to be insisting that computers – or there human operators – require a radioactive source near enough to them to be useful.

Stop reading, start patching

… in order to correct this problem, BOTH the client AND the router must be patched

This is an emergency post: normal* service will resume next week**.

So, over the past 48 hours or so, news of the KRACK vulnerability for Wifi has started spreading.  This vulnerability makes is pretty trivially easy to snoop on information sent between a device (mobile phone, laptop, etc.) and a wifi router, in some cases allowing changes to that information.  This is not a bug in code, but a mis-design in the crypto algorithm that’s used by the vast majority of Wifi connections: WPA2.

Some key facts:

  • WPA2 personal and WPA2 enterprise are vulnerable
  • the vulnerability is in the design of the code, not the implementation
    • however, Linux and Android 6.0+ implementations (which use wpa_supplicant 2.4 or higher, are even more easily attacked)
  • in order to correct this problem, BOTH the client AND the router must be patched.
    • this means that it’s not good enough just to update your laptop, but also the router in your house, business, etc.
  • Android phones typically take a long time to get patches (if at all)
    • unless you have evidence to the contrary, assume that your phone is vulnerable
  • many hotels, businesses, etc., rarely update or patch their routers
    • assume that any wifi connection that you use from now on is vulnerable unless you know that it’s been patched
  • you can continue to rely on VPNs
  • you can continue to rely on website encryption***
    • but remember that you may be betraying lots of metadata, including, of course, the address of the website that you’re visiting, to any snoopers
  • the security of IoT devices is going to continue to be a problem
    • unless their firmware can easily be patched, it’s difficult to believe that they will be safe

For my money, it’s worth investing in that VPN solution you were always promising yourself, and starting to accept the latency hit that it may cost you.

For more information in an easily readable form, I suggest heading over to The Register, which is pretty much always a good place to start.

For information in the initial publisher of this attack, visit https://www.krackattacks.com/


*well, as normal as it ever was

**hopefully

***as much as you ever could before…

That Backdoor Fallacy revisited – delving a bit deeper

…if it breaks just once that becomes always,..

A few weeks ago, I wrote a post called The Backdoor Fallacy: explaining it slowly for governments.  I wish that it hadn’t been so popular.  Not that I don’t like the page views – I do – but because it seems that it was very timely, and this issue isn’t going away.  The German government is making the same sort of noises that the British government* was making when I wrote that post**.  In other words, they’re talking about forcing backdoors in encryption.  There was also an amusing/worrying story from slashdot which alleges that “US intelligence agencies” attempted to bribe the developers of Telegram to weaken the encryption in their app.

Given some of the recent press on this, and some conversations I’ve had with colleagues, I thought it was worth delving a little deeper***.  There seem to be three sets of use cases that it’s worth addressing, and I’m going to call them TSPs, CSPs and Other.  I’d also like to make it clear here that I’m talking about “above the board” access to encrypted messages: access that has been condoned by the relevant local legal system.  Not, in other words, the case of the “spooks”.  What they get up to is for another blog post entirely****.  So, let’s look at our three cases.

TSPs – telecommunications service providers

In order to get permission to run a telecommunications service(wired or wireless) in most (all?) jurisdictions, you need to get approval from the local regulator: a licence.  This licence is likely to include lots of requirements: a typical one is that you, the telco (telecoms company) must provide access at all times to emergency numbers (999, 911, 112, etc.).  And another is likely to be that, when local law enforcement come knocking with a legal warrant, you must give them access to data and call information so that they can basically do wire-taps.  There are well-established ways to do this, and fairly standard legal frameworks within which it happens: basically, if a call or data stream is happening on a telco’s network, they must provide access to it to legal authorities.  I don’t see an enormous change to this provision in what we’re talking about.

CSPs – cloud service providers

Things get a little more tricky where cloud service providers are concerned.  Now, I’m being rather broad with my definition, and I’m going to lump your Amazons, Googles, Rackspaces and such in with folks like Facebook, Microsoft and other providers who could be said to be providing “OTT” (Over-The-Top – in that they provide services over the top of infrastructure that they don’t own) services.  Here things are a little greyer*****.  As many of these companies (some of who are telcos, how also have a business operating cloud services, just to muddy the waters further) are running messaging, email services and the like, governments are very keen to apply similar rules to them as those regulating the telcos. The CSPs aren’t keen, and the legal issues around jurisdiction, geography and what the services are complicate matter.  And companies have a duty to their shareholders, many of whom are of the opinion that keeping data private from government view is to be encouraged.  I’m not sure how this is going to pan out, to be honest, but I watch it with interest.  It’s a legal battle that these folks need to fight, and I think it’s generally more about cryptographic key management – who controls the keys to decrypt customer information – than about backdoors in protocols or applications.

Other

And so we come to other.  This bucket includes everything else.  And sadly, our friends the governments want their hands on all of that everything else.    Here’s a little list of some of that everything else.  Just a subset.  See if you can see anything on the list that you don’t think there should be unfettered access to (and remember my previous post about how once access is granted, it’s basically game over, as I don’t believe that backdoors end up staying secret only to “approved” parties…):

  • the messages you send via apps on your phone, or tablet, or laptop or PC;
  • what you buy on Amazon;
  • your banking records – whether on your phone or at the bank;
  • your emails via your company VPN;
  • the stored texts on your phone when you enquired about the woman’s shelter
  • your emails to your doctor;
  • your health records – whether stored at your insurers, your hospital or your doctor’s surgery;
  • your browser records about emergency contraception services;
  • access to your video doorbell;
  • access to your home wifi network;
  • your neighbour’s child’s chat message to the ChildLine (a charity for abused children in the UK – similar exist elsewhere)
  • the woman’s shelter’s records;
  • the rape crisis charity’s records;
  • your mortgage details.

This is a short list.  I’ve chosen emotive issues, of course I have, but they’re all legal.  They don’t even include issues like extra-marital affairs or access to legal pornography or organising dissent against oppressive regimes, all of which might well edge into any list that many people might copmile.  But remember – if a backdoor is put into encryption, or applications, then these sorts of information will start leaking.  And they will leak to people you don’t want to have them.

Our lives revolve around the Internet and the services that run on top of it.  We have expectations of privacy.  Governments have an expectation that they can breach that privacy when occasion demands.  And I don’t dispute that such an expectation is valid.  The problem that this is not the way to do it, because of that phrase “when occasion demands”.  If the occasion breaks just once, then that becomes always, and not just to “friendly” governments.  To unfriendly governments, to criminals, to abusive partners and abusive adults and bad, bad people.  This is not a fight for us to lose.


*I’m giving the UK the benefit of the doubt here: as I write, it’s unclear whether we really have a government, and if we do, for how long it’ll last, but let’s just with it for now.

**to be fair, we did have a government then.

***and not just because I like the word “delving”.  Del-ving.  Lovely.

****one which I probably won’t be writing if I know what’s good for me.

*****I’m a Brit, so I use British spelling: get over it.

Disbelieving the many eyes hypothesis

There is a view that because Open Source Software is subject to review by many eyes, all the bugs will be ironed out of it. This is a myth.

Writing code is hard.  Writing secure code is harder: much harder.  And before you get there, you need to think about design and architecture.  When you’re writing code to implement security functionality, it’s often based on architectures and designs which have been pored over and examined in detail.  They may even reflect standards which have gone through worldwide review processes and are generally considered perfect and unbreakable*.

However good those designs and architectures are, though, there’s something about putting things into actual software that’s, well, special.  With the exception of software proven to be mathematically correct**, being able to write software which accurately implements the functionality you’re trying to realise is somewhere between a science and an art.  This is no surprise to anyone who’s actually written any software, tried to debug software or divine software’s correctness by stepping through it.  It’s not the key point of this post either, however.

Nobody*** actually believes that the software that comes out of this process is going to be perfect, but everybody agrees that software should be made as close to perfect and bug-free as possible.  It is for this reason that code review is a core principle of software development.  And luckily – in my view, at least – much of the code that we use these days in our day-to-day lives is Open Source, which means that anybody can look at it, and it’s available for tens or hundreds of thousands of eyes to review.

And herein lies the problem.  There is a view that because Open Source Software is subject to review by many eyes, all the bugs will be ironed out of it.  This is a myth.  A dangerous myth.  The problems with this view are at least twofold.  The first is the “if you build it, they will come” fallacy.  I remember when there was a list of all the websites in the world, and if you added your website to that list, people would visit it****.  In the same way, the number of Open Source projects was (maybe) once so small that there was a good chance that people might look at and review your code.  Those days are past – long past.  Second, for many areas of security functionality – crypto primitives implementation is a good example – the number of suitably qualified eyes is low.

Don’t think that I am in any way suggesting that the problem is any lesser in proprietary code: quite the opposite.  Not only are the designs and architectures in proprietary software often hidden from review, but you have fewer eyes available to look at the code, and the dangers of hierarchical pressure and groupthink are dramatically increased.  “Proprietary code is more secure” is less myth, more fake news.  I completely understand why companies like to keep their security software secret – and I’m afraid that the “it’s to protect our intellectual property” line is too often a platitude they tell themselves, when really, it’s just unsafe to release it.  So for me, it’s Open Source all the way when we’re looking at security software.

So, what can we do?  Well, companies and other organisations that care about security functionality can – and have, I believe a responsibility to – expend resources on checking and reviewing the code that implements that functionality.  That is part of what Red Hat, the organisation for whom I work, is committed to doing.  Alongside that, we, the Open Source community, can – and are – finding ways to support critical projects and improve the amount of review that goes into that code*****.  And we should encourage academic organisations to train students in the black art of security software writing and review, not to mention highlighting the importance of Open Source Software.

We can do better – and we are doing better.  Because what we need to realise is that the reason the “many eyes hypothesis” is a myth is not that many eyes won’t improve code – they will – but that we don’t have enough expert eyes looking.  Yet.


* Yeah, really: “perfect and unbreakable”.  Let’s just pretend that’s true for the purposes of this discussion.

** …and which still relies on the design and architecture actually to do what you want – or think you want – of course, so good luck.

*** nobody who’s actually written more than about 5 lines of code (or more than 6 characters of Perl)

**** I added one.  They came.  It was like some sort of magic.

***** see, for instance, the Linux Foundation‘s Core Infrastructure Initiative