This isn’t one of those police dramas where a suspect parcel arrives at the precinct and someone realises just in time that it may be a bomb – what we’re talking about here is open source software packages (though the impact on your application may be similar if you’re not sufficiently suspicious). Open source software is everywhere these days – which is great – but how can you be sure that you should trust the software you’ve downloaded to do what you want? The area of software supply chain management – of which this discussion forms a part – is fairly newly visible in the industry, but is growing in importance. We’re going to consider a particular example.
There’s a huge conversation to be had here about what trust means (see my article “What is trust?” as a starting point, and I have a forthcoming book on Trust in Computing and the Cloud for Wiley), but let’s assume that you have a need for a library which provides some cryptographic protocol implementation. What do you need to know, and what are you choices? We’ll assume, for now, that you’ve already made what is almost certainly the right choice, and gone with an open source implementation (see many of my articles on this blog for why open source is just best for security), and that you don’t want to be building everything from source all the time: you need something stable and maintained. What should be your source of a new package?
Option 1 – use a vendor
There are many vendors out there now who provide open source software through a variety of mechanisms – typically subscription. Red Hat, my employer (see standard disclosure) is one of them. In this case, the vendor will typically stand behind the fitness for use of a particular package, provide patches, etc.. This is your easiest and best choice in many cases. There may be times, however, when you want to use a package which is not provided by a vendor, or not packaged by your vendor of choice: what do you do then? Equally, what decisions do vendors need to make about how to trust a package?
Option 2 – delve deeper
This is where things get complex. So complex, in fact, that I’m going to be examining them at some length in my book. For the purposes of this article, though, I’ll try to be brief. We’ll start with the assumption that there is a single maintainer of the package, and multiple contributors. The contributors provide code (and tests and documentation, etc.) to the project, and the maintainer provides builds – binaries/libraries – for you to consume, rather than your taking the source code and compiling it yourself (which is actually what a vendor is likely to do, though they still need to consider most of the points below). This is a library to provide cryptographic capabilities, so it’s fairly safe to assume that we care about its security. There are at least five specific areas which we need to consider in detail, all of them relying on the maintainer to a large degree (I’ve used the example of security here, though very similar considerations exist for almost any package): let’s look at the issues.
- build – how is the package that you are consuming created? Is the build process performed on a “clean” (that is, non-compromised) machine, with the appropriate compilers and libraries (there’s a turtles problem here!)? If the binary is created with untrusted tools, then how can we trust it at all, so what measures does the maintainer take to ensure the “cleanness” of the build environment? It would be great if the build process is documented as a repeatable build, so that those who want to check it can do so.
- integrity – this is related to build, in that we want to be sure that the source code inputs to the build process – the code coming, for instance, from a git repository – are what we expect. If, somehow, compromised code is injected into the build process, then we are in a very bad position. We want to know exactly which version of the source code is being used as the basis for the package we are consuming so that we can track features – and bugs. As above, having a repeatable build is a great bonus here.
- responsiveness – this is a measure of how responsive – or not – the maintainer is to changes. Generally, we want stable features, tied to known versions, but a quick response to bug and (in particular) security patches. If the maintainer doesn’t accept patches in a timely manner, then we need to worry about the security of our package. We should also be asking questions like, “is there a well-defined security disclosure of vulnerability management process?” (See my article Security disclosure or vulnerability management?), and, if so, “is it followed”?
- provenance – all code is not created equal, and one of the things of which a maintainer should be keeping track is the provenance of contributors. If a large amount of code in a part of the package which provides particularly sensitive features is suddenly submitted by an unknown contributor with a pseudonymous email address and no history of contributions of security functionality, this should raise alarm bells. On the other hand, if there is a group of contributors employed by a company with a history of open source contributions and well-reviewed code who submit a large patch, this is probably less troublesome. This is a difficult issue to manage, and there are typically no definite “OK” or “no-go” signs, but the maintainer’s awareness and management of contributors and their contributions is an important point to consider.
- expertise – this is the most tricky. You may have a maintainer who is excellent at managing all of the points above, but is just not an expert in certain aspects of the functionality of the contributed code. As a consumer of the package, however, I need to be sure that it is fit for purpose, and that may include, in the case of the security-related package we’re considering, being assured that the correct cryptographic primitives are used, that bounds-checking is enforced on byte streams, that proper key lengths are used or that constant time implementations are provided for particular primitives. This is very, very hard, and the job of maintainer can easily become a full-time one if they are acting as the expert for a large and/or complex project. Indeed, best practice in such cases is to have a team of trusted, experienced experts who work either as co-maintainers or as a senior advisory group for the project. Alternatively, having external people or organisations (such as industry bodies) perform audits of the project at critical junctures – when a major release is due, or when an important vulnerability is patched, for instance – allows the maintainer to share this responsibility. It’s important to note that the project does not become magically “secure” just because it’s open source (see Disbelieving the many eyes hypothesis), but that the community, when it comes together, can significantly improve the assurance that consumers of a project can have in the packages which is produces.
Once we consider these areas, we then need to work out how we measure and track each of them. Who is in a position to judge the extent to which any particular maintainer is fulfilling each of the areas? How much can we trust them? These are a set of complex issues, and one about which much more needs to be written, but I am passionate about exposing the importance of explicit trust in computing, particularly in open source. There is work going on around open source supply chain management, for instance the new (at time of writing Project Rekor – but there is lots of work still to be done.
Remember, though: when you take a package – whether library or executable – please consider what you’re consuming, what about it you can trust, and on what assurances that trust is founded.