Trust book – chapter index and summary

I thought it might be interesting to provide the chapter index and a brief summary of each chapter addresses.

In a previous article, I presented the publisher’s blurb for my upcoming book with Wiley, Trust in Computer Systems and the Cloud. I thought it might be interesting, this time around, to provide the chapter index of the book and to give a brief summary of what each chapter addresses.

While it’s possible to read many of the chapters on their own, I haved tried to maintain a logical progression of thought through the book, building on earlier concepts to provide a framework that can be used in the real world. It’s worth noting that the book is not about how humans trust – or don’t trust – computers (there’s a wealth of literature around this topic), but about how to consider the issue of trust between computing systems, or what we can say about assurances that computing systems can make, or can be made about them. This may sound complex, and it is – which is pretty much why I decided to write the book in the first place!

  • Introduction
    • Why I think this is important, and how I came to the subject.
  • Chapter 1 – Why Trust?
    • Trust as a concept, and why it’s important to security, organisations and risk management.
  • Chapter 2 – Humans and Trust
    • Though the book is really about computing and trust, and not humans and trust, we need a grounding in how trust is considered, defined and talked about within the human realm if we are to look at it in our context.
  • Chapter 3 – Trust Operations and Alternatives
    • What are the main things you might want to do around trust, how can we think about them, and what tools/operations are available to us?
  • Chapter 4 – Defining Trust in Computing
    • In this chapter, we delve into the factors which are specific to trust in computing, comparing and contrasting them with the concepts in chapter 2 and looking at what we can and can’t take from the human world of trust.
  • Chapter 5 – The Importance of Systems
    • Regular readers of this blog will be unsurprised that I’m interested in systems. This chapter examines why systems are important in computing and why we need to understand them before we can talk in detail about trust.
  • Chapter 6 – Blockchain and Trust
    • This was initially not a separate chapter, but is an important – and often misunderstood or misrepresented – topic. Blockchains don’t exist or operate in a logical or computational vacuum, and this chapter looks at how trust is important to understanding how blockchains work (or don’t) in the real world.
  • Chapter 7 – The Importance of Time
    • One of the important concepts introduced earlier in the book is the consideration of different contexts for trust, and none is more important to understand than time.
  • Chapter 8 – Systems and Trust
    • Having introduced the importance of systems in chapter 5, we move to considering what it means to have establish a trust relationship from or to a system, and how the extent of what is considered part of the system is vital.
  • Chapter 9 – Open Source and Trust
    • Another topc whose inclusion is unlikely to surprise regular readers of this blog, this chapter looks at various aspects of open source and how it relates to trust.
  • Chapter 10 – Trust, the Cloud, and the Edge
    • Definitely a core chapter in the book, this addresses the complexities of trust in the modern computing environments of the public (and private) cloud and Edge networks.
  • Chapter 11 – Hardware, Trust, and Confidential Computing
    • Confidential Computing is a growing and important area within computing, but to understand its strengths and weaknesses, there needs to be a solid theoretical underpinning of how to talk about trust. This chapter also covers areas such as TPMs and HSMs.
  • Chapter 12 – Trust Domains
    • Trust domains are a concept that allow us to apply the lessons and frameworks we have discussed through the book to real-world situations at large scale. They also allow for modelling at the business level and for issues like risk management – introduced at the beginning of the book – to be considered more explicitly.
  • Chapter 13 – A World of Explicit Trust
    • Final musings on what a trust-centric (or at least trust-inclusive) view of the world enables and hopes for future work in the field.
  • References
    • List of works cited within the book.

Trust book preview

What it means to trust in the context of computer and network security

Just over two years ago, I agreed a contract with Wiley to write a book about trust in computing. It was a long road to get there, starting over twenty years ago, but what pushed me to commit to writing something was a conference I’d been to earlier in 2019 where there was quite a lot of discussion around “trust”, but no obvious underlying agreement about what was actually meant by the term. “Zero trust”, “trusted systems”, “trusted boot”, “trusted compute base” – all terms referencing trust, but with varying levels of definition, and differing understanding if what was being expected, by what components, and to what end.

I’ve spent a lot of time thinking about trust over my career and also have a major professional interest in security and cloud computing, specifically around Confidential Computing (see Confidential computing – the new HTTPS? and Enarx for everyone (a quest) for some starting points), and although the idea of a book wasn’t a simple one, I decided to go for it. This week, we should have the copy-editing stage complete (technical editing already done), with the final stage being proof-reading. This means that the book is close to down. I can’t share a definitive publication date yet, but things are getting there, and I’ve just discovered that the publisher’s blurb has made it onto Amazon. Here, then, is what you can expect.


Learn to analyze and measure risk by exploring the nature of trust and its application to cybersecurity 

Trust in Computer Systems and the Cloud delivers an insightful and practical new take on what it means to trust in the context of computer and network security and the impact on the emerging field of Confidential Computing. Author Mike Bursell’s experience, ranging from Chief Security Architect at Red Hat to CEO at a Confidential Computing start-up grounds the reader in fundamental concepts of trust and related ideas before discussing the more sophisticated applications of these concepts to various areas in computing. 

The book demonstrates in the importance of understanding and quantifying risk and draws on the social and computer sciences to explain hardware and software security, complex systems, and open source communities. It takes a detailed look at the impact of Confidential Computing on security, trust and risk and also describes the emerging concept of trust domains, which provide an alternative to standard layered security. 

  • Foundational definitions of trust from sociology and other social sciences, how they evolved, and what modern concepts of trust mean to computer professionals 
  • A comprehensive examination of the importance of systems, from open-source communities to HSMs, TPMs, and Confidential Computing with TEEs. 
  • A thorough exploration of trust domains, including explorations of communities of practice, the centralization of control and policies, and monitoring 

Perfect for security architects at the CISSP level or higher, Trust in Computer Systems and the Cloud is also an indispensable addition to the libraries of system architects, security system engineers, and master’s students in software architecture and security. 

Does my TCB look big in this?

The smaller your TCB the less there is to attack, and that’s a good thing.

This isn’t the first article I’ve written about Trusted Compute Bases (TCBs), so if the concept is new to you, I suggest that you have a look at What’s a Trusted Compute Base? to get an idea of what I’ll be talking about here. In that article, I noted the importance of the size of the TCB: “what you want is a small, easily measurable and easily auditable TCB on which you can build the rest of your system – from which you can build a ‘chain of trust’ to the other parts of your system about which you care.” In this article, I want to take some time to discuss the importance of the size of a TCB, how we might measure it, and how difficult it can be to reduce the TCB size. Let’s look at all of those issues in order.

Size does matter

However you measure it – and we’ll get to that below – the size of the TCB matters for two reasons:

  1. the larger the TCB is, the more bugs there are likely to be;
  2. the larger the TCB is, the larger the attack surface.

The first of these is true of any system, and although there may be ways of reducing the number of bugs, proving the correctness of all or, more likely, part of the system, bugs are both tricky to remove and resilient – if you remove one, you may well be introducing another (or worse, several). Now, the kinds or bugs you have, and the number of them, can be reduced through a multitude of techniques, from language choice (choosing Rust over C/C++ to reduce memory allocation errors, for instance) to better specification and on to improved test coverage and fuzzing. In the end, however, the smaller the TCB, the less code (or hardware – we’re considering the broader system here, don’t forget), you have to trust, the less space there is for there to be bugs in it.

The concept of an attack surface is important, and, like TCBs, one I’ve introduced before (in What’s an attack surface?). Like bugs, there may be no absolute measure of the ratio of danger:attack surface, but the smaller your TCB, well, the less there is to attack, and that’s a good thing. As with bug reduction, there are number of techniques you may want to apply to reduce your attack surface, but the smaller it is, then, by definition, the fewer opportunities attackers have to try to compromise your system.

Measurement

Measuring the size of your TCB is really, really hard – or, maybe I should say that coming up with an absolute measure that you can compare to other TCBs is really, really hard. The problem is that there are so many measurements that you might take. The ones you care about are probably those that can be related to attack surface – but there are so many different attack vectors that might be relevant to a TCB that there are likely to be multiple attack surfaces. Let’s look at some of the possible measurements:

  • number of API methods
  • amount of data that can be passed across each API method
  • number of parameters that can be passed across each API method
  • number of open network sockets
  • number of open local (e.g. UNIX) sockets
  • number of files read from local storage
  • number of dynamically loaded libraries
  • number of DMA (Direct Memory Access) calls
  • number of lines of code
  • amount of compilation optimisation carried out
  • size of binary
  • size of executing code in memory
  • amount of memory shared with other processes
  • use of various caches (L1, L2, etc.)
  • number of syscalls made
  • number of strings visible using strings command or similar
  • number of cryptographic operations not subject to constant time checks

This is not meant to be an exhaustive list, but just to show the range of different areas in which vulnerabilities might appear. Designing your application to reduce one may increase another – one very simple example being an attempt to reduce the number of API calls exposed by increasing the number of parameters on each call, another being to reduce the size of the binary by using more dynamically linked libraries.

This leads us to an important point which I’m not going to address in detail in this article, but which is fundamental to understanding TCBs: that without a threat model, there’s actually very little point in considering what your TCB is.

Reducing the TCB size

We’ve just seen one of the main reasons that reducing your TCB size is difficult: it’s likely to involve trade-offs between different measures. If all you’re trying to do is produce competitive marketing material where you say “my TCB is smaller than yours”, then you’re likely to miss the point. The point of a TCB is to have a well-defined computing base which can protect against specific threats. This requires you to be clear about exactly what functionality requires that it be trusted, where it sits in the system, and how the other components in the system rely on it: what trust relationships they have. I was speaking to a colleague just yesterday who was relaying a story of software project who said, “we’ve reduced our TCB to this tiny component by designing it very carefully and checking how we implement it”, but who overlooked the fact that the rest of the stack – which contained a complete Linux distribution and applications – could be no more trusted than before. The threat model (if there was one – we didn’t get into details) seemed to assume that only the TCB would be attacked, which missed the point entirely: it just added another “turtle” to the stack, without actually fixing the problem that was presumably at issue: that of improving the security of the system.

Reducing the TCB by artificially defining what the TCB is to suit your capabilities or particular beliefs around what the TCB specifically should be protecting against is not only unhelpful but actively counter-productive. This is because it ignores the fact that a TCB is there to serve the needs of a broader system, and if it is considered in isolation, then it becomes irrelevant: what is it acting as a base for?

In conclusion, it’s all very well saying “we have a tiny TCB”, but you need to know what you’re protecting, from what, and how.

Dependencies and supply chains

A dependency on a component is one which that an application or component needs to work

Supply chain security is really, really hot right now. It’s something which folks in the “real world” of manufactured things have worried about for years – you might be surprised (and worried) how much effort aircraft operators need to pay to “counterfeit parts”, and I wish I hadn’t just searched online the words “counterfeit pharmaceutical” – but the world of software had a rude wake-up call recently with the Solarwinds hack (or crack, if you prefer). This isn’t the place to go over that: you’ll be able to find many, many articles if you search on that. In fact, many companies have been worrying about this for a while, but the change is that it’s an issue which is now out in the open, giving more leverage to those who want more transparency around what software they consume in their products or services.

When we in computing (particularly software) think about supply chains, we generally talk about “dependencies”, and I thought it might be useful to write a short article explaining the main dependency types.

What is a dependency?

A dependency on a component is one which that an application or component needs to work, and they are generally considered to come in two types:

  • build-time dependencies
  • run-time dependencies.

Let’s talk about those two types in turn.

Build-time dependencies

these are components which are required in order to build (typically compile and package) your application or library. For example, if I’m writing a program in Rust, I have a dependency on the compiler if I want to create an application. I’m actually likely to have many more run-time dependencies, however. How those dependencies are made visible to me will depend on the programming language and the environment that I’m building in.

Some languages, for instance, may have filesystem support built in, but others will require you to “import” one or more libraries in order to read and write files. Importing a library basically tells your build-time environment to look somewhere (local disk, online repository, etc.) for a library, and then bring it into the application, allowing its capabilities to be used. In some cases, you will be taking pre-built libraries, and in others, your build environment may insist on building them itself. Languages like Rust have clever environments which can look for new versions of a library, download it and compile it without your having to worry about it yourself (though you can if you want!).

To get back to your file system example, even if the language does come with built-in filesystem support, you may decide to import a different library – maybe you need some fancy distributed, sharded file system, for instance – from a different supplier. Other capabilities may not be provided by the language, or may be higher-level capabilities: JSON serialisation or HTTPS support, for instance. Whether that library is available in open source may have a large impact on your decision as to whether or not to use it.

Build-time dependencies, then, require you to have the pieces you need – either pre-built or in source code form – at the time that you’re building your application or library.

Run-time dependencies

Run-time dependencies, as the name implies, only come into play when you actually want to run your application. We can think of there being two types of run-time dependency:

  1. service dependency – this may not be the official term, but think of an application which needs to write some data to a window on a monitor screen: in most cases, there’s already a display manager and a window manager running on the machine, so all the application needs to do is contact it, and communicate the right data over an API. Sometimes, the underlying operating system may need to start these managers first, but it’s not the application itself which is having to do that. These are local services, but remote services – accessing a database, for instance – operate in the same sort of way. As long as the application can contact and communicate with the database, it doesn’t need to do much itself. There’s still a dependency, and things can definitely go wrong, but it’s a weak coupling to an external application or service.
  2. dynamic linking – this is where an application needs access to a library at run-time, but rather than having added it at build-time (“static linking”), it relies on the underlying operating system to provide a link to the library when it starts executing. This means that the application doesn’t need to be as large (it’s not “carrying” the functionality with it when it’s installed), but it does require that the version that the operating system provides is compatible with what’s expected, and does the right thing.

Conclusion

I’ve resisted the temptation to go into the impacts of these different types of dependency in terms of their impact on security. That’s a much longer article – or set of articles – but it’s worth considering where in the supply chain we consider these dependencies to live, and who controls them. Do I expect application developers to check every single language dependency, or just imported libraries? To what extent should application developers design in protections from malicious (or just poorly-written) dynamically-linked libraries? Where does the responsibility lie for service dependencies – particularly remote ones?

These are all complex questions: the supply chain is not a simple topic (partly because there is not just one supply chain, but many of them), and organisations need to think hard about how they work.

They won’t get security right

Save users from themselves: make it difficult to do the wrong thing.

I’m currently writing a book at Trust in computing and the cloud – I’ve mentioned it before – and I confidently expect to reach 50% of my projected word count today, as I’m on holiday, have more time to write it, and got within about 850 words of the goal yesterday. Little boast aside, one of the topics that I’ve been writing about is the need to consider the contexts in which the systems you design and implement will be used.

When we design systems, there’s a temptation – a laudable one, in many cases – to provide all of the features and functionality that anyone could want, to implement all of the requests from customers, to accept every enhancement request that comes in from the community. Why is this? Well, for a variety of reasons, including:

  • we want our project or product to be useful to as many people as possible;
  • we want our project or product to match the capabilities of another competing one;
  • we want to help other people and be seen as responsive;
  • it’s more interesting implementing new features than marking an existing set complete, and settling down to bug fixing and technical debt management.

These are all good – or at least understandable – reasons, but I want to argue that there are times that you absolutely should not give in to this temptation: that, in fact, on every occasion that you consider adding a new feature or functionality, you should step back and think very hard whether your product would be better if you rejected it.

Don’t improve your product

This seems, on the face of it, to be insane advice, but bear with me. One of the reasons is that many techies (myself included) are more interested getting code out of the door than weighing up alternative implementation options. Another reason is that every opportunity to add a new feature is also an opportunity to deal with technical debt or improve the documentation and architectural information about your project. But the other reasons are to do with security.

Reason 1 – attack surface

Every time that you add a feature, a new function, a parameter on an interface or an option on the command line, you increase the attack surface of your code. Whether by fuzzing, targeted probing or careful analysis of your design, the larger the attack surface of your code, the more opportunities there are for attackers to find vulnerabilities, create exploits and mount attacks on instances of your code. Strange as it may seem, adding options and features for your customers and users can often be doing them a disservice: making them more vulnerable to attacks than they would have been if you had left well enough alone.

If we do not need an all-powerful administrator account after initial installation, then it makes sense to delete it after it has done its job. If logging of all transactions might yield information of use to an attacker, then it should be disabled in production. If older versions of cryptographic functions might lead to protocol attacks, then it is better to compile them out than just to turn them off in a configuration file. All of these lead to reductions in attack surface which ultimately help safeguard users and customers.

Reason 2 – saving users from themselves

The other reason is also about helping your users and customers: saving them from themselves. There is a dictum – somewhat unfair – within computing that “users are stupid”. While this is overstating the case somewhat, it is fairer to note that Murphy’s Law holds in computing as it does everywhere else: “Anything that can go wrong, will go wrong”. Specific to our point, some user somewhere can be counted upon to use the system that you are designing, implementing or operating in ways which are at odds with your intentions. IT security experts, in particular, know that we cannot stop people doing the wrong thing, but where there are opportunities to make it difficult to do the wrong thing, then we should embrace them.

Not adding features, disabling capabilities and restricting how your product is used might seem counter-intuitive, but if it leads to a safer user experience and fewer vulnerabilities associated with your product or project, then in the end, everyone benefits. And you can use the time to go and write some documentation. Or go to the beach. Enjoy!

An Enarx milestone: binaries

Demoing the same binary in very different TEEs.

This week is Red Hat Summit, which is being held virtually for the first time because of the Covid-19 crisis. The lock-down has not affected the productivity of the Enarx team, however (at least not negatively), as we have a very exciting demo that we will be showing at Summit. This post should be published at 1100 EDT, 1500 BST, 1400 GMT on Tuesday, 2020-04-28, which is the time that the session which Nathaniel McCallum and I recorded will be released to the world. I hope to be able to link to that once it’s released to the world. But what will we be showing?

Well, to set the scene, and to discover a little more about the Enarx project, you might want to read these articles first (also available in Japanese – visit each article of a link):

Enarx, as you’ll discover, is about running workloads in TEEs (Trusted Execution environments), using WebAssembly, in what we call “Keeps”. It’s a mammoth job, particularly as we’re abstracting away the underlying processor architectures (currently two: Intel’s SGX and AMD’s SEV), so that you, the user, don’t need to worry about them: all you need to do is write and compile your application, then request that it be deployed. Enarx, then, has lots of moving parts, and one of the key tasks for us has been to start the work to abstract away the underlying processor architectures so that we can prepare the runtime layers on top. Here’s a general picture of the software layers, and how they sit on top of the hardware platforms:

What we’re announcing – and demoing – today is that we have an initial implementation of code to allow us to abstract away process-based and VM-based types of architecture (with examples for SGX and SEV), so that we can do this:

This seems deceptively simple, but what’s actually going on under the covers is rather more than is exposed in the picture above. The reality is more like this:

This gives more detail: the application that’s running on both architectures (SGX on the left, SEV on the right) is the very same ELF static-PIE binary. To be clear, this is not only the same source code, compiled for different platforms, but exactly the same binary, with the very same hash signature. What’s pretty astounding about this is that in order to make it run on both platforms, the engineering team has had to write two sets of seriously low-level code, including more than a little Assembly language, providing the “plumbing” to allow the binary to run on both.

This is a very big deal, because although we’ve only implemented a handful of syscalls on each platform – enough to make our simple binary run and print out a message – we now have a framework on which we know we can build. And what’s next? Well, we need to expand that framework so that we can then build the WebAssembly layers which will allow WebAssembly applications to run on top:

There’s a long way to go, but this milestone shows that we have an initial framework which we can improve, and on which we can build.

What’s next?

What’s exciting about this milestone from our point of view is that we think it puts Enarx at a stage where more people can join and take part. There’s still lots of low-level work to be done, but it’s going to be easier to split up now, and also to start some of the higher level work, too. Enarx is completely open source, and we do all of our design work in the open, along with our daily stand-ups. You’re welcome to browse our documentation, RFCs (mostly in draft at the moment), raise issues, and join our calls. You can find loads more information on the Enarx wiki: we look forward to your involvement in the project.

Last, and not least, I’d like to take a chance to note that we now have testing/CI/CD resources available for the project with both Intel SGX and AMD SEV systems available to us, all courtesy of Packet. This is amazingly generous, and we both thank them and encourage you to visit them and look at their offerings for yourself!

No security without an architecture

Your diagrams don’t need to be perfect. But they do need to be there.

I attended a virtual demo this week. It didn’t work, but none of us was stressed by that: it was an internal demo, and these things happen. Luckily, the members of the team presenting the demo had lots of information about what it would have shown us, and a particularly good architectural diagram to discuss. We’ve all been in the place where the demo doesn’t work, and all felt for the colleague who was presenting the slidedeck, and on whose screen a message popped up a few slides in, saying “Demo NO GO!” from one of her team members.

After apologies, she asked if we wanted to bail completely, or to discuss the information they had to hand. We opted for the latter – after all, most demos which aren’t foregrounding user experience components don’t show much beyond terminal windows that most of us could fake up in half an hour or so anyway. She answered a couple of questions, and then I piped up with one about security.

This article could have been about the failures in security in a project which was showing an early demo: another example of security being left till late (often too late) in the process, at which point it’s difficult and expensive to integrate. However, it’s not. It clear that thought had been given to specific aspects of security, both on the network (in transit) and in storage (at rest), and though there was probably room for improvement (and when isn’t there?), a team member messaged me more documentation during the call which allowed me to understand of the choices the team had made.

What this article is about is the fact that we were able to have a discussion at all. The slidedeck included an architecture diagram showing all of the main components, with arrows showing the direction of data flows. It was clear, colour-coded to show the provenance of the different components, which were sourced from external projects, which from internal, and which were new to this demo. The people on the call – all technical – were able to see at a glance what was going on, and the team lead, who was providing the description, had a clear explanation for the various flows. Her team members chipped in to answer specific questions or to provide more detail on particular points. This is how technical discussions should work, and there was one thing in particular which pleased me (beyond the fact that the project had thought about security at all!): that there was an architectural diagram to discuss.

There are not enough security experts in the world to go around, which means that not every project will have the opportunity to get every stage of their design pored over by a member of the security community. But when it’s time to share, a diagram is invaluable. I hate to think about the number of times I’ve been asked to look at project in order to give my thoughts about security aspects, only to find that all that’s available is a mix of code and component documentation, with no explanation of how it all fits together and, worse, no architecture diagram.

When you’re building a project, you and your team are often so into the nuts and bolts that you know how it all fits together, and can hold it in your head, or describe the key points to a colleague. The problem comes when someone needs to ask questions of a different type, or review the architecture and design from a different slant. A picture – an architectural diagram – is a great way to educate external parties (or new members of the project) in what’s going on at a technical level. It also has a number of extra benefits:

  • it forces you to think about whether everything can be described in this way;
  • it forces you to consider levels of abstraction, and what should be shown at what levels;
  • it can reveal assumptions about dependencies that weren’t previously clear;
  • it is helpful to show data flows between the various components
  • it allows for simpler conversations with people whose first language is not that of your main documentation.

To be clear, this isn’t just a security problem – the same can go for other non-functional requirements such as high-availability, data consistency, performance or resilience – but I’m a security guy, and this is how I experience the issue. I’m also aware that I have a very visual mind, and this is how I like to get my head around something new, but even for those who aren’t visually inclined, a diagram at least offers the opportunity to orient yourself and work out where you need to dive deeper into code or execution. I also believe that it’s next to impossible for anybody to consider all the security implications (or any of the higher-order emergent characteristics and qualities) of a system of any significant complexity without architectural diagrams. And that includes the people who designed the system, because no system exists on its own (or there’s no point to it), so you can’t hold all of those pieces in your head of any length of time.

I’ve written before about the book Building Evolutionary Architectures, which does a great job in helping projects think about managing requirements which can morph or change their priority, and which, unsurprisingly, makes much use of architectural diagrams. Enarx, a project with which I’m closely involved, has always had lots of diagrams, and I’m aware that there’s an overhead involved here, both in updating diagrams as designs change and in considering which abstractions to provide for different consumers of our documentation, but I truly believe that it’s worth it. Whenever we introduce new people to the project or give a demo, we ensure that we include at least one diagram – often more – and when we get questions at the end of a presentation, they are almost always preceded with a phrase such as, “could you please go back to the diagram on slide x?”.

I nearly published this article without adding another point: this is part of being “open”. I’m a strong open source advocate, but source code isn’t enough to make a successful project, or even, I would add, to be a truly open source project: your documentation should not just be available to everybody, but accessible to everyone. If you want to get people involved, then providing a way in is vital. But beyond that, I think we have a responsibility (and opportunity!) towards diversity within open source. Providing diagrams helps address four types of diversity (at least!):

  • people whose first language is not the same as that of your main documentation (noted above);
  • people who have problems reading lots of text (e.g. those with dyslexia);
  • people who think more visually than textually (like me!);
  • people who want to understand your project from different points of view (e.g. security, management, legal).

If you’ve ever visited a project on github (for instance), with the intention of understanding how it fits into a larger system, you’ll recognise the sigh of relief you experience when you find a diagram or two on (or easily reached from) the initial landing page.

And so I urge you to create diagrams, both for your benefit, and also for anyone who’s going to be looking at your project in the future. They will appreciate it (and so should you). Your diagrams don’t need to be perfect. But they do need to be there.

Enarx goes multi-platform

Now with added SGX!

Yesterday, Nathaniel McCallum and I presented a session “Confidential Computing and Enarx” at Open Source Summit Europe. As well as some new information on the architectural components for an Enarx deployment, we had a new demo. What’s exciting about this demo was that it shows off attestation and encryption on Intel’s SGX. Our initial work focussed on AMD’s SEV, so this is our first working multi-platform work flow. We’re very excited, and particularly as this week a number of the team will be attending the first face to face meetings of the Confidential Computing Consortium, at which we’ll be submitting Enarx as a project for contribution to the Consortium.

The demo had been the work of several people, but I’d like to call out Lily Sturmann in particular, who got things working late at night her time, with little time to spare.

What’s particularly important about this news is that SGX has a very different approach to providing a TEE compared with the other technology on which Enarx was previously concentrating, SEV. Whereas SEV provides a VM-based model for a TEE, SGX works at the process level. Each approach has different advantages and offers different challenges, and the very different models that they espouse mean that developers wishing to target TEEs have some tricky decisions to make about which to choose: the run-time models are so different that developing for both isn’t really an option. Add to that the significant differences in attestation models, and there’s no easy way to address more than one silicon platform at a time.

Which is where Enarx comes in. Enarx will provide platform independence both for attestation and run-time, on process-based TEEs (like SGX) and VM-based TEEs (like SEV). Our work on SEV and SGX is far from done, but also we plan to support more silicon platforms as they become available. On the attestation side (which we demoed yesterday), we’ll provide software to abstract away the different approaches. On the run-time side, we’ll provide a W3C standardised WebAssembly environment to allow you to choose at deployment time what host you want to execute your application on, rather than having to choose at development time where you’ll be running your code.

This article has sounded a little like a marketing pitch, for which I apologise. As one of the founders of the project, alongside Nathaniel, I’m passionate about Enarx, and would love you, the reader, to become passionate about it, too. Please visit enarx.io for more information – we’d love to tell you more about our passion.

What’s a Trusted Compute Base?

Tamper-evidence, auditability and measurability are three important properties.

A few months ago, in an article called “Turtles – and chains of trust“, I briefly mentioned Trusted Compute Bases, or TCBs, but then didn’t go any deeper.  I had a bit of a search across the articles on this blog, and realised that I’ve never gone into this topic in much detail, which feels like a mistake, so I’m going to do it now.

First of all, let’s think about computer systems.  When I talk about systems, I’m being both quite specific (see Systems security – why it matters) and quite broad (I don’t just mean computer that sits on your desk or in a data centre, but include phones, routers, aircraft navigation devices – pretty much anything that has a set of chips inside it).  There are surely some systems that you don’t rely on too much to do important things, but in most cases, you’re going to care if they go wrong, or, more relevant to this discussion, if they get compromised.  Even the most benign of systems – a smart light-bulb, for instance – can become a nightmare if compromised.  Even if you don’t particularly care whether you can continue to use it in the way it was intended, there are still worries about its misuse in the case of compromise:

  1. it may become a “jumping off point” for malicious attacks into your network or other systems;
  2. it may be used as part of a botnet, piggybacking on your network to attack other systems (leading to sanctions against your legitimate systems from outside);
  3. it may be used as part of a botnet, using up resources such as network bandwidth, storage or electricity (leading to resource constraints or increased charges).

For any systems dealing with sensitive data – anything from your messages to loved ones on your phone through intellectual property secrets for a manufacturing organisation through to National Security data for government department – these issues are compounded.  In order to protect your system, you can’t just say “this system is secure” (lovely as that would be).  What can you do to start making statement about the general security of a system?

The stack

Systems consist of multiple components, and modern computing systems are typically composed from multiple layers (one of my favourite xkcd comics, Stack, shows some of them).  What’s relevant from the point of view of this article is that, on the whole, the different layers of the stack start up – boot up – from the bottom upwards.  This means, following the “bottom turtle” rule (see the Turtles article referenced above), that we need to ensure that the bottom layer is as secure as possible.  In fact, in order to build a system in which we can have assurance that it will behave as expected and designed (in other words, a system in which we can have a trust relationship), we need to build a Trusted Compute Base.  This should have at least the following set of properties: tamper-evidence, auditability and measurability, all of which are related to each other.

Tamper-evidence

We want to know if the TCB – on which we are building everything else – has a problem.  Specifically, we need a set of layers or components that we are pretty sure have not been compromised, or which, if compromised, will be tamper-evident:

  • fail in expected ways,
  • refuse to start, or
  • flag that they have been compromised.

It turns out that this is not easy, and typically becomes more difficult as you move up the stack – partly because you’re adding more layers, and partly because those layers tend to get more complex.

Our TCB should have the properties listed above (around failure, refusing to start or compromise-flagging), and be as small as possible.  This seems the wrong way around: surely you would want to ensure that as much of your system was trusted as possible?  In fact, what you want is a small, easily measurable and easily auditable TCB on which you can build the rest of your system – from which you can build a “chain of trust” to the other parts of your system about which you care.  Auditability and measurability are the other two properties that you want in a TCB, and these two properties are why open source is a very useful tool in your toolkit when building a TCB.

Auditability (and open source)

Auditability means that you – or someone else who you trust to do the job – can look into the various components of the TCB and assure yourself that they have been written, compiled and  are executing properly.  As I explained in Of projects, products and (security) community, the person may not always be you, or even someone in your organisation, but if you’re using widely deployed open source software, the rest of the community can be doing that auditing for you, which is a win for you and – if you contribute your knowledge back into the community – for everybody else as well.

Auditability typically gets harder the further you go down the stack – partly because you’re getting closer and closer to bits – ones and zeros – and to actual electrons, and partly because there is very little truly open source hardware out there at the moment.  However, the more that we can see and audit of the TCB, the more confidence we can have in it as a building block for the rest of our system.

Measurability (and open source)

The other thing you want to be able to do is ensure that your TCB really is your TCB.  Tamper-evidence is related to this, but that’s a run-time property only (for software components, at least).  Being able to measure when you provision your system and then to check that what you originally loaded is still what you think it should be when you boot it is a very important property of a TCB.  If what you’re running is open source, you can check it yourself, against your own measurements and those of the community, and if changes are made – by you or others – those changes can be checked (as part of auditing) and then propagated through measurement checking to the rest of the community.  Equally important – and much more difficult – is run-time measurability.  This turns out to be very difficult to do, although there are some techniques emerging which are beginning to get traction – for now, we tend to rely on tamper-evidence, which is easier in hardware than software.

Summary

Trusted Compute Bases (TCBs) are a key concept in building systems that we hope will behave in ways we expect – or allow us to find out when they are not.  Tamper-evidence, auditability and measurability are three important properties that they should display, and it turns out that open source is an important factor in helping us ensure two of those.

 

 

 

Enarx for everyone (a quest)

In your backpack, the only tool that you have to protect you is Enarx…

You are stuck in a deep, dark wood, with spooky noises and roots that seem to move and trip you up.  Behind every tree malevolent eyes look out at you.  You look in your backpack and realise that the only tool that you have for your protection is Enarx, the trusty open source project given you by the wizened old person at the beginning of your quest.  Hard as you try, you can’t remember what it does, or how to use it.  You realise that now is that time to find out.

What do you do next?

  • If you are a business person, go to 1. Why I need Enarx to reduce business risk.
  • If you are an architect, go to 2. How I can use Enarx to protect sensitive data.
  • If you are a techy, go to 3. Tell me more about Enarx technology (I can take it).

1. Why I need Enarx to reduce business risk

You are the wise head upon which your business relies to consider and manage risk.  One of the problems that you run into is that you have sensitive data that needs to be protected.  Financial data, customer data, legal data, payroll data: it’s all at risk of compromise if it’s not adequately protected.  Who can you trust, however?  You want to be able to use public clouds, but the risks of keeping and processing information on systems which are not under your direct control are many and difficult to quantify.  Even your own systems are vulnerable to outdated patches, insider attacks or compromises: confidentiality is difficult to ensure, but vital to your business.

Enarx is a project which allows you to run applications in the public cloud, on your premises – or wherever else – with significantly reduced and better quantifiable risk.  It uses hardware-based security called “Trust Execution Environments” from CPU manufacturers, and cuts out many of the layers that can be compromised.  The only components that do need to be trusted are fully open source software, which means that they can be examined and audited by industry experts and your own teams.

Well done: you found out about Enarx.  Continue to 6. Well, what’s next?


2. How I can use Enarx to protect sensitive data

You are the expert architect who has to consider the best technologies and approaches for your organisation.  You worry about where best to deploy sensitive applications and data, given the number of layers in the stack that may have been compromised, and the number of entities – human and machine – that have the opportunity to peek into or mess with the integrity of your applications.  You can’t control the public cloud, nor know exactly what the stack it’s running is, but equally, the resources required to ensure that you can run sufficient numbers of hardened systems on premises are growing.

Enarx is an open source project which uses TEEs (Trusted Execution Environments), to allow you to run applications within “Keeps” on systems that you don’t trust.  Enarx manages the creation of these Keeps, providing cryptographic confidence that the Keeps are using valid CPU hardware and then encrypting and provisioning your applications and data to the Keep using one-time cryptographic keys.  Your applications run without any of the layers in the stack (e.g. hypervisor, kernel, user-space, middleware) being able to look into the Keep.  The Keep’s run-time can accept applications written in many different languages, including Rust, C, C++, C#, Go, Java, Python and Haskell.  It allows you to run on TEEs from various CPU manufacturers without having to worry about portability: Enarx manages that for you, along with attestation and deployment.

Well done: you found out about Enarx.  Continue to 6. Well, what’s next?


3. Tell me more about Enarx technology (I can take it)

You are a wily developer with technical skills beyond the ken of most of your peers.  A quick look at the github pages tells you more: Enarx is an open source project to allow you to deploy and applications within TEEs (Trusted Execution Environments).

  • If you’d like to learn about how to use Enarx, proceed to 4. I want to use Enarx.
  • If you’d like to learn about contributing to the Enarx project, proceed to 5. I want to contribute to Enarx.

Well done: you found out about Enarx.  Continue to 6. Well, what’s next?


4. I want to use Enarx

You learn good news: Enarx is designed to be easy to use!

If you want to run applications that process sensitive data, or which implement sensitive algorithms themselves, Enarx is for you.  Enarx is a deployment framework for applications, rather than a development framework.  What this means is that you don’t have to write to particular SDKs, or manage the tricky attestation steps required to use TEEs.  You write your application in your favourite language, and as long as it has WebAssembly as a compile target, it should run within an Enarx “Keep”.  Enarx even manages portability across hardware platforms, so you don’t need to worry about that, either.  It’s all open source, so you can look at it yourself, audit it, or even contribute (if you’re interested in that, you might want to proceed to 5. I want to contribute to Enarx).

Well done: you found out about Enarx.  Continue to 6. Well, what’s next?


5. I want to contribute to Enarx

Enarx is an open source project (under the Apache 2.0 licence), and we welcome contributions, whether you are a developer, tester, documentation guru or other enthusiastic bod with an interest in providing a way for the rest of the world to up the security level of the applications they’re running with minimal effort.  There are various components to Enarx, including attestation, hypervisor work, uni-kernel and WebAssembly run-time pieces.  We want to provide a simple and flexible framework to allow developers and operations folks to deploy applications to TEEs on any supported platform without recompilation, having to choose an obscure language or write to a particular SDK.  Please have a look around our github site and get in touch if you’re in a position to contribute.

Well done: you found out about Enarx.  Continue to 6. Well, what’s next?


6. Well, what’s next?

You now know enough to understand how Enarx can help you: well done!  At time of writing, Enarx is still in development, but we’re working hard to make it available to all.

We’ve known for a long time that we need encryption for data at rest and in transit: Enarx helps you do encryption for data in use.

For more information, you may wish to visit: