Defining the Edge

How might we differentiate Edge computing from Cloud computing?

This is an edited excerpt from my forthcoming book on Trust in Computing and the Cloud for Wiley.

There’s been a lot of talk about the Edge, and almost as many definitions as there are articles out there. As usual on this blog, my main interest is around trust and security, so this brief look at the Edge concentrates on those aspects, and particularly on how we might differentiate Edge computing from Cloud computing.

The first difference we might identify is that Edge computing addresses use cases where consolidating compute resource in a centralised location (the typical Cloud computing case) is not necessarily appropriate, and pushes some or all of the computing power out to the edges of the network, where it can computing resources can process data which is generated at the fringes, rather than having to transfer all the data over what may be low-bandwidth networks for processing. There is no generally accepted single industry definition of Edge computing, but examples might include:

  • placing video processing systems in or near a sports stadium for pre-processing to reduce the amount of raw footage that needs to be transmitted to a centralised data centre or studio
  • providing analysis and safety control systems on an ocean-based oil rig to reduce reliance and contention on an unreliable and potentially low-bandwidth network connection
  • creating an Internet of Things (IoT) gateway to process and analyse data environmental sensor units (IoT devices)
  • mobile edge computing, or multi-access edge computing (both abbreviated to MEC), where telecommunications services such as location and augmented reality (AR) applications are run on cellular base stations, rather than in the telecommunication provider’s centralised network location.

Unlike Cloud computing, where the hosting model is generally that computing resources are consumed by tenants – customers of a public cloud, for instance- in the Edge case, the consumer of computing resources is often the owner of the systems providing them (though this is not always the case). Another difference is the size of the host providing the computing resources, which may range from very large to very small (in the case of an IoT gateway, for example). One important factor about most modern Edge computing environments is that they employ the same virtualisation and orchestration techniques as the cloud, allowing more flexibility in deployment and lifecycle management over bare-metal deployments.

A table comparing the various properties typically associated with Cloud and Edge computing shows us a number of differences.


Public cloud computingPrivate cloud computingEdge computing
LocationCentralisedCentralisedDistributed
Hosting modelTenantsOwnerOwner or tenant(s)
Application typeGeneralisedGeneralisedMay be specialised
Host system sizeLargeLargeLarge to very small
Network bandwidthHighHighMedium to low
Network availabilityHighHighHigh to low
Host physical securityHighHighLow
Differences between Edge computing and public/private cloud computing

In the table, I’ve described two different types of cloud computing: public and private. The latter is sometimes characterised as on premises or on-prem computing, but the point here is that rather than deploying applications to dedicated hosts, workloads are deployed using the same virtualisation and orchestration techniques employed in the public cloud, the key difference being that the hosts and software are owned and managed by the owner of the applications. Sometimes these services are actually managed by an external party, but in this case there is a close commercial (and concomitant trust) relationship to this managed services provider, and, equally important, single tenancy is assured (assuming that security is maintained), as only applications from the owner of the service are hosted[1]. Many organisations will mix and match workloads to different cloud deployments, employing both public and private clouds (a deployment model known as hybrid cloud) and/or different public clouds (a deployment model known as multi-cloud). All these models – public computing, private computing and Edge computing – share an approach in common: in most cases, workloads are not deployed to bare-metal servers, but to virtualisation platforms.

Deployment model differences

What is special about each of the models and their offerings if we are looking at trust and security?

One characteristic that the three approaches share is scale: they all assume that hosts will have with multiple workloads per host – though the number of hosts and the actual size of the host systems is likely to be highest in the public cloud case, and lowest in the Edge case. It is this high workload density that makes public cloud computing in particular economically viable, and one of the reasons that it makes sense for organisations to deploy at least some of their workloads to public clouds, as Cloud Service Providers can employ economies of scale which allow them to schedule workloads onto their servers from multiple tenants, balancing load and bringing sets of servers in and out of commission (a computation- and time-costly exercise) infrequently. Owners and operators of private clouds, in contrast, need to ensure that they have sufficient resources available for possible maximum load at all times, and do not have the opportunities to balance loads from other tenants unless they open up their on premises deployment to other organisations, transforming themselves into Cloud Service Providers and putting them into direct competition with existing CSPs.

It is this push for high workload density which is one of the reasons for the need for strong workload-from-workload (type 1) isolation, as in order to be able to maintain high density, cloud owners need to be able to mix workloads from multiple tenants on the same host. Tenants are mutual untrusting; they are in fact likely to be completely unaware of each other, and, if the host is doing its job well, unaware of the presence of other workloads on the same host as them. More important than this property, however, is a strong assurance that their workloads will not be negatively impacted by workloads from other tenants. Although negative impact can occur in other contexts to computation – such as storage contention or network congestion – the focus is mainly on the isolation that hosts can provide.

The likelihood of malicious workloads increases with the number of tenants, but reduces significantly when the tenant is the same as the host owner – the case for private cloud deployments and some Edge deployments. Thus, the need for host-from-workload (type 2) isolation is higher for the public cloud – though the possibility of poorly written or compromised workloads means that it should not be neglected for the other types of deployment.

One final difference between the models is that for both public and private cloud deployments the physical vulnerability of hosts is generally considered to be low[2], whereas the opportunities for unauthorised physical access to Edge computing hosts are considered to be much higher. You can read a little more about the importance of hardware as part of the Trusted Compute Base in my article Turtles – and chains of trust, and it is a fundamental principle of computer security that if an attacker has physical access to a system, then the system must be considered compromised, as it is, in almost all cases, possible to compromise the confidentiality, integrity and availability of workloads executing on it.

All of the above are good reasons to apply Confidential Computing techniques not only to cloud computing, but to Edge computing as well: that’s a topic for another article.


1 – this is something of a simplification, but is a useful generalisation.

2 – Though this assumes that people with authorised access to physical machines are not malicious, a proposition which cannot be guaranteed, but for which monitoring can at least be put in place.

Masks, vaccinations, social distancing – and cybersecurity

The mappingn the realms of cybersecurity and epidemiology is not perfect, but metaphors can be useful.

Waaaaay back in 2018 (which seems a couple of decades ago now), I wrote an article called Security patching and vaccinations: a surprising link. In those days, Spectre and Meltdown were still at the front of our minds, and the impact of something like Covid-19 was, if not unimaginable, then far from what most of us considered in our day-to-day lives. In the article, I argued that patching of software and vaccination (of humans or, I guess, animals[1]) had some interesting parallels (the clue’s in the title, if I’m honest). As we explore the impact of new variants of the Covid-19 virus, the note that “a particular patch may provide resistance to multiple types of attack of the same family, as do some vaccinations” seems particularly relevant. I also pointed out that although there are typically some individuals in a human population for whom vaccination is too risky, a broad effort to vaccinate the rest of the population has positive impact for them; in a similar way to how patching most systems in a deployment can restrict the number of “jumping off points” or attack.

I thought it might be interesting to explore other similarities between disease management in the human sphere with how we do things in cybersecurity, not because they are exact matches, but because they can be useful metaphors to explain to colleagues, family and friends what we do.

Vaccinations

We’ve looked a vaccinations a bit above: the key point here is that once a vulnerability is discovered, software vendors can release patches which, if applied correctly, protect the system from those attacks. This is not dissimilar in effect to the protection provided by vaccinations in human settings, though the mechanism is very different. Computer systems don’t really have an equivalent to anti-bodies or immune systems, but patches may – like vaccines – provide protection for more than one specific attack (think virus strain) if others exploit the same type of weakness.

Infection testing

As we have discovered since the rise of Covid-19, testing of the population is a vital measure to understand what other mechanisms need to be put in place to control infection. The same goes for cybersecurity. Testing the “health” of a set of systems, monitoring their behaviour and understanding which may be compromised, by which attacks, leveraging which vulnerabilities, is a key part of any cybersecurity strategy, and easily overlooked when everything seems to be OK.

Masks

I think of masks as acting a little like firewalls, or mechanisms like SELinux which act to prevent malicious programs from accessing parts of the system which you want to protect. Like masks, these mechanisms reduce the attack surface available to bad actors by stopping up certain “holes” in the system, making it more difficult to get in. In the case of firewalls, it’s network ports, and in the case of SELinux, it’s activities like preventing unauthorised system calls (syscalls). We know that masks are not wholly effective in preventing transmission of Covid-19 – they don’t give 100% protection from oral transmission, and if someone sneezes into your eye, for instance, that could lead to infection – but we also know that if two people are meeting, and they both wear masks, the chance of transmission from an infected to an uninfected person is reduced. Most cybersecurity controls aim mainly to protect the systems on which they reside, but a well thought-out deployment may also aim to put in controls to prevent attackers from jumping from one system to another.

Social distancing

This last point leads us to our final metaphor: social distancing. Here, we put in place controls to try to make it difficult for an attacker (or virus) to jump from one system to another (or human to another). While the rise of zero trust architectures has led to something of a down-playing of some of these techniques within cybersecurity, mechanisms such as DMZs, policies such as no USB drives and, at the extreme end, air-gapping of systems (where there is no direct network connection between them) all aim to create physical or logical barriers to attacks or transmission.

Conclusion

The mapping between controls in the realms of cybersecurity and epidemiology is not perfect, but metaphors can be useful in explaining the mechanisms we use and also in considering differences (is there an equivalent of “virus load” in computer systems, for instance?). If there are lessons we can learn from the world of disease managemet, then we should be keen to do so.


1 – it turns out that you can actually vaccinate plants, too: neat.

Does my TCB look big in this?

The smaller your TCB the less there is to attack, and that’s a good thing.

This isn’t the first article I’ve written about Trusted Compute Bases (TCBs), so if the concept is new to you, I suggest that you have a look at What’s a Trusted Compute Base? to get an idea of what I’ll be talking about here. In that article, I noted the importance of the size of the TCB: “what you want is a small, easily measurable and easily auditable TCB on which you can build the rest of your system – from which you can build a ‘chain of trust’ to the other parts of your system about which you care.” In this article, I want to take some time to discuss the importance of the size of a TCB, how we might measure it, and how difficult it can be to reduce the TCB size. Let’s look at all of those issues in order.

Size does matter

However you measure it – and we’ll get to that below – the size of the TCB matters for two reasons:

  1. the larger the TCB is, the more bugs there are likely to be;
  2. the larger the TCB is, the larger the attack surface.

The first of these is true of any system, and although there may be ways of reducing the number of bugs, proving the correctness of all or, more likely, part of the system, bugs are both tricky to remove and resilient – if you remove one, you may well be introducing another (or worse, several). Now, the kinds or bugs you have, and the number of them, can be reduced through a multitude of techniques, from language choice (choosing Rust over C/C++ to reduce memory allocation errors, for instance) to better specification and on to improved test coverage and fuzzing. In the end, however, the smaller the TCB, the less code (or hardware – we’re considering the broader system here, don’t forget), you have to trust, the less space there is for there to be bugs in it.

The concept of an attack surface is important, and, like TCBs, one I’ve introduced before (in What’s an attack surface?). Like bugs, there may be no absolute measure of the ratio of danger:attack surface, but the smaller your TCB, well, the less there is to attack, and that’s a good thing. As with bug reduction, there are number of techniques you may want to apply to reduce your attack surface, but the smaller it is, then, by definition, the fewer opportunities attackers have to try to compromise your system.

Measurement

Measuring the size of your TCB is really, really hard – or, maybe I should say that coming up with an absolute measure that you can compare to other TCBs is really, really hard. The problem is that there are so many measurements that you might take. The ones you care about are probably those that can be related to attack surface – but there are so many different attack vectors that might be relevant to a TCB that there are likely to be multiple attack surfaces. Let’s look at some of the possible measurements:

  • number of API methods
  • amount of data that can be passed across each API method
  • number of parameters that can be passed across each API method
  • number of open network sockets
  • number of open local (e.g. UNIX) sockets
  • number of files read from local storage
  • number of dynamically loaded libraries
  • number of DMA (Direct Memory Access) calls
  • number of lines of code
  • amount of compilation optimisation carried out
  • size of binary
  • size of executing code in memory
  • amount of memory shared with other processes
  • use of various caches (L1, L2, etc.)
  • number of syscalls made
  • number of strings visible using strings command or similar
  • number of cryptographic operations not subject to constant time checks

This is not meant to be an exhaustive list, but just to show the range of different areas in which vulnerabilities might appear. Designing your application to reduce one may increase another – one very simple example being an attempt to reduce the number of API calls exposed by increasing the number of parameters on each call, another being to reduce the size of the binary by using more dynamically linked libraries.

This leads us to an important point which I’m not going to address in detail in this article, but which is fundamental to understanding TCBs: that without a threat model, there’s actually very little point in considering what your TCB is.

Reducing the TCB size

We’ve just seen one of the main reasons that reducing your TCB size is difficult: it’s likely to involve trade-offs between different measures. If all you’re trying to do is produce competitive marketing material where you say “my TCB is smaller than yours”, then you’re likely to miss the point. The point of a TCB is to have a well-defined computing base which can protect against specific threats. This requires you to be clear about exactly what functionality requires that it be trusted, where it sits in the system, and how the other components in the system rely on it: what trust relationships they have. I was speaking to a colleague just yesterday who was relaying a story of software project who said, “we’ve reduced our TCB to this tiny component by designing it very carefully and checking how we implement it”, but who overlooked the fact that the rest of the stack – which contained a complete Linux distribution and applications – could be no more trusted than before. The threat model (if there was one – we didn’t get into details) seemed to assume that only the TCB would be attacked, which missed the point entirely: it just added another “turtle” to the stack, without actually fixing the problem that was presumably at issue: that of improving the security of the system.

Reducing the TCB by artificially defining what the TCB is to suit your capabilities or particular beliefs around what the TCB specifically should be protecting against is not only unhelpful but actively counter-productive. This is because it ignores the fact that a TCB is there to serve the needs of a broader system, and if it is considered in isolation, then it becomes irrelevant: what is it acting as a base for?

In conclusion, it’s all very well saying “we have a tiny TCB”, but you need to know what you’re protecting, from what, and how.

Intentional laziness

Identifying a time and then protecting that time is vital.

Over the past year[1], since Covid-19 struck us, one of the things that has been a notable is … well, the lack of notable things. Specifically, we’ve been deprived of many of the occasions that we’d use to mark our year, or to break the day-to-day grind: family holidays, the ability to visit a favourite restaurant, festivals, concerts, sporting events, even popping round to enjoy drinks or a barbecue at a friend’s house. The things we’d look forward to as a way of breaking the monotony of working life – or even of just providing something a bit different to a job we actively enjoy – have been difficult to come by.

This has led to rather odd way of being. It’s easy either to get really, really stuck into work tasks (whether that’s employed work, school work, voluntary work or unpaid work such as childcare or household management), or to find yourself just doing nothing substantive for long stretches of time. You know: just scrolling down your favourite social media feed, playing random games on your phone – all the while feeling guilty that you’re not doing what you should be doing. I’ve certainly found myself doing the latter from time to time when I feel I should be working, and have overcompensated by forcing myself to work longer hours, or to check emails at 10pm when I should get thinking about heading to bed, for instance. So, like many of us, I think, I get stuck into one of two modes:

  1. messing around on mindless tasks, or
  2. working longer and harder than I should be.

The worse thing about the first of these is that I’m not really relaxing when I’m doing them, partly because much of my mind is on things which I feel I ought to be doing.

There are ways to try to fix this, one of which is to be careful about the hours you work or the tasks you perform, if you’re more task-oriented in the role you do, and then to set yourself non-work tasks to fill up the rest of the time. Mowing the lawn, doing the ironing, planting bulbs, doing the shopping, putting the washing out – the tasks that need to get done, but which you might to prefer to put off, or which you just can’t quite find time to do because you’re stuck in the messing/working cycle. This focus on tasks that actually need to be done, but which aren’t work (and divert you from the senseless non-tasks) has a lot to be said for it, and (particularly if you live with other people). It’s likely to provide social benefits as well (you’ll improve the quality of the environment you live in, or you’ll just get shouted at less), but it misses something: it’s not “down-time”.

By down-time, I mean time set aside specifically not to do things. It’s a concept associated with the word “Sabbath”, an Anglicisation of the Hebrew word “shabbat”, which can be translated as “rest” or “cessation”. I particularly like the second translation (though given my lack of understanding of Hebrew, I’m just going to have to accept the Internet’s word for the translation!), as the idea of ceasing what you’re doing, and making a conscious decision to do so, is something I think that it’s easy to miss. That’s true even in normal times, but with fewer markers in our lives for when to slow down and take time for ourselves – a feature of many of our lives in the world of Covid-19 – it’s all too simple just to forget, or to kid ourselves that those endless hours of brainless tapping or scrolling are actually some sort of rest for our minds and souls.

Whether you choose a whole day to rest/cease every week, set aside an hour after the kids have gone to bed, get up an hour early, give yourself half an hour over lunch to walk or cycle or do something else, it doesn’t matter. What I know I need to do (and I think it’s true of others, too), is to practice intentional laziness. This isn’t the same as doing things which you may find relaxing to some degree (I enjoy ironing, I know people who like cleaning the kitchen), but which need to be done: it’s about giving yourself permission not to do something. This can be really, really hard, particularly if you care for other people, have a long commute or a high pressure job, but it’s also really important for our longer-term well-being.

You also need to plan to be lazy. This seems counter-intuitive, at least to me, but if you haven’t set aside time and given yourself permission to relax and cease your other activities, you’ll feel guilty, and then you won’t be relaxing properly. Identifying a time to read a book, watch some low-quality boxsets, ring up a friend for a gossip on the phone or just have a “sneaky nap”, and then protecting that time is worthwhile. No – it’s more than worthwhile: it’s vital.

I’m aware, as I write this, that I’m in the very privileged position of being able to do this fairly easily[2], when for some people, it’s very difficult. Sometimes, we may need to defer these times and to plan a weekend away from the kids, a night out or an evening in front of the television for a week, or even a month or more from now. Planning it gives us something to hold on to, though: a break from the “everyday-ness” which can grind us down. But if we don’t have something to look forward to, a time that we protect, for ourselves, to be intentionally lazy, then our long-term physical, emotional and mental health will suffer.


1 – or two years, or maybe a decade. No-one seems to know.

2 – this doesn’t mean that I do, however.

Dependencies and supply chains

A dependency on a component is one which that an application or component needs to work

Supply chain security is really, really hot right now. It’s something which folks in the “real world” of manufactured things have worried about for years – you might be surprised (and worried) how much effort aircraft operators need to pay to “counterfeit parts”, and I wish I hadn’t just searched online the words “counterfeit pharmaceutical” – but the world of software had a rude wake-up call recently with the Solarwinds hack (or crack, if you prefer). This isn’t the place to go over that: you’ll be able to find many, many articles if you search on that. In fact, many companies have been worrying about this for a while, but the change is that it’s an issue which is now out in the open, giving more leverage to those who want more transparency around what software they consume in their products or services.

When we in computing (particularly software) think about supply chains, we generally talk about “dependencies”, and I thought it might be useful to write a short article explaining the main dependency types.

What is a dependency?

A dependency on a component is one which that an application or component needs to work, and they are generally considered to come in two types:

  • build-time dependencies
  • run-time dependencies.

Let’s talk about those two types in turn.

Build-time dependencies

these are components which are required in order to build (typically compile and package) your application or library. For example, if I’m writing a program in Rust, I have a dependency on the compiler if I want to create an application. I’m actually likely to have many more run-time dependencies, however. How those dependencies are made visible to me will depend on the programming language and the environment that I’m building in.

Some languages, for instance, may have filesystem support built in, but others will require you to “import” one or more libraries in order to read and write files. Importing a library basically tells your build-time environment to look somewhere (local disk, online repository, etc.) for a library, and then bring it into the application, allowing its capabilities to be used. In some cases, you will be taking pre-built libraries, and in others, your build environment may insist on building them itself. Languages like Rust have clever environments which can look for new versions of a library, download it and compile it without your having to worry about it yourself (though you can if you want!).

To get back to your file system example, even if the language does come with built-in filesystem support, you may decide to import a different library – maybe you need some fancy distributed, sharded file system, for instance – from a different supplier. Other capabilities may not be provided by the language, or may be higher-level capabilities: JSON serialisation or HTTPS support, for instance. Whether that library is available in open source may have a large impact on your decision as to whether or not to use it.

Build-time dependencies, then, require you to have the pieces you need – either pre-built or in source code form – at the time that you’re building your application or library.

Run-time dependencies

Run-time dependencies, as the name implies, only come into play when you actually want to run your application. We can think of there being two types of run-time dependency:

  1. service dependency – this may not be the official term, but think of an application which needs to write some data to a window on a monitor screen: in most cases, there’s already a display manager and a window manager running on the machine, so all the application needs to do is contact it, and communicate the right data over an API. Sometimes, the underlying operating system may need to start these managers first, but it’s not the application itself which is having to do that. These are local services, but remote services – accessing a database, for instance – operate in the same sort of way. As long as the application can contact and communicate with the database, it doesn’t need to do much itself. There’s still a dependency, and things can definitely go wrong, but it’s a weak coupling to an external application or service.
  2. dynamic linking – this is where an application needs access to a library at run-time, but rather than having added it at build-time (“static linking”), it relies on the underlying operating system to provide a link to the library when it starts executing. This means that the application doesn’t need to be as large (it’s not “carrying” the functionality with it when it’s installed), but it does require that the version that the operating system provides is compatible with what’s expected, and does the right thing.

Conclusion

I’ve resisted the temptation to go into the impacts of these different types of dependency in terms of their impact on security. That’s a much longer article – or set of articles – but it’s worth considering where in the supply chain we consider these dependencies to live, and who controls them. Do I expect application developers to check every single language dependency, or just imported libraries? To what extent should application developers design in protections from malicious (or just poorly-written) dynamically-linked libraries? Where does the responsibility lie for service dependencies – particularly remote ones?

These are all complex questions: the supply chain is not a simple topic (partly because there is not just one supply chain, but many of them), and organisations need to think hard about how they work.

Leaving space, making balance

No substantive article this week.

I had an interesting idea for an article this week (I even took a picture to illustrate it) – but it’s going to have to wait for another time. I’ve got a busy week, and a couple of (non-urgent) medical appointments which have just got in the way a bit. This is one of those times when I’ve decided to set things aside, and not add to my pile of things to do by writing a proper article.

I should be back soon: in the meantime, follow my example and consider dropping a non-critical task and give yourself some breathing space.

Arm joins the Confidential Computing party

Arm’s announcement of Realms isn’t just about the Edge

The Confidential Computing Consortium is a Linux Project designed to encourage open source projects around confidential computing. Arm has been part of the consortium for a while – in fact, the company is Premier Member – but things got interesting on the 30th March, 2021. That’s when Arm announced their latest architecture: Arm 9. Arm 9 includes a new set of features, called Realms. There’s not a huge amount of information in the announcement about Realms, but Arm is clear that this is their big play into Confidential Computing:

To address the greatest technology challenge today – securing the world’s data – the Armv9 roadmap introduces the Arm Confidential Compute Architecture (CCA).

I happen to live about 30 minutes’ drive from the main Arm campus in Cambridge (UK, of course), and know a number of Arm folks professionally and socially – I think I may even have interviewed for a job with them many moons ago – but I don’t want to write a puff piece about the company or the technology[1]. What I’m interested in, instead, is the impact this announcement is likely to have on the Confidential Computing landscape.

Arm has had an element in their architecture for a while called TrustZone which provides a number of capabilities around security, but TrustZone isn’t a TEE (Trusted Execution Environment) on its own. A TEE is the generally accepted unit of confidential computing – the minimum building block on which you can build. It is arguably possible to construct TEEs using TrustZone, but that’s not what it’s designed for, and Arm’s decision to introduce Realms strongly suggests that they want to address this. This is borne out by the press release.

Why is all this important? I suspect that few of you have laptops or desktops that run on Arm (Raspberry Pi machines apart – see below). Few of the servers in the public cloud run Arm, and Realms are probably not aimed particularly at your mobile phone (for which TrustZone is a better fit). Why, then, is Arm bothering to make a fuss about this and to put such an enormous design effort into this new technology? There are two answers, it seems to me, one of which is probably pretty much a sure thing, and the other of which is more of a competitive gamble.

Answer 1 – the Edge

Despite recent intrusions by both AMD and Intel into the Edge space, the market is dominated by Arm-based[3] devices. And Edge security is huge, partly because we’re just seeing a large increase in the number of Edge devices, and partly because security is really hard at the Edge, where devices are more difficult to defend, both logically (they’re on remote networks, more vulnerable to malicious attack) and physically (many are out of the control of their owners, living on customer premises, up utility poles, on gas pipelines or in sports stadia, just to give a few examples). One of the problems that confidential computing aims to solve is the issue that, traditionally, once an attacker has physical access to a system, it should be considered compromised. TEEs allow some strong mitigations against that problem (at least against most attackers and timeframes), so making it easy to create and use TEEs on the Edge makes a lot of sense. With the addition of Realms to the Arm 9 architecture, Arm is signally its intent to address security on the Edge, and to defend and consolidate its position as leader in the market.

Answer 2 – the Cloud

I mentioned above that few public cloud hosts run Arm – this is true, but it’s likely to change. Arm would certainly like to see it change, and to see its chipsets move into the cloud mainstream. There has been a lot of work to improve support for server-scale Arm within Linux (in fact, open source support for Arm is generally excellent, not least because of the success of Arm-based chips in Raspberry Pi machines). Amazon Cloud Services (AWS) started offering Arm-based servers to customers as long ago as 2018. This is a market in which Arm would clearly love to be more active and carve out a larger share, and the growing importance of confidential computing in the cloud (and public and private) means that having a strong story in this space was important: Realms are Arm’s answer to this.

What next?

An announcement of an architecture is not the same as availability of hardware or software to run on it. We can expect it to be quite a few months before we see production chips running Arm 9, though evaluation hardware should be available to trusted partners well before that, and software emulation for various components of the architecture will probably come even sooner. This means that those interested in working with Realms should be able to get things moving and have something ready pretty much by the time of availability of production hardware. We’ll need to see how easy they are to use, what performance impact they have, etc., but Arm do have an advantage here: as they are not the first into the confidential computing space, they’ve had the opportunity to watch Intel and AMD and see what has worked, and what hasn’t, both technically and in terms of what the market seems to like. I have high hopes for Arm Realms, and Enarx, the open source confidential computing project with which I’m closely involved, has plans to support them when we can: our architecture was designed with multi-platform support from the beginning.


1 – I should also note that I participated in a panel session on Confidential Computing which was put together by Arm for their “Arm Vision Day”, but I was in no way compensated for this[2].

2 -in fact, the still for the video is such a terrible picture of me that I think maybe I have grounds to sue for it to be taken down.

3 – Arm doesn’t manufacture chips itself: it licenses its designs to other companies, who create, manufacture and ship devices themselves.

GET/SET methods for open source projects

Or – how to connect with open source

I’m aware that many of the folks who read my blog already know lots about open source, but I’m also aware that there are many who know little if anything about it. I’m a big, big proponent or open source software (and beyond, such as open hardware), and there are lots of great resources you can find to learn more about it: a very good starting point is Opensource.com. It’s run by a bunch of brilliant people for the broader community by my current employer, Red Hat (I should add a disclaimer that I’m not only employed by Red Hat, but also a “Correspondent” at Opensource.com – a kind of frequent contributor/Elder Thing), and has articles on pretty much every aspect of open source that you can imagine.

I was thinking about APIs today (they’re in the news this week after a US Supreme Court Judgment on an argument between Google and Oracle), and it occurred to me that if I were interested in understanding how to interacting with open source at the project level, but didn’t know much about it, then a quick guide might be useful. The same goes if I were involved in an open source project (such as Enarx) which was interested in attracting contributors (particularly techie contributors) who aren’t already knowledgeable about open source. Given that most programmers will understand what GET and SET methods do (one reads data, the other writes data), I thought this might be useful way to consider engagement[1]. I’ll start with GET, as that’s how you’re likely to be starting off, as well – finding out more about the project – and then move to SET. This is far from an exhaustive list, but I hope that I’ve hit most of the key ways you’re most likely to start getting involved/encourage others to get involved. The order I’ve chosen reflects what I suspect is a fairly typical approach to finding out more about a project, particularly for those who aren’t open source savvy already, but, as they say, YMMV[2].

I’ve managed to stop myself using Enarx as the sole source of examples, but have tried to find a variety of projects to give you a taster. Disclaimer: their inclusion here does not mean that I am a user or contributor to the project, nor is it any guarantee of their open source credentials, code quality, up-to-date-ness, project maturity or community health[4].

GET methods

  • Landing page – the first encounter that you may have with a project will probably be its landing page. Some projects go for something basic, others apply more design, but you should be able to use this as the starting point for your adventures around the project. You’d generally hope to find a link various of the other resources listed below from this page. Sigstore
  • Wiki – in many cases, the project will have a wiki. This could be simple, it could be complex. It may allow editing by anyone, or only by a select band of contributors to the project, and its relevance as source-of-truth may be impacted by how up to date it is, but the wiki is usually an excellent place to start. Fedora Project
  • Videos – some projects maintain a set of videos about their project. These may include introductions to the concepts, talking head interviews with team members, conference sessions, demos, HOW-TOs and more. It’s also worth looking for videos put up my contributors to the project, but which aren’t necessarily officially owned by the project. Rust Language
  • Code of Conduct – many projects insist that their project members follow a code of conduct, to reduce harassment, reduce friction and generally make the project a friendly, more inclusive and more diverse place to be. Linux kernel
  • Binary downloads – as projects get more mature, they may choose to provide pre-compiled binary downloads for users. More technically-inclined users may choose to compile their own from the code base (see below) even before this, but binary downloads can be a quick way to try out a project and see whether it does what you want. Chocolate Doom (a Doom port)
  • Design documentation – without design documentation, it can be very difficult to get really into a project (I’ve written about the importance of architecture diagrams on this blog before). This documentation is likely to include everything from an API definition up to complex use cases and threat models. Kubernetes
  • Code base – you’ve found out all you need to get going: it’s time to look at the code! This may vary from a few lines to many thousands, may include documentation in comments, may include test cases: but if it’s not there, then the project can’t legitimately call itself open source. Rocket Rust web framework[5]
  • Email/chat – most projects like to have a way for contributors to discuss matters asynchronously. The preferred medium varies between projects, but most will choose an email list, a chat server or both. Here’s where to go to get to know other users and contributors, ask questions, celebrate successful compiles, and just hang out. Enarx chat
  • Meet-ups, video conferences, calls, etc. – though physically meetings are tricky for many at the moment (I’m writing as Covid-19 still reduces travel opportunities for many), having ways for community members and contributors to get together synchronously can be really helpful for everybody. Sometimes these are scheduled on a daily, weekly or monthly basis, sometimes they coincide with other, larger meet-ups, sometimes a project gets big enough to have its own meet-ups, and sometimes so big that there are meet-ups of sub-projects or internal interest groups. Linux Security Summit Europe

SET methods

  • Bug reports – for many of us, the first time we contribute anything substantive back to an open source project is when we file a bug report. These types of bug reports – from new users – can be really helpful for projects, as they not only expose bugs which may not already be known to the project, but they also give clues as to how actual users of the project are trying to use the code. If the project already publishes binary downloads (see above), then you don’t even need to have compiled the code to try it and submit a bug report, but bug reports related to compilation and build can also be extremely useful to the project. Sometimes, the mechanism for bug reporting also provides a way to ask more general questions about the project, or to ask for new features. exa (replacement for the ls command)
  • Tests – once you’ve starting using the project, another way to get involved (particularly once you start contributing code) can be to design and submit tests for how the project ought to work. This can be a great way to unearth both your assumptions (and lack of knowledge!) about the project, but also the project’s design assumptions (some of which may well be flawed). Tests are often part of the code repository, but not always. Gnome Shell
  • Wiki – the wiki can be a great way to contribute to the project whether you’re coding or not. Many projects don’t have as much information available as they should do, and that information may often not be aimed at people coming to the project “fresh”. If this is what you’ve done, then you’re in a great position to write material which will help other “newbs” to get into the project faster, as you’ll know what would have helped you if it had been there. Wine (Windows Emulator for Linux)
  • Code – last, but not least, you can write code. You may take hours, months or years to get to this stage – or may never reach it – but open source software is nothing without its code. If you’ve paid enough attention to the other steps, got involved in the community, understood what the project aims to do, and have the technical expertise (which you may well develop as you go!), then writing code may be way to you want to do. Enarx (again)

1 – I did consider standard RESTful verbs – GET, PUT, POST and DELETE, but that felt rather contrived[2].

2 – And I don’t like the idea of DELETE in this context!

3 – “Your Mileage May Vary”, meaning, basically, that your experience may be different, and that’s to be expected.

4 – that said, I do use lots of them!

5 – I included this one because I’ve spent far too much of my time look at this over the past few months…

3 vital traits for an open source leader

The world is not all about you.

I’ve written a few articles on how to be do something badly, or how not to do things, as I think they’re a great way of adding a little humour to the process of presenting something. Some examples include:

The last, in particular, was very popular, and ended up causing so much furore on a mailing list that the mailing list had to be deleted. Not my fault, arguably (I wasn’t the one who posted it to the list), but some measure of fame (infamy?) anyway. I considered writing this article in a similar vein, but decided that although humour can work as a great mechanism to get engaged, it can also sometimes confuse the message, or muddy the waters of what I’m trying to say. I don’t want that: this is too important.

I’m writing this in the midst of a continuing storm around the re-appointment to the board of the Free Software Foundation (FSF) of Richard Stallman. We’re also only a couple of years on from Linus Torvalds deciding to make some changes to his leadership style, and apologising for his past behaviour. Beyond noting those events, I’m not going to relate them to any specific points in this article, though I will note that they have both informed parts of it.

The first thing I should say about these tips is that they’re not specific to open source, but are maybe particularly important to it. Open source, though much more “professional” than it used to be, with more people paid to work on it, is still about voluntary involvement. If people don’t like how a project is being run, they can leave, fork it, or the organisation for which they work may decide to withdraw funding (and/or its employees’ time). This is different to most other modes of engagement in projects. Many open source projects also require maintenance, and lots of it: you don’t just finish it, and then hand it over. In order for it to continue to grow, or even to continue to be safe and relevant to use, it needs to keep running, preferably with a core group of long-term contributors and maintainers. This isn’t unique to open source, but it is key to the model.

What all of the above means is that for an open source project to thrive in the long-term, it needs a community. The broader open source world (community in the larger sense) is moving to models of engagement and representation which more closely model broader society, acknowledging the importance of women, neuro-diverse members, older, younger, disabled members and other under-represented groups (in particular some ethnic groups). It has become clear to most, I believe, that individual projects need to embrace this shift if they are to thrive. What, then, does it mean to be a leader in this environment?

1. Empathise

The world is not all about you. The project (however much it’s “your baby”) isn’t all about you. If you want your project to succeed, you need other people, and if you want them to contribute, and to continue to contribute to your project, you need to think about why they might want to do so, and what actions might cause them to stop. When you do something, say something or write something, think not just about how you feel about it, but about how other people may feel about it.

This is hard. Putting yourself in other people’s shoes can be really, really difficult, the more so when you don’t have much in common with them, or feel that your differences (ethnicity, gender, political outlook, sexuality, nationality, etc.) define your relationship more than your commonalities. Remember, however, that you do share something, in fact, the most important thing in this context, which is a desire to work on this project. Ask others for their viewpoints on tricky problems – no, strike that – just ask others for the viewpoints, as often as possible, particularly when you assume that there’s no need to do so. If you can see things at least slightly from other people’s point of view, you can empathise with them, and even the attempt to do so shows that you’re making an effort, and that helps other people make an effort to empathise, too, and you’re already partway to meeting in the middle.

2. Apologise

You will get things wrong. Others will get things wrong. Apologise. Even if you’re not quite sure what you’ve done wrong. Even if you think you’re in the right. If you’ve hurt someone, whether you meant to or not, apologise. This can be even harder than empathising, because once two (or more) parties have entrenched themselves in positions on a particular topic, if they’re upset or angry, then the impulse to empathise will be significantly reduced. If you can empathise, it will become easier to apologise, because you will be able to see others’ points of view. But even if you can’t see their point of view, at least realise that they have another point of view, even if you don’t agree with it, or think it’s rational. Apologising for upsetting someone is only a start, but it’s an important one.

3. Don’t rely on technical brilliance and vision

You may be the acknowledged expert in the field. You may have written the core of the project. It may be that no-one will ever understand what you have done, and its brilliance, quite like you. Your vision may be a guiding star, bringing onlookers from near and far to gaze on your project.

Tough.

That’s not enough. People may come to your project to bask in the glory of your technical brilliance, or to wrap themselves in the vision you have outlined. Some may even stay. But if you can’t empathise, if you can’t apologise when you upset them, those people will represent only a fraction of the possible community that you could have had. The people who stay may be brilliant and visionary, too, but your project is the weaker for not encouraging (not to mention possibly actually discouraging) broader, more inclusive involvement of those who are not like you, in that they don’t value brilliance and genius sufficiently to overlook the deficits in your leadership. It’s not just that you won’t get people who aren’t like you: you will even lose people who are like you, but are unwilling to accept a leadership style which excludes and alienates.

Conclusion

It’s important, I think, to note that the two first points above require active work. Fostering a friendly environment, encouraging involvement, removing barriers: these are all important. They’re also (at least in theory) fairly simple, and don’t require hard choices and emotional investment. Arguably, the third point also requires work, in that, for many, there is an assumption that if your project is technically exciting enough (and, by extension, so is your leadership), then that’s enough: casting away this fallacy can be difficult to do.

Also, I’m aware that there’s something of an irony that I, a white, fairly neuro-typical, educated, middle-aged, Anglo adult male in a long-term heterosexual relationship, is writing about this, because many – too many! – of the leaders in this (as with many other spaces) are very much like me in many of their attributes. And I need to do a better job of following my own advice above. But I can try to model it, and I can shout about how important it is, and I can be an ally to those who want to change, and to those worst affected when that change does not come. I cannot pretend that inertia, a lack of change and a resistance to it, affects me as much as it does others, due to my position of privilege within society (and the communities about which I’ve been writing), but I can (and must) stand up when I can.

There are also times to be quiet and leave space for other voices (despite the fact that even the ability to grant that space is another example of privilege). I invite others to point me at other voices, and if I get enough feedback to do so, I’ll compile an article in the next few weeks designed to point at them from this blog.

In the meantime, one final piece of advice for leaders: be kind.

3 types of application security

There are security applications and there are applications which have security.

I’m indebted to my friend and colleague, Matt Smith, for putting me on the road to this article: he came up with a couple of the underlying principles that led me to what you see below. Thanks, Matt: we’ll have a beer (and maybe some very expensive wine) – one day.

There are security applications and there are applications which have security: specifically, security features or functionality. I’m not saying that there’s not a cross-over, and you’d hope that security applications have security features, but they’re used for different things. I’m going to use open source examples here (though many of them may have commercial implementations), so let’s say: OpenVPN is a security product, whose reason for existence is to provide a security solution, where as Kubernetes is not a security application, though it does include security features. That gives us two types of application security, but I want to add another distinction. It may sound a little arbitrary, at least from the point of the person designing or implementing the application, but I think it’s really important for those who are consuming – that is, buying, deploying or otherwise using an application. Here are the three:

  1. security application – an application whose main purpose is to solve a security-related problem;
  2. compliance-centric security – security within an application which aims to meet certain defined security requirements;
  3. risk-centric security – security within an application which aims to allow management or mitigation of particular risk.

Types 2 and 3 are subsets of the non-security application (though security applications may well include them!), and may not seem that different, until you look at them from an application user’s point of view. Those differences are what I want to concentrate on in this article.

Compliance-centric

You need a webserver – what do you buy (or deploy)? Well, let’s assume that you work in a regulated industry, or even just that you’ve spent a decent amount of time working on your risk profile, and you decide that you need a webserver which supports TLS 1.3 encryption. This is basically a “tick box” (or “check box” for North American readers): when you consider different products, any which do not meet this requirement are not acceptable options for your needs. They must be compliant – not necessarily to a regulatory regime, but to your specific requirements. There may also be more complex requirements such as FIPS compliance, which can be tested and certified by a third party – this is a good example of a compliance feature which has moved from a regulatory requirement in certain industries and sectors to a well-regarded standard which is accepted in others.

I think of compliance-centric security features as “no” features. If you don’t have them, the response to a sales call is “no”.

Risk-centric

You’re still trying to decide which webserver to buy, and you’ve come down to a few options, all of which meet your compliance requirements: which to choose? Assuming that security is going to be the deciding factor, what security features or functionality do they provide which differentiate them? Well, security is generally about managing risk (I’ve written a lot about this before, see Don’t talk security: talk risk, for example), so you look at features which allow you to manage risks that are relevant to you: this is the list of security-related capabilities which aren’t just compliance. Maybe one product provides HSM integration for cryptographic keys, another One Time Password (OTP) integration, another integrity-protected logging. Each of these allows you to address different types of risk:

  • HSM integration – protect against compromise of private keys
  • OTP integration – protect against compromise of user passwords, re-use of user passwords across sites
  • integrity-protected logging – protect against compromise of logs.

The importance of these features to you will depend on your view of these different risks, and the possible mitigations that you have in place, but they are ways that you can differentiate between the various options. Also, popular risk-centric security features are likely to morph into compliance-centric features as they are commoditised and more products support them: TLS is a good example of this, as is password shadowing in Operating Systems. In a similar way, regulatory regimes (in, for instance, the telecommunications, government, healthcare or banking sectors) help establish minimum risk profiles for applications where customers or consumers of services are rarely in a position to choose their provider based on security capabilities (typically because they’re invisible to them, or the consumers do not have the expertise, or both).

I think of risk-centric security features as “help me” features: if you have them, the response to a sales call is “how will they help me?”.

Why is this important to me?

If you are already a buyer of services, and care about security – you’re a CISO, or part of a procurement department, for instance – you probably consider these differences already. You buy security products to address security problems or meet specific risk profiles (well, assuming they work as advertised…), but you have other applications/products for which you have compliance-related security checks. Other security features are part of how you decide which product to buy.

If you are a developer, architect, product manager or service owner, though, think: what I am providing here? Am I providing a security application, or an application with security features? If the latter, how do I balance where I put my effort? In providing and advertisingcompliance-centric or risk-centric features? In order to get acceptance in multiple markets, I am going to need to address all of their compliance requirements, but that may not leave me enough resources to providing differentiating features against other products (which may be specific to that industry). On the other hand, if I focus too much on differentiation, you may miss important compliance features, and not get in the door at all. If you want to getting to the senior decision makers in a company or organisation and to be seen as a supplier of a key product – one which is not commoditised, but is differentiated from your competitors and really helps organisations manage risk – then you need to thinking about moving to risk-centric security features. But you really, really need to know what compliance-centric features are expected, as otherwise you’re just not going to get in the door.