Tom Alrich's Blog: June 2022

Thursday, June 30, 2022

The first complete SBOM tool!

I’ve noted a number of times that there is no “complete” open source SBOM consumption tool available, that provides an organization a constantly updated list of exploitable vulnerabilities in software products that they operate. Specifically, I mean a tool that:

1. Ingests SBOMs and retrieves the identifiers of all components;

2. Looks up each of those components in the National Vulnerability Database (NVD) or the OSS Index database (or another vulnerability database), and identifies any open CVEs that apply to them;

3. Stores each of these CVEs and temporarily assigns them a status of “exploitable”. Note the CVEs are stored by product and version number, but not by the component they apply to (since that information isn’t needed, once the vulnerability has been identified);

4. Ingests VEX documents as they come in and removes from the list any CVE that is shown to have a status of “not affected” – meaning it isn’t exploitable in the product itself, usually because of how the component was installed (or wasn’t installed);[i]

5. Repeats steps 2-4 regularly (ideally, multiple times a day); and

6. Provides the user an almost continuously updated list of exploitable vulnerabilities in software products and versions that they operate.

Why is it so important that the final product of the tool be a list of exploitable vulnerabilities in software products, not all vulnerabilities identified from components in an SBOM? Because there are so few of the former, compared to the latter. I’ve always used the estimate that 90-95% of component vulnerabilities are not exploitable in the product that includes the component, but that was always based just on guesses I’d heard. However, just this morning, I read a Dark Reading article that quotes a researcher to the effect that 97% of open source component vulnerabilities aren’t exploitable in the product itself. I’ll also point out that about 90% of components are open source, so including commercial components probably won’t change this percentage very much.

And why is it important that the tool provide just a list of exploitable vulnerabilities? Because, if this isn’t done, end users are going to waste a lot of time trying to track down vulnerabilities that aren’t exploitable in the first place. Suppliers’ help desks will be overwhelmed with calls and emails that ask when the supplier will patch CVE-2022-XXXXX, yet more than 19 of every 20 calls will be about non-exploitable vulnerabilities – damaging morale on the help desk and wasting lots of professional time.

Until very recently, there was no tool – whether open source or commercial – that performed all the above steps and output a list of exploitable vulnerabilities. For a long time, there have been commercial tools that perform steps 1-3, especially software composition analysis (SCA) tools. And there has been one open source tool, Dependency-Track (an OWASP project), that also performs steps 1-3. In fact, this tool is currently used by software developers about 200 million times a month to identify vulnerabilities in components of products they’re developing, allowing them to remediate the vulnerabilities or replace the components before they ship the product. But no open source tool performed steps 4-6.

However, I’m pleased to report that Dependency-Track can now ingest VEX documents and flag non-exploitable vulnerabilities, so that the list provided to the user will only include the 3% of component vulnerabilities that are exploitable in the product, but not the 97% that aren’t[ii]. This means that Steve Springett, who developed Dependency-Track ten years ago (about five years before the term SBOM even came into use) – and who also is co-leader of the CycloneDX OWASP project – is the winner of the coveted Alrich Prize, for the developer of the first complete SBOM consumption tool (either open source or commercial).

Now, you may wonder what huge award comes to the winner of the Alrich Prize. Here it is: lunch with Tom Alrich! This might be pretty expensive if Steve lived in London or in Australia, where Patrick Dwyer, Steve’s co-leader of the CycloneDX project, lives. However, it just happens that Steve lives about eight miles from where I do, in suburban Chicago (and note that I waited until now to announce the prize. I wasn’t going to take the chance that Patrick or someone else in a far-flung region would win it!). I’ll let Steve choose the McDonald’s where he’ll receive his award 😊.

But there’s one other important piece of news on the SBOM Tools front. I haven’t mentioned that Dependency-Track ingests only CycloneDX, not SPDX, SBOMs. Up until last week, there wasn’t an open source tool that could perform any of the above steps with SPDX SBOMs. However, at the Linux Foundation Open Source Summit in Austin last week, Jennings Aske, CISO of New York Presbyterian Hospital, announced that they have made their Daggerboard SBOM consumption tool available as an open source project, through the Linux Foundation; other NYP people demonstrated the tool.

Daggerboard ingests both SPDX and CycloneDX SBOMs and outputs lists of component vulnerabilities, meaning it performs steps 1-3 above. Since this is the first open source tool that ingests SPDX SBOMs, that’s good. However, I’ve had to notify Jennings that he didn’t win the prize because a) I can’t afford to fly to New York to buy him lunch, and b) Daggerboard doesn’t ingest VEXes, so it doesn’t output exploitable vulnerabilities.

Of course, Jennings was devastated when he heard this news, so I did promise that, once Daggerboard is modified to ingest VEXes (which is definitely on NYP’s feature list for version 2), I’ll let him know the next time I’m in town visiting my sister, and we can have lunch then. After all, it’s the least I can do (it’s also the most I can do).

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

[i] Actually, the non-exploitable CVEs shouldn’t be removed from the list for the product, but they should be flagged as not exploitable. They need to be retained because a) some users require their suppliers to patch all vulnerabilities, even non-exploitable ones; and b) sometimes, if a CVE isn’t exploitable because of other code in the product that prevents an attacker from accessing the vulnerability, there’s a possibility that a future code revision may inadvertently undue that protection and make the CVE exploitable after all.

[ii] Of course, this assumes that suppliers will provide VEX documents, as well as SBOMs. Fortunately, I think that’s likely to happen, since suppliers are anxious to avoid the mass help desk resignations that may occur if customers start calling about all component vulnerabilities, not just the non-exploitable ones. In fact, two of the largest software organizations in the world were the main drivers behind development of the VEX format under the NTIA – and the continuation of that work under CISA.

Wednesday, June 29, 2022

Home on the (version) range

One nice thing about working in a new field of endeavor – SBOMs and VEXes – is that I get to worry about new problems that I never even considered to be problems before. Such is the case with version ranges.

Like probably 90% of people in the cybersecurity field, it never even occurred to me that there would be a problem with specifying a version range. For example, if I say, “Vulnerability XYZ is found in versions 2.2 through (and including) 3.0 of product ABC” and someone asks me if version “2.2a” or “2.5 patch 2” or “3.0 build 4” falls in that range, I would probably answer in the affirmative in all three cases. Moreover, I wouldn’t worry that some people would think differently. This is because I assume that any letter appended to a number can be ignored when determining whether or not the version falls within the range, and I also assume that any version string[i] that includes a patch or build number would also fall into the range, assuming the version number by itself would. Furthermore, I assume that everyone else thinks the same way.

However, computers are very literal-minded individuals. A computer can’t be expected to reason in the same way as I just described unless it’s been explicitly instructed to do so. In fact, there are probably an infinite number of other rules that humans apply all the time without thinking about them, that a computer will never apply unless it’s told to do this.

The problem came up when the CISA VEX committee (which was under the NTIA until the end of last year) was working on this document (which is the best document on VEX, by the way). The document includes (via links to GitHub) examples of about ten use cases for VEX, in both the CSAF and CycloneDX VEX formats. When we considered a use case that included a version range, we realized we couldn’t naively assume that all ranges would be correctly interpreted by the tool on the user end.

Why was the committee concerned with version ranges? Because a VEX document is a machine-readable vulnerability notification. Often, vulnerabilities apply to a range of versions of a product, since the vulnerable code didn’t change over a number of versions. In cases like this, being able to specify a version range can be a huge time saver vs enumerating every version in the range.

For example, you might want to say, “No versions of product ABC before version 4.0 are affected by vulnerability CVE-2022-12345.” There might be 150 or more versions in that range (both patch and build levels need to be included as separate versions, since a new SBOM should be issued whenever there has been any change in the software); enumerating every one of them wouldn’t be a lot of fun, to say the least.

What’s even worse, it’s common that nobody at the current supplier of the product can enumerate all the early versions, since the product was originally sold by another supplier. Thus, enumerating all versions before v4.0 may be somewhere between hard and impossible. It seems like a gift from God to be able to encompass all those versions with a simple statement like “all versions before v4.0”.

Of course, that’s simple for humans, but not for computers. Unless a computer sees an enumeration of all the versions within a range, it’s not at all certain that it will be able to determine whether every version string presented to it falls within the range or not, even if a human would have no problem deciding that question. Even worse, the computer might make the wrong decision and identify a version string (e.g., “v3.23 build 3”) as being outside of the range, even though almost any human would identify it as being within the range.

Here's how the CycloneDX VEX format treats version ranges: It supports any range that is listed in the Version Range Specifier. This document is widely followed by various projects that require version ranges, including the new CVE 5.0 specification. Rather than attempt the impossible task of stating a universal algorithm for interpreting all version ranges no matter how the version strings themselves are constructed, this document provides a generalized format for specifying a range in one of a number of versioning schemes.

These schemes are all relatively simple to make calculations in. For example, one of the most popular of the versioning schemes is “semantic versioning”. This scheme requires version numbers in the form “X.Y.Z”, where X = major version number, Y = minor version number and Z = patch number. X, Y, and Z all must be integers, and they must increment by 1. Thus, it is relatively easy to determine whether any version string in the “X.Y.Z” form falls inside or outside of a range with endpoints that follow semantic versioning.

A version range in the CycloneDX VEX format reads (for example) "vers:semver/>=2.9.0|<=4.1.0". “semver” refers to semantic versioning, of course. If the range is specified in a different versioning scheme listed in the Version Range Specifier, that scheme will be substituted for “semver”. This example can be read to mean “the range between and including versions 2.9 and 4.1, interpreted using semantic versioning”.

A supplier (which can be an open source project) that is using one of the versioning schemes supported by the Version Range Specifier will have no problem specifying version ranges in the CycloneDX VEX format. That’s the good news. The bad news is that, when it comes to commercial software, a large percentage (and perhaps the majority) of suppliers don’t follow one of the schemes in the Version Range Specifier.

For example, here’s a version string for Cisco IOS™: “12.2(33)SXI9”. Suppose someone asked you to develop an algorithm that would determine whether “12.2(35) SX17” is more or less recent[ii]. How would you do this? One algorithm might decide that the second version string is more recent, because 35 is greater than 33. But another algorithm might decide that the first string is more recent, because 19 is greater than 17. Obviously, this question can’t be answered in any way other than to say, “Ask someone from Cisco to explain this to you.” That doesn’t work too well if you’re designing a computer algorithm.

In other words, Cisco’s versioning scheme, at least for IOS, may not be amenable to algorithmic interpretation at all. And even if it is, there would need to be special code – probably provided by Cisco – in the software that interprets the IOS range. But Cisco is not unique at all. I’m writing this post on a laptop running Windows 11 Home version 21H2 build 22000.739, running Experience Pack 1000.22000.739.0. How would you figure out a version range, given all of these numbers? I wouldn’t even try to do that.

So, the CycloneDX VEX format supports version ranges for certain products, that are much more likely to be open source than commercial. If a product uses one of the supported versioning schemes, the party creating a VEX can be certain that any version string that falls within a specified range will be interpreted by the user’s tool to fall within that range. However, suppliers like Cisco and Microsoft are out of luck, unless they want to provide some custom code to include in tools that interpret CycloneDX VEX documents.

How does the CSAF VEX format support version ranges? It also supports the versioning schemes identified in the Version Range Specifier. But for all other ranges, the word is caveat emptor. The VEX creator can’t be certain that a range in any other versioning scheme will always be correctly interpreted by the tool on the user end. For those schemes, it is better to enumerate every version in a range, rather than specify the range itself. In other words, the CSAF-based VEX format doesn’t differ much at all from the CycloneDX-based VEX format regarding versioning. Sorry, Cisco.

[i] This is the more general term for “version number”, since it includes the possibility of versions that are referred to by more than numbers.

[ii] Of course, an algorithm to interpret version ranges would need to include this as one of its calculations.

Friday, June 24, 2022

No good deed goes unpunished

In my most recent post, I pointed out that it’s literally counterproductive to buy only software products for which no vulnerabilities have ever been reported – especially if the product doesn’t have a CPE name, so it’s not listed in the NVD at all. This is because not reporting any vulnerabilities is almost certainly not a sign that the product is perfect, but that the supplier doesn’t think it’s worthwhile to even look for vulnerabilities. After all, if you’re sure your software is perfect, why even bother to look? I provided a vivid example of a device that’s not even listed in the NVD, but probably has about 40,000 vulnerabilities (2,000 of which are probably exploitable) in its software and firmware, according to Tom Pace of NetRise.

However, I also pointed out that Steve Springett had mentioned that, in his day job, they not only don’t get fooled by products with zero vulnerabilities, but they literally favor products for which a lot of vulnerabilities have been reported. The reasoning is simple: There are a ___-load (excuse me, “a large quantity”) of vulnerabilities in just about any software or firmware product. Since most vulnerabilities are reported by the supplier of the vulnerable product, they can exercise discretion in the ones they report and the ones they don’t report.

In the post, I did point to one type of vulnerability that probably doesn’t need to be reported: vulnerabilities that the supplier discovers and immediately fixes. Since these vulnerabilities never even have the chance to be found in a shipped product from the supplier, I don’t see why they would need to be reported. The whole reason for reporting is to inform software-using organizations of vulnerabilities in software they use, not as a form of public flagellation of the supplier for not being perfect in all circumstances, even if there’s no possibility of harm to anyone.

On the other hand, a supplier definitely should report any vulnerability that is found in a product they have already shipped. And Steve is saying it’s inevitable there will be a lot of vulnerabilities that meet this condition (of course, the biggest reason for this is that new vulnerabilities are always being discovered, so a few lines of code that seemed totally innocuous when the software was written suddenly are identified as a vulnerability). So, if a product has a CPE name, but only a few vulnerabilities have ever been reported for it, this is usually an indication that the supplier isn’t looking very hard for vulnerabilities (the fact that there’s a CPE name means they’ve at least looked once, but that’s not much better than never having looked in the first place).

This brings me to an excellent blog post I just read, by Walter Haydock. Walter is quite an interesting fellow. I connected with him on LinkedIn last year, and a couple of months ago, he mentioned me in a list of five people that his LinkedIn followers should connect to. I was of course pleased with that, but I wasn’t expecting what came next: within two days, about 300 people requested to be connected with me, all due to Walter (whereas I’d probably never had more than two or three requests in one day, and most days none at all). I found this almost scary; I asked Walter what would happen if he told all of his followers to drink poisoned Kool-Aid™. He didn’t know, but also added that he didn’t plan to try that. Good thing! With great power comes great responsibility.

You get the point: This guy’s blog is really good. I highly recommend you subscribe (although I don’t think my recommendation will bring Walter 300 new subscribers in two days!). His latest post makes a point something like my point in my previous post: that the number of vulnerabilities itself isn’t a very meaningful metric (he writes both for suppliers and end users, so you get both perspectives on this question).

What struck me most in Walter’s post was this paragraph:

Some compliance regimes, such as the federal government’s FedRAMP standard…(establish) clear perverse incentives with respect to how organizations look at the total number of findings. For example, after initial authorization, organizations that identify an increase of 20% or more from the baseline, or 10 unique vulnerabilities, whichever is greater, are subject to a “Detailed Finding Review.” By punishing companies who identify more than an arbitrary number of vulnerabilities - regardless of their severity - during a given time frame, the government disincentivizes its vendors from looking too closely for these vulnerabilities in the first place.

This was a real eye-opener to me: The fact that, not only would a supplier not get credit for reporting a healthy number of vulnerabilities, but they would get penalized for doing so. We clearly need to re-think how we treat software vulnerabilities.

Any opinions expressed in this blog post are strictly mine, and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

Wednesday, June 22, 2022

My most recent podcast

I’m pleased to report that Cybellum, an Israel-based company whose mission is “…to enable manufacturers and their suppliers to develop and maintain products that aren’t just safe, but are also secure”, has just posted a podcast I taped with them a few months ago. I’m quite pleased with the results, which are due as much to their good questions as to anything I said.

The podcast – which is only 26 minutes long – ended up being very focused on what I believe is the biggest issue preventing widespread adoption of SBOMs by end user organizations (i.e. organizations whose primary mission isn’t developing software for other organizations. Developers are already heavily using SBOMs for their internal purposes, thank you very much): the current lack of tools and scalable third-party services to utilize SBOMs and VEX documents for software risk management purposes, as well as the relative dearth of written guidance on how non-developers can use SBOMs.

I’ll warn you that, if you’d rather hear happy stories about how SBOMs and VEXes are already being widely used and how it will just take a little more of what we’re currently doing to reach component security nirvana, perhaps you need to look for another podcast. There is a huge amount of work to be done, and even what I know to be in the pipeline at the current moment is totally inadequate to address what’s needed.

However, I’m also quite optimistic that what’s needed will come, and in the not-distant future. I’m optimistic because – as a student of Milton Friedman during his heyday at the University of Chicago – I believe that free markets will ultimately both a) allow consumer demand for SBOMs to rapidly grow from its current close-to-nonexistent level, and b) “monetize” the so-far-nascent (at best) sub-markets for tools and services for widespread distribution and utilization of SBOMs for vulnerability risk management purposes.

I also want to point out that the previous podcast in this same series featured Steve Springett, the creator of Dependency-Track (which I mention in the podcast, and have referred to multiple times in these posts) and leader of the OWASP CycloneDX project (Steve is also the brains behind the current effort to solve the naming problem, one of the primary inhibitors of widespread production and utilization of SBOMs).

I recommend you also listen to that podcast, since Steve provides some very good insights into the current and future state of SBOMs and VEXes. In my opinion, Steve is the intellectual leader of the SBOM and VEX communities. The rest of us are just trying to visit the places where he’s already arrived, made a big difference, then moved on to his next challenge.

Sunday, June 19, 2022

Seriously…never buy vulnerability-free software!

In this recent post (and a couple subsequent ones), I discussed an interesting presentation by Tom Pace of NetRise. In it, he described how he’d found 1,237 vulnerabilities by identifying components in the firmware in a device that’s used in many ICS environments.

Having vulnerabilities – and sometimes a lot of them – is certainly not unique to this product. Why did I write three posts about this discovery? The problem is that, were you to look for vulnerabilities for that product in the National Vulnerability Database (NVD), you would get the message “There are 0 matching records.” Sounds good, doesn’t it? No vulnerabilities at all! That conclusion is true, except it’s off by 1,237 vulnerabilities.

Now, suppose you were comparing three products in advance of a procurement decision, and you look up vulnerabilities for all three. For two of them, you find a handful of vulnerabilities, but for the third product, you get the above message. Would you tell the first two vendors, “Thanks, but no thanks”, and write up your PO for the third vendor? I’m sure many organizations would.

Of course, this would be a mistake, since Tom found 1,237 vulnerabilities listed for components in just the firmware of this device. But it turns out the true story is worse: Tom said last week that after further analysis, he identified 2,200 vulnerabilities in components included in the device’s firmware. Even worse, after analyzing all of the software installed in the device, he estimates there are around 40,000 component vulnerabilities in the whole device. That’s a lot.

To be sure, these aren’t all exploitable vulnerabilities. As I’ve mentioned often, probably 90-95% of vulnerabilities identified in software and firmware components within a device aren’t exploitable in the device itself, often because of how the component was implemented. Let’s say that the percentage for this device is 95%, meaning only 5% of the identified vulnerabilities are exploitable. That’s still 2,000 exploitable vulnerabilities in a single device

This is obviously not good, but what’s even worse is the fact that not a single one of these 2,000 exploitable vulnerabilities appears in the NVD. They don’t appear because the supplier never registered the product. If the supplier (or somebody else) doesn’t register a product, it doesn’t get a CPE name. And if it doesn’t have a CPE name, nobody can report vulnerabilities for the product to the NVD. The product will appear to be perfect – no vulnerabilities, either current or reported in the past.

In fact, this supplier is so good that they’ve never registered any of their 50 or so products – meaning everything this supplier makes has a perfect record! Moreover, they don’t even mention the word “security” or “vulnerability” on their website. Why should they, given that all of their products are perfect?

Of course, that company’s products aren’t perfect – just the opposite. And the company is hardly unique. There are lots of other companies that haven’t registered some or all of their products on the NVD, meaning that anyone searching for vulnerabilities in those products will also get the message, “There are 0 matching records.”

What does all this mean? I hope you’re sitting down, since I need to give you some bad news: There are no perfect products or perfect suppliers (there’s also no Santa Claus or Easter Bunny. Might as well give you all the bad news at once). You should never interpret the fact that you can’t identify vulnerabilities for a software product (or intelligent device) in the NVD (or any other vulnerability database, of course) to mean that the product doesn’t have vulnerabilities.

But there’s more to it than that. Not only should you stay away from “perfect” products, but you should also deliberately favor products that show a lot of vulnerabilities in the NVD. Why is this? Remember, vulnerabilities are almost always reported by the suppliers themselves. Would you rather buy a product from a supplier that has only reported a few vulnerabilities in the past year or two, or from one that has reported a lot of them? If a supplier has only reported a few vulnerabilities, this doesn’t mean they’re good; on the contrary, it probably means they’re clueless in cybersecurity matters. It means the supplier isn’t looking very hard – or not at all – for vulnerabilities, so they’re not finding many.

Steve Springett, who I’ve written about a number of times and who is tasked with helping 2,000 coders produce secure software in his day job, said last week that his company deliberately favors products for which there are a lot of reported vulnerabilities. They consider this a sign that the supplier is diligently seeking out vulnerabilities, not waiting for their product to be hacked.

So not only should you avoid “perfect” products, but you should actually seek out suppliers that have reported a lot of vulnerabilities. Of course, you also want to make sure that such a supplier hasn’t left serious vulnerabilities unpatched. My guess is, if a supplier has found and reported a lot of exploitable vulnerabilities, they’ve also done a good job of patching them. In fact, the supplier should report vulnerabilities, even if they’re patched.[i] That’s the only way the rest of the world will learn about the real impact of particular vulnerabilities.

[i] There are limits to this suggestion. I know that suppliers discover vulnerabilities in products under development all the time, and patch them immediately. I don’t think they need to report those. But when a vulnerability develops in a product that’s already on the market, they always need to report it – along with providing the patch, of course.

Monday, June 13, 2022

It’s time for the VEX talk…

I’ve wanted to have this talk with you for a long time, but I’ll admit I’ve been putting it off. However, there are some important things you need to know about VEX, and I’d rather you hear them from me than from someone else (such as one of your friends):

1. There are now two VEX formats. They’re very different, although they have the same purpose. The first is based on the new CSAF version 2 vulnerability reporting standard from OASIS (it uses a subset of the capabilities in that standard, called a profile). The other is based on the OWASP CycloneDX SBOM format, although any SBOMs referred to in the VEX can be in either CycloneDX or SPDX format.

2. There is no direct connection between SBOMS and VEX documents. Even though the VEX format was developed to address the issue of non-exploitable vulnerabilities identified using SBOMs, the VEX itself doesn’t need to refer to an SBOM or to a component. The question of the exploitability of a vulnerability, that is found in a particular component, only applies to the product as a whole: Is the vulnerability exploitable in the product or not?

3. There’s no way to do a direct translation between the two formats; by that I mean not only that there aren’t translation tools, but that I don’t even see a good way to do a direct translation, other than to extract the data from one format and create a new VEX in the other.

4. Currently, there aren’t any automated tools to create VEXes[i] in either format. This recently approved document from the CISA VEX committee provides links to JSON code for various use cases, in both VEX formats.

5. However, unlike with SBOMs, where the format used should be a concern for end users, end users really shouldn’t care in what format a VEX is provided, as long as the tool or third-party service they are using to track vulnerabilities in software components can interpret it. This is because a VEX document, even though it’s comprehensible to a human who wants to invest some time learning about the format it’s written in, will always be created with machine readability in mind.

6. Suppliers are already providing vulnerability notifications to their users in various ways now – in emailed PDFs, website postings, etc. So, VEXes don’t give them a new capability (as SBOMs do)[ii]. When suppliers start providing VEXes, they’ll be expressly intended to be read by machines; the user will seldom read a VEX themselves. If they do, it will probably be to satisfy their curiosity about what the VEX looks like, not actually to absorb information.

7. And speaking of machine readability, what tools or third party services are currently available to utilize VEX documents? The tool will need to do more than ingest VEXes, since VEXes alone won’t usually provide any useful information. Instead, a tool will need to do all of the following:

a. Ingest an SBOM about a particular software product (or intelligent device) – say, product A - and retrieve all the component information from it;

b. Using the NVD or another vulnerability database like OSS Index, create a list of vulnerabilities currently applicable to components listed in the SBOM for A;

c. Ingest a VEX document or documents that apply to A;

d. Based on the information in the VEX documents, remove from the list of vulnerabilities applicable to A all vulnerabilities that aren’t exploitable in A; and

e. Repeat this process daily, if not more often.

8. As VEXes are received, the list will be whittled down so that it contains only exploitable vulnerabilities. Thus, the user is much less likely to waste their time trying to identify and patch non-exploitable vulnerabilities, than they would be if they didn’t receive VEXes at all.

9. Currently, there are no tools or subscription-based third-party services that perform steps a. to e. above. However, I know that the Dependency-Track open source tool will soon be able to ingest CycloneDX VEXes (it can create VEXes now, in the CycloneDX VEX format). Dependency-Track has for at least ten years been able to read SBOMs (in the CycloneDX format) and look up vulnerabilities in the NVD or OSS Index. It is now getting a huge amount of use from developers who want to learn about vulnerabilities affecting components in a software product they’re developing. The VEX ingestion capability (which I’ve seen demonstrated) should be available soon.[iii]

10. Even with the current lack of tools for utilizing VEXes by end-users, I’m optimistic that the tools and the VEXes will be available before long. This is because software suppliers are the main drivers behind the VEX format. Two very large suppliers were the most interested in getting work on the format started in 2020 and are still heavily involved with that work today. Both of these suppliers – as well as many more – are very concerned about the heavy load that will be placed on their help desks if customers start calling in about the large number of vulnerabilities that will be found in the NVD for components of their products. The customers will naturally think of all these vulnerabilities as exploitable – when in fact only a small percentage of them are. One of the two large suppliers said they expected that producing VEXes along with SBOMs will end up saving them thousands of false positive help desk calls every month.

11. In this post, I suggested that suppliers should take responsibility for a) looking up components vulnerabilities in the NVD every day, and b) “whittling down” the list of vulnerabilities so that only exploitable ones are left on it. This would confer the added advantage that VEX documents would never have to be sent outside the supplier (or outside the third party that the supplier engaged to perform this work for them), and end users wouldn’t have to look up component vulnerabilities themselves.

12. If you think about it, it’s somewhat silly that, if a supplier has for example 10,000 customers, every one of those customers should have to look up the same components in the NVD and get the same list of vulnerabilities, vs. the supplier just doing it once for all of them. Following the five steps listed above, the supplier would provide their customers a continually updated list of exploitable vulnerabilities in particular versions of their product or device (whichever versions the customer is using). I think this idea makes a lot of sense and will probably be realized one of these days, just due to market pressures.

13. Another thing I would like to see happen is what I call “real-time VEX”. These are simple VEX documents focused on only one product (although perhaps multiple versions of that product), that simply list non-exploitable component vulnerabilities in the product. Because they are so simple, the supplier would put one out as soon as they discover that a particular component vulnerability or vulnerabilities isn’t exploitable in the product. Receiving these right away will allow end users to avoid what could be a lot of wasted effort looking for vulnerabilities that simply aren’t there – as opposed to waiting a week or more to receive that information.

[i] By “automated tool for producing VEXes”, I mean a tool that will ask the user (typically a supplier producing a VEX for their product) the specific questions that need to be answered in a VEX, including “What is the product?” “What is the vulnerability?” “What is the status of that vulnerability in each of the versions that you wish to report on?” etc. With the answers to those questions, the tool will produce a VEX in one of the formats (CSAF-based or CycloneDX-based). I don’t think it will be hard to develop such tools (at least as far as one of the formats is concerned), and I’m sure some will be developed soon.

[ii] VEX was developed specifically to provide a “negative vulnerability notification”. The normal “positive” notification that we’ve all seen from software suppliers usually says something like “This vulnerability (CVE-2022-12345) is exploitable in our product. Please apply patch XYZ or upgrade to version XX.XX.” However, a negative notification essentially says, “This vulnerability is not exploitable in our product. You don’t have to do anything about it.” Negative notifications have always been an option, but they were not often needed until SBOMs started becoming available. However, since there’s agreement in the software community that a majority of vulnerabilities found in software components aren’t exploitable in the product itself, the advent of SBOMs suddenly created an immediate need for machine-readable negative notifications. Of course, both VEX formats can produce positive notifications as well.

[iii] Fairly soon, there will also be a subscription-based third-party service that will ingest both SBOMs and VEXes and provide users with a continually updated list of exploitable component vulnerabilities in products of concern.

Saturday, June 11, 2022

80 percent!

In my last post, I described how a group of friends and I are meeting weekly to discuss roadblocks that are currently preventing widespread distribution and use of SBOMS (SBOMs are already produced in huge quantities by suppliers, but they’re not distributing them to customers. They’re using them to improve the security of their own products).

In that post, I pointed out that we had decided to do what we could to remove what might be the most difficult of the roadblocks, which is usually called the “naming problem” in the SBOM community. I also pointed out that we had already made progress on identifying the cause of the problem and identifying at least a partial solution.

I’m pleased to report that this week we agreed on what looks like it could be a real solution – and even better, it probably won’t be technically hard to implement. It will be politically hard (which we always believed would be the case), but we think that, given that this problem is holding up widespread rollout of SBOMS (federal agencies are required to start asking them from their software suppliers in August, because of Executive Order 14028), this may be the time when the walls finally come tumbling down.

The background of the problem is this: The primary use of SBOMs, when it comes to cybersecurity risk management, is providing a list of components in a software product (or in an intelligent device). The end user – or a third party acting on their behalf – can look up the components in the National Vulnerability Database (NVD), to find out which if any current vulnerabilities (CVEs) apply to them. They can then “coordinate with” (i.e., bug) their supplier to patch those vulnerabilities quickly.

Looking up a component in the NVD requires knowing the CPE name for the product, which is a long character string like “cpe:2.3:a:adobe:acrobat_reader:11.0.5:-:*:*:*:windows:*:*”. The problem is that, for a large percentage of components, there is no CPE name to be found – for one of a host of reasons.

Of course, the NVD has a search function that might yield results, but SBOMs need to be fully machine-readable. Given that many software products have thousands of components, having to search on even one of those will slow down processing of the SBOM. Searching for 100 CPE names would be a lot of work, and searching for say 4,000 CPEs would be a monumental task.

At the meeting this past week, I pointed out that I’d never heard an estimate of what percentage of components in an average SBOM don’t have CPE names (or at least, constructing a CPE name according to the MITRE specification and using that to search doesn’t produce a hit. There are many CPE names that aren’t constructed correctly, but to which CVEs may be attached. Unfortunately, the NVD doesn’t provide “close matches”. It must be exact). I wondered if anybody had an idea about that.

The Director of Product Security of a very large software supplier, who actively participates in this group, said that his company also wondered about this. They studied the issue, and decided they were normally only able to find CPE names for about 20 percent of the components in the average SBOM for one of their products. In other words, it’s likely that around 80% of CPE names for software components can’t be found in the NVD, and therefore require a fair amount of “manual” effort to discover. And that’s assuming they can be discovered at all, since many software products and devices don’t have a CPE name at all, even though they may be loaded with vulnerabilities, as discussed in this post).

Of course, currently that problem just affects the supplier, not their customers - since they and almost all of their peers aren’t currently providing SBOMs to their customers. But because they do a lot of business with the federal government, in August they’ll have to start providing SBOMs to federal agencies. Given that so many of the component CPE names are unresolved, what will the agencies do? Will they have to throw the SBOMs away?

No. Remember, getting one SBOM with say 100 components, and therefore only 20 verified CPE names, is still better than getting zero SBOMs with zero components and zero verified CPE names. The agency can still find out about vulnerabilities in those 20 components and contact the supplier about any of those vulnerabilities that are shown to be serious. And if there are 5,000 components in the SBOM (and there are many products with thousands, or even tens of thousands, of components), they’ll still have verified CPE names for 1,000 of those.

But my friends and I are determined to do better. The solution isn’t to make CPE better, since the opportunities to improve CPE are quite limited. I mentioned in my last post that the solution will probably lie at least partially with the purl (package URL) identifier. While that isn’t the whole solution, it does seem to be a good part of it. I used to think this was a problem that would take ten years to solve, but now I’d say it will be substantially solved in 1-2 years, and perhaps even less than that.

Tuesday, June 7, 2022

Going after the big one

About three months ago, a small group of friends (old and new ones) who are involved in the “SBOM business” got together to start talking about something that is on all of our minds: the small but important set of serious problems that are currently holding back widespread use of software bills of materials, and how these issues might be either mitigated or (preferably) solved altogether.

We all agreed that the problem of SBOM production by software suppliers is much smaller than the problem of use by general organizations, since many of the suppliers are already making extensive use of SBOMs today. But the suppliers are by and large producing SBOMs for their own use, to help them manage their own supply chain cybersecurity risks (i.e. the risks posed by the many components they include in their products). They’re not distributing them to their users, mostly because the users aren’t asking for them.

Why aren’t the users asking for them? There are a number of reasons, especially the lack of low-cost or open source tools and services that will help them identify risks found in components included in the software they use.

However, there’s one problem that overshadows all of the others. It’s one that was sometimes discussed by the NTIA Software Component Transparency Initiative, but which was generally considered to be insoluble – that is, insoluble without a persistent multiyear effort that would involve a lot of…well, lobbying of various government agencies and nonprofit organizations that would need to be involved in any permanent fix. Since there were other more easily soluble problems that could be addressed without such a huge effort, and since there are partial workarounds available that can make the big problem at least tolerable, the general consensus was to let this sleeping dog lie for the moment.

The problem has a number of names because it has many facets, but most people refer to it as “the naming problem”. Briefly, the problem is that there are a lot of problems with the “CPE” (common platform enumeration) names that are required in order to look up vulnerabilities (CVEs) that apply to a product in the National Vulnerability Database (NVD). Very often it will be difficult (or impossible) to look up a product, because the user is unable to find the CPE name under which it was entered.

More generally, it turns out that simply knowing the title and supplier name of a software product that you own won’t provide you with a universal name, that will be valid in all times and places. A really striking example is a software supplier you might have heard of, named Microsoft. Well, we may think of that supplier as “Microsoft”. However, there are many different names used with different products that we would normally consider “Microsoft products”.

If you search on “Microsoft” in the NVD, you’ll miss a lot of products that are listed under a different supplier name, like Microsoft Corporation, Microsoft Europe, etc. In fact, someone who works a lot on this issue told me they had asked people at Microsoft what company they worked for, and they received something like 27 different responses. Even more interesting, there is no central location where you can go to find all products produced by the various “Microsoft” entities.

I wrote about this problem in 2020, soon after I joined the NTIA initiative. However, at that time I agreed with the consensus that there were other fish to be fried before we turned to that one; so I didn’t write any more posts on it until today.

To be honest, when my friends and I started meeting weekly, we didn’t really intend to tackle the naming problem right away. But, in one of our first few meetings, Tom Pace of NetRise did a presentation on a very serious problem (along with two others, discussed here and here) that poses serious risks to intelligent device users; it turns out this is just one of the many facets of the naming problem.

The week after Tom’s presentation, the group decided (although I’m not sure “decided” is the right word; “stumbled into” might be better) to start exploring the naming problem further. In the four or so weeks since that meeting, to my surprise, we have made a lot of progress. Here are some of the things we’ve decided on.

The problem can be divided into short- and long-term aspects. While we explored the long-term problem for a week or two and made some progress on at least the outlines of a possible long-term solution, we decided that there are short-term steps we can take, that could lead to improvement in perhaps six months to a year. We’re going to focus on those steps for the time being.

While it’s tempting to think the solution is to have a central registry of suppliers or products – and while this would actually be a true solution to the problem – in practice, that would require a huge amount of resources, as well as close to endless lobbying, discussion, arm-twisting, etc. to put it into place and keep it operating. This simply isn’t going to happen.

The long-term solution has to be a distributed one, in which different groups are responsible for their own "namespaces”. What are those groups? They’re the groups responsible for different types of software. Since about 90% of software components are open source, most of these groups are open source repositories and package managers, including Maven, PyPi, NPM, etc. These all have their own naming schemes.

However, it turns out that open source namespaces are the easiest to deal with. This is because each repository maintains its own namespace – i.e. a list of all the open source products stored in the repository. In theory, if you want to find the authoritative name for an open source component and you know the repository it came from (which is usually determined by the language it’s written in), you can just search there and find the name.

The hard type of software is proprietary software – i.e. software that you usually pay for, which comes from Microsoft, Oracle, etc. One would generally think that the supplier could tell you the authoritative name for one of their products, but in practice suppliers will often have multiple names for the same product. For example, the product might have been acquired by multiple suppliers over time; they each named it using their own naming scheme; the versions that were produced by Supplier A will all have completely different names than those produced by Supplier B, etc.

And, since suppliers themselves get acquired and divested, the product name (and of course the supplier name) will often have changed multiple times for that reason. The current owner of a product will likely have assigned it their own name; just knowing the name that a previous owner assigned to it won’t necessarily be any help in learning the current name – of either the product or the supplier.

If we’re not going to have a universal namespace maintained by an army of registrars, how are we going to find the name of a component we’re concerned about? While this isn’t necessarily the solution, there is a naming scheme known as purl (package URL), that in principle can cover all open source software. The core fields in the purl are:

type: the package "type" or package "protocol" such as maven, npm, nuget, gem, pypi, etc. Required.
namespace: some name prefix such as a Maven groupid, a Docker image owner, a GitHub user or organization. Optional and type-specific.
name: the name of the package. Required.
version: the version of the package. Optional.

If you’re not sure of the “real” name of an open source component, you can in theory use the information you have about it – these four items – to construct its purl. You should then be able to look it up in databases like Sonatype’s free OSS Index, the largest open source software database. And while I’m sure there are various gotchas, you should almost always be able to positively identify the software you’re looking for.

Other vulnerability databases also use purl to identify software, but guess which database doesn’t use it? You’re right, it’s the NVD! The NVD only uses CPE names, meaning there’s a lot of open source software that isn’t referenced in the NVD at all. Plus, there’s more that isn’t referenced correctly, because of various problems constructing CPE names.

So guess what the next item is on my group’s agenda? It’s getting the NVD to start using purl, along with CPEs, to identify the products in the database. Technically, this isn’t very hard to implement, but politically it’s challenging, because of the various organizations that have to be involved in different ways. We’re now putting together our roadmap for doing this, and then we’ll start. I won’t give you an ETA for purl in the NVD (I’m sure it’s more than one year, and probably more than two years), but the point is someone is working on it.

As I mentioned, 90% of components are open source, so what about the 10% that are proprietary? While it’s not impossible that purl could be extended to them in some way, there are other naming schemes like SWID tags that are already in wide use, that might cover proprietary software. There will need to be cooperation from the large suppliers like Microsoft (which used to include SWID tags in all of its products), and there will need to be some group that handles products from suppliers that are out of business or just don’t respond to inquiries on this. So, this won’t be a piece of cake, either.

It will definitely be many years before every software product ever produced, and every product being produced now and in the future, will in theory be capable of being identified definitively with a single name (from one of only two or three naming schemes). But solving the naming problem isn’t an all-or-nothing proposition. The various incremental steps along the way (say implementing purl in just one part of the NVD) will make life a lot easier, both for software developers and for the users who are trying to identify and patch vulnerabilities (as well as other risks) in the software they use.

Update May 6, 2023: The SBOM Forum proposed at least a 70% solution to the NVD portion of the naming problem in September 2022. We're now discussing with the NIST team which runs the NVD how we can partner with them to make these and other improvements, including possibly getting funding and expertise from private industry and perhaps other sources.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

Thursday, June 2, 2022

The other Minimum Elements of an SBOM

Probably the most widely-read of the 20-odd documents that were produced by the NTIA on SBOMs was “The Minimum Elements for a Software Bill of Materials”. Of course, this document was mandated by Executive Order 14028, which required that federal agencies start requesting SBOMs from their suppliers of critical software, but didn’t itself provide further information about those SBOMs.

I think this is a very well-written paper, which provides a lot of good information on what should be in an SBOM, as well as how it should be produced.[i] But literally 80-90% of the minimum elements mentioned in the paper have been totally ignored by the software community. This becomes clear if you look at the table on page 3 of the document.

The table lists three areas of minimum elements that will be discussed in the document: Data Fields, Automation Support, and Practices and Processes. Yet I’m not exaggerating when I say that literally 100% of what I’ve seen written about the minimum elements of an SBOM just addresses the data fields – i.e., the seven fields that need to be in an SBOM. Everything I’ve read ignores the minimum elements found in the other two areas.

This is too bad, since the NTIA made some excellent observations in those two areas. The reason for this neglect is quite simple: On page 9 of the document, you can find a nice table of the seven minimum data fields (called “baseline data elements”). But there’s no equivalent table for Automation Support or for Practices and Processes. You have to read a bunch of – shudder! – text and make your own tables. With hindsight, it would have been good to include summary tables for the other two areas as well.

This post is an attempt to make up for that omission, although the information won’t be in tables, but in lists. However, I first want to point out that nothing in the Minimum Elements document is “required” of federal agencies this August, when compliance with Section 4(e) of the EO is due. The only actual requirement for SBOMs under the EO is that agencies start asking their suppliers for them; that doesn’t require any minimum elements – just an email saying, “Please send me an SBOM.”

On the other hand, everything in the document is worth considering as you develop your organization’s program to utilize or produce SBOMs – and that goes for all organizations that use software, not just federal agencies.

Automation Support

Automation Support is discussed on pages 10 and 11. There’s just one “element” in this section: An SBOM should be provided in one of the three machine-readable formats: SPDX, CycloneDX or SWID[ii].

I do want to point out that there are other important considerations for automation support, that were probably left out of the Minimum Elements document because that’s all they are: considerations. One is that both SPDX and CycloneDX are available in different data representation schemes, the two most important of which are JSON and XML. Both SPDX and CycloneDX also support good ol’ XLS, which is machine-readable but is mainly intended for eyeballs – and can’t easily convey a lot of the information that’s provided using the JSON and XML schemes.

Practices and Processes

Most minimum elements are found in this category. They include the following (with references to the appropriate page of the document):

Frequency (page 12): This is very simple, but also very important. An SBOM should be re-issued whenever the software has changed. Of course, this includes major and minor updates, but it also includes patches. In my opinion, the version string should also change whenever there’s been a software change, even just a patch. Having separate “patch levels” or “build levels” isn’t going to work anymore, since none of the automated SBOM tools is likely to support either of those. So, patch levels and build levels will need to be integrated into the version string. If that isn’t done, there will be multiple SBOMs with different content, that are considered by the SBOM tooling to be identical. That obviously won’t work.

Depth (page 12): I will admit that people who engage in the reprehensible practice of writing about SBOM practices (which includes me, I’m ashamed to say) often speak of an SBOM as an inherently multilayer document. That is, we often make it seem like everyone should expect every SBOM they receive to show the direct components (dependencies) of the product itself, as well as the second, third, fourth, etc. layers – yea, unto the 15^th or 20^th generation.

There are two problems with this. First, the idea of neat, nested layers of components isn’t accurate in practice. There’s nothing that prevents component B from being a dependency of component A, while at the same time a different instance of A is a dependency of B – within the same product. However, it’s still accurate to take a particular chain of dependencies and put those in hierarchical order. In practice, it’s impossible not to talk about “layers”; you just need to understand what that means.

The second problem is much more important: It’s very hard for a supplier to get information on anything more than the first “layer” of components. And here’s where NTIA’s advice kicks in: “At a minimum, all top-level dependencies must be listed with enough detail to seek out the transitive dependencies recursively.” This means that:

1. The supplier needs to provide their users a complete list of first-level components in their product (after all, if a software supplier doesn’t even have a complete list of the components that the supplier themselves included in their product, perhaps they ought to think about finding another line of work…); and

2. The supplier needs to name the first-level components with enough accuracy that the user will be able to seek out an SBOM for each of those components (see below). In my opinion, the supplier should also make their best effort to acquire the SBOM for each first-level component, so they can fill in at least some of the “second layer” components in their SBOM.[iii]

For the time being, it’s better for software users not to demand too much from their suppliers. Just getting a complete list of components from their supplier (with usable identifiers like CPE and PURL for each one, so they can be looked up in vulnerability databases) will be a big improvement over the SBOMs that most users have currently, which are none at all.

Distribution and delivery (pp 12 & 13): The document doesn’t prescribe any means of distribution and delivery of SBOMs, but does say there are two components to this: “..how the existence and availability of the SBOM is (sic) made known (advertisement or discovery) and how the SBOM is retrieved by, or transmitted to, those who have the appropriate permissions (access).”

And here I disagree with the document. I believe that, once SBOMs and VEX documents are being widely produced and used (which might well be 5-10 years from now, I’ll admit), any distribution and delivery mechanism that depends on the end user retrieving the SBOM themselves, or even requesting it, will ultimately fail. What’s needed long-term is literally a mechanism to “stream” SBOMs and VEXes to users (more specifically, to the tool they operate to “process” the two documents for supply chain cyber risk management purposes), so that they don’t have to do anything at all to receive the documents they’re entitled to; they just show up. I described that mechanism (in not much detail) at the end of this post.

I’ve gone through all the Minimum Elements listed in the NTIA document, except for the seven minimum data elements and a couple other minimum elements that I don’t think are important. However, the document also goes “Beyond the Minimum Elements” and discusses some things that are probably not needed now, but which will be important in the future – so the user needs to be thinking about them. Rather than load them into this already-full post, I’ll discuss them soon.

[i] The document says nothing about how government agencies (and by implication, other software users) can utilize SBOMs for cybersecurity risk management purposes. Use of SBOMs remains an under-addressed area, although that is changing.

[ii] SWID really isn’t an SBOM format; it’s a machine-readable software identifier. It does include the seven minimum fields but very little else, making it not very useful for most SBOM purposes, including software supply chain risk management. But, since properly identifying software components is a big problem that several groups are working on now, SWID may see expanded use in coming years. It may well be a component (no pun intended) of the solution to that problem.

[iii] I think intelligent (IoT) device suppliers have a special obligation to try to obtain SBOMs for the first-level components. I’ve written an article on SBOMs for IoT and IIoT devices with my customer Red Alert Labs, which explains why I say this. When it’s published in July, I’ll publicize it in my blog.