Sunday, September 25, 2022

I’ve figured out why software users aren’t requesting SBOMs


I realized about a year ago that the reason why SBOMs aren’t being used by non-developers for vulnerability management purposes – even though the developers themselves are using them very heavily for those purposes, as shown in this post – isn’t that most developers don’t want to supply them. Rather, it’s because the users aren’t asking for them.

I have until recently blamed that reticence on the fact that there are few low-cost or open source tools to “process” SBOMS (i.e. ingest them and look up component vulnerabilities in the NVD or another vulnerability database) available for end users; moreover, there’s only one tool (at any price point), Dependency-Track, that ingests SBOMs and VEX documents, and these tools have to do both. Of course,  the whole point of creating a machine-readable SBOM (as opposed to just a listing of components in a PDF) is that…strange but true!...it needs to be read by a machine. I can attest that SBOMs in JSON or XML can be read easily by someone with a little knowledge of English, but there’s still no reason to create such an SBOM if a machine isn’t going to read it.

And when you get to SBOMs with hundreds or thousands of components (the average software product now has about 150 third-party components, but some have thousands or even tens of thousands), there’s no question that you need to process it using a tool. That is, you probably have better things to do with your time than print out an SBOM and look up 150 components in the National Vulnerability Database (NVD) using “manual” searches.

But just having a tool that can “process” SBOMs isn’t enough. The end user should look not just for component vulnerabilities, but for the small subset of those that are exploitable in the product itself; those are the ones that pose a problem. VEX documents are required to separate the exploitable wheat from the unexploitable chaff. Since at least 90% of component vulnerabilities aren’t exploitable, an SBOM consumption tool that doesn’t also read VEXes is probably less than useful.

When Dependency-Track recently started supporting VEXes as well as SBOMs (it’s supported SBOMs since 2012, long before almost anybody else got interested in them), I happily announced that fact. And since D-T is open source and therefore free, I wondered if this development might at least get a few more end users interested in getting SBOMs from their suppliers. The result was (drumroll, please)…

Thunderous silence. Of course, when Dependency-Track (which was developed in 2012 by Steve Springett, co-leader of the CycloneDX SBOM format project) is downloaded, the person who did that doesn’t fill out a questionnaire saying what they’re going to use it for. However, Steve developed D-T for use by developers, and Steve says that what he hears from users is that they’re overwhelmingly developers.

To be honest, just changing some wording and some screens would make D-T a perfectly usable consumer product, but Steve has never had requests for that. He knows of a small number of non-developers who are using D-T to track vulnerabilities in products they use, but he doesn’t believe that number is growing.

This isn’t to say that, if someone wants to make the investment to create a consumer-oriented product that “processes” both SBOMs and VEXes, and they make it available at low cost, there won’t be takers for it. Until recently, I thought that was all it would take to make people start using SBOMs.

However, just asking myself a question recently – and answering it with some simple math – made it clear to me that the lack of a low-cost, end-user-friendly tool isn’t at all what’s holding back use of SBOMs.

The (rather long) question I asked myself was, “Let’s assume there’s a low-cost, end-user-friendly SBOM/VEX processing tool, and that a user feeds an SBOM into it. The tool looks up every component listed in the SBOM in the NVD. For every successful lookup, the user receives a list of any open vulnerabilities in the component. They then use any available VEX from the supplier, for the product and version that’s the subject of the SBOM, to determine which of these vulnerabilities are not exploitable. They remove them from the list of open vulnerabilities, ultimately yielding a list of only exploitable vulnerabilities.

“Given this, what is the probability that the user will a) learn about every exploitable component vulnerability in the product, but also b) know exactly which of those vulnerabilities are exploitable or not exploitable?” We will call these the two target probabilities.

There are three probabilities that need to be considered, to answer this question:

1.      The probability that the user will be able to find each component in the NVD when they look it up, and learn of any vulnerabilities (CVEs) that apply to the component;

2.      The probability that a vulnerability identified for a component will be exploitable; and

3.      The probability that there will be VEX documents available from the supplier, which will identify all non-exploitable component vulnerabilities. This will leave just the exploitable ones.

The first probability is based on the “naming problem”. This is the problem that the informal group I lead, the SBOM Forum, addressed in a recent paper; I discussed it in this post. Briefly, the aspect of the naming problem that we addressed can be stated as, “When a user looks up the components found in a newly-created SBOM in the NVD, they will find only a small fraction of them, unless they try other manual processes.” These processes include AI, fuzzy logic, references to other databases, etc. The point is that a purely automated process will only find a small fraction of the vulnerabilities in the NVD (six reasons for this are described in the paper); the person interpreting the SBOM will need to use these other processes, and mix in a good measure of dumb luck, in order to learn about any more than a small fraction of the vulnerabilities that apply to the product.

And what is that small fraction? Our paper quotes the Director of Product Security of a very large US-based software developer, who said that they can find fewer than 20% of the components that appear in one of their SBOMs (they’re creating SBOMs regularly for all their products, but they’re not distributing them now). However, in a recent discussion in one of our meetings, this person admitted that the real number, in their experience, is less than 10%, and maybe even less than 5%. Another participant in the meeting – who works for a very large US-based producer of both software and intelligent devices – said their experience shows that the number is under 5%.

What this means is that, even if an end user has a tool that ingests SBOMs and looks up all the components in the NVD, they will probably learn about vulnerabilities in fewer than 5% of the components. Of course, the SBOM Forum’s proposal aims to raise that percentage far higher – and it will ultimately do that. But I estimate that it will be one to two years before our proposal is fully implemented (I’m optimistic that it will be implemented, based on the reception it’s received so far).

Even after it’s implemented, it will mostly affect new vulnerability reports, not existing ones. So, it could be a number of years before we’ll be able to state that the percentage of successful NVD component searches is over say 70% (it will never be 100%, of course). The bottom line is that, for at least the next couple of years, the first probability above is no more than 10% and more likely below 5%.

I’ve already discussed the second probability, the percentage of component vulnerabilities that are exploitable, in previous blog posts. I normally use the figure of 90%, but most people I talk to believe it’s over 95%. In other words, once a user has looked up whatever components they can find in the NVD and listed all the vulnerabilities identified for those components, at least nine of every ten will be non-exploitable. This might seem like good news, until you consider that you don’t know which ones aren’t exploitable – so you have to treat them all as exploitable for the time being.

The third percentage, the probability that the supplier will produce enough VEX documents to identify every non-exploitable component vulnerability, is zero currently, since no supplier is producing VEXes, period (and BTW, don’t point to a vulnerability report and tell me that’s a VEX, even if it’s in the CSAF format). Will it become non-zero later?

I have recently come to believe there’s a good likelihood that VEX documents will never be produced, for various reasons I’ve discussed already and will in the future. This is why I’m now advocating for the idea of real-time VEX, although I’m also still working with the VEX committee as it makes slow progress toward producing a VEX playbook - which is needed no matter whether the VEX document or real-time VEX emerges as the winner. VEX information is required for SBOMs to be widely used, no matter how it ultimately gets distributed. And SBOMs will be widely used.

So, what’s my estimate for the third probability? It’s zero now, but in a couple years it will be nonzero. However, I want to remind you that the only way a user will be sure that the list of component vulnerabilities they have for a product only contains exploitable vulnerabilities is if they can be sure that the supplier has notified them of every non-exploitable component vulnerability in the product and version in question – that is, they have informed them which nine of every ten component vulnerabilities aren’t exploitable.

Now that we know these three probabilities, what about the two target probabilities we want to derive from them? These are the probabilities that the user will a) learn about every component vulnerability in the product, and b) learn exactly which of those vulnerabilities are exploitable or not exploitable.

Finding a) is easy, since it’s just the first probability listed above: the probability that the user will find every component in the NVD. We’ve already said that’s (conservatively) 10%. But finding b) is harder, since this requires receiving a VEX for each of the non-exploitable vulnerabilities in the product. So, if there are a total of ten vulnerabilities but only one is exploitable, the user will need to receive a VEX for each of the other nine vulnerabilities, saying it’s not exploitable (remember, the user initially needs to assume that every component vulnerability they identify is exploitable, until they receive a VEX stating otherwise).

Let’s say the user just receives seven VEX documents, each stating that one of the ten component vulnerabilities isn’t exploitable. Does the fact that only seven were received mean the remaining three vulnerabilities are exploitable? The supplier could always later end the uncertainty by sending another VEX, which states that the three are exploitable. However, would that be a good idea, especially if the supplier doesn’t have a patch for the vulnerabilities yet? It definitely wouldn’t be a good idea. Only if the supplier sends out a VEX stating that the vulnerabilities have all been patched, which includes the URL(s) of the patch(es), can the user be sure the remaining three vulnerabilities are exploitable in that product and version.[i]

Unfortunately, there’s no assurance that the supplier will always make sure to close the loop and inform their customers of the status of every vulnerability that the NVD lists for a component of a particular version of their product. Thus, it's very possible that a customer will waste time (theirs and the supplier’s) by contacting the supplier regarding non-exploitable vulnerabilities. This is one of the reasons why I now think real-time VEX is a better option than VEX documents.

So what is the value of target probability b)? As I said earlier, currently it’s zero, since no supplier is currently producing VEXes. However, I do think that the idea of a VEX API will ultimately be adopted. This will allow the customer to be certain whether or not each component vulnerability is exploitable. But, since we’re concerned with the next year or so at the moment, I can confidently say that b) is a very low percentage, and currently it’s zero percent.

Finally, we’re able to answer the big question: Why are so few users interested in receiving SBOMs for use in software vulnerability management – even though there’s one open source tool available that could help them in this job? There are at least two reasons:

1.      They have heard that it’s difficult to look up components from an SBOM in the NVD. An estimated maximum ten percent success rate doesn’t inspire them to try doing that, either.

2.      They know that, at least in the near term, they won’t be able to learn which component vulnerabilities are exploitable in the full product, without calling up the supplier’s help desk and running down their entire list of component vulnerabilities with the unfortunate person who answers their call, forcing them to find out whether each of those vulnerabilities is exploitable or not. Of course, this is one of the main reasons why suppliers aren’t keen on distributing SBOMs to their customers now; they suspect their help desks will be overwhelmed with calls about false positives.

You may notice a seeming contradiction in this post: I first talked about the fact that SOBMs have been hugely successful with developers, but now I’m implying they’re lying dead at the starting gate, as far as end users are concerned. Clearly, the developers know something that the users don’t. Can they share this?

As far as the first reason, the naming problem, goes, the suppliers definitely have knowledge. The naming problem has been known since the beginning of the NVD, but SBOMs have turned it from an annoyance into a crisis, because of the huge number of components that now need to be looked up. Developers and consultants have developed a whole array of tricks – including AI and fuzzy logic routines, databases that shed light on different aspects of software names, etc. – to get around the problem.

Since these tricks are very ad hoc, they can’t be shared as part of a tool. However, developers could apply them on behalf of their customers, so that the SBOMs they send out have valid identifiers (“CPE names”) that allow customers to look up components in the NVD (meaning the customers would probably still end up looking up the vulnerabilities in the NVD manually). But, more likely, the suppliers will just outsource the whole “SBOM processing” activity to third party service providers

The service providers will provide an “SBOM service” for end users. This service will take in SBOMs, as well as either VEX documents or real-time VEX information (whichever is available) and produce for their customers a list of exploitable component vulnerabilities in a product/version that is used by the customer. The customers won’t need even to look at an SBOM or VEX document, nor should they need to look up anything in the NVD on their own. They’ll receive what’s really the holy grail of SBOMs: continuously-updated lists of exploitable vulnerabilities in the products/versions in use by the user organization.

This is why I now think that, at least for the next few years, almost all “SBOM processing” for end users will be performed by third parties, not by standalone tools operated by the users themselves. My guess is the users will be quite happy with that arrangement.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] The fact that a patch has been released for a vulnerable version of a product doesn’t mean the status of the vulnerability changes in that version. Best practice dictates that application of the patch increments the version number of the instance of the product that’s patched. So the vulnerability will be fixed in the patched version but not in the unpatched version.

No comments:

Post a Comment