Monday, July 15, 2024

Currently, automated vulnerability management is impossible. How can we fix that?


Currently, the OWASP SBOM Forum has two sets of meetings going on. On Fridays at 1PM ET, we have our weekly SBOM Forum full meetings. We intentionally don’t usually have a set topic for these meetings (and if we do, we don’t necessarily follow it). This is because I find it much more useful to let the topic – or often, the topics – emerge from the meeting.

Our other meetings are held every other Tuesday at 11AM ET. These do have a regular topic, which is vulnerability database problems (including, but not limited to the NVD); of course, that’s a very broad topic, but we at least keep within its bounds. Anybody is welcome to attend either or both meetings; email me if you would like me to send you the invitations.

The past week had both meetings, which both proved quite productive. To understand what I’m going to say below, here’s some background you need:

1.      The CVE Numbering Authorities (there are currently close to 400 of them. These are mostly software developers, but there are others like ENISA, GitHub and MITRE) are responsible for preparing CVE reports on new vulnerabilities they have identified; for each vulnerability (CVE) identified in the report, the CNA must include information on at least one affected product, including at least the product’s name, version string, and vendor name (the CNA is often the vendor of the product named in the CVE report); those three data points are entered as text fields. All CVE reports are incorporated into the CVE.org database (I believe there are around 250,000 CVE reports in the database now).

2.      From CVE.org, the new CVE reports “flow down” to the NVD. Until February 12 of this year, NVD staff members (who are all NIST employees or contractors) quickly “enriched” the reports by adding important information, including the CVSS score, CWEs, and a machine-readable CPE name for each product/version listed in the report; all that information became available in the NVD. Note that the CPE name is essential if the information in a CVE report is to be discoverable by automated methods. If it’s not so discoverable, I regard the CVE report as useless for vulnerability management purposes, since vulnerability management needs to be automated (even though there will always be times when “manual” processes will be required. But they should always be the exception, not the rule).

3.      On February 12 something happened, which has yet to be adequately explained by the NVD. On that date, the volume of CVE reports that were “enriched” by the NVD staff fell from hundreds a day to literally zero. It remained at that level until May, when it kicked back up to a level of about 25% of the new CVE reports. In other words, after three months of producing no useful CVE reports at all and building up a big backlog of “non-enriched” reports, the NVD has now reduced the rate at which they’re building up that backlog by 25%. However, note they have made no progress at all in reducing the backlog itself – just growing it at a slightly slower rate; of course, they’re also not even giving a date by which the backlog will be eliminated. Excuse me for not being overwhelmed with joy at this most recent news.

4.      For this reason, I and the OWASP SBOM Forum have given up on the idea that the NVD will ever dig itself out of its CPE hole, although it will certainly help if they can. This is why last Friday’s meeting of the Forum was focused entirely on the question of what (or who) could replace the NVD as a source of CPE data for CVE records that don’t have CPEs now.

5.      We had a good discussion, which you can read in the meeting notes (BTW, anyone who was at the meeting should feel free to add your comments to what I’ve written for the notes. Since I’m always trying to follow the discussion closely, I can’t write down a lot of what’s said). However, when we came to the question of who was going to fill in the gap created by the fact that the NVD has largely given up on their commitment to provide CPE names for CVE reports, I was surprised by what I heard.

6.      In Tuesday’s meeting (and in earlier meetings), it had been suggested (and I’m not sure by whom) that the CNAs should create the CPE names. After all, a large percentage (probably the majority) of CVE reports are created by a CNA that is also the developer of the product that has the new vulnerability. It seems almost axiomatic that the CNA organization should create the CPE name as well.

7.      However, Bruce Lowenthal, who leads the PSIRT at Oracle, pushed back on the idea.  Bruce doesn't want two different fields for the same data in the JSON 5.x format, because he feels it will lead to inconsistencies. He notes that many feel CPEs - at least, the CPEs created before February 12 of this year, when NIST/NVD staff memberse were creating the great majority of them - are deficient. He suggests that CNAs should fill in all the fields in the CVE JSON 5.x record that are needed to create a CPE. Then, CVE.org should provide automation that creates the CPE name based on these fields, with no additional human intervention. 

      The idea behind this is the CNAs know what should be in the fields in the CVE report (especially the name and version string of the vulnerable software, as well as the name of the vendor). Since most CVE reports describe a vulnerability found in a product developed by the CNA itself, they should have the final say on the contents of these three fields, not somebody at NIST who's using other criteria (which are often opaque at best, and often are simply wrong) to determine these values.

8.      This idea has a lot of appeal. In fact, I and others have wondered for a long time why the NVD staff needs to create CPE names “manually”, when in theory an automated tool could do the same job with more accuracy – assuming that the CNA provides authoritative information in the product, version and vendor fields, which are all text fields. At this point in the meeting, it seemed obvious to me that the solution to the problem of creating CPEs (given that the NVD can no longer be trusted to do that regularly) was to require that somebody (presumably CVE.org, but perhaps the CNAs themselves) should be required to use an automated tool to create a CPE name for every affected product in their CVE report; the CPE name would need to be based on the exact product, version and vendor text fields in the report.

9.      However, by Monday (i.e., the day I’m writing this), I had changed my mind regarding this idea. Andrey Lukashenkov of Vulners, whose judgment I have great respect for, assured me on LinkedIn that it will be hard, if not impossible, to develop an automated tool to create CPE names (you can read our conversation here). Was this a dead end? In other words, since CPE seems to be on life support at best and there’s not currently any alternative to it, does this mean that automated vulnerability management is no longer possible, even if it was before?

No, it doesn’t. About 15 minutes before the meeting ended on Friday, the Director of Product Security for a different very large US-based software vendor joined the meeting. He seemed to agree with what I’ve just described, but he went on to make another suggestion; given that he’s a member of the CVE.org board, his suggestion carried a lot of weight with me. He suggested that, since the new CVE version 5.1 specification, which was adapted a few months ago by CVE.org, supports purl identifiers as an option, maybe the best course of action would be to train all the CNAs to support purl identifiers.

I was quite pleased to hear this person’s suggestion, since the whole reason why the new CVE spec supports purl is because the SBOM Forum suggested it in our 2022 white paper on changes we were suggesting to fix the naming problem in the NVD (i.e., the problems with CPE, which are described on pages 4-6 of the paper). During that effort, Steve Springett and Tony Turner submitted a pull request to CVE.org to include purl in what was then known as the CVE JSON spec, but is known today (I believe) as simply the CVE spec. We were too late to get it into the 5.0 spec, but it was included in the 5.1 spec, which was just adopted a couple of months ago.

For reasons described at length in our paper, purl is by far superior to CPE, especially for open source software, where purl has literally conquered the world (to the extent that I have yet to hear of a single open source vulnerability database that is not based on purl). Purl excels at naming software that is available in package managers, but it currently doesn’t work for:

1.      Proprietary (“closed source”) software;

2.      Open source software written in C/C++, which is typically not available in package managers; and

3.      Any other open source software not available in package managers.

4.      Plus, there’s another type of product that is today identified by CPE names, but which can’t currently be identified by purls; I’m referring to intelligent devices. Our 2022 white paper suggests that the existing GS1 standards be used to identify devices in vulnerability databases (we were suggesting it for the NVD at the time, but they could be used in any vulnerability database). Those could work, but it would be nice if we could figure out a way to get purl to accommodate devices, since the GS1 standards come with a lot of baggage (and cost) that wouldn’t be needed in our application.

What’s essential in purl is that it’s a distributed naming system, in which individual software sources are responsible for the names of the software available on their source; in other words, the software source
“controls its namespace”. For example, a package manager like Maven Central is responsible for the name of every software product available on Maven Central. This means that the combination of the name of the software source and the name of a product available within that source (as well as a version string applicable to that product) will always be unique; that is, Maven Central controls its namespace, so every product name in Maven Central will be unique in that namespace.

For each of the four above items that don’t currently work with purl, the way to make them work is to identify a way to have a controlled namespace for that product type. When the OWASP SBOM Forum wrote our 2022 paper, we suggested an idea for a controlled namespace for proprietary software; that is the short section titled “SWID” on pages 11 and 12 of the paper.

Our suggestion at the time was that software suppliers could create SWID tags and distribute them with the binaries they distribute to customers; with access to the SWID tag, the software customer could create a unique purl, which would allow the customer to identify that supplier’s product and version in a vulnerability database (at the same time as we wrote the paper, Steve Springett, a purl maintainer, submitted a pull request for a new purl type called SWID, which was accepted).

The big problem with this suggestion was that it doesn’t work for legacy software versions, since those binaries have already been distributed  (we realized this at the time, but we didn’t want to spend the 2-4 months that would be required to figure out a good solution to it). How will users of those versions find a SWID tag?

In my opinion, this wouldn’t create a big technical problem; I think there are multiple good options for accomplishing this. For example, a supplier could create a well-known location on their website like “SWID.txt”. This would include the name and version number of previous versions of their products, along with a SWID tag for each. A vulnerability management tool could retrieve that tag and easily create the purl that corresponds to it; the user could then look up vulnerabilities for the legacy software they use in a vulnerability database based on purl, although that database will need to support the SWID purl type.

Of course, I don’t deny for a moment that there will be a huge number of questions to be resolved regarding how to extend purl to each of the four product types listed above, as well as how to implement purl support in CVE.org and other databases that don’t currently support it. Even if one or a small number of individuals could answer all these questions on their own, their answers will never be as good as those that will be produced by a group of industry leaders formed to address exactly these questions. In fact, that's the main rationale for the OWASP SBOM Forum’s Vulnerability Database Working Group (let me know if you would like to join the group).

So, as of our next meeting on July 23, the OWASP Vulnerability Database Working Group will start working on questions suggested above, including:

·        What is needed for CNAs to be comfortable with including purls in CVE reports (at the moment, it will just be for open source software products available in a package manager)?

·        What is needed for the CVE.org database to be able to support purl lookups?

·        Is the SWID tag proposal the best way to incorporate commercial software lookup capabilities into purl?

·        If it is, how can commercial software users best be informed (in a machine readable format, of course) of the contents of the purl tag(s) for software products they utilize?

·        Similar questions regarding open source software not in package managers, as well as intelligent devices.

Of course, none of these questions are simple, and it could easily be 1-2 years before they are all answered to at least some degree of satisfaction. But, given that it looks like vulnerability management is close to impossible today, don’t you think this is a good thing to do?

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

I lead the OWASP SBOM Forum and its Vulnerability Database Working Group, which works to understand and address issues like what’s discussed in this post; please email me to learn more about what we do or to join us. You can also support our work through easy directed donations to OWASP, a 501(c)(3) nonprofit. Please email me to discuss that.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post. 

No comments:

Post a Comment