Tom Alrich's Blog: Has the NVD become an empty shell?

Brian Martin, a well-known vulnerability researcher with whom I’ve had previous online discussions, has been closely following what is happening with the National Vulnerability Database (NVD) for a long time. Recently, he’s trying to answer the question whether the NVD is making progress on fixing the big problem that developed starting on February 12, 2024.

Starting on that date, the NVD greatly slowed performance of what I (and others) believe is their most important function: adding machine-readable product identifiers called CPE names to the CVE records created by CVE.org [i]. CVE records (created by CNAs) include a textual description of a newly identified vulnerability, as well as of one or more products that are affected by the vulnerability.

However, an NVD user utilizing the search bar (or one of the APIs) to learn about CVEs that apply to a product their organization uses will not be shown any CVEs that don’t include a CPE name. While in previous years, the NVD staff – or more specifically, contractors to NIST, the NVD’s parent – had always created CPE names for products mentioned in a new CVE record within days of the NVD’s receiving it from CVE.org, in 2024 the NVD added CPE names to fewer than half of the new records. This means that, if you search for a particular product in the NVD’s search bar today, you will on average only be shown half of any CVEs that affect the product, if they were discovered in 2024.

All CVE records that don’t have CPE names (called “unenriched” records) constitute the NVD’s backlog. Until last February, the NVD ran on average a zero backlog, but that changed on February 12. For a while, the NVD promised they would have the backlog cleared up in 2024, but that simply didn’t happen. In fact, in late December the NVD initially reported adding CPE names to only about 1% of CVE records. This meant that during that period, the backlog essentially grew at the same rate as new CVE records.

By the end of 2024, the backlog was about 22,000 unenriched records, which is close to 55% of the approximately 40,000 new CVE records created in 2024. As we started off 2025, the question was whether the NVD would start working off that backlog or whether they would allow it to grow.

Unfortunately, we have the answer to that question now: Brian put up a post on LinkedIn this week that shows the NVD seems to have completely stopped adding CPE names (and CVSS scores) to new CVE records for the previous 12 days. Of course, there is no communication at all from the NVD about this problem – in fact, the last time they posted a communication about their backlog was last November 13, when they said they were now fully staffed up and were hoping to make real progress soon.

Now it seems the NVD may have stopped performing their most important function, other than maintaining the database itself. This could well be because of the turmoil that has been going on in the federal government this year, which has included cancellation (whether legal or otherwise) of many ongoing contracts. If this is the case, that’s bad news indeed, since it means the NVD might end up going the way of agencies like USAID, which was all but shut down more than a month ago and remains in that state (with close to zero employees) today, despite a court order to reopen the agency.

In other words, the NVD might already be gone. But even if it survives in some form, unless enrichment restarts, it will be a hollow shell of what it was. The good news is that, if you search the NVD for recently announced vulnerabilities that apply to a product you use now, you’re likely to receive the message that there are no vulnerabilities to display.

The bad news is that’s probably not the truth. However, the only way to find out about CVEs that were not displayed is to do text searches of the over 40,000 new CVEs that were identified since February 12, 2024. Obviously, that’s not a solution to the problem (there are several third parties like Vulners and VulnCheck that are performing this work themselves, although none of them have done it for all 22,000 CVEs in the backlog. CISA was doing some of this work in their Vulnrichment program, but that program stopped adding CPE names in December, for some unexplained reason).

I don’t know what the backlog of unenriched CVEs is today, as a percentage of new CVEs identified since say the beginning of 2024. However, it’s without a doubt over the 55% level, where it stood in December. Where will this end? Clearly, if enrichment doesn’t resume (and specifically, addition of CPE names to CVE records), we’ll end up with a backlog that asymptotically approaches the number of CVEs that have been identified since last February 12.

In other words, a search for a product in the NVD will become increasingly meaningless. In fact, many people would argue that any product vulnerability search that at best will yield you half of the vulnerabilities that have been discovered in the past year is already meaningless.[ii] Of course, the NVD will probably survive as an historical database, but it won’t be a trustworthy source of vulnerability data for CVEs identified after February 2024.

How can this problem be fixed? It depends on what you mean by ‘fixed’. If you’re talking about going back and adding CPEs to all the unenriched CVE records since last February, that’s probably never going to happen.

However, if you’re talking about putting in place a long-term solution, that is certainly possible. Like all long-term solutions, it will require a lot of work. The source of the problem is quite clear: For reasons that are not known, the people tasked with adding CPE names to CVE records aren’t able to do their job now.

At first, it might seem that the solution to this problem is to find some more money to throw at the problem. NIST did that last year – and now the problem is worse than ever. Plus, given the current climate in Washington, I strongly doubt there are any trees left to shake for money.

Most importantly, CPE is a flawed software identifier, as the SBOM Forum (now the OWASP SBOM Forum) described in our 2022 white paper on fixing the naming problem in the NVD. Some of CPE’s problems might be fixed (and may in fact be fixed due to CVE.org’s revised specification), but others simply can’t be fixed.

Perhaps the biggest of these problems is the fact that CPE requires a name for the software product and the vendor of the product. What’s so hard about that? The problem is most software products and software vendors go by many different names in different contexts. If someone wants to find a particular product or vendor name in the NVD, they will need to guess at the one that was used by the person who created the CPE. There’s no way to know beforehand what that name was.

At this point, someone will usually say something like, “If we had a global directory of software vendor names, all those names could be mapped to a single canonical name. Then we could require that the person who creates a CPE name only use the canonical name for each vendor.” This sounds very simple, but who makes the decision on the name for the vendor? Is it the current CEO, the current CFO (in financial filings, etc.), the initial articles of incorporation, the name used by the New York Times or the Tokyo Yomiuri Shimbun, etc. etc?

And what if the developer acquires another developer, but – as usually happens – leaves the existing product and vendor names in place for a year or two, or perhaps never changes them? A customer of the acquired developer might not learn of the acquisition for a year or two and will continue to search for vulnerabilities using the previous product and vendor names.

Of course, if the funds were available to create a global database of software suppliers and more importantly to maintain it, this wouldn’t be a problem. However, maintaining the supplier database would be hugely more expensive than maintaining the NVD itself. And when you talk about the required database of product names, that would be much more expensive than the supplier database. Clearly, neither of these databases is likely to be available soon or ever.

What’s needed is a software identifier that is based entirely on information that is available to the user at any time. Such an identifier doesn’t have to be “created” at all. That identifier is purl, which stands for Product URL. Purl came from literally nowhere ten years ago to conquering the open source world – to the extent that currently there is almost no open source vulnerability database that doesn’t support purl. Purl can be implemented in the “CVE ecosystem" fairly easily, once a revised CVE record format can be developed and tested. Moreover, purl doesn’t need to supplant CPE. Current CPE names won’t go away, and if someone starts creating new CPE names again and adding them to CVE records, they will certainly be supported.

To implement purl in the CVE ecosystem, two steps are required:

1. Changes will be needed in the CVE Record format. Of course, that format is changed regularly, so this should not be too hard. The changes will not be implemented until step 2 is completed (and they may not be implemented for a while after that, since there is always a backlog of format changes that are needed).

2. There needs to be an end-to-end proof of concept. It will start with CVE Numbering Authorities creating test CVE records (based on the revised CVE Record format) and end with users searching vulnerability databases for vulnerabilities applicable to particular products. If the users are shown all the CVEs that affect each product, the PoC will be successful.

These two steps are required to implement purl as it’s currently configured: to support open source software found in package managers. Purl doesn’t currently support commercial products. While it will be a big step forward when purl is implemented in the CVE ecosystem just for open source software, it will be much better when purl supports commercial software as well. Since the NVD is currently the primary vulnerability database for commercial software, if purl is to provide a solution to the NVD’s problems, it will need to support commercial software as well as open source.

I described a possible solution to this problem – proposed by Steve Springett, leader of the OWASP Dependency Track and CycloneDX projects – in this blog post. It requires software suppliers to create a “SWID tag” for each of their products and make it available on their website to all interested parties, whether or not they are customers (the SWID tags will also be distributed via the Transparency Exchange API).

A user who wants to search for vulnerabilities in a product they use can download the product’s SWID tag and create a simple purl using it (usually, only 3-4 fields from the SWID tag will be required to create the purl). Since the CNA will use the same tag when they create the purl for the CVE record, this means a user searching for the product’s purl in a vulnerability database should always find any CVE records for the same product and version. Note that this doesn’t require anyone to “create” a purl; its contents will be dictated by the contents of the SWID tag.

Thus, the third and final step for implementing purl support in the CVE ecosystem is testing and implementing the use of SWID tags to create and validate purls for commercial software. Since this step can be accomplished at the same time as the first two steps, it would speed up the implementation of purl in the CVE ecosystem if the two tracks could be carried out simultaneously.

How long will it take to do all of this? I think the first two steps – which will implement purl in the CVE ecosystem with coverage of open source software in package managers – will take about a year. The third step, validating and implementing the use of SWID tags to create purls for commercial software, will require another year. However, if resources permitted both tracks to be carried out at the same time, we could have purl supported throughout the CVE ecosystem, for both open source and commercial software, in 1 ½ to 2 years.

This is a long time, of course, but given the distinct possibility that the NVD will become an empty shell soon – and given that it is already greatly diminished – what other choice is there? With 280,000 CVEs in the catalog now, it is long past the time when anything other than automated search for software vulnerabilities is practical. We’ve already effectively lost the capability for automated identification of all vulnerabilities that apply to a product in an NVD search. It’s time to start work on Plan B.

The OWASP SBOM Forum is willing to take the lead on all three steps of this project. However, we require funding for that. If your organization is able to support us with either donations or personnel or both, please email me at tom@tomalrich.com. OWASP is a 501(c)(3) nonprofit organization.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.

[i] For a summary of what CVE.org does and how it relates to the NVD, see this post.

[ii] CVE.org has made modifications to the CPE specification in an effort to get the CVE Numbering Authorities (CNAs) – who in the majority of cases are the developer of the product for which a vulnerability is being reported – to start adding CPE names to the CVE records they create. I hope this effort is successful, of course, but we certainly can’t count on it.

Tom Alrich's Blog

Wednesday, March 19, 2025

Has the NVD become an empty shell?

No comments:

Post a Comment

Get new posts by email: