Brian Martin, a well-known vulnerability researcher with whom I’ve had previous online discussions, has been closely following what is happening with the National Vulnerability Database (NVD) for a long time. Recently, he’s trying to answer the question whether the NVD is making progress on fixing the big problem that developed starting on February 12, 2024.
Starting on that date, the NVD greatly
slowed performance of what I (and others) believe is their most important
function: adding machine-readable product identifiers called CPE names to the
CVE records created by CVE.org[i]. CVE records (created by
CNAs) include a textual description of a newly identified vulnerability, as
well as of one or more products that are affected by the vulnerability.
However, an NVD user utilizing the
search bar (or one of the APIs) to learn about CVEs that apply to a product their
organization uses will not be shown any CVEs that don’t include a CPE name. While
in previous years, the NVD staff – or more specifically, contractors to NIST, the
NVD’s parent – had always created CPE names for products mentioned in a new CVE
record within days of the NVD’s receiving it from CVE.org, in 2024 the NVD added
CPE names to fewer than half of the new records. This means that, if you search
for a particular product in the NVD’s search bar today, you will on average only
be shown half of any CVEs that affect the product, if they were discovered in
2024.
All CVE records that don’t have CPE
names (called “unenriched” records) constitute the NVD’s backlog. Until last
February, the NVD ran on average a zero backlog, but that changed on February
12. For a while, the NVD promised they would have the backlog cleared up in
2024, but that simply didn’t happen. In fact, in late December the NVD initially
reported adding CPE names to only about 1% of CVE records. This meant that
during that period, the backlog essentially grew at the same rate as new CVE
records.
By the end of 2024, the backlog
was about 22,000 unenriched records, which is close to 55% of the approximately
40,000 new CVE records created in 2024. As we started off 2025, the question
was whether the NVD would start working off that backlog or whether they would
allow it to grow.
Unfortunately, we have the answer
to that question now: Brian put up a post
on LinkedIn this week that shows the NVD seems to have completely stopped
adding CPE names (and CVSS scores) to new CVE records for the previous 12 days.
Of course, there is no communication at all from the NVD about this problem –
in fact, the last time they posted a communication about their backlog was last
November 13, when they said they were now fully staffed up and were hoping to
make real progress soon.
Now it seems the NVD may have stopped
performing their most important function, other than maintaining the database
itself. This could well be because of the turmoil that has been going on in the
federal government this year, which has included cancellation (whether legal or
otherwise) of many ongoing contracts. If this is the case, that’s bad news
indeed, since it means the NVD might end up going the way of agencies like
USAID, which was all but shut down more than a month ago and remains in that
state (with close to zero employees) today, despite a court order to reopen the
agency.
In other words, the NVD might already
be gone. But even if it survives in some form, unless enrichment restarts, it
will be a hollow shell of what it was. The good news is that, if you search the
NVD for recently announced vulnerabilities that apply to a product you use now,
you’re likely to receive the message that there are no vulnerabilities to
display.
The bad news is that’s probably
not the truth. However, the only way to find out about CVEs that were not
displayed is to do text searches of the over 40,000 new CVEs that were
identified since February 12, 2024. Obviously, that’s not a solution to the
problem (there are several third parties like Vulners
and VulnCheck that are performing this
work themselves, although none of them have done it for all 22,000 CVEs in the
backlog. CISA was doing some of this work in their Vulnrichment program, but
that program stopped adding CPE names in December, for some unexplained reason).
I don’t know what the backlog of
unenriched CVEs is today, as a percentage of new CVEs identified since say the
beginning of 2024. However, it’s without a doubt over the 55% level, where it
stood in December. Where will this end? Clearly, if enrichment doesn’t resume
(and specifically, addition of CPE names to CVE records), we’ll end up with a
backlog that asymptotically approaches the number of CVEs that have been
identified since last February 12.
In other words, a search for a
product in the NVD will become increasingly meaningless. In fact, many people
would argue that any product vulnerability search that at best will yield you
half of the vulnerabilities that have been discovered in the past year is
already meaningless.[ii] Of
course, the NVD will probably survive as an historical database, but it won’t be
a trustworthy source of vulnerability data for CVEs identified after February
2024.
How can this problem be fixed? It
depends on what you mean by ‘fixed’. If you’re talking about going back and
adding CPEs to all the unenriched CVE records since last February, that’s
probably never going to happen.
However, if you’re talking about putting
in place a long-term solution, that is certainly possible. Like all long-term
solutions, it will require a lot of work. The source of the problem is quite
clear: For reasons that are not known, the people tasked with adding CPE names
to CVE records aren’t able to do their job now.
At first, it might seem that the
solution to this problem is to find some more money to throw at the problem.
NIST did that last year – and now the problem is worse than ever. Plus, given
the current climate in Washington, I strongly doubt there are any trees left to
shake for money.
Most importantly, CPE is a flawed software
identifier, as the SBOM Forum (now the OWASP SBOM Forum) described in our 2022
white paper on fixing the naming problem in the NVD. Some of CPE’s problems
might be fixed (and may in fact be fixed due to CVE.org’s revised specification),
but others simply can’t be fixed.
Perhaps the biggest of these
problems is the fact that CPE requires a name for the software product and the
vendor of the product. What’s so hard about that? The problem is most software
products and software vendors go by many different names in different contexts.
If someone wants to find a particular product or vendor name in the NVD, they
will need to guess at the one that was used by the person who created the CPE.
There’s no way to know beforehand what that name was.
At this point, someone will
usually say something like, “If we had a global directory of software vendor
names, all those names could be mapped to a single canonical name. Then we
could require that the person who creates a CPE name only use the canonical
name for each vendor.” This sounds very simple, but who makes the decision on the
name for the vendor? Is it the current CEO, the current CFO (in financial
filings, etc.), the initial articles of incorporation, the name used by the New
York Times or the Tokyo Yomiuri Shimbun, etc. etc?
And what if the developer acquires
another developer, but – as usually happens – leaves the existing product and
vendor names in place for a year or two, or perhaps never changes them? A
customer of the acquired developer might not learn of the acquisition for a
year or two and will continue to search for vulnerabilities using the previous
product and vendor names.
Of course, if the funds were
available to create a global database of software suppliers and more
importantly to maintain it, this wouldn’t be a problem. However, maintaining the
supplier database would be hugely more expensive than maintaining the NVD
itself. And when you talk about the required database of product names, that
would be much more expensive than the supplier database. Clearly, neither of
these databases is likely to be available soon or ever.
What’s needed is a software
identifier that is based entirely on information that is available to the user
at any time. Such an identifier doesn’t have to be “created” at all. That
identifier is purl,
which stands for Product URL. Purl came from literally nowhere ten years ago to
conquering the open source world – to the extent that currently there is almost
no open source vulnerability database that doesn’t support purl. Purl
can be implemented in the “CVE ecosystem" fairly easily, once a revised
CVE record format can be developed and tested. Moreover, purl doesn’t need to
supplant CPE. Current CPE names won’t go away, and if someone starts creating
new CPE names again and adding them to CVE records, they will certainly be
supported.
To implement purl in the CVE
ecosystem, two steps are required:
1.
Changes will be needed
in the CVE Record format. Of course, that format is changed regularly, so this
should not be too hard. The changes will not be implemented until step 2 is
completed (and they may not be implemented for a while after that, since there
is always a backlog of format changes that are needed).
2.
There needs to be an end-to-end
proof of concept. It will start with CVE Numbering Authorities creating test
CVE records (based on the revised CVE Record format) and end with users
searching vulnerability databases for vulnerabilities applicable to particular products.
If the users are shown all the CVEs that affect each product, the PoC will be successful.
These two steps are required to implement
purl as it’s currently configured: to support open source software found in
package managers. Purl doesn’t currently support commercial products. While it
will be a big step forward when purl is implemented in the CVE ecosystem just for
open source software, it will be much better when purl supports commercial
software as well. Since the NVD is currently the primary vulnerability database
for commercial software, if purl is to provide a solution to the NVD’s
problems, it will need to support commercial software as well as open source.
I described a possible solution to
this problem – proposed by Steve Springett, leader of the OWASP Dependency Track
and CycloneDX projects – in this
blog post. It requires software suppliers to create a “SWID tag” for each of
their products and make it available on their website to all interested
parties, whether or not they are customers (the SWID tags will also be
distributed via the Transparency
Exchange API).
A user who wants to search for
vulnerabilities in a product they use can download the product’s SWID tag and
create a simple purl using it (usually, only 3-4 fields from the SWID tag will
be required to create the purl). Since the CNA will use the same tag when they
create the purl for the CVE record, this means a user searching for the product’s
purl in a vulnerability database should always find any CVE records for the
same product and version. Note that this doesn’t require anyone to “create” a
purl; its contents will be dictated by the contents of the SWID tag.
Thus, the third
and final step for implementing purl support in the CVE ecosystem is testing
and implementing the use of SWID tags to create and validate purls for
commercial software. Since this step can be accomplished at the same time as
the first two steps, it would speed up the implementation of purl in the CVE
ecosystem if the two tracks could be carried out simultaneously.
How long will it take to do all of
this? I think the first two steps – which will implement purl in the CVE ecosystem
with coverage of open source software in package managers – will take about a
year. The third step, validating and implementing the use of SWID tags to
create purls for commercial software, will require another year. However, if
resources permitted both tracks to be carried out at the same time, we could
have purl supported throughout the CVE ecosystem, for both open source and
commercial software, in 1 ½ to 2 years.
This is a long time, of course, but
given the distinct possibility that the NVD will become an empty shell soon –
and given that it is already greatly diminished – what other choice is there?
With 280,000 CVEs in the catalog now, it is long past the time when anything
other than automated search for software vulnerabilities is practical. We’ve
already effectively lost the capability for automated identification of all vulnerabilities
that apply to a product in an NVD search. It’s time to start work on Plan B.
The OWASP SBOM Forum is willing to take the lead on all
three steps of this project. However, we require funding for that. If your
organization is able to support us with either donations or personnel or both,
please email me at tom@tomalrich.com. OWASP is a 501(c)(3) nonprofit organization.
My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.
[ii]
CVE.org has made modifications to the CPE specification in an effort to get the
CVE Numbering Authorities (CNAs) – who in the majority of cases are the
developer of the product for which a vulnerability is being reported – to start
adding CPE names to the CVE records they create. I hope this effort is successful,
of course, but we certainly can’t count on it.
No comments:
Post a Comment