Before our biweekly OWASP SBOM Forum meeting on Friday, I asked Andrey Lukashenkov of Vulners for an update on where the National Vulnerability Database's (NVD) backlog of “unenriched” CVE records stands[i]. Andrey said the backlog is now over 21,300; this is at least 2,000 more than it was not much more than a month ago, and of course it’s a record high number for this year. Since there have been a total of 37,000 new CVE records added to the NVD this year, this means only about 43% contain a CPE name.
In other words, on average a simple search of the NVD using
a known CPE name will only discover 43% of vulnerabilities identified since
February 12, the day the NVD's problems started. Even though CVE records for
the other 57% of vulnerabilities are present in the NVD, they don’t contain CPE
names and therefore are invisible to searches. If you want to learn whether any
of those vulnerabilities apply to your product, you need to do a text search of
the 21,300 “unenriched” (i.e., CPE-less) CVE records. Of course, you would need
to do that for every product of concern to you, and you would have to do it as often
as you want to learn about newly reported vulnerabilities, which ideally is
daily. Of course, nobody is going to do this.
Andrey also pointed out something even more startling: During the first four days of the week of December 2 (and presumably also on the day we were meeting, December 6), the NVD added CPE names to exactly 0% of new CVE records. Since their problems started on February 12th, the NVD has always enriched at least a few CVE records every day (other than a single day in May).
Of course, I assume the NVD will resume adding CPE names to
CVE records sooner or later. But the idea that the NVD can eliminate their
backlog in 2025 (or perhaps ever) looks more and more like fantasy. CISA has added
about 2,000 CPEs for exploited vulnerabilities, but the backlog figure of
21,300 presumably takes those into account. In addition, a few firms like Vulners (Andrey’s employer) and VulnCheck
have taken it upon themselves to add their own CPEs to some of the unenriched CVE
records; unfortunately, neither of these firms has official “Alternate Data
Provider” (ADP) status, so it isn’t clear what will happen to the CPE names
they created, when and if the NVD returns.
In other words, today automated searches of the NVD, and
presumably vulnerability scanner output, will normally identify no more than
50% of vulnerabilities that have been reported since February 12. If you went
to the doctor to diagnose your illness and they told you up front that they
only knew about fewer than 50% of new diseases that have been discovered this
year, would you keep going to them? That’s essentially the problem the software
security community faces now.
What’s the solution to this problem? Some people have
pointed to the CVE
Numbering Authorities (CNAs) as the solution. These are the organizations,
including a number of large software developers (e.g., Oracle, Microsoft, and Schneider
Electric) and organizations like GitHub, MITRE and the Japanese JP-CERT, that create the CVE records
in the first place. They report vulnerabilities in products they have developed
themselves, as well as products from developers, including open source projects,
that are not themselves CNAs.
The question is why the CNAs aren’t adding CPE names to the
CVE records they create. Since the SBOM Forum includes several large CNAs, we
have discussed this question a lot. I have heard two main answers:
1.
In the past, the NVD has usually rejected CPE
names that were created by anyone other than the NVD, presumably on the grounds
that only the NVD knows how to create them. Unfortunately, if the NVD has some
sort of secret process they follow to create CPE names, they have never revealed
it. Moreover, that process seems to include at least a few purely random
elements, since nobody has ever come up with a way to predict a CPE name with
certainty. For a discussion of some of the problems with CPE, as well as
how they might be addressed, see this
2022 paper by the SBOM Forum (the discussion of problems with CPE is found on
pages 4-6).
2.
To be honest, there seems to be little if any
enthusiasm among the CNAs to start creating CPEs, precisely because so much of
the process seems to be arbitrary. Nobody can be expected to invest a lot of
time creating a CPE name when it has all the durability of a Jell-O sculpture.
Fortunately, there is an alternative to CPE called purl,
which stands for “product URL”. In less than one decade, purl has gone from
nowhere to completely conquering the open source software world. It is used as the
software identifier in almost all vulnerability databases for open source
software worldwide. The notable exception to this rule is the NVD and databases
derived from it, which of course use CPE.
Why has purl been so successful in the open source world?
This post
discusses several reasons, but the most important is that a user who wants to
know the purl for an open source product, which they downloaded from a package manager,
will always create the same purl as any other user, as long as it is for the
same version of the same product, which is found in the same package manager.
Moreover, the CNA reporting a vulnerability in the purl in a
CVE record will create the purl using the same information – meaning a purl
used to search for an open source project in a vulnerability database should
always (barring human error) match the purl in a CVE record. Unlike the case
with the NVD today, in which a search for CVEs applicable to a product will
probably not reveal half of the vulnerabilities that have been identified in
that product this year, a search in a purl-based open source vulnerability
database like OSS Index should always
yield every vulnerability that has ever been reported for the same product.
However, there are two important tasks (each with sub-tasks)
that need to be accomplished, before purl can be placed on an equal footing with
CPE in CVE records.[ii]
They are:
First task: CVE Numbering Authorities need to start including
purls in CVE records, when the product being referenced is an open source
product in a package manager. While that is technically possible now due to the
CVE 5.1 specification coming into effect this past spring, it turns out that
virtually none of the CNAs are in fact doing this. The biggest reason is
undoubtedly that neither of the two major US government-run databases, the NVD
and CVE.org, currently accepts any software
identifier other than CPE. So, a CVE record with a purl identifier is all
dressed up with nowhere to go.
How can this situation be changed? Some group needs to conduct
extensive outreach to the CNAs and to CVE.org (which runs the CVE Program, including recruiting
and managing the CNAs). That outreach will include “evangelizing” about the
advantages of including purls in CVE records, as well as training on the
details of doing so. Just as importantly, the group needs to work with the CNAs
and CVE.org to identify the policies and procedures that must be in place for
purls to be successfully used in the CVE context.
One important part of this effort will be conducting an end-to-end
proof of concept, in which:
1.
CNAs will include a purl whenever they create a CVE
record to report a new vulnerability in an open source product found in a package
manager. The purl will be based on the package manager name, as well as the
product name and version string in that package manager.
2.
A purl-based vulnerability database will ingest
the CVE record, just as the NVD does for CVE records now.
3.
A user who has downloaded an open source product
from a package manager will easily create a purl using the package manager name,
as well as the product name and version string as registered in the package
manager. Since the user’s purl should always match the purl that the CNA included
in the CVE record, the search should always return every CVE that has been reported
for that product.
The results of this proof of concept should help convince CVE.org
and the CNAs that purl is a much better identifier for open source software
than CPE.
Second task: Purl needs to be able to identify
commercial software, not only open source software found in package managers. A
scheme for doing this was suggested in 2022 by Steve Springett, leader of the
OWASP Dependency Track and CycloneDX projects and a founding member of the
OWASP SBOM Forum, in the above-referenced white paper on CPE naming in the NVD.
Steve’s idea is that commercial software suppliers will create standardized
short documents called “SWID tags”. These will provide authoritative metadata for
a software product, including the supplier name, product name and version
string.
Whenever the supplier wishes to report a new vulnerability
in their product, they will provide the SWID tag to the CNA who creates the new
CVE record. The CNA will create the product’s purl using the information in the
SWID tag; they will include the purl in the CVE record. Later, when an end user
wants to learn about new vulnerabilities that have been identified in a
commercial product they use, they will be able to locate and download[iii]
the same SWID tag as the CNA used when they created the purl in the CVE record.
The fact that both the CNA and the end user will base their purls on the same
SWID tag means the purls should be identical barring human error, just as in the
above case of purls for open source software distributed in package managers.
The three primary goals of the project are:
1.
To work with commercial software developers, vulnerability
management service providers, and end users to identify policies and procedures
for creation and use of purls based on SWID tags.
2.
To evangelize and train CVE.org staff members
and CNAs on creation and use of the new SWID-based purls. Of course, this
effort will build on the evangelization and training in the first task.
3.
To conduct an end-to-end proof of concept that essentially
mirrors the one described in the first task, except that the purl name will
always be based on the contents of the SWID tag prepared by a commercial
software supplier, not the name and version string for an open source product distributed
through a package manager.[iv]
Tom Alrich and Tony Turner of the OWASP SBOM Forum have developed
a white
paper that proposes a project to implement both of the above steps, as well
as a project
plan[v]
for doing this. The project is called “Purl Expansion Design and Proof of
Concept”. Because this project will almost certainly take more than a year to
accomplish, and because neither of us is able to donate that amount of time, we
are requesting donations to fund at least part of this effort. While we believe
the whole project will require over $100,000 in funding, we are willing to start
the project with a much more modest donation or donations.
If you or your organization are able to donate any amount over
$1,000, you can donate to OWASP (a 501(c)(3) nonprofit organization) and have
your donation “directed” to the SBOM Forum; this can be done either online or
directly. Donations are often tax deductible.
If you would like to discuss this, please email Tom Alrich at tom@tomalrich.com.
Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.
[i] CVE
records – i.e., records of newly-discovered software vulnerabilities – are supposed
to include one or more machine-readable software identifiers called CPE names. The
CPE name identifies a software product that is affected by the vulnerability
identified by the CVE number. Before February 12, 2024, the NVD always created
a CPE name for every product named (in a text field) in the CVE record. However,
on that day the NVD’s production of CPE names dropped precipitously; it has not
recovered since that day.
[ii] There
should be no problem with having both a CPE name and a purl in a single CVE
record, since there is no intention of purl “replacing” CPE. As long as somebody
- perhaps the NVD staff, or perhaps some CNAs who prefer CPE – is willing
to keep creating new CPE names, they will continue to be used. Moreover, the huge
set of CPE names already created (at least 250,000, and probably more than
that) will not disappear, since there is no good way to replace them with purls
in existing CVE records.
[iii] End
users will be able to locate and download a SWID tag, as well as other types of
software supply chain artifacts like SBOMs and VEX documents, by utilizing the
upcoming Transparency
Exchange API. It will be fully available in 2025.
[iv] Package
managers almost never distribute commercial software.
[v] The
project plan primarily focuses on the second step, since the need for the first
step was not apparent until very recently.
No comments:
Post a Comment