In April, I announced the new OWASP Vulnerability
Database Working Group, part of the OWASP SBOM Forum. The group was formed
to try to make sense of the many options available for vulnerability databases,
especially since the seeming collapse of the National Vulnerability Database
(NVD) made it imperative for all members of the software security community to
learn what their options are (there have always been lots of options, but when
the NVD was working reasonably well, many organizations were happy to put all
their eggs in that basket. Unfortunately, the NVD is no longer working well, in
case you didn’t know).
That group, which meets biweekly, has had some very
interesting discussions (the meeting notes and chats are here).
In the meeting this week, we discussed the problems caused by the fact that the
NVD has stopped “enriching” CVE reports by adding CPE names. That discussion
revealed a problem that I didn’t know existed. Since understanding the problem
requires understanding how information gets into the NVD in the first place,
I’ll start there.
The NVD is a database of software vulnerabilities, which are
identified by a CVE number (e.g., CVE-2021-44228, the famous log4j – actually,
log4shell – vulnerability). CVE numbers are maintained in a database operated
by the MITRE Corporation under a contract with DHS. The database used to be
called just “MITRE”, but now it’s officially known by its URL: cve.org. While MITRE personnel run cve.org
day-to-day, they report to an independent board composed of representatives
from private industry and government (including CISA and the NVD).
Like probably most people, I used to think that
vulnerabilities were reported by independent researchers and white hat hackers directly
to MITRE, and that the developer of the software is not usually involved in
this process. However, that’s literally the opposite of the truth. In fact,
almost all CVEs are reported by the supplier of the software itself in a CVE
report.
A CVE report needs to be created by a “CVE Numbering Authority”
(CNA), which assigns a CVE number to the vulnerability. In most cases, the CNA
is a large software developer – Oracle, Red Hat, Microsoft, HPE, Schneider
Electric, etc. Some CNAs just report vulnerabilities discovered in their own
software. Others, like Red Hat and GitHub (a division of Microsoft), advertise
that they will help other developers (within a certain scope, like “open source
projects” or a particular industry or country) create CVE reports for
vulnerabilities they’ve discovered in their products.
A developer that isn’t a CNA but wants to report a
vulnerability in one of their products can contact a CNA that has them within
their advertised scope. And if a developer can’t find a CNA that seems likely
to be able to help them, they can contact MITRE itself, which is the “CNA of
Last Resort” (CISA is the CNA of Last Resort for Industrial Control Systems and
medical devices).
Of course, the CVE Report doesn’t just describe a
vulnerability. It always needs to point to at least one product (software or an
intelligent device) that is subject to the vulnerability. In at least 80% of
cases, the product in the report was developed by the CNA that created the
report.
There are two ways in which the product subject to the CVE
can be referred to in the report. The default is always a textual description,
e.g. “Cisco Crosswork Network Controller version 3.0.0” – and it’s safe to say
that every CVE report includes such a textual description of the product.
However, a user searching for vulnerabilities in a product they utilize will
almost never be able to find the product in a vulnerability database like the
NVD simply by searching on a text description; this is because there are many
ways in which the product can be identified textually (to use the above
example, that product might be described as “Cisco Crosswork Network Controller
v3.0.0”, “Cisco, Inc. Crosswork Network Controller version 3.0.0”, “Cisco
Crosswork Network Controller version 3.0”, etc. None of these would find a
match if entered in the NVD).
This is why there should always be a machine-readable
software identifier on the CVE report; a user that knows the identifier for a
product can search for it in a vulnerability database like the NVD by entering
that identifier. Currently, the only identifier supported by the NVD is the CPE
name. If the user enters the correct CPE name for the product, the search
result will either describe any vulnerabilities to which the product is subject
or return a null result, which the user can trust to be an indication that no
vulnerabilities have been reported for that product.[i]
If they don’t know the correct CPE name for the product (and, unlike the purl
identifier, the CPE name can’t be definitively predicted from information
available to the user), they’re out of luck.
When the CNA creates the CVE report, they should include a
CPE name for the product or products affected by the vulnerability. However, in
the past the CNAs have often not done that. One reason for this is that the CNA
may not feel comfortable creating the CPE name, because the specification isn’t easy to
understand. Another reason is that the NVD, when they receive the CVE report
from CVE.org, is supposed to “enrich” it with information that they provide;
one of those pieces of information is the CPE name. In many cases, if a CNA
included a CPE name in a CVE report, it was overwritten when the NVD enriched
the report (this also has happened a lot with the CVSS score). The result was
that, even when the CNA included a CPE name in the report, the CPE name in the
NVD was the one that a NIST employee had created, not the CNA.
Of course, as long as a user can learn the CPE name in the
NVD (perhaps through the vendor of the product), this isn’t a terrible
situation. However, on February 12, 2024, the NVD abruptly reduced the number
of CVE reports that they enriched to almost zero; while this has recovered to
some degree, it’s still far below where it should be[ii].
Even that wouldn’t be a terrible problem if the CNAs simply
created their own CPEs. CVE.org is pressing them to do that, and the five or
six largest CNAs (which account for the vast majority of CVE reports) are doing
this, at least for reports of vulnerabilities in their own products. The
problem is that most of the CNAs aren’t including CPE names in their CVE
reports. This makes the reports unusable in most widely-used applications,
since they all require the ability to automatically find a product in the NVD;
manual searches are close to useless.
We discussed this issue in the meeting of the OWASP
Vulnerability Database Working Group this week. The Directors of Product
Security of two of the largest software developers in the US (both large CNAs)
were in the meeting, and both pointed to a big reason why many CNAs aren’t
including CPE names in their reports: since the NIST people who enrich the CVE
reports almost always must choose one of many different vendor names
(Microsoft, Microsoft Inc., Microsoft Europe, etc.), product names (Microsoft
Word, Microsoft Office Word, Word, etc.), and more, there is no way up front to
know for certain what choices they’ll make. If the CNA enters the CPE name it believes
is appropriate, the NVD staff may override that with their own CPE name (and
this has happened a lot in the past).
These two large CNAs (and many other people, of course)
would like to learn what rules the NVD staff members follow when they create
CPE names, so they can make sure their staff members follow those same rules
when they create CPE names for CVE reports. However, nobody has been able to
get that information from the NVD (my guess is this is because the NVD doesn’t
have rules to follow, but won’t admit that).
Unfortunately, there’s probably no near-term solution to
this problem, except for CVE.org to provide training to the CNAs on how they
should be creating CPE names, and hope the NVD doesn’t suddenly start creating
their own CPE names again.
However, given that there’s no definitive way to identify
values for the fields included in a CPE name (vendor name, product name, etc.),
there will never be a real solution to this problem as long as CPE is the only
option for naming software in the NVD. The ultimate solution to this problem is
to take advantage of the fact that the new CVE version 5.1 specification
(formerly the “CVE JSON specification”) includes the capability to utilize purl identifiers.
If a CNA adds a purl identifier to the CVE report (they
probably have to include a CPE also), and if the vulnerability database
supports purl (which isn’t the case now with the NVD and won’t be anytime soon,
for sure. Of course, CVE.org should be supporting it now, although there probably
aren’t any purls in that database now), a user will be able to find recent
vulnerabilities for a product by searching on its purl. This should always be
predictable, based on information the user should have already: the location
from which they downloaded the package (e.g. Maven Central), the name of the
package in that location, and the version string for the package in that
location (the SBOM Forum white paper goes into a lot of depth in explaining why
this is the case).
As I discussed in this post
last year, purl is already used to identify software in almost every
vulnerability database in the world which isn’t based on CPE (this means the
NVD and the databases based on it). However, currently, purl can only be used
to find open source software packages in vulnerability databases, not
proprietary (“closed source”) products.
In the SBOM Forum’s paper, we described a scheme – based on
a suggestion from Steve Springett, the leader of the OWASP CycloneDX and
Dependency Track projects and also a purl maintainer – in which a software
developer will create a SWID tag
for each new product and version and make that tag available with the binaries.
As we were writing the paper, Steve got a new purl type added (each download
location, usually a package manager, has its own purl type), called SWID. If a
user has the SWID tag for a product and wants to find about vulnerabilities in
it, they will be able to create a purl using just 3 or 4 fields from the SWID
tag (the SWID spec supports about 80 fields, but only a few of them are needed
to create the purl).
If the CNA that created the CVE report (say it’s for
CVE-2024-12345) included a purl in the report for one of their proprietary
products, they presumably based it on the SWID tag (since they probably work
for the developer that created both the product and its tag). Thus, the purl
the user enters in their search (which was developed using the SWID tag they found
on the developer’s website) should always match the purl associated with the
CVE number. This means the user should always be able to find out that their
product is vulnerable to CVE-2024-12345. This sort of certainty is never
possible with CPE.
The big fly in the ointment currently is that what I’ve just
described only applies to new software products or versions, not to existing or
legacy ones. There needs to be some mechanism by which a user of a legacy
product version can find a SWID tag for their version as well. The good news is
that it shouldn’t be hard to create such a mechanism. For example, I’ve
suggested that a software supplier could have a known location on their website
called maybe “SWID.txt”. It would provide a list of products and versions,
along with a SWID tag for each. A tool could search on the product and version
and find the SWID tag; using that, the tool could create the purl for the
product/version.[iii]
Of course, there would be other ways to make the SWID tag
information available to users of legacy products and versions. In fact,
there’s no reason why SWID tags even need to be used for this purpose. There
just needs to be a way for the supplier to make information required to
identify their products available to users of both current and legacy products.
There are lots of ways this could be done.
I would love to see the OWASP Vulnerability Database working
group address this task, but currently it’s beyond our means. If your
organization might be interested in supporting this work through man (or woman)
power or a donation to OWASP (or both), please drop me an email.
Any opinions expressed in this
blog post are strictly mine and are not necessarily shared by any of the
clients of Tom Alrich LLC. If you would like to comment on what you have
read here, I would love to hear from you. Please email me at tom@tomalrich.com.
I lead the OWASP SBOM Forum, which works to understand and address issues like what’s
discussed in this post; please email me to learn more about what we do or to
join us. You can also support our work through easy directed donations to OWASP,
a 501(c)(3) nonprofit. Email me to discuss that.
My book "Introduction to SBOM and VEX" is now available in paperback and Kindle versions! For background on the book and the link to order it, see this post.
[i] Unfortunately, because of deficiencies in the NVD, a null result for a vulnerability search can mean many other things, such as that the user unknowingly fat-fingered the CPE name. This and other problems with CPE are described on pages 4-6 of the OWASP SBOM Forum’s 2022 white paper linked above.
[ii] CISA has tried to help out with their Vulnrichment program, but that only addresses a small fraction of the non-enriched records. In addition, some CNAs report that there are problems with CISA’s work.
[iii]
In fact, this could be simplified if the supplier listed the purl along with
the SWID tag, since there should always be a one-to-one correspondence between
the two.
No comments:
Post a Comment