Wednesday, May 7, 2025

The importance of being negative

The future of the CVE Program has been discussed a lot lately due to funding problems; I have contributed to those discussions. However, most of my posts that have mentioned the program in the last six months or so have been about software identification – that is, the importance of being able to accurately identify, using a machine-readable identifier, the software product or products that are affected by a vulnerability described in a CVE Record.

Before February 2024, there wasn’t a lot of discussion about the merits of different software identifiers, since the two most widely used identifiers – CPE and purl – each had their own well-understood place in the cybersecurity world. CPE reigned as king of the NVD and the other databases that are built on the NVD; on the other hand, purl was the king (queen?) of the open source world. While the NVD – which is firmly in the CPE camp - tracks vulnerabilities in both open source and commercial software, it isn’t the preferred source for the former, while it’s almost the only source for the latter.

However, in February 2024 the king stumbled. NVD staff (really contractors) drastically reduced, and at times completely stopped, their performance of their most important task: adding CPE names to new CVE Records. The records always include a textual description of the affected products, but there’s no good way to search these. But if the record for CVE-2025-12345 includes a machine-readable CPE software identifier (the only identifier currently used in CVE Records), and an NVD user searches for vulnerabilities using the identical CPE name, CVE-2025-12345 should always show up in the results.

What happens if the CPE name that’s searched for is just a little bit different from the CPE name that’s included in the CVE Record? In that case, it’s likely that no vulnerabilities will be shown to the user. What will be shown? Every NVD user’s favorite message: “There are 0 matching records.”

Will the user be crestfallen when they see this message? Not necessarily. In fact, they might be pleased, since that is the same message they will receive if there are no vulnerabilities listed at all for the product they’re searching for. In other words, the same message might mean both “This product has lots of vulnerabilities that affect it, but you need to keep guessing the CPE name in order to learn about them” and “The product you’re searching for has no reported vulnerabilities.”

The main problem with CPE is that there are lots of ways that the CPE that a user searches for might not match the CPE that is included with the CVE Record. This is because there is no way to know exactly how the NVD contractor that created the CPE name filled in the various fields. These include:

·        The vendor name in the CVE Record is “Microsoft Inc”, but the user searches for “Microsoft Inc.” (i.e., with a period) and finds nothing.

·        The product name in the CVE Record includes the text “mail integration”. The user searches for “mailintegration” and finds nothing.

·        The vendor name is “apache foundation”. The user searches for “apache_foundation” and finds nothing.

The big problem with these near misses isn’t just that the user won’t learn about a vulnerability that might apply to the product they’re searching for. More importantly, they won’t learn whether the search didn’t return results because there in fact aren’t any applicable vulnerabilities, or because they searched using the wrong character string. Humans being inherently optimistic, people are much more likely to apply the former interpretation.

To produce this blog, I rely on support from people like you. If you appreciate my posts, please make that known by donating here. Any amount is welcome. Thanks!

To use the first example, there are two CPE names in the NVD for which the vendor field is “Microsoft Inc”. Suppose that each of these CPEs appears in three CVE Records. A user who searches for “Microsoft Inc” will learn about those six CVEs. However, if a user enters “Microsoft Inc.”, they will see the message, “There are 0 matching records.” Rather than trying “Microsoft Inc” as well, the user may assume there are no vulnerabilities that apply to products sold by an entity with “Microsoft” and “Inc” in its name, no matter what other punctuation the CPE name might contain.

It’s annoying that the NVD makes it so easy for a user to be misled about whether a product is vulnerable. However, it’s much more serious that CPE’s quirks prevent a user from ever being able to make a clear statement that there are no vulnerabilities that apply to a particular product. This is because the user can never be sure whether the message “There are 0 matching records” means they guessed the wrong CPE name or whether it means the product truly has no reported vulnerabilities.

The purl identifier eliminates this ambiguity. Today, purl is mostly used to identify open source software packages distributed through package managers like Maven Central and npm. For example, the purl “pkg:pypi/django@1.11.1” refers to the package django version 1.11.1, which is found in the PyPI package manager.[i]

If someone wishes to verify that this is the correct purl for that package, they can always do so using a simple search in pypi.org. They should never need to guess a purl name, nor to look it up in a central database (like the CPE “Dictionary” found on the NVD website).[ii]

This means that, if a user searches a vulnerability database like OSS Index using a verified purl and finds no vulnerabilities, they can be reasonably[iii] sure there have been no vulnerabilities reported to CVE.org for the package in question. In vulnerability management, the danger posed by false negative findings is much greater than that posed by false positives.

If you receive a false positive finding from a vulnerability database, the biggest problem is that you’re likely to perform work (patching) that was unnecessary. However, if you receive a false negative finding, you won’t learn about vulnerabilities that might come to bite you. Even worse, you won’t usually know about this problem.

If CPE is the only identifier available for CVE Numbering Authorities (CNAs) when they create new CVE records (as is the case today), a user is much more likely to receive a false positive finding from the NVD, and less likely to know they’re receiving it, than if the CNA can alternatively utilize purls in CVE records.

Fortunately, CVE.org is now moving toward adding purl as an alternative software identifier in CVE records, although other things need to be in place before purl can be an “equal partner” to CPE. For one thing, there need to be vulnerability databases that can properly ingest CVE records that include a purl. Fortunately, I’m sure there are at least one or two databases that should be able to do that soon after the new records become available.

This points to the need for an end-to-end proof of concept for purl in what I call the “CVE ecosystem”. The PoC would start with CNAs including purls in CVE records for open source software products

and end with users searching for – and hopefully finding – those CVEs in vulnerability databases. If you would like to participate in or support an OWASP project to do this, please email me. 

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. And while you’re at it, please donate as well!


[i] Technically, PyPI is a package registry, not a package manager. However, its function in purl would be the same if it were a package manager.

[ii] That being said, a central database of purls is also being developed. This is to make it easier for people who aren’t very familiar with the purl syntax, especially when more than a few fields are required. There are also free services that will create a purl based on the user’s inputs.

[iii] I say “reasonably”, since it’s possible that the “same” vulnerability has been reported to CVE.org, yet the CPE name that was created for it didn’t have enough information to create a purl. Unfortunately, this is a common occurrence, since CPE has no field for a package manager name (sometimes the person who creates the CPE name includes the package manager in the product name. But that is obviously not part of the CPE specification). In a case like this one, OSS Index (and perhaps other open source vulnerability databases) would probably search for the package in various package managers, but success would not be certain.

Of course, this is just one example of why it is important to include purl as an optional software identifier in the CVE Program.

No comments:

Post a Comment