Thursday, February 13, 2025

Better living through purl!

 

The CVE Program is getting ready to consider adopting purl as an alternate software identifier to CPE in CVE Records. If this goes through, software users will be able to use purl to look up open source software products and components that are affected by CVEs. They will be able to do this in several major vulnerability databases, and perhaps later they will be able to do this in the NVD.

However, end users aren’t the only organizations that will benefit from purl being used in the “CVE ecosystem”; in fact, they’re not even the biggest beneficiaries. Here are what I believe are the most important groups who will benefit and why:

1. Software developers that utilize open source components in their products. Unlike CPE, purl currently focuses on just one type of software: open source software distributed by package managers. 90% of the code in most software products today, whether open source or proprietary, consists of open source components. Therefore, it’s important that developers be able to learn about vulnerabilities in those components.

To look up an open source component in a vulnerability database, the developer needs to know the identifier for the product. If the developer wants to use the National Vulnerability Database (NVD), they will first need to search for the CPE name using the CPE Search function in the NVD. However, finding the correct version of the component is challenging. For example, here is the search result for the popular Python product “django-anymail”. You will have to figure out which of the 34 CPE names is the one you need.

On the other hand, if the developer wants to learn the purl for django-anymail, they don’t need to look anything up in an external database. Instead, they just need to know three pieces of information, which they presumably already have:

1.      The purl type, which is based on the package index, PyPI;

2.      The package name, django-anymail; and

3.      The (optional) version string, e.g. 1.11.1.

Using these three pieces of information, the developer can easily create the purl: “pkg:pypi/django-anymail@1.11.1” (“pkg” is the prefix to every purl). Note that no database lookup was required!

2. CVE Numbering Authorities (CNAs). As described in this post, a CNA is an organization that reports software vulnerabilities to CVE.org in the form of CVE Records. Today, the only software identifier found in CVE Records is CPE. Unfortunately, more than half the CVE Records created last year – about 22,000 out of 39,000 – don’t contain a CPE and thus can’t be found with simple searches.

However, if purl were also supported in CVE Records[i], the CNA could create the purl for an open source product, exactly as the supplier in the earlier example would create it. Meanwhile, a user searching a vulnerability database for the product could create the same purl. Barring a mistake, the user should always be able to locate the same product and thus learn of any vulnerabilities reported for it.

CNAs that report vulnerabilities for open source software will benefit from two other features that are unique to purl:

a.      Every module in an open source library can have a purl, not just the library itself (as is the case with CPE). If a vulnerability is found in only one module of a library, it would be much better for the CNA to report the vulnerability as applicable just to that module. This is because the developer will often include in their product just the module(s) that is required, rather than the whole library. If the CNA reports just the module as vulnerable, a developer that didn’t include that module in their product can (perhaps proudly) announce to their customers that the vulnerabiy doesn’t affect their product and no customer needs to patch it.[ii]

b.      Similarly, if a product is found in multiple package managers but the CVE Record includes the purl for only one of them, this means there’s no need to patch the product in the other package managers. However, because CPE doesn’t have a field for a package manager, most users won’t learn of this. Again, this means there will often be wasted patching effort.

Another advantage that CNAs cite for purl is the fact that a purl doesn’t have to be created by any central authority. Every product available in a package manager (or another repository that has a purl “type”) already has a purl, even if nobody has written it down yet. By contrast, CPEs must be created by a NIST contractor that works for the NVD. They often take days to create (if they are created at all). This delays the CNA in submitting a new CVE Record.

3. End users. Of course, this means the ultimate consumers of vulnerability information, even if they receive it through some sort of service provider. Their primary concern is completeness of the data. That is, they need to know about all the vulnerabilities that affect the software products they use. Of course, there’s hardly any end user today that doesn’t have about a year’s backlog of patches to apply, so it’s not like they are breathlessly anticipating more vulnerabilities.

Any organization with such a backlog owes it to themselves to learn about every newly released vulnerability. They (or perhaps a service provider acting on their behalf) need to feed all those vulnerabilities into some algorithm that prioritizes patches to apply, based on a combination of a) attributes of those vulnerabilities like CVSS or EPSS score and b) attributes of the assets on which the affected software product resides, such as proximity to the internet or criticality for the business.

CVE is of course the most widely followed vulnerability identifier worldwide. However, just learning about a CVE doesn’t do an organization much good unless they also learn about a product or products that is affected by the vulnerability – and if the organization utilizes one of those products. Because of this, CVE Records always identify the affected product(s). When the CNA creates a new CVE Record, they do this in textual form.

After the CNA creates a new CVE Record, they submit it to the CVE.org database. From there, the NVD downloads all new CVE Records. Up until last year, NVD staff members (or contractors working for the NVD, which is part of NIST) added one or more CPE names to almost every CVE Record. They do (or did) this because, with over 250,000 CVE Records today, it’s impossible to learn the products affected by each CVE simply by browsing through the text included in the record. There needs to be a machine-readable identifier like CPE or purl to search on.

But here’s the problem (already referred to above): Almost exactly a year ago, the NVD drastically reduced the number of CVE Records into which they inserted CPE names, with the result that as of the end of the year, fewer than half of new CVE Records contained a CPE name. The problem seems to have continued into this year. A CVE Record that doesn’t include a machine-readable software identifier will be invisible to an automated search using a CPE name. In other words, if someone searches using the CPE name for a product they utilize, the search will miss any CVE Record that describes the same product in a text field, if the record doesn’t include the product’s CPE. Moreover, the person searching won’t have a way to learn about the missing CPE names.

Because of this problem, end users can’t count on being able to learn about all CVEs that affect one of their products. If purl were implemented as an alternative identifier in CVE Records, and if the CNA had included a purl in a CVE Record for the product, then a search using that purl would point out that the product was affected by the CVE. Implementing purl is needed for ensuring completeness of the vulnerability data, at least for open source software (and open source components in proprietary software).

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Also email me If you would like to participate in the OWASP SBOM Forum or donate to it (through a directed donation to OWASP, a 501(c)(3) nonprofit organization).

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] Currently, the CVE Record Format includes a field for purl. However, it’s not being used at all today, mainly because there’s been no training or encouragement for the CNAs. That will hopefully change soon. 

[ii] In fact, this was the case with the log4j vulnerability CVE-2021-44228. It was reported as affecting the log4j library, but in fact it just affected the log4core module. Since a CPE for a library refers to the library itself, not one of the modules, this meant that in 2021-2022, a lot of time was wasted worldwide in patching every module in log4j.

No comments:

Post a Comment