Monday, March 31, 2025

Here’s your chance to advise CVE.org!


I was pleased to see in my LinkedIn feed this morning a post from Alec Summers of MITRE containing a link to a “CVE Data Usage and Satisfaction Survey” which closes on April 4. I was even more pleased when I went to the survey and found it only asks non-wonky questions that should mostly be understandable by casual users of CVE information (which probably includes a large percentage of people in the worldwide cybersecurity community).

The survey is very well thought out. I recommend you fill it out. I especially recommend that you indicate on questions 14, 16 and 19 that you wish to see purl implemented in the CVE Record Format. While purl is present in the format now, it seems that whoever did that thought it’s a format for expressing versions, like semver. Thus, even though someone might enter a purl in a CVE record now, it won’t be usable.

Speaking of purl, I’ve retitled and revised the post I put up a few days ago on the OWASP SBOM Forum’s proposal to enable implementation and use of purl in the CVE “ecosystem”. Please take a look at that.

 

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.

 

Friday, March 28, 2025

Is CVE.org on the chopping block?


I’ve been writing a lot about problems with the National Vulnerability Database (NVD) lately. It didn’t occur to me that there might be similar (although not directly related) problems at CVE.org, which used to be called MITRE. However, I’ve learned that may be the case – and the implications of that would be far more significant than the implications of the NVD’s problems. Since a lot of people don’t understand that these are two different agencies, I’ll provide a Cliff Notes™ summary of them.

CVE.org: Before 1999, software vulnerabilities were identified and classified differently in different communities, e.g., developers working with a particular language. This made it hard to have general discussions about vulnerabilities, since two people could never be sure they were discussing the same vulnerability.

To address this issue, two researchers from the consulting company MITRE proposed a unified vulnerability classification scheme called CVE, which stands for “common vulnerabilities and exposures”. That led to the introduction of the first “CVE list” in 1999. CVE took off quickly, and soon vulnerability disclosures all over the world included CVE numbers.

CVE.org’s most important function is recruiting and managing the CVE Numbering Authorities (CNAs), of which there are 447 today (they include software developers like Microsoft, Oracle, Red Hat, Schneider Electric and HPE, as well as organizations like GitHub and ENISA). These are organizations that are authorized to assign new CVE numbers to vulnerabilities, then create a CVE Record for each vulnerability. The CVE Records are made available in the CVE.org database; they are also forwarded to the NVD and other vulnerability databases that are based on CVE.

CVE.org was formerly known just as MITRE, and many people continue to use that name today. This is not inaccurate, since I believe that 100% of the staff members of CVE.org are MITRE employees. However, those staff members now report to a nonprofit board composed of public and private sector representatives.

The NVD: The NVD also started operating in 1999, although not under that name – it was originally titled “ICAT”. In 2008, NVD initiated the CPE (Common Platform Enumeration) software identifier and started adding “CPE names” to CVE Records from MITRE. I’ve written many times about the shortcomings of CPE, starting with this OWASP SBOM Forum paper in 2022.

The most serious shortcoming recently is that, starting last February, the NVD drastically slowed the process by which they add CPE names to new CVE Records, with the result that in 2024 fewer than half of new CVE Records included a CPE name. Now, the NVD seems to have almost stopped adding them altogether. Since an automated search – e.g., using the search bar - for vulnerabilities that apply to a particular software product will not discover any CVE that does not include a CPE name, this means searches will miss far more than half of the CVEs that might have been identified for that product since February 2024.

To get back to the subject of this blog post, while I have written about the NVD’s problems extensively since they began more than a year ago, I have never even considered the possibility that we should be concerned about CVE.org. That is, until this week, when Steve Springett, leader of the OWASP Dependency Track and CycloneDX projects, emailed me this link to a Reddit chat entitled “What is happening at MITRE?”

The chat can be summarized by saying that many people, both CNAs and bug researchers, have noticed that the time it takes MITRE (CVE.org) to respond to new vulnerability reports has increased substantially in recent months. The people in the chat identified various possible reasons why these slowdowns are occurring, but nobody pointed to some definitive event that might have caused them.

I’m not going to speculate on the reason for the slowdowns, either. However, I want to point out that if CPE.org goes into an extended slowdown like the one at the NVD, the consequences for the software security community, and especially for the vulnerability management community, will be much more serious.

In the NVD’s case, there are ways to get around the fact that a CVE Record doesn’t include a CPE name. One way is to start to implement an alternative software identifier in CVE Records. CPE will probably always be an option for CVE Records, but purl should be another option. In fact, I am proposing a project to accomplish exactly that: make it possible for purls to be included in CVE records and utilized for lookups in CVE-based vulnerability databases.

But what will happen if CVE.org stops performing its most essential function: facilitating the production of vulnerability reports (CVE Records) by CNAs? There is literally no other way that those vulnerabilities will be made public, at least in a standardized manner (i.e., using the CVE Record Format). Of course, software developers will continue to report vulnerabilities to the users of their products, but since those reports aren’t standardized, we’ll be back in the same position as in 1999: there will be many different vulnerability reporting formats, but no common one. Truly automated software vulnerability management will once again be impossible.

What can we do to prevent this from happening? Since nobody in the chat could point to a definite cause of the problem, there’s no way to identify a fix for it. However, I want to point to a general cause, which might show us what the long-term fix is.

The general cause can be summarized by saying, “This is what can happen when government agencies, subject to political control, are responsible for carrying out vital technical functions.” Governments change regularly. Since each government is sovereign and guards its privileges jealously, there’s simply no way to guarantee that any practice or rule that a government agency puts in place today will survive tomorrow, let alone when the next administration takes over and imposes its own practices and rules.

Of course, both the NVD and CVE.org, for all their problems, would never be in anything close to their current form were it not for their both being government agencies; we should be grateful for that. However, maybe the time has come to think bigger than what’s in place today. I would like to propose that:

1.      A new supernational “software vulnerability authority” be created. It will be funded by assessments on governments that want to participate, as well as donations by private sector organizations with a stake in software security (of course, it’s hard to think of any organization today that doesn’t have a stake in software security). The vulnerability authority will include representatives from private and public sector organizations worldwide; it will not have special ties to any one government[i].

2.      The software vulnerability authority will create and operate a new database intended to replace both the NVD and CVE.org databases (this is referred to as “the replacement database” below). This new database will include vulnerabilities identified with CVE numbers and software products identified with CPE names and/or purl identifiers[ii].[iii] The database will include all data that currently exists in the NVD and CVE.org, as week as new CVE Records that are created after the cutover to the new database.

3.      The vulnerability authority will set up an organization like CVE.org that manages organizations that report vulnerabilities (identified in products that the organization developed, products that others developed, or both). These will probably be like the CNAs today, although there would need to be changes to that program. Most importantly, the reporters need to be paid, since today it is hard to incentivize CNAs – all of whom have day jobs - to implement improvements that require a substantial investment of time.

4.      Existing vulnerability databases (such as OSV, OSS Index, GitHub Security Advisories, Python Packaging Advisory Database, VulnCheck, VulnDB, VulDB, etc.[iv]) will be encouraged to continue their current work. These databases will continue to be accessible individually, but there will also be a central “intelligent front end”, through which a user can access datapoints in these databases. The intelligent front end will also allow access to data in the replacement database.

5.      The front end (which I call the “Global Vulnerability Database” or GVD) will manage all interaction between a user and the individual databases that make up the GVD.

6.      The user will be able to enter queries that use one or more vulnerability identifiers (CVE, OSV, GHSA, etc.) and one or more software identifiers (CPE, purl, PyPI Package, etc.). The front end will parse the user’s query into queries to one or more individual vulnerability databases and return all response(s) to the user.

7.      If a user enters a query that contains multiple software identifiers, vulnerability identifiers, or both, the GVD will not attempt to “harmonize” the response by translating one type of identifier into another. For example, if the user requests all vulnerabilities that affect a software product and identifies that product using a CPE name, the GVD will not display any CVE Record that contains just a purl. This is because, in many if not most cases, it is impossible truly to “translate” one identifier into another. If the user knows both the CPE name and the purl for a product, they will learn about all CVEs that affect that product only if they include both identifiers in their query.

Of course, the Global Vulnerability Database is just one alternative for replacing the NVD with a much more robust and universal vulnerability database. When I first started talking about the GVD more than a year ago, it seemed there would be plenty of time to discuss all the nuances, before considering actual deployment.

However, the problems with the NVD, and now possibly with CVE.org, are sending a clear signal that it’s time to start talking about what’s going to replace them. It’s clear that we shouldn’t replace them with another US government-run effort. Been there, done that, got the T-shirt. The replacement needs to be truly global and truly universal.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] A model for this authority might be IANA, the Internet Assigned Numbers Authority. That organization manages the DNS program worldwide, as well as assignment of IP addresses.

[ii] If one or more other software identifiers become popular in the future, they could be accommodated as well.

[iii] Besides software and firmware, the NVD also displays vulnerabilities found in intelligent devices. That program needs to be substantially rethought, but it needs to be continued in some form - probably in a separate database for devices. The OWASP SBOM Forum’s 2022 white paper described (on pages 12 to 14) a different naming system for devices, based on the existing GTIN and GMN identifiers used in international trade. CPE is not a good identifier for the device itself (meaning the complete set of software and firmware products installed in the device, which in my opinion should be how device vulnerabilities are reported), although individual software products installed in the device can be identified with CPEs.

[iv] The NVD and CVE.org will be discontinued, since all their current data will be included in the replacement database.

Sunday, March 23, 2025

A longstanding CVE problem needs to be resolved - along with two others

 

Last December in the annual CNA Workshop run by CVE.org, an important issue was brought up during a presentation by Lisa Olsen of Microsoft. I wrote in my post:

Lisa pointed out that in some CVE reports, the “affected” version of the product refers to the fixed version of the product (i.e., the version to which the patch has been applied), while in other reports (usually from different CNAs), the “affected” version is the unpatched version.

This is a huge difference, since it means some organizations may be applying patches to versions of a product to which the patch has already been applied. Lisa said the new CVE schema will allow the CNA (which is in many cases the developer of the affected product) to indicate which case applies. However, it seems to me there should be a rule: The “affected” product is always the one in which the vulnerability has not been patched.

It's good that the new CVE schema will allow the CNA to indicate whether the vulnerability is found in the patched or the unpatched version of the product, but I was surprised there should be any question about this: After all, only the unpatched version of the product is vulnerable to the CVE, so in my opinion – at least at the time – there should be a rule that the CVE always applies to the unpatched version (frankly, I was surprised that Microsoft seems to take the opposite position). However, I now realize it’s more complicated than that, and there are two other serious questions that come into play here.

The first of those questions is how the supplier will distinguish between the patched and the unpatched versions of the product. I have always assumed that, when a patch is applied to a software product, the version number will automatically change to a new one; for example, if the product follows “semantic versioning”, when a patch is applied to version 5.2.0 of a product, the patched version might be numbered 5.2.1. In this example, when the supplier reports the vulnerability, they will list v5.2.0 in the CVE record, if they follow my rule.

Of course, this is done so that, when a software user learns that version 5.2.0 of a product they use is affected by a new CVE, they will check to see which version they are using. If it’s v5.2.0, they will download and apply the patch. But if it’s v5.2.1, they will know they’re already running the patched version.

However, a lot of software suppliers (both commercial and open source suppliers) don’t change the version number when a patch is applied. Instead, the user needs to do something else to learn what patches are installed on their system. In Microsoft Windows, Windows Update or PowerShell provides this information, but it doesn’t automatically find its way into a CVE record for a product.

Of course, it all software suppliers were required by CVE.org (in the best practices sense, not the regulatory sense) to follow semantic versioning or a similar versioning scheme, it might be possible always to represent the patched version of a product with a different version string – derived by following a particular rule – than the version string used by the unpatched version. Would that solve our problem?

No, it wouldn’t. To understand why, you should review this post in which Bruce (whom I just referred to as “the Director of Project Security of one of the largest software developers in the US”) pointed out that developers often release multiple patches in between two consecutive versions of the product; it is up to the user to decide which of those patches, if any, they wish to apply[i].

This means that, until a new version is published (in which all patches released since the last version are included), the supplier will never know which patches are on a particular user’s system, unless they ask the user to tell them (perhaps during a help desk call). Since a version string that took account of patches would need to represent each of the patches that has been applied since the last version (major or minor) was released, this would make it very difficult to accurately represent all those patches.

To illustrate this problem, suppose there have been five patches released since the last version of a product, conveniently numbered 1,2,3,4, and 5 (of course, real patch numbers are much more complicated). Suppose a user had applied patch numbers 1, 3 and 4, but not 2 and 5. The version string for their instance of the product might be 5.2.1_3_4, or something like that. A user that had just applied patches 3 and 5 would see the version string 5.2.3_5.

Now let’s say the supplier wants to report a new vulnerability in their product. Since they shouldn’t report the vulnerability until they have a patch available for it, let’s suppose this is the sixth patch since the 5.2.0 release of the product. What will the patched version be called? It will have to depend on which of the previous five patches the user has applied. For example, if the user had applied patches 1 and 4, the new version would be called v5.2.1_4_6; if the user had applied patches 2,3, and 4, the new patch would be v5.2.2_3_4_6, etc.

This isn’t just a numbering problem. Since the code for a patch often varies depending on which patches have already been applied, the developer may need to develop a different new patch for each combination of previously applied patches. Bruce calculated that the number of patches[ii] that may be needed is equal to 2 raised to the number of independent patches that have been issued since the last update. In the above example with six patches, this is 2 to the sixth power, or 64. If there are 10 independent patches, the total number of new patches required is 1,024. Needless to say, it isn’t possible to develop this many patches whenever a single new patch is needed.

In other words, my idea that a patched version of a software product could be distinguished from the unpatched version simply by changing a number will not work in practice. Our original question (the one Lisa Olsen discussed in the CNA Workshop in December) was whether the CVE record should refer to the patched or the unpatched version.

The answer is that, since there might be hundreds of “patched versions” applying to a single unpatched version, there is no good way to use versioning as a way to distinguish patched from unpatched products. Of course, this is why patch reports are usually very complicated documents, such as this one from Oracle (chosen at random).

Naturally, this is a disappointment. I originally thought the problem Lisa brought up could be easily solved, but that’s far from being the case. So, how can it be solved? Like almost every complicated problem, it requires people with a stake in solving the problem to get together (presumably virtually) to work out an acceptable solution. It won’t be a thing of beauty, of course, but hopefully it will at least be usable.

Who are the people with a stake in this problem and how will they get together? In my previous post, I called for a proof of concept for introducing purl identifiers into the CVE ecosystem. That PoC will gather stakeholders in the vulnerability management process, including software developers, CNAs, and vulnerability database operators. Since one of the objectives of the PoC will be to identify new or changed “rules of the road” for reporting new vulnerabilities in CVE records, and since this question very much involves one of those rules of the road, it would be appropriate to include a discussion (and hopefully resolution) of this problem in the proof of concept.

I will put this in the to-do list for the PoC. Please join us for this project. Email me at tom@tomalrich.com to discuss participating in the project (or at least the initial planning phase) and/or financially supporting the project by donating to OWASP.

 

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] Sometimes, developers release “cumulative” patches, meaning one patch will include all previous patches. However, suppose a supplier releases a cumulative patch C, which includes patches A and B. A user doesn’t want to apply Patch A (because of a possible incompatibility with their instance of the product), but they’d like to apply C. Obviously, they can’t do that. This is one reason why cumulative patches aren’t the norm in the software industry.

[ii] The 2023 post was about SBOMs, but the argument applies equally to patches.

Saturday, March 22, 2025

Implementing purl in the CVE ecosystem


As discussed in this recent blog post, software vulnerability management is facing a serious problem: The National Vulnerability Database (NVD) seems to be seriously neglecting one of its two primary responsibilities: adding “CPE names” to new CVE records.

This leads to two problems that need to be addressed as soon as possible.

Part I: The first problem

· Software users need to be able to learn about vulnerabilities that have been reported in the software they use. They do this by searching a software vulnerability database.

·       The National Vulnerability Database (NVD) is by far the most widely used vulnerability database in the world. However, just learning about a new software vulnerability does not help a user, unless they know what product or products are affected by the vulnerability.

·       In the NVD, vulnerabilities are identified in CVE records using a format like “CVE-2024-12345”. Products that are affected by a CVE should be identified in the CVE record, using a machine readable “CPE name” like “cpe:2.3:a:microsoft:internet_explorer:8.0.6001:beta:*:*:*:*:*:* ”.  

·       Currently, NVD contractors are responsible for adding CPE names to new CVE records. Before February 2024, this was almost always done within a few days of when the NVD received the CVE record.

·       If a CVE record does not contain a CPE name for a software product affected by the vulnerability described in the text of the record, a user searching for vulnerabilities that have been identified in the product will not learn it is affected by that CVE.

·       The NVD’s problem is that, starting on February 12, 2024, the number of CPE names it created dropped drastically. As a result, about 55% of new CVE records in 2024 were never given a CPE name. This means the product(s) affected by that CVE are invisible to a search. Various observers have pointed out that in 2025, the problem is getting even worse, since only about 25% of new CVE Records contain a CPE name. As of this writing (the end of March 2025), there is a serious question whether the NVD is creating new CPE names at all.

·       This means that any search for a particular product in the NVD is more likely than not to miss any vulnerabilities that have been reported for that product since February 2024, making NVD searches more and more useless – as well as misleading – as time goes on.

·       What other vulnerability databases are there, besides the NVD? For open source software products, the answer is “a lot”. These include OSS IndexOSVGitHub Security Advisories and others; in fact, a software user is more likely to learn about vulnerabilities that apply to open source software products (or open source components in an SBOM) in these databases than in the NVD.

·       However, for commercial software products, there is currently only one vulnerability database: the NVD[i]. Because the NVD can no longer be called reliable, this means there is currently no reliable source of vulnerability data for commercial software products. Obviously, this isn’t a good thing, given how dependent business and government organizations are on commercial software.

·       When will this problem be fixed? If this question asks when CPE names will be added to all the over 30,000 “CPE-less” CVE records currently in the NVD’s backlog, the answer is “probably never”. Currently, the best that can be hoped for is to slow the rate of growth of the backlog.

·       Since the CPE backlog may never go away, what measures can be taken in the longer term? There isn’t much question: CVE records can no longer be restricted to containing CPE identifiers. While CPE should continue to be one option for new CVE records, it should no longer be the only one. The best alternative is purl.[i]

·       If purl were implemented in the CVE record format, it would immediately improve identification of open source products, since purl has – in only about eight years – become a major software identifier used in open source vulnerability databases.

·       The most important feature of purl is that the user never has to look up the purl for a product before they search for the product in a vulnerability database. This is because they should always be able to create the purl for a product by using information they already have. This includes the package manager from which they downloaded the software, as well as the package name and version string in that package manager.

·       Since each purl is globally unique, a purl for an affected product in a CVE record should always match a purl created by a software user before they search for the product in a vulnerability database. This means that searches using purl will have a high success rate.

How can we address the first problem in Part I of the project? 

Introducing purl into the CVE ecosystem requires making it possible for CVE Numbering Authorities (CNAs) to designate software products in CVE records using purls. CNAs are mostly larger software developers and organizations like GitHub; they are responsible for reporting vulnerabilities to CVE.org using the CVE Record Format.

Three tasks are required to address the first problem. In each of these tasks, we will coordinate with the CVE.org Quality Working Group.

A.      Develop a new version of the CVE Record Format to accommodate use of purl and submit it as a pull request to CVE.org. The SBOM Forum will work with the CVE.org Quality Working Group (QWG) and the Python Foundation to accomplish this goal.

B.      Develop plans for an end-to-end proof of concept for use of purl in the CVE ecosystem.

C.     Conduct that proof of concept. The PoC will involve software suppliers, end user organizations, CVE Numbering Authorities (CNAs) and vulnerability database operators.


Part II: The second problem

·       Today, purl can only be used to identify open source software in package managers, not commercial software. Since most private and government organizations utilize commercial software to run their businesses, it is important that purl be expanded to identify commercial, as well as open source, software products. In 2022, the OWASP SBOM Forum suggested[iii] a way to fix this problem by having a supplier create a “SWID tag” for each of their products. A new “type” called SWID was developed and implemented in purl.

·       A SWID tag is a small document containing 5-10 pieces of metadata about a software product. These pieces of information can be used to create the purl for the product, which will always be globally unique.

·       The only three mandatory fields for a purl using the SWID type are “name”, “version” and “tagId”. Note that “tagId” can be almost anything. For example, it could be the URL from which the product is downloaded.

·       To illustrate this, the purl for Fedora version 29 is “pkg:swid/Fedora@29?tag_id=org.fedoraproject.Fedora-29”. Note that every purl starts with “pkg:” followed by the type. For open source software, the type usually indicates the package manager or other repository – for example, “NPM” for Node NPM packages and “maven” for Maven JARs and related artifacts. However, for commercial software, the type will normally be SWID.

·       The supplier will usually make both the SWID tags and the purls for their products available on their website or by other means. If a user wants to look up a product in a vulnerability database, they can download the purl for it, if that is available; otherwise, they can download the SWID tag and use that to create the purl (of course, various tools will automate this process). Neither the purl nor the SWID tag will need to change until the product is upgraded to a new version.

·       As in the case of purls for open source software products, the purl included in the CVE record for a commercial product should always match the purl a user creates when they want to search for that product; this is because both purls will be based on the contents of the same SWID tag. 

How can we address the second problem in Part II of the project?

We can address the second problem – purls current lack of support for commercial software products - with three tasks. To accomplish Part II, we will work with a group of industry participants, including commercial software developers, CNAs, and vulnerability database operators.

A.      Develop “rules of the road” for production, distribution, and use of SWID tags to allow purl to identify commercial software.

B.      Test those rules in a small-scale proof of concept. In that PoC,

                                        i.               A supplier will create SWID tags (perhaps using this tool) for certain products and make them available to their customers;

                                      ii.               CNAs will create test CVE records containing those purls to report test “vulnerabilities” in their products[v];

                                   iii.               One or more vulnerability databases (that support both CVE and purl) will ingest the test CVE records; and

                                   iv.               End users will utilize purls created from the SWID tags to search the vulnerability databases. If all the CVEs that were recorded for a product are revealed when the user searches using the product’s purl, the PoC is successful.

C.     Develop educational webinars and videos on use of purl in the CVE Record Format for CNAs and other participants in the CVE ecosystem.

Note: While Part 2 of the project follows Part 1 in this document, it is not necessary that this should be done when the project is executed. This is because nothing in Part 1 is an absolute prerequisite for accomplishing Part 2. The project might be significantly accelerated if Parts 1 and 2 could be accomplished at the same time.

The goal of Part 1 is to conduct a proof of concept to demonstrate how purl, as it is used today, can be incorporated as an optional software identifier in the CVE ecosystem. Since purl currently is used mostly to identify open source software found in package managers, that will be its use when it becomes an option in CVE Records.

However, the goal of Part 2 is to allow purl to become an identifier for commercial software products by having commercial developers create SWID tags to carry  metadata for their product; users searching for vulnerabilities in a commercial product that they own can utilize the product’s SWID tag to create a purl for it. This should always exactly match the purl used by the CNA when they reported the vulnerability in a new CVE Record (because both purls will be based on the same SWID tag).

The goal of Part 2 will be to develop “rules of the road” for creating and using SWID tags and the purls based on them, as well as test these in a small-scale proof of concept. Because SWID is just one of hundreds of purl types and the types can be used interchangeably, no changes should be required to the CVE Record Format or any other component of the CVE ecosystem.

Therefore, if funding permits, Part 2 of this project should be executed at the same time as Part 1. For example, while the proof of concept in Part 1 Step C is being executed, either (or both) steps D and E of Part 2 could be executed. Since Part 1 might itself require nine months to one year, starting both parts simultaneously could save up to a year of total project time.

One reason why this should be considered is that as of late March 2025, the NVD is not creating new CPE names in anywhere near the volume required to reduce their backlog of “unenriched” CVE Records, let alone eliminate it. While there are alternate vulnerability databases for open source software (almost all based on purl), there are no vulnerability databases for commercial software that are not themselves based on the NVD.

As previously stated, this means there is no reliable vulnerability database for commercial software products today. Given that private enterprises and government agencies mainly utilize commercial software, this is a serious problem. The sooner that purls based on SWID tags can be used in new CVE Records, the sooner that users of commercial software products will be made fully aware of the risks they face due to recently identified vulnerabilities.

 

Conclusion

It is possible that the National Vulnerability Database may no longer exist at all soon. However, no matter what happens, it is clear there needs to be an alternate software identifier besides CPE available to CNAs and software end users. While there are one or two experimental alternatives (such as OmniBOR), purl is already in heavy use. For example, the open source software composition analysis (SCA) tool Dependency Track alone is used over 20 million times every day to look up a dependency from a software bill of materials (SBOM) in the OSS Index vulnerability database, which is based on purl.

Purl’s availability in the CVE record format will quickly make identification of open source software much easier and more accurate in the NVD and other vulnerability databases based on CVE. And, when the policies and procedures for use of the SWID purl type have been worked out and tested in a proof of concept, identification of commercial software products in the same databases will be much easier, as well as much more accurate.

Of course, it will probably be 1-2 years before purl is in widespread use in the CVE ecosystem. But there’s no excuse for waiting any longer; two years in the future will still be two years in the future six months from now. The six tasks (A – F) listed above are mostly non-technical; they mainly require getting agreement among a group of participants in the CVE ecosystem.

The OWASP SBOM Forum will be pleased to lead this effort; we will start out with an initial project to perform the first two Part 1 tasks: development of plans for the proof of concept and identification of changes to the CVE record format that are required to accommodate purl.

Anyone interested in contributing their time and/or resources to this project should contact Tom Alrich at tom@tomalrich.com. Donations to OWASP of $1,000 and up can be “restricted” to use by the SBOM Forum. OWASP is a 501(c)(3) nonprofit organization. We welcome all contributions. 


[i] There are several commercial vulnerability databases that are based on the NVD; these include the data currently in the NVD (which can be downloaded in about ten minutes). That data has been augmented and “cleaned up”. These databases are all trying to remedy the NVD’s current shortfalls in various ways. However, none of them have the resources to do more than try their best to fill the most serious gaps.

Wednesday, March 19, 2025

Has the NVD become an empty shell?

Brian Martin, a well-known vulnerability researcher with whom I’ve had previous online discussions, has been closely following what is happening with the National Vulnerability Database (NVD) for a long time. Recently, he’s trying to answer the question whether the NVD is making progress on fixing the big problem that developed starting on February 12, 2024.

Starting on that date, the NVD greatly slowed performance of what I (and others) believe is their most important function: adding machine-readable product identifiers called CPE names to the CVE records created by CVE.org[i]. CVE records (created by CNAs) include a textual description of a newly identified vulnerability, as well as of one or more products that are affected by the vulnerability.

However, an NVD user utilizing the search bar (or one of the APIs) to learn about CVEs that apply to a product their organization uses will not be shown any CVEs that don’t include a CPE name. While in previous years, the NVD staff – or more specifically, contractors to NIST, the NVD’s parent – had always created CPE names for products mentioned in a new CVE record within days of the NVD’s receiving it from CVE.org, in 2024 the NVD added CPE names to fewer than half of the new records. This means that, if you search for a particular product in the NVD’s search bar today, you will on average only be shown half of any CVEs that affect the product, if they were discovered in 2024.

All CVE records that don’t have CPE names (called “unenriched” records) constitute the NVD’s backlog. Until last February, the NVD ran on average a zero backlog, but that changed on February 12. For a while, the NVD promised they would have the backlog cleared up in 2024, but that simply didn’t happen. In fact, in late December the NVD initially reported adding CPE names to only about 1% of CVE records. This meant that during that period, the backlog essentially grew at the same rate as new CVE records.

By the end of 2024, the backlog was about 22,000 unenriched records, which is close to 55% of the approximately 40,000 new CVE records created in 2024. As we started off 2025, the question was whether the NVD would start working off that backlog or whether they would allow it to grow.

Unfortunately, we have the answer to that question now: Brian put up a post on LinkedIn this week that shows the NVD seems to have completely stopped adding CPE names (and CVSS scores) to new CVE records for the previous 12 days. Of course, there is no communication at all from the NVD about this problem – in fact, the last time they posted a communication about their backlog was last November 13, when they said they were now fully staffed up and were hoping to make real progress soon.

Now it seems the NVD may have stopped performing their most important function, other than maintaining the database itself. This could well be because of the turmoil that has been going on in the federal government this year, which has included cancellation (whether legal or otherwise) of many ongoing contracts. If this is the case, that’s bad news indeed, since it means the NVD might end up going the way of agencies like USAID, which was all but shut down more than a month ago and remains in that state (with close to zero employees) today, despite a court order to reopen the agency.

In other words, the NVD might already be gone. But even if it survives in some form, unless enrichment restarts, it will be a hollow shell of what it was. The good news is that, if you search the NVD for recently announced vulnerabilities that apply to a product you use now, you’re likely to receive the message that there are no vulnerabilities to display.

The bad news is that’s probably not the truth. However, the only way to find out about CVEs that were not displayed is to do text searches of the over 40,000 new CVEs that were identified since February 12, 2024. Obviously, that’s not a solution to the problem (there are several third parties like Vulners and VulnCheck that are performing this work themselves, although none of them have done it for all 22,000 CVEs in the backlog. CISA was doing some of this work in their Vulnrichment program, but that program stopped adding CPE names in December, for some unexplained reason).

I don’t know what the backlog of unenriched CVEs is today, as a percentage of new CVEs identified since say the beginning of 2024. However, it’s without a doubt over the 55% level, where it stood in December. Where will this end? Clearly, if enrichment doesn’t resume (and specifically, addition of CPE names to CVE records), we’ll end up with a backlog that asymptotically approaches the number of CVEs that have been identified since last February 12.

In other words, a search for a product in the NVD will become increasingly meaningless. In fact, many people would argue that any product vulnerability search that at best will yield you half of the vulnerabilities that have been discovered in the past year is already meaningless.[ii] Of course, the NVD will probably survive as an historical database, but it won’t be a trustworthy source of vulnerability data for CVEs identified after February 2024.

How can this problem be fixed? It depends on what you mean by ‘fixed’. If you’re talking about going back and adding CPEs to all the unenriched CVE records since last February, that’s probably never going to happen.

However, if you’re talking about putting in place a long-term solution, that is certainly possible. Like all long-term solutions, it will require a lot of work. The source of the problem is quite clear: For reasons that are not known, the people tasked with adding CPE names to CVE records aren’t able to do their job now.

At first, it might seem that the solution to this problem is to find some more money to throw at the problem. NIST did that last year – and now the problem is worse than ever. Plus, given the current climate in Washington, I strongly doubt there are any trees left to shake for money.

Most importantly, CPE is a flawed software identifier, as the SBOM Forum (now the OWASP SBOM Forum) described in our 2022 white paper on fixing the naming problem in the NVD. Some of CPE’s problems might be fixed (and may in fact be fixed due to CVE.org’s revised specification), but others simply can’t be fixed.

Perhaps the biggest of these problems is the fact that CPE requires a name for the software product and the vendor of the product. What’s so hard about that? The problem is most software products and software vendors go by many different names in different contexts. If someone wants to find a particular product or vendor name in the NVD, they will need to guess at the one that was used by the person who created the CPE. There’s no way to know beforehand what that name was.

At this point, someone will usually say something like, “If we had a global directory of software vendor names, all those names could be mapped to a single canonical name. Then we could require that the person who creates a CPE name only use the canonical name for each vendor.” This sounds very simple, but who makes the decision on the name for the vendor? Is it the current CEO, the current CFO (in financial filings, etc.), the initial articles of incorporation, the name used by the New York Times or the Tokyo Yomiuri Shimbun, etc. etc?

And what if the developer acquires another developer, but – as usually happens – leaves the existing product and vendor names in place for a year or two, or perhaps never changes them? A customer of the acquired developer might not learn of the acquisition for a year or two and will continue to search for vulnerabilities using the previous product and vendor names.

Of course, if the funds were available to create a global database of software suppliers and more importantly to maintain it, this wouldn’t be a problem. However, maintaining the supplier database would be hugely more expensive than maintaining the NVD itself. And when you talk about the required database of product names, that would be much more expensive than the supplier database. Clearly, neither of these databases is likely to be available soon or ever.

What’s needed is a software identifier that is based entirely on information that is available to the user at any time. Such an identifier doesn’t have to be “created” at all. That identifier is purl, which stands for Product URL. Purl came from literally nowhere ten years ago to conquering the open source world – to the extent that currently there is almost no open source vulnerability database that doesn’t support purl. Purl can be implemented in the “CVE ecosystem" fairly easily, once a revised CVE record format can be developed and tested. Moreover, purl doesn’t need to supplant CPE. Current CPE names won’t go away, and if someone starts creating new CPE names again and adding them to CVE records, they will certainly be supported.

To implement purl in the CVE ecosystem, two steps are required:

1.      Changes will be needed in the CVE Record format. Of course, that format is changed regularly, so this should not be too hard. The changes will not be implemented until step 2 is completed (and they may not be implemented for a while after that, since there is always a backlog of format changes that are needed).

2.      There needs to be an end-to-end proof of concept. It will start with CVE Numbering Authorities creating test CVE records (based on the revised CVE Record format) and end with users searching vulnerability databases for vulnerabilities applicable to particular products. If the users are shown all the CVEs that affect each product, the PoC will be successful.

These two steps are required to implement purl as it’s currently configured: to support open source software found in package managers. Purl doesn’t currently support commercial products. While it will be a big step forward when purl is implemented in the CVE ecosystem just for open source software, it will be much better when purl supports commercial software as well. Since the NVD is currently the primary vulnerability database for commercial software, if purl is to provide a solution to the NVD’s problems, it will need to support commercial software as well as open source.

I described a possible solution to this problem – proposed by Steve Springett, leader of the OWASP Dependency Track and CycloneDX projects – in this blog post. It requires software suppliers to create a “SWID tag” for each of their products and make it available on their website to all interested parties, whether or not they are customers (the SWID tags will also be distributed via the Transparency Exchange API).

A user who wants to search for vulnerabilities in a product they use can download the product’s SWID tag and create a simple purl using it (usually, only 3-4 fields from the SWID tag will be required to create the purl). Since the CNA will use the same tag when they create the purl for the CVE record, this means a user searching for the product’s purl in a vulnerability database should always find any CVE records for the same product and version. Note that this doesn’t require anyone to “create” a purl; its contents will be dictated by the contents of the SWID tag.

Thus, the third and final step for implementing purl support in the CVE ecosystem is testing and implementing the use of SWID tags to create and validate purls for commercial software. Since this step can be accomplished at the same time as the first two steps, it would speed up the implementation of purl in the CVE ecosystem if the two tracks could be carried out simultaneously.

How long will it take to do all of this? I think the first two steps – which will implement purl in the CVE ecosystem with coverage of open source software in package managers – will take about a year. The third step, validating and implementing the use of SWID tags to create purls for commercial software, will require another year. However, if resources permitted both tracks to be carried out at the same time, we could have purl supported throughout the CVE ecosystem, for both open source and commercial software, in 1 ½ to 2 years.

This is a long time, of course, but given the distinct possibility that the NVD will become an empty shell soon – and given that it is already greatly diminished – what other choice is there? With 280,000 CVEs in the catalog now, it is long past the time when anything other than automated search for software vulnerabilities is practical. We’ve already effectively lost the capability for automated identification of all vulnerabilities that apply to a product in an NVD search. It’s time to start work on Plan B.

The OWASP SBOM Forum is willing to take the lead on all three steps of this project. However, we require funding for that. If your organization is able to support us with either donations or personnel or both, please email me at tom@tomalrich.com. OWASP is a 501(c)(3) nonprofit organization.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] For a summary of what CVE.org does and how it relates to the NVD, see this post.

[ii] CVE.org has made modifications to the CPE specification in an effort to get the CVE Numbering Authorities (CNAs) – who in the majority of cases are the developer of the product for which a vulnerability is being reported – to start adding CPE names to the CVE records they create. I hope this effort is successful, of course, but we certainly can’t count on it.

Sunday, March 16, 2025

The two uses of “exploitability”


My post about a week ago introduced (for those who weren’t familiar with it already) the concept of exploitability of a vulnerability. This term is used in discussions of the risk posed by a particular vulnerability (usually a CVE, although there are other types of vulnerabilities as well). Perhaps the most important reason why the risk posed by a vulnerability comes up is the need to prioritize patches for application. Most medium-to-large organizations today have more patches to apply than they have time in the day to apply them; they need to decide which patches to prioritize and which to ignore. The best way to prioritize patches is to determine how much risk each patch mitigates by assigning a risk score to each vulnerability addressed by the patch.

Risk is a combination of likelihood and impact. Thus, any attempt to measure the risk posed by a vulnerability must take account of both variables. For impact, there isn’t much dispute over the best way to measure that: the well-known Common Vulnerability Scoring System (CVSS) Base Score is an index of the impact (called “severity”) of a vulnerability being exploited in software products in general.

But how can we measure the likelihood that the vulnerability will be exploited? CISA’s Catalog of Known Exploited Vulnerabilities (KEV) is a list of vulnerabilities that are known to have been exploited “in the wild” – i.e., not by a researcher doing a proof of concept, but by a bad guy pursuing not-too-nice ends. It’s hard to argue that any vulnerability in the KEV catalog should be prioritized for patching, since the likelihood that it will be exploited again is close to 100%.

However, the problem with KEV is that fewer than 1% of the 280,000 CVEs in the current CVE list are in the KEV catalog. How can we estimate the likelihood of the remaining 99% of CVEs being exploited in the wild? This is where the Exploit Prediction Scoring System (EPSS) comes in. EPSS is an index of the likelihood that a vulnerability will be exploited in the wild within the next 30 days. In my opinion, a vulnerability risk score should take into account both whether the vulnerability is in the KEV catalog and the EPSS score[i].

However, for a couple of years, I’ve been pointing out that we (meaning the software security community) are now using the term “exploitability” in two senses. One is the EPSS sense, but the other is a sense that I haven’t written about much lately (but used to a lot): VEX. That stands for “vulnerability exploitability exchange”. It’s an outgrowth of the SBOM (software bill of materials) “movement”, which I’ve been involved with. In fact, my book, “An Introduction to SBOM and VEX”, is still the only book on Amazon that seriously discusses SBOMs or VEX (there are a couple of the inevitable books that always appear when a new topic is introduced, offering “5 tips about [insert topic du jour]”. Plus, searching on “SBOM” yields – a few places below my book - one of my kids’ favorite books when they were young, “The Very Hungry Caterpillar”. However, I haven’t figured out what that book has to do with SBOM or VEX).

The idea of VEX came up when a few large companies who were involved in the NTIA Software Component Transparency Initiative, including Cisco and Oracle, became alarmed at the large number of false positive vulnerabilities that appear when a software user utilizes a tool like Dependency Track to identify vulnerabilities in components contained in a product they use; in fact, it seems that more than 97% of the identified vulnerabilities are false positives. This isn’t because the vulnerable code isn’t in the component; it usually is. However, because of the many ways a component can be installed in a software product, in the majority of cases the installation itself mitigates the vulnerability[ii].

Those big companies, driven by visions of their support lines being swamped with calls from outraged users about the huge number of “vulnerabilities” in their products, decided the solution to that problem is a machine readable document that essentially says, “Vulnerability CVE-2025-12345 is not exploitable in our Product ABC version 2.1. You don’t have to worry about it and we won’t patch it, since that would be a waste of both of our time.”

Let’s compare the meaning of “exploitable” in the above sentence and the meaning of the same word in EPSS. In EPSS, a vulnerability is exploitable if there’s an exploit kit available, if it’s being discussed in hacker blog posts, etc. Even more importantly, a lot of different products might be under attack by hackers exploiting that vulnerability; the more attacks that are occurring on more products, the more “exploitable” the vulnerability is. The EPSS score is a probability and varies between 0 and 1.0.

On the other hand, VEX has nothing to do with what’s going on in the outside world. It has to do with a single product, which might be installed on your company’s network or another company’s. It gives a binary answer to the question, “If a hacker were able to reach Product ABC version 2.1 on my network, would they be able to exploit CVE-2025-12345, which is present in a component of ABC v2.1?”

Let’s compare the two uses of “exploitable”. The question in both cases is, “Is CVE-2025-12345 exploitable?”

1.      In VEX, the question refers to a single product. In EPSS, it refers to all products.

2.      In VEX, the answer can be determined by technical means: Ask a hacker of average skill level (not the uber-hacker who could penetrate a brick) to exploit the vulnerability in the product in question (meaning they can utilize the vulnerability for some purpose, like viewing restricted information or escalating privileges). If the hacker can do that, the vulnerability is exploitable in that product. If the hacker can’t do it, the vulnerability isn’t exploitable in the product[iii]. In EPSS, the answer is determined statistically, based on data like the number of times the CVE has been mentioned in a blog or website, and whether there is publicly available exploit code.

3.      In VEX, the answer is binary: the CVE is exploitable in a particular product or it isn’t. The answer doesn’t change unless the product itself changes. In EPSS, the answer is a probability, which changes over time based on changes in the variables that make up the EPSS score (in fact, EPSS scores are re-computed daily for every one of the about 280,000 vulnerabilities currently in the CVE catalog).

4.      In VEX, the skill level of the hacker can make a big difference in the answer to the question whether CVE-2025-12345 is exploitable in Product ABC v2.1. If the hacker is very skillful, they might be able to compromise ABC. This is why VEX assumes a certain “average” level of hacker skill, although that can never be specified with any rigor. On the other hand, EPSS just depends on activity. If enough hackers are trying to use CVE-2025-12345 to attack any product, it doesn’t matter whether they’re succeeding in their efforts. If hackers are targeting that vulnerability, its EPSS score will probably go up.

Given how differently the single term “exploitable” is used in the two cases, it seems clear both cases shouldn’t use the same term. I think the VEX use case is more deserving of the term, since the question is whether it’s technically possible for any hacker to exploit the vulnerability in that product. The answer to the question in the EPSS use case doesn’t depend on technical data. Instead, it depends on statistical analysis of social data, such as the existence of code to exploit the vulnerability, or even just public discussion about doing so.

However, even though I think the VEX use case is much closer to what most security professionals think of when they use the term “exploitability”, I admit it’s too late to try to take it away from the EPSS use case. The VEX concept is struggling to be accepted, while EPSS is already in wide use. What should we do? Call the EPSS case “Exploitability #1” and the VEX case “Exploitability #2”?

That won’t work, but there’s another way we can state the fact that an attacker of medium skill should be able to exploit the vulnerability in the product in question: We can say, “Product A version 2.1 is affected by CVE-2025-12345”, rather than “CVE-2025-12345 is exploitable in Product A v2.1.” In fact, this is the wording we use to talk about vulnerabilities in general today. I suggest we use this wording when discussing VEX use cases from now on. Problem solved.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] EPSS scores for all 280,000 CVEs in the current CVE catalog are updated daily.

[ii] There are many other reasons why a vulnerability in a component will not be exploitable in the product itself – e.g., if the developer patched the vulnerability when it installed the component in the product.

[iii] For us to make the positive statement that the vulnerability is exploitable in the product, Mr./MS Average Hacker would have to succeed in exploiting the vulnerability. However, if the same hacker can’t exploit the vulnerability once, that doesn’t demonstrate the vulnerability isn’t exploitable. The hacker would have to fail to exploit the vulnerability multiple times, before it could be said that the vulnerability isn’t exploitable in the product.