Sunday, March 23, 2025

A longstanding CVE problem needs to be resolved - along with two others

 

Last December in the annual CNA Workshop run by CVE.org, an important issue was brought up during a presentation by Lisa Olsen of Microsoft. I wrote in my post:

Lisa pointed out that in some CVE reports, the “affected” version of the product refers to the fixed version of the product (i.e., the version to which the patch has been applied), while in other reports (usually from different CNAs), the “affected” version is the unpatched version.

This is a huge difference, since it means some organizations may be applying patches to versions of a product to which the patch has already been applied. Lisa said the new CVE schema will allow the CNA (which is in many cases the developer of the affected product) to indicate which case applies. However, it seems to me there should be a rule: The “affected” product is always the one in which the vulnerability has not been patched.

It's good that the new CVE schema will allow the CNA to indicate whether the vulnerability is found in the patched or the unpatched version of the product, but I was surprised there should be any question about this: After all, only the unpatched version of the product is vulnerable to the CVE, so in my opinion – at least at the time – there should be a rule that the CVE always applies to the unpatched version (frankly, I was surprised that Microsoft seems to take the opposite position). However, I now realize it’s more complicated than that, and there are two other serious questions that come into play here.

The first of those questions is how the supplier will distinguish between the patched and the unpatched versions of the product. I have always assumed that, when a patch is applied to a software product, the version number will automatically change to a new one; for example, if the product follows “semantic versioning”, when a patch is applied to version 5.2.0 of a product, the patched version might be numbered 5.2.1. In this example, when the supplier reports the vulnerability, they will list v5.2.0 in the CVE record, if they follow my rule.

Of course, this is done so that, when a software user learns that version 5.2.0 of a product they use is affected by a new CVE, they will check to see which version they are using. If it’s v5.2.0, they will download and apply the patch. But if it’s v5.2.1, they will know they’re already running the patched version.

However, a lot of software suppliers (both commercial and open source suppliers) don’t change the version number when a patch is applied. Instead, the user needs to do something else to learn what patches are installed on their system. In Microsoft Windows, Windows Update or PowerShell provides this information, but it doesn’t automatically find its way into a CVE record for a product.

Of course, it all software suppliers were required by CVE.org (in the best practices sense, not the regulatory sense) to follow semantic versioning or a similar versioning scheme, it might be possible always to represent the patched version of a product with a different version string – derived by following a particular rule – than the version string used by the unpatched version. Would that solve our problem?

No, it wouldn’t. To understand why, you should review this post in which Bruce (whom I just referred to as “the Director of Project Security of one of the largest software developers in the US”) pointed out that developers often release multiple patches in between two consecutive versions of the product; it is up to the user to decide which of those patches, if any, they wish to apply[i].

This means that, until a new version is published (in which all patches released since the last version are included), the supplier will never know which patches are on a particular user’s system, unless they ask the user to tell them (perhaps during a help desk call). Since a version string that took account of patches would need to represent each of the patches that has been applied since the last version (major or minor) was released, this would make it very difficult to accurately represent all those patches.

To illustrate this problem, suppose there have been five patches released since the last version of a product, conveniently numbered 1,2,3,4, and 5 (of course, real patch numbers are much more complicated). Suppose a user had applied patch numbers 1, 3 and 4, but not 2 and 5. The version string for their instance of the product might be 5.2.1_3_4, or something like that. A user that had just applied patches 3 and 5 would see the version string 5.2.3_5.

Now let’s say the supplier wants to report a new vulnerability in their product. Since they shouldn’t report the vulnerability until they have a patch available for it, let’s suppose this is the sixth patch since the 5.2.0 release of the product. What will the patched version be called? It will have to depend on which of the previous five patches the user has applied. For example, if the user had applied patches 1 and 4, the new version would be called v5.2.1_4_6; if the user had applied patches 2,3, and 4, the new patch would be v5.2.2_3_4_6, etc.

This isn’t just a numbering problem. Since the code for a patch often varies depending on which patches have already been applied, the developer may need to develop a different new patch for each combination of previously applied patches. Bruce calculated that the number of patches[ii] that may be needed is equal to 2 raised to the number of independent patches that have been issued since the last update. In the above example with six patches, this is 2 to the sixth power, or 64. If there are 10 independent patches, the total number of new patches required is 1,024. Needless to say, it isn’t possible to develop this many patches whenever a single new patch is needed.

In other words, my idea that a patched version of a software product could be distinguished from the unpatched version simply by changing a number will not work in practice. Our original question (the one Lisa Olsen discussed in the CNA Workshop in December) was whether the CVE record should refer to the patched or the unpatched version.

The answer is that, since there might be hundreds of “patched versions” applying to a single unpatched version, there is no good way to use versioning as a way to distinguish patched from unpatched products. Of course, this is why patch reports are usually very complicated documents, such as this one from Oracle (chosen at random).

Naturally, this is a disappointment. I originally thought the problem Lisa brought up could be easily solved, but that’s far from being the case. So, how can it be solved? Like almost every complicated problem, it requires people with a stake in solving the problem to get together (presumably virtually) to work out an acceptable solution. It won’t be a thing of beauty, of course, but hopefully it will at least be usable.

Who are the people with a stake in this problem and how will they get together? In my previous post, I called for a proof of concept for introducing purl identifiers into the CVE ecosystem. That PoC will gather stakeholders in the vulnerability management process, including software developers, CNAs, and vulnerability database operators. Since one of the objectives of the PoC will be to identify new or changed “rules of the road” for reporting new vulnerabilities in CVE records, and since this question very much involves one of those rules of the road, it would be appropriate to include a discussion (and hopefully resolution) of this problem in the proof of concept.

I will put this in the to-do list for the PoC. Please join us for this project. Email me at tom@tomalrich.com to discuss participating in the project (or at least the initial planning phase) and/or financially supporting the project by donating to OWASP.

 

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] Sometimes, developers release “cumulative” patches, meaning one patch will include all previous patches. However, suppose a supplier releases a cumulative patch C, which includes patches A and B. A user doesn’t want to apply Patch A (because of a possible incompatibility with their instance of the product), but they’d like to apply C. Obviously, they can’t do that. This is one reason why cumulative patches aren’t the norm in the software industry.

[ii] The 2023 post was about SBOMs, but the argument applies equally to patches.

Saturday, March 22, 2025

The NVD seems to be on its way out. What do we do now?


As discussed in this recent blog post, software vulnerability management is facing a serious problem: The National Vulnerability Database (NVD) seems to be giving up one of its two primary responsibilities (besides its responsibility to keep the database itself operating): adding “CPE names” to new “CVE records”.

There are two parts to the problem:

Part I

·       Software users need to be able to learn about vulnerabilities that have been reported in the software they use. They do this by searching a software vulnerability database.

·       The National Vulnerability Database (NVD) is by far the most widely used vulnerability database in the world. However, just learning about a new software vulnerability does not help a user, unless they know what product or products are affected by the vulnerability.

·       In the NVD, vulnerabilities are identified in CVE records using a format like “CVE-2024-12345”. Products that are affected by a CVE should be identified in the CVE record, using a machine readable “CPE name” like “cpe:2.3:a:microsoft:internet_explorer:8.0.6001:beta:*:*:*:*:*:* ”.  

·       Currently, NVD contractors are responsible for adding CPE names to new CVE records. Before February 2024, this was almost always done within a few days of when the NVD received the CVE record.

·       If a CVE record doesn’t contain a CPE name for a software product affected by the vulnerability described in the text of the record, a user searching for vulnerabilities that have been identified in the product will not learn it is affected by that CVE.

·       The NVD’s problem is that, starting on February 12, 2024, the NVD significantly reduced the number of CPE names it created. As a result, about 55% of new CVE records in 2024 were never given a CPE name. This means the product(s) affected by that CVE are invisible to a search.

·       Brian Martin is a well-known figure in the vulnerability world; Brian wrote on LinkedIn recently that it appeared the NVD had not added a very small number of CVE records for 12 days – in other words, they may have completely given up on adding CPE names (Brian confirmed a week later that the NVD has not yet resumed performing this task).

·       The same week, Bruce Lowenthal, Senior Director of Product Security at Oracle, reported that so far in 2025, only about 25% of new CVE records added in January and February contained a CPE name.

·       This means that, much more than half the time, a search in the NVD using a product name will yield no CVE records since February 2024, even if there has been at least one vulnerability reported for the product since then.

·       Even worse, the user will probably believe the product is vulnerability-free, since the only message they receive says that no records were found.

·       Of course, this means that searching for vulnerabilities in the NVD is increasingly becoming a waste of time; even worse, it will probably give the user a false sense of security.

·       What other vulnerability databases are there, besides the NVD? For open source software products, the answer is “a lot”, including OSS Index, OSV, GitHub Security Advisories and others; in fact, a software user is more likely to learn about vulnerabilities that apply to open source software products (or open source components in an SBOM) in these databases than in the NVD.

·       However, for commercial software products, there is currently only one vulnerability database: the NVD. Since the NVD can no longer be called reliable, this means there is currently no reliable source of vulnerability data for commercial software products. Obviously, this isn’t a good thing, given how dependent+ business and government organizations are on commercial software.

·       Bruce pointed out that one serious effect of the lack of CPE names – or at least the long delay in creating them – is that Oracle (and surely many other software suppliers) – is often not able to provide patches as quickly as their customers want them (normally within 2-3 weeks).

·       When will this problem be fixed? If this question asks when CPE names will be added to all the over 30,000 “CPE-less” CVE records currently in the backlog, the answer is “probably never”. It is possible that the rate of growth in the backlog will be slowed, but it is unlikely that the backlog itself will be erased, in the sense that every CVE record will include a CPE name.

·       Since the CPE backlog may never go away, what measures can be taken in the longer term? There isn’t much question: CVE records need optionally to contain a different identifier than CPE, although CPE will always be an option. That identifier needs to be purl.[i]

·       If purl were implemented in the CVE record format, it would immediately improve identification of open source products, since purl has – in about eight years – become the primary software identifier used in open source vulnerability databases.

·       The most important feature of purl is that the user never has to look up the purl for a product before they search for the product in a vulnerability database. This is because they should always be able to create the purl for a product by using information they already have, including the package manager from which they downloaded the software, as well as the package name and version string in that package manager.

·       Since each purl is globally unique, a purl for an affected product in a CVE record should always match a purl created by a software user before they search for the product in a vulnerability database, meaning searches using purl will have a high success rate.


How can we address the Part 1 problem?[ii]

Fixing this problem requires making it possible for CVE Numbering Authorities (CNAs) to designate software products in CVE records using purls. CNAs are mostly larger software developers and organizations like GitHub; they are responsible for reporting vulnerabilities to CVE.org using the CVE Record Format.

Three tasks are required to address the Part 1 problem. In each of these tasks, we will coordinate with the CVE.org Quality Working Group.

1.    High-level plans need to be developed for the two Part 1 tasks below, as well as the two Part 2 tasks.

2.     A revised version of the CVE record format needs to be developed in test form. That version will make it easier to specify a purl than does the current version 5.1.1. A very experienced CNA for open source software will be available to help with this task.

3.   There needs to be an end-to-end proof of concept for use of purl in CVE records. This will start with CNAs creating CVE records that contain purls (using the test format) and making them available to one or more vulnerability databases that can accept these records. If users can search a vulnerability database using a purl and be shown all the CVE records that apply to that purl, the PoC will be a success.

 

Part 2

·       Today, purl can only be used to identify open source software in package managers, not commercial software. Since most private and government organizations utilize commercial software to run their businesses, it is important that purl be expanded to identify commercial, as well as open source, software products. In 2022, the OWASP SBOM Forum suggested[iii] a way to fix this problem by having a supplier create a “SWID tag” for each of their products. A new purl “type” called SWID was developed and implemented in purl.

·       A SWID tag is a small document containing 5-10 pieces of metadata about a software product. These pieces of information can be used to create the purl for a product, which will always be globally unique.

·       The only three mandatory fields for a purl using the SWID type are “name”, “version” and “tagId”. Note that “tagId” can be almost anything. For example, it could be the URL from which the product can be downloaded.

·       To illustrate this, the purl for the product named Fedora, version 29, is pkg:swid/Fedora@29?tag_id=org.fedoraproject.Fedora-29. Note that every purl starts with “pkg:” followed by the type. For open source software, the type usually indicates the package manager or other repository – for example, “NPM” for Node NPM packages and “maven” for Maven JARs and related artifacts. However, for commercial software, the type will normally be SWID.

·       The supplier will make both the SWID tags and the purls for their products available on their website, or by other means. If a user wants to look up a product in a vulnerability database, they can download the purl for it, if that is available; otherwise, they can download the SWID tag and use that to create the purl (of course, various tools will automate this process). Neither the purl nor the SWID tag will need to change until the product is upgraded to a new version.

·       As in the case of purls for open source software products, the purl included in the CVE record for a commercial product should always match the purl a user creates when they want to search for that product; this is because both purls will be based on the contents of the same SWID tag.

 

How can we address the Part 2 problem?[iv]

We can address the Part 2 problem with two tasks. We will work with the purl maintainers in accomplishing both tasks.

1.     Perform a small proof of concept, in which

a.       A supplier will create SWID tags (perhaps using this tool) for certain products and make them available to their customers;

b.       A CNA will create test CVE records containing those purls to report test “vulnerabilities” in their products[v];

c.       One or more vulnerability databases (that support both CVE and purl) will ingest the test CVE records; and

d.       End users will use purls created from the SWID tags to search the vulnerability databases. If all the CVEs that were recorded for a product are revealed when the user searches using the product’s purl, the PoC is successful.

Note on 3/23: When I wrote this new post today about a longstanding problem for CVE records, I suggested that the best way to resolve it would be to include it in this Proof of Concept task. It doesn’t have to be resolved in order for the PoC to be successful, but because this is a very important issue for the CVE community, the best time to deal with it would be when a representative group of members of the community was assembled – i.e., in this PoC. So this will be on the PoC’s agenda as well.

2.     Work with a group of developers, CNAs and vulnerability database operators to identify appropriate policies and procedures for development, exchange and use of SWID tags and the purls based on them.


Conclusion

Given that the NVD may have stopped performing their most important function – adding CPE names to CVE records – for at least two weeks, and because of what is going on in the federal government today, it is possible that the NVD may no longer exist at all soon. Fortunately, CVE.org (which is part of DHS, not the Department of Commerce like NIST/NVD) seems to be taking steps to replace the functions that the NVD was performing (many people have been saying for a while that the NVD should be consolidated into the CVE.org database, since the CVE records in the latter form the foundation of the NVD anyway).

No matter what happens in the near term, it is clear there needs to be an alternative software identifier besides CPE available to CNAs and software end users. While there are one or two experimental alternatives (such as OmniBOR), purl is already in heavy use. For example, the open source software composition analysis (SCA) tool Dependency Track alone is used over 20 million times every day to look up a dependency from an SBOM in the OSS Index vulnerability database, which is based on purl.

Purl’s availability in the CVE record format will quickly make identification of open source software much easier and more accurate in the NVD and other vulnerability databases based on CVE. And, when the policies and procedures for use of the SWID purl type have been worked out and tested in a proof of concept, identification of commercial software products in the same databases will be much easier (no CPE lookup required!), as well as much more accurate.

Of course, it will probably be 1-2 years before purl is in widespread use in the CVE ecosystem. But there’s no excuse for waiting any longer; two years in the future will still be two years in the future six months from now. The five tasks listed above are mostly non-technical; they mainly require getting agreement among a group of participants in the CVE ecosystem.

The OWASP SBOM Forum will be pleased to lead this effort; we will start out with an initial project to perform the first two Part 1 tasks: development of high level plans and identification of changes to the CVE record format that are required to accommodate purl. If you or your organization would like to participate in this effort, please drop me an email; we will try to have an organizational meeting soon. Plus, we will need modest donations to move forward. If you or your organization wish to donate, please let me know and I’ll give you the easy instructions for donating online.

Any donation of over $1,000 to OWASP can be “restricted” to the SBOM Forum; we will be pleased with a donation of any size. All initial donations will toward the initial project (which will not require a huge budget). OWASP is a 501(c)(3) nonprofit organization, so in many cases your donation will be tax deductible.

I hope to hear from you soon!

 

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] Other identifiers could also be implemented, besides purl and CPE. The CNA should be able to choose the best identifier for their purposes, with the caveat that the identifier will need to be supported in vulnerability databases that identify vulnerabilities using CVE records.

[ii] For an in depth discussion of these tasks, see this document. Comments are welcome!

[iii] On pages 11 and 12 of this white paper.

[iv] For an in depth discussion of these tasks, see this document. Comments are welcome!

[v] Of course, the “vulnerabilities” do not need to be real ones.

Wednesday, March 19, 2025

Has the NVD become an empty shell?

Brian Martin, a well-known vulnerability researcher with whom I’ve had previous online discussions, has been closely following what is happening with the National Vulnerability Database (NVD) for a long time. Recently, he’s trying to answer the question whether the NVD is making progress on fixing the big problem that developed starting on February 12, 2024.

Starting on that date, the NVD greatly slowed performance of what I (and others) believe is their most important function: adding machine-readable product identifiers called CPE names to the CVE records created by CVE.org[i]. CVE records (created by CNAs) include a textual description of a newly identified vulnerability, as well as of one or more products that are affected by the vulnerability.

However, an NVD user utilizing the search bar (or one of the APIs) to learn about CVEs that apply to a product their organization uses will not be shown any CVEs that don’t include a CPE name. While in previous years, the NVD staff – or more specifically, contractors to NIST, the NVD’s parent – had always created CPE names for products mentioned in a new CVE record within days of the NVD’s receiving it from CVE.org, in 2024 the NVD added CPE names to fewer than half of the new records. This means that, if you search for a particular product in the NVD’s search bar today, you will on average only be shown half of any CVEs that affect the product, if they were discovered in 2024.

All CVE records that don’t have CPE names (called “unenriched” records) constitute the NVD’s backlog. Until last February, the NVD ran on average a zero backlog, but that changed on February 12. For a while, the NVD promised they would have the backlog cleared up in 2024, but that simply didn’t happen. In fact, in late December the NVD initially reported adding CPE names to only about 1% of CVE records. This meant that during that period, the backlog essentially grew at the same rate as new CVE records.

By the end of 2024, the backlog was about 22,000 unenriched records, which is close to 55% of the approximately 40,000 new CVE records created in 2024. As we started off 2025, the question was whether the NVD would start working off that backlog or whether they would allow it to grow.

Unfortunately, we have the answer to that question now: Brian put up a post on LinkedIn this week that shows the NVD seems to have completely stopped adding CPE names (and CVSS scores) to new CVE records for the previous 12 days. Of course, there is no communication at all from the NVD about this problem – in fact, the last time they posted a communication about their backlog was last November 13, when they said they were now fully staffed up and were hoping to make real progress soon.

Now it seems the NVD may have stopped performing their most important function, other than maintaining the database itself. This could well be because of the turmoil that has been going on in the federal government this year, which has included cancellation (whether legal or otherwise) of many ongoing contracts. If this is the case, that’s bad news indeed, since it means the NVD might end up going the way of agencies like USAID, which was all but shut down more than a month ago and remains in that state (with close to zero employees) today, despite a court order to reopen the agency.

In other words, the NVD might already be gone. But even if it survives in some form, unless enrichment restarts, it will be a hollow shell of what it was. The good news is that, if you search the NVD for recently announced vulnerabilities that apply to a product you use now, you’re likely to receive the message that there are no vulnerabilities to display.

The bad news is that’s probably not the truth. However, the only way to find out about CVEs that were not displayed is to do text searches of the over 40,000 new CVEs that were identified since February 12, 2024. Obviously, that’s not a solution to the problem (there are several third parties like Vulners and VulnCheck that are performing this work themselves, although none of them have done it for all 22,000 CVEs in the backlog. CISA was doing some of this work in their Vulnrichment program, but that program stopped adding CPE names in December, for some unexplained reason).

I don’t know what the backlog of unenriched CVEs is today, as a percentage of new CVEs identified since say the beginning of 2024. However, it’s without a doubt over the 55% level, where it stood in December. Where will this end? Clearly, if enrichment doesn’t resume (and specifically, addition of CPE names to CVE records), we’ll end up with a backlog that asymptotically approaches the number of CVEs that have been identified since last February 12.

In other words, a search for a product in the NVD will become increasingly meaningless. In fact, many people would argue that any product vulnerability search that at best will yield you half of the vulnerabilities that have been discovered in the past year is already meaningless.[ii] Of course, the NVD will probably survive as an historical database, but it won’t be a trustworthy source of vulnerability data for CVEs identified after February 2024.

How can this problem be fixed? It depends on what you mean by ‘fixed’. If you’re talking about going back and adding CPEs to all the unenriched CVE records since last February, that’s probably never going to happen.

However, if you’re talking about putting in place a long-term solution, that is certainly possible. Like all long-term solutions, it will require a lot of work. The source of the problem is quite clear: For reasons that are not known, the people tasked with adding CPE names to CVE records aren’t able to do their job now.

At first, it might seem that the solution to this problem is to find some more money to throw at the problem. NIST did that last year – and now the problem is worse than ever. Plus, given the current climate in Washington, I strongly doubt there are any trees left to shake for money.

Most importantly, CPE is a flawed software identifier, as the SBOM Forum (now the OWASP SBOM Forum) described in our 2022 white paper on fixing the naming problem in the NVD. Some of CPE’s problems might be fixed (and may in fact be fixed due to CVE.org’s revised specification), but others simply can’t be fixed.

Perhaps the biggest of these problems is the fact that CPE requires a name for the software product and the vendor of the product. What’s so hard about that? The problem is most software products and software vendors go by many different names in different contexts. If someone wants to find a particular product or vendor name in the NVD, they will need to guess at the one that was used by the person who created the CPE. There’s no way to know beforehand what that name was.

At this point, someone will usually say something like, “If we had a global directory of software vendor names, all those names could be mapped to a single canonical name. Then we could require that the person who creates a CPE name only use the canonical name for each vendor.” This sounds very simple, but who makes the decision on the name for the vendor? Is it the current CEO, the current CFO (in financial filings, etc.), the initial articles of incorporation, the name used by the New York Times or the Tokyo Yomiuri Shimbun, etc. etc?

And what if the developer acquires another developer, but – as usually happens – leaves the existing product and vendor names in place for a year or two, or perhaps never changes them? A customer of the acquired developer might not learn of the acquisition for a year or two and will continue to search for vulnerabilities using the previous product and vendor names.

Of course, if the funds were available to create a global database of software suppliers and more importantly to maintain it, this wouldn’t be a problem. However, maintaining the supplier database would be hugely more expensive than maintaining the NVD itself. And when you talk about the required database of product names, that would be much more expensive than the supplier database. Clearly, neither of these databases is likely to be available soon or ever.

What’s needed is a software identifier that is based entirely on information that is available to the user at any time. Such an identifier doesn’t have to be “created” at all. That identifier is purl, which stands for Product URL. Purl came from literally nowhere ten years ago to conquering the open source world – to the extent that currently there is almost no open source vulnerability database that doesn’t support purl. Purl can be implemented in the “CVE ecosystem" fairly easily, once a revised CVE record format can be developed and tested. Moreover, purl doesn’t need to supplant CPE. Current CPE names won’t go away, and if someone starts creating new CPE names again and adding them to CVE records, they will certainly be supported.

To implement purl in the CVE ecosystem, two steps are required:

1.      Changes will be needed in the CVE Record format. Of course, that format is changed regularly, so this should not be too hard. The changes will not be implemented until step 2 is completed (and they may not be implemented for a while after that, since there is always a backlog of format changes that are needed).

2.      There needs to be an end-to-end proof of concept. It will start with CVE Numbering Authorities creating test CVE records (based on the revised CVE Record format) and end with users searching vulnerability databases for vulnerabilities applicable to particular products. If the users are shown all the CVEs that affect each product, the PoC will be successful.

These two steps are required to implement purl as it’s currently configured: to support open source software found in package managers. Purl doesn’t currently support commercial products. While it will be a big step forward when purl is implemented in the CVE ecosystem just for open source software, it will be much better when purl supports commercial software as well. Since the NVD is currently the primary vulnerability database for commercial software, if purl is to provide a solution to the NVD’s problems, it will need to support commercial software as well as open source.

I described a possible solution to this problem – proposed by Steve Springett, leader of the OWASP Dependency Track and CycloneDX projects – in this blog post. It requires software suppliers to create a “SWID tag” for each of their products and make it available on their website to all interested parties, whether or not they are customers (the SWID tags will also be distributed via the Transparency Exchange API).

A user who wants to search for vulnerabilities in a product they use can download the product’s SWID tag and create a simple purl using it (usually, only 3-4 fields from the SWID tag will be required to create the purl). Since the CNA will use the same tag when they create the purl for the CVE record, this means a user searching for the product’s purl in a vulnerability database should always find any CVE records for the same product and version. Note that this doesn’t require anyone to “create” a purl; its contents will be dictated by the contents of the SWID tag.

Thus, the third and final step for implementing purl support in the CVE ecosystem is testing and implementing the use of SWID tags to create and validate purls for commercial software. Since this step can be accomplished at the same time as the first two steps, it would speed up the implementation of purl in the CVE ecosystem if the two tracks could be carried out simultaneously.

How long will it take to do all of this? I think the first two steps – which will implement purl in the CVE ecosystem with coverage of open source software in package managers – will take about a year. The third step, validating and implementing the use of SWID tags to create purls for commercial software, will require another year. However, if resources permitted both tracks to be carried out at the same time, we could have purl supported throughout the CVE ecosystem, for both open source and commercial software, in 1 ½ to 2 years.

This is a long time, of course, but given the distinct possibility that the NVD will become an empty shell soon – and given that it is already greatly diminished – what other choice is there? With 280,000 CVEs in the catalog now, it is long past the time when anything other than automated search for software vulnerabilities is practical. We’ve already effectively lost the capability for automated identification of all vulnerabilities that apply to a product in an NVD search. It’s time to start work on Plan B.

The OWASP SBOM Forum is willing to take the lead on all three steps of this project. However, we require funding for that. If your organization is able to support us with either donations or personnel or both, please email me at tom@tomalrich.com. OWASP is a 501(c)(3) nonprofit organization.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] For a summary of what CVE.org does and how it relates to the NVD, see this post.

[ii] CVE.org has made modifications to the CPE specification in an effort to get the CVE Numbering Authorities (CNAs) – who in the majority of cases are the developer of the product for which a vulnerability is being reported – to start adding CPE names to the CVE records they create. I hope this effort is successful, of course, but we certainly can’t count on it.

Sunday, March 16, 2025

The two uses of “exploitability”


My post about a week ago introduced (for those who weren’t familiar with it already) the concept of exploitability of a vulnerability. This term is used in discussions of the risk posed by a particular vulnerability (usually a CVE, although there are other types of vulnerabilities as well). Perhaps the most important reason why the risk posed by a vulnerability comes up is the need to prioritize patches for application. Most medium-to-large organizations today have more patches to apply than they have time in the day to apply them; they need to decide which patches to prioritize and which to ignore. The best way to prioritize patches is to determine how much risk each patch mitigates by assigning a risk score to each vulnerability addressed by the patch.

Risk is a combination of likelihood and impact. Thus, any attempt to measure the risk posed by a vulnerability must take account of both variables. For impact, there isn’t much dispute over the best way to measure that: the well-known Common Vulnerability Scoring System (CVSS) Base Score is an index of the impact (called “severity”) of a vulnerability being exploited in software products in general.

But how can we measure the likelihood that the vulnerability will be exploited? CISA’s Catalog of Known Exploited Vulnerabilities (KEV) is a list of vulnerabilities that are known to have been exploited “in the wild” – i.e., not by a researcher doing a proof of concept, but by a bad guy pursuing not-too-nice ends. It’s hard to argue that any vulnerability in the KEV catalog should be prioritized for patching, since the likelihood that it will be exploited again is close to 100%.

However, the problem with KEV is that fewer than 1% of the 280,000 CVEs in the current CVE list are in the KEV catalog. How can we estimate the likelihood of the remaining 99% of CVEs being exploited in the wild? This is where the Exploit Prediction Scoring System (EPSS) comes in. EPSS is an index of the likelihood that a vulnerability will be exploited in the wild within the next 30 days. In my opinion, a vulnerability risk score should take into account both whether the vulnerability is in the KEV catalog and the EPSS score[i].

However, for a couple of years, I’ve been pointing out that we (meaning the software security community) are now using the term “exploitability” in two senses. One is the EPSS sense, but the other is a sense that I haven’t written about much lately (but used to a lot): VEX. That stands for “vulnerability exploitability exchange”. It’s an outgrowth of the SBOM (software bill of materials) “movement”, which I’ve been involved with. In fact, my book, “An Introduction to SBOM and VEX”, is still the only book on Amazon that seriously discusses SBOMs or VEX (there are a couple of the inevitable books that always appear when a new topic is introduced, offering “5 tips about [insert topic du jour]”. Plus, searching on “SBOM” yields – a few places below my book - one of my kids’ favorite books when they were young, “The Very Hungry Caterpillar”. However, I haven’t figured out what that book has to do with SBOM or VEX).

The idea of VEX came up when a few large companies who were involved in the NTIA Software Component Transparency Initiative, including Cisco and Oracle, became alarmed at the large number of false positive vulnerabilities that appear when a software user utilizes a tool like Dependency Track to identify vulnerabilities in components contained in a product they use; in fact, it seems that more than 97% of the identified vulnerabilities are false positives. This isn’t because the vulnerable code isn’t in the component; it usually is. However, because of the many ways a component can be installed in a software product, in the majority of cases the installation itself mitigates the vulnerability[ii].

Those big companies, driven by visions of their support lines being swamped with calls from outraged users about the huge number of “vulnerabilities” in their products, decided the solution to that problem is a machine readable document that essentially says, “Vulnerability CVE-2025-12345 is not exploitable in our Product ABC version 2.1. You don’t have to worry about it and we won’t patch it, since that would be a waste of both of our time.”

Let’s compare the meaning of “exploitable” in the above sentence and the meaning of the same word in EPSS. In EPSS, a vulnerability is exploitable if there’s an exploit kit available, if it’s being discussed in hacker blog posts, etc. Even more importantly, a lot of different products might be under attack by hackers exploiting that vulnerability; the more attacks that are occurring on more products, the more “exploitable” the vulnerability is. The EPSS score is a probability and varies between 0 and 1.0.

On the other hand, VEX has nothing to do with what’s going on in the outside world. It has to do with a single product, which might be installed on your company’s network or another company’s. It gives a binary answer to the question, “If a hacker were able to reach Product ABC version 2.1 on my network, would they be able to exploit CVE-2025-12345, which is present in a component of ABC v2.1?”

Let’s compare the two uses of “exploitable”. The question in both cases is, “Is CVE-2025-12345 exploitable?”

1.      In VEX, the question refers to a single product. In EPSS, it refers to all products.

2.      In VEX, the answer can be determined by technical means: Ask a hacker of average skill level (not the uber-hacker who could penetrate a brick) to exploit the vulnerability in the product in question (meaning they can utilize the vulnerability for some purpose, like viewing restricted information or escalating privileges). If the hacker can do that, the vulnerability is exploitable in that product. If the hacker can’t do it, the vulnerability isn’t exploitable in the product[iii]. In EPSS, the answer is determined statistically, based on data like the number of times the CVE has been mentioned in a blog or website, and whether there is publicly available exploit code.

3.      In VEX, the answer is binary: the CVE is exploitable in a particular product or it isn’t. The answer doesn’t change unless the product itself changes. In EPSS, the answer is a probability, which changes over time based on changes in the variables that make up the EPSS score (in fact, EPSS scores are re-computed daily for every one of the about 280,000 vulnerabilities currently in the CVE catalog).

4.      In VEX, the skill level of the hacker can make a big difference in the answer to the question whether CVE-2025-12345 is exploitable in Product ABC v2.1. If the hacker is very skillful, they might be able to compromise ABC. This is why VEX assumes a certain “average” level of hacker skill, although that can never be specified with any rigor. On the other hand, EPSS just depends on activity. If enough hackers are trying to use CVE-2025-12345 to attack any product, it doesn’t matter whether they’re succeeding in their efforts. If hackers are targeting that vulnerability, its EPSS score will probably go up.

Given how differently the single term “exploitable” is used in the two cases, it seems clear both cases shouldn’t use the same term. I think the VEX use case is more deserving of the term, since the question is whether it’s technically possible for any hacker to exploit the vulnerability in that product. The answer to the question in the EPSS use case doesn’t depend on technical data. Instead, it depends on statistical analysis of social data, such as the existence of code to exploit the vulnerability, or even just public discussion about doing so.

However, even though I think the VEX use case is much closer to what most security professionals think of when they use the term “exploitability”, I admit it’s too late to try to take it away from the EPSS use case. The VEX concept is struggling to be accepted, while EPSS is already in wide use. What should we do? Call the EPSS case “Exploitability #1” and the VEX case “Exploitability #2”?

That won’t work, but there’s another way we can state the fact that an attacker of medium skill should be able to exploit the vulnerability in the product in question: We can say, “Product A version 2.1 is affected by CVE-2025-12345”, rather than “CVE-2025-12345 is exploitable in Product A v2.1.” In fact, this is the wording we use to talk about vulnerabilities in general today. I suggest we use this wording when discussing VEX use cases from now on. Problem solved.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] EPSS scores for all 280,000 CVEs in the current CVE catalog are updated daily.

[ii] There are many other reasons why a vulnerability in a component will not be exploitable in the product itself – e.g., if the developer patched the vulnerability when it installed the component in the product.

[iii] For us to make the positive statement that the vulnerability is exploitable in the product, Mr./MS Average Hacker would have to succeed in exploiting the vulnerability. However, if the same hacker can’t exploit the vulnerability once, that doesn’t demonstrate the vulnerability isn’t exploitable. The hacker would have to fail to exploit the vulnerability multiple times, before it could be said that the vulnerability isn’t exploitable in the product. 

Wednesday, March 12, 2025

This is why we need to fix NERC CIP in the cloud soon

 

I’ve been hearing for over a year about software and security service providers who are moving their platform to the cloud and either abandoning or deprecating their on-premises solution for NERC CIP customers. This week, I heard a story from a large NERC entity about their experience with a software provider whose product they wanted to purchase for on-premises use. However, before they purchased it, the provider told them they were moving to the cloud, although they would still be able to offer them a fully functional on-premises version.

The good news is that the provider is willing to support them on premises and will include full functionality in that version. What’s the bad news? The price tag for the software in the cloud is $80,000, but the price tag for the on-premises version is…drum roll, please…$800,000. In other words, the provider is willing to support an on-premises version just for this one customer, but the customer will have to pay the full cost of it.

Of course, the NERC entity declined the software provider’s offer; they found another on-premises product to purchase. It doesn’t have all the functionality of the other package, but they say it will meet their needs.

I’ve heard that, even though this problem was getting bad one year ago, now it’s getting much worse. What’s even worse than having to pay $800,000 for an on premises software product is not being able to find an on premises version for any price. I’m sure that’s already happening now. For example, software for Internal Network Security Monitoring, which will be required for CIP-015 compliance, will probably never be available in an on-prem version. Of course, CIP-015 enforcement is more than three years away and FERC still has not approved the standard yet, so this isn’t an immediate problem.

Is help on the way? Yes, it is. A Standards Drafting Team is hard at work on considering what will be required, although they haven’t drafted a single word yet (I don’t blame them for this, because there are multiple conceptual problems that need to be worked through before they can even take their first baby step toward a new or revised standard). I predicted last year that the new or revised standards will be enforced starting in 2031, and I see no reason to move that date backwards.

Of course, by 2031 there might not be any software left to run the BES – at least for NERC entities with medium and high impact CIP environments – other than MS Excel™. I don’t think we can wait that long for the problem to be fixed. Do you?

If you are involved with NERC CIP compliance and would like to discuss issues related to “cloud CIP”, please email me at tom@tomalrich.com.

 

Monday, March 10, 2025

What is “exploitability”?

It’s not news that the biggest problem in software vulnerability management today isn’t “zero days” – that is, scary vulnerabilities for which there isn’t a patch yet. In fact, it’s just the opposite: It’s the fact that most medium to large organizations today are overwhelmed with unapplied patches. Moreover, these organizations have come to realize they will probably never get through their current patch backlog, given the rate at which new vulnerabilities are being identified and patches issued for them almost every day.

Thus, the big problem today is patch prioritization – that is, deciding which patches need to be applied ASAP, which ones should be applied but aren’t super urgent, and which ones are not worthwhile applying at all.

Prioritizing patches needs to be based on risk. In other words, patches that mitigate the most risk should be applied first, patches that mitigate substantial risk should be applied next, and patches that mitigate little risk can be ignored (or at least put aside, in case the day comes when all current patches have been applied and the patching team is begging for more work 😊).

Of course, security patches mitigate the risk from vulnerabilities (usually CVEs) that are fixed by the patch. This means that patches need to be prioritized for application based on how risky the CVEs are that they fix. Rather than try to determine for themselves how much risk each new CVE poses, most organizations rely on one or more published scores for the CVE.

Risk is a combination of likelihood and impact, which mean respectively the probability that an attacker will utilize the CVE to attack the organization and the magnitude of damage that could result if the attacker succeeds. By far the most widely followed measure of impact (and probably the best) is the CVSS Base Score.

“Impact” is easy to understand, but what about “likelihood”? What does it mean to say that a vulnerability is “likely” to cause a negative impact on an organization? After all, the fact that a vulnerability is present in a software product I use doesn’t directly harm me. It’s only when an attacker “exploits” that vulnerability is it possible that I will suffer harm. This is why the likelihood of an organization being impacted by a software vulnerability is usually referred to as the “exploitability” of the vulnerability.

Today, there are two measures of exploitability of a vulnerability. The first and most widely followed is CISA’s Known Exploited Vulnerabilities (KEV) catalog. It is a list of (currently) around 1300 vulnerabilities[i] that are known to have been actively exploited “in the wild” (i.e., not as part of a controlled experiment).[ii] The fact that a vulnerability has been exploited in the past (even if it isn’t being exploited currently) is a good indication that it might continue to be exploited. In other words, the hackers already know how to find and utilize the vulnerability, so it’s likely they’ll continue to do so. After all, there are a lot of software users that never apply patches. They’re just waiting for the hackers to take them to the cleaners.

KEV is a good measure of exploitability, but the fact that it only includes the 1300 vulnerabilities that have been exploited, and says nothing about the approximately 290,000 CVEs that are not known to have been exploited, makes it not hugely useful in prioritizing patches for application. Of course, any patch that fixes a CVE on the KEV list should be applied as soon as possible. But since only a small fraction of the CVEs that are fixed by patches in your backlog are on the KEV list, what can you learn about the exploitability of all the other CVEs?

This is where the EPSS score comes in. EPSS stands for “Exploit Prediction Scoring System”. It was developed and is maintained by FIRST, the Forum of Incident Response and Security Teams (FIRST also maintains CVSS).

EPSS is quite different from KEV and even from CVSS, in that its primary goal is not to describe the present but to predict the future. It provides a score between 0 and 1.0, which estimates the probability that a given CVE will be exploited in the wild in the 30 days following publication of the record of the vulnerability. The EPSS score of every CVE ever reported (the total in March 2025 is about 280,000) is updated daily.

EPSS is 100% data driven. It is created (and constantly updated) by a) gathering data on many different indicators of exploitation, b) including those variables in a mathematical model, and c) updating the model’s weights so it “predicts” recent experience as closely as possible. EPSS scores change daily. Thus, it is important to check them regularly (the current scores can be automatically retrieved from the EPSS website at any time).

Due to this purely mathematical approach, there is no causality in the model; it simply includes correlations. Some of the variables that are tracked are:

1.      Vendor of the affected product

2.      Age of the vulnerability (i.e., days since the CVE was published)

3.      Number of times the CVE has been mentioned on a list or website

4.      Whether there is publicly available exploit code

The EPSS team regularly points out that the scores have no intrinsic meaning on their own; they only have relative meaning when compared with other probabilities. That is, if the EPSS score for a CVE is .2, the only firm lesson that can be drawn from that fact is that the CVE is more likely to be exploited than for example a CVE with a .1 score, and it is less likely to be exploited than a CVE with a .3 score.

Remember, we’re trying to prioritize patches for application by our organization. To do that, we need first to compare the degree of risk posed by the CVEs that are fixed by those patches. We need to determine a risk score for each CVE, but to do that, we need to score the likelihood that the vulnerability will be exploited in our environment, as well as the impact if it is successfully exploited. We have already decided to use CVSS Base Score to measure impact, but now we’re trying to get a “likelihood score” – which is equivalent to an “exploitability score”.

We have two measures of exploitability of a CVE: whether the CVE is present in the KEV catalog, and its current EPSS score. Which should we use? While KEV is currently the gold standard of exploitability measures, the fact that EPSS is such a different measure from KEV, and that it scores so many more vulnerabilities, means it is a good idea to use both measures.

How can you utilize both KEV presence and EPSS score in patch prioritization? Since the fact that a vulnerability is being actively exploited outweighs all other measures of risk, I recommend that you move any patch that fixes a CVE that is found in the current KEV catalog to the top of the prioritization list; this means the patch should be applied as soon as possible. After doing that, you could prioritize patches based on both their CVSS Base Score and their current EPSS score.

I have another point to make about exploitability, but since what I’ve already written might take some digesting, I’ll stop here. This was the Exploitability 101 course. Look for Exploitability 102, coming soon to a blog near you.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] The vulnerability management services provider VulnCheck has their own KEV catalog, which contains 2-3 times the number of vulnerabilities in CISA’s catalog (VulnCheck’s catalog contains all of the entries in CISA’s catalog).

[ii] Even though a vulnerability has been actively exploited, this doesn’t mean the attacker was successful in causing some sort of harm to the organization (e.g., stealing some of their data). It just means the attacker was able to reach the vulnerable code and exercise it in some way.