The vulnerability database emergency

Last Monday, I published this post titled “Currently, automated vulnerability management is impossible. How can we fix that?” Two days later, I put up a follow-up post titled “The CVE/CPE backlog is currently 17,000. It’s growing by >100 each day.” The point of both posts is that it seems the National Vulnerability Database (NVD) has given up on accomplishing its most important function, which is to inform software users (and software developers) of all reported vulnerabilities (CVEs in particular), that are found in their products, using a machine-readable software identifier. Today, Cybellum released a podcast I recorded with them last week, discussing this problem.[i]

Here is a summary of the two posts:

1.      CVE reports are prepared by CVE Numbering Authorities (CNAs) and submitted to the CVE.org database. From there, they are sent to the NVD. From the NVD’s inception about two decades ago until February 12 of this year, the NVD (which is staffed by NIST employees and contractors) took responsibility for “enriching” those reports by adding information the CNAs weren’t supposed to enter on their own.

2.      The most important of the pieces of information that the NVD added to the CVE report was a CPE name (a machine-readable software identifier), which identifies the product/version that is affected by the CVE (i.e., the vulnerability that is the subject of the report). All CVE reports contain a textual description of the product(s) that is affected by the CVE. However, in order to be useful to an automated vulnerability management process, the report needs to contain a machine-readable product identifier. Until February 12, the NVD had almost always added CPE names to CVE reports, making those reports searchable by product or vendor name.

3.      Starting on February 12, the NVD drastically reduced its enrichment of CVE reports; in fact, from February 12 through mid-May, they added CPE names to virtually none of them. Starting in mid-May, they resumed enriching them regularly again, but only at a pace of about 75 per day. This is less than half of the average number of new CVE reports received by the NVD each day- about 175.

4.      While it’s certainly better than no enrichment at all, there are two problems with that pace. First, it obviously won’t eliminate the “enrichment gap”, since they’re accumulating a backlog of 100 “unenriched” CVE reports a day. The NVD has said they intend to close this gap by sometime in 2025. This means they intend to pick up their pace of enrichment (mainly through adding contactors), so that it closely matches or exceeds the rate at which new CVEs are received.

5.      However, the second problem is much more serious: Between February 12 and mid-May, 175 new CVE reports were passed to the NVD by CVE.org on the average business day; only a small fraction of these reports were enriched at all (in fact, at least a couple weeks went by with literally zero enrichment). This means there’s now a huge backlog of CVE reports without CPE names, which built up during that three-month period. The NVD has said nothing about how or even whether they will reduce that backlog.

6.      When I wrote the first post last Monday, I was thinking the backlog for the three-month period must be around 2-3,000 CVE reports. However, a conversation with Andrey Lukashenkov of Vulners showed me that the backlog is about 17,000 now; as I just pointed out, this backlog is still growing by about 100 a day. Even if the NVD tripled their current rate of enrichment to 225 per day, it would take about a year and a half to eliminate the total backlog. Of course, this also assumes the NVD will try to eliminate the backlog. Since they have yet to even acknowledge this backlog in their announcements, that assumption is questionable.

What’s the problem with having this backlog? Think of what will happen if you go to the NVD to search for vulnerabilities currently found in a software product you use (where “you” includes both end-user organizations and software developers. The latter are without much doubt the biggest users of vulnerability database services).

Today, no matter what product you search for in the NVD, the search will almost always appear to go swimmingly: no vulnerabilities at all will be found! But does that mean the product is vulnerability-free? Only if you just care about vulnerabilities that were identified before February 12. If you care about any that were identified later, you may be living in a fool’s paradise (of course, my guess is that 99% of searches in the NVD are for current vulnerabilities. Hopefully, any serious vulnerability that was identified in a product before February 12 has been mitigated in one way or the other).

I wasn’t joking when I titled the first post, “…automated vulnerability management is impossible.” How can you manage vulnerabilities in the software you use, if you can’t learn about them in the first place?

Although they had invited me to do their podcast before I wrote those two posts, Cybellum was so alarmed by what I said in the posts that they decided to accelerate not just the recording, but also the publication of the podcast. So, instead of scheduling the recording and post-production taking more than a month, they released the podcast today. I must admit that my record with doing podcasts is spotty, since I often tend to insert a lot of um’s and ah’s. But this podcast came out very well. I’d like to hear what you think of it.

Of course, Cybellum has every reason to be worried about this issue: Almost every organization of any type in the world uses software, and many of those organizations develop software as well. If there’s no easy way for them to learn about vulnerabilities in software, their entire business model needs to be re-thought. Yet, that seems to be exactly where we’re left now.

Of course, the problem isn’t really that the NVD has now proven to be unreliable; it’s that so many organizations put their complete trust in the NVD in the first place. I was asked in the podcast if there is an easy-to-implement alternative to the NVD. The answer is no. Whatever solution your organization adopts for vulnerability management going forward, you can be sure that a) implementing it will probably not involve just using a single source of vulnerability information, and b) the solution will require more of your time, and perhaps money, than putting all your eggs in the NVD basket required[ii].

Without much doubt, this problem needs to be remediated. As I discussed in last Monday’s post, the real solution is to move to using purl as the primary product identifier for CVEs. However, I’ll be honest: This will take at least 2-3 years to be fully implemented. What can we do in the meantime? Can the software world wait 2-3 years before fully automated vulnerability management is possible? Of course, the answer to that question is no.

The good news is that the NVD is just one of many available vulnerability databases. However, the not-so-good news is that there’s no single database that can provide the one-stop solution that many people believed (although not correctly) that the NVD provided. Each database has its strengths and weaknesses – including the things it does really well, the things it doesn’t do well and the things it doesn’t do at all. What’s required is an intelligent catalog that itemizes the strengths and weaknesses of each database and suggests how multiple database options can be combined to meet particular needs. I’ll have more to say about that soon.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

I lead the OWASP SBOM Forum and the OWASP Vulnerability Database Working Group. Both groups endeavor to understand and address issues like what’s discussed in this post; please email me to learn more about what we do or to join us. You can also support our work through easy directed donations to OWASP, a 501(c)(3) nonprofit. Please email me to discuss that.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.

[i] The LinkedIn post, which shows a well-chosen video excerpt from the podcast, is here.

[ii] If there’s a bright side to this debacle, it’s that the value of utilizing the NVD, as opposed to using other sources for some or all your vulnerability data needs, has been greatly overstated for some time. It is very possible that your new solution(s) will fit your needs much better than exclusive reliance on the NVD did.

