Wednesday, September 18, 2024

The NVD's been down so long, it's beginning to look like up


Bruce Lowenthal, Senior Director of Product Security at Oracle, has been regularly updating the OWASP SBOM Forum members on what’s going on with the National Vulnerability Database (NVD); my last post on this topic was this one. His latest update was on Sunday. Here are the highlights (which include additional points he made in emails with me yesterday):

·        The total number of “unenriched” CVE reports (i.e., reports to which the NVD has not assigned a CPE name. That means a search on a CPE name will not reveal the vulnerability, even if the product that the CPE applies to is named in the text field of the report. Only a “manual” text search would reveal the product) is now 18,790. These are unenriched CVE reports incorporated into the NVD starting in February (when the NVD suddenly drastically reduced the number of CVE reports they enriched) and ending in mid-September.

·        This number is similar to what Andrey Lukashenkov estimated in the August post linked above, which was “over 18,000”. You might call this good news, since the rate of increase in the backlog seems to be somewhat diminishing. But Bruce’s monthly numbers don’t show a consistent trend.

·        However, the NVD is consistent in one thing: Despite the fact that they built up a backlog of 10,557 unenriched CVE reports from February through May (yielding an enrichment rate of less than 3%), they no longer consider that to be part of their backlog, since they aren’t even trying to enrich any CVEs issued before June. Starting in June, they have been enriching an average of less than 50% of CVEs every month, but they haven’t enriched a single CVE for the February – May period since May.

·        The last announcement they made about their problems was on May 29, when they said “a contract for additional processing support” would allow them to return to their pre-February processing rate “within the next few months”. It’s now almost four months later, and they’re still very far from reaching their pre-February rate, which was close to 100%.

·        On May 29, they also announced, “In addition, a backlog of unprocessed CVEs has developed since February…We anticipate that that this backlog will be cleared by the end of the fiscal year.” Of course, they’re referring to the end of the federal fiscal year, which is 12 days from today. Somehow, I doubt they’ll clear up their entire 18,790 unenriched CVEs by the end of the calendar year, let alone the end of the fiscal year. Bruce’s numbers showed that the backlog continues to grow at over 1,000 per month. Given that growth rate, I calculate that the NVD will erase their backlog…envelope, please…never.

The NVD has also been consistent in another area: They still have not given an honest explanation for why their processing capability fell off a cliff on February 12. However, this is now clear: It seems almost all their CVE processing activities were performed by contractors. Another federal agency was providing them about $4 million per year to pay for those contractors; they suddenly withdrew that funding in February.

I would love to learn why that other agency withdrew their funding, especially in the middle of the fiscal year. But the bigger question is what the fact that the NVD seems to be relying almost entirely on contractors to enrich CVEs means for the quality of that work. Fortunately or unfortunately, we already know the answer: the quality isn’t good.

While CPE was the first machine-readable software identifier to make the big time more than two decades ago, its weaknesses have become more apparent in recent years, especially because the purl identifier has been so successful in the open source software world. Even then, the fact that the NVD was putting a CPE on almost all CVE reports made a gradual solution to the problem – i.e., gradually switching to purl as the primary software identifier in CVE reports – seem quite acceptable.

But today, we’re living with about 19,000 CVEs in the NVD (and other vulnerability databases that are based on the NVD) that don’t have a CPE, and this number is growing by over 1,000 a month. Moreover, almost all of these are recent CVEs, which makes the fact that they’re invisible to searches even more galling. It’s like your doctor stopped learning about new diseases in 2019 and hasn’t informed you of that fact. When you go to him with symptoms of Covid, he has no idea what your problem might be.

Automated vulnerability management (the only kind that makes sense for any organization other than a small one) is no longer possible for any organization if they are tied to a vulnerability database that relies exclusively on CPE. But, since by far the most widely used vulnerability identifier is CVE, and a CVE report can now only use CPE name as a software identifier, this means that most users of vulnerability databases are limited to using CPE[i]. Given the NVD’s huge backlog of unenriched CVEs, this also means that users of CVE-based databases are described by another 3-letter acronym: SOL.

Fortunately, there is another software identifier that has now almost literally taken over the open source software world: purl. In fact, there is hardly any open source vulnerability database that isn’t based on purl. But purl currently has an important shortcoming: it doesn’t support proprietary software products, just open source ones.

This means that, even when CNAs start including purl identifiers in CVE reports, they won’t be able to do so for proprietary software. Since the biggest CNAs are all proprietary software developers (Oracle, Microsoft, etc.), and since most of their CVE reports address vulnerabilities in their own products, this means that today most CVE reports don’t contain a machine-readable software identifier (because the NVD has usually discouraged the CNAs from creating their own CPE names and has often substituted their own for the ones the CNAs created).

To make a long story short (or shorter, at least), the big inhibitor to replacing purl with CPE in the NVD and other CVE-based vulnerability databases is the fact that there’s currently no clear way to create a usable purl for proprietary software, even though in principle there should be no problem with doing that (although it won't be easy). On page 12 of our white paper on software naming published more than two years ago, the OWASP SBOM Forum described, at a very high level, one method of creating purls for proprietary software; since then we have identified another method as well.

We will shortly come out with a new white paper on how the SBOM Forum proposes to make it possible for CNAs to create machine-readable purl identifiers for proprietary software products identified in CVE reports.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

I lead the OWASP SBOM Forum and its Vulnerability Database Working Group. These two groups work to understand and address issues like what’s discussed in this post; please email me to learn more about what we do or to join us. You can also support our work through easy directed donations to OWASP, a 501(c)(3) nonprofit, which are passed through to the SBOM Forum. Please email me to discuss that.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


[i] Sonatype’s OSS Index vulnerability database uses CVEs, but links them to purl identifiers, not CPE names. Since the CVE reports don’t include purls now, this means those links have been developed by Sonatype’s own research, not the input of the CVE Numbering Authority (CNA). When they create the CVE report, the CNA describes the affected product in text form, but the CPE number for that product usually is – or was – created by the NVD. Since there can’t be a one-to-one match between a CPE and a purl, Sonatype utilizes an eclectic mix of methods to link CVEs with open source products identified with purls.

Of course, this is certainly better than not being able to link an open source product to a CVE at all, which is why Dependency Track relies primarily on OSS Index for its vulnerability lookups. By the way, the last time I checked (last December), Dependency Track was looking up vulnerabilities for a component in an SBOM 500 million times a month. Given the growth rate they were experiencing then, it’s not hard to believe they’re now at 7-800 million lookups per month. If you’re keeping score at home, that’s 23 to 27 million lookups every day.

Not that Steve Springett (leader of the OWASP CycloneDX and Dependency Track projects) goes around shouting this from the rooftops. He’s not that kind of guy.

No comments:

Post a Comment