Thursday, August 15, 2024

We have our work cut out for us

I and some other members of the OWASP SBOM Forum have come to realize - ever since we released this white paper almost two years ago - that in the long run there's no alternative to making purl the universal software identifier for vulnerability databases. However, it seems the long run has now become the short run, because the NVD (National Vulnerability Database) has dug itself into a deep hole in the creation of CPE names; the NVD uses CPE names to represent both open source and commercial software products. The NVD essentially stopped adding CPE names to CVE reports in early February of this year. 

Why is this a problem? Because a CVE report without a machine-readable software identifier is like a car without a steering wheel: You know there’s a new vulnerability, but there’s no automated way to learn what software products are vulnerable to that CVE. Of course, you can always read the textual product descriptions in every backlogged CVE report, if you have time for that - but most of us want an automated way to do this. 


CPE is the only software identifier supported by the NVD. Before February, the NVD staff read every new CVE report released by CVE.org and assigned a CPE name to the vulnerable product(s) named in the report. That CPE name could then be entered in the NVD search bar by a user to look up all CVEs that had identified the product (and usually, one version of the product) affected by the CVE.


However, since February the NVD has mostly stopped doing this. The NVD now has a backlog of more than 17,000 CVE reports that are missing CPE names; that backlog is growing every day. This means that searching for vulnerabilities that apply to a product today will miss most vulnerabilities that were identified from February on. Since it’s likely that the great majority of NVD searches are for new vulnerabilities, this means that automated vulnerability management based on the NVD is currently impossible for almost all practical purposes.


Besides CPE, the only other widely used software identifier today is purl. Purl is already by far the leading (and for all practical purposes the only) software identifier for open source software (OSS) packages in vulnerability databases that are dedicated to OSS. For a description of how purl came about and why it’s so special, see this article by Philippe Ombredanne, the creator of purl (and a member of the OWASP SBOM Forum). 


Purl has without a doubt become the leading identifier for open source software worldwide. However, today purl has no generally accepted way to represent proprietary software; this is the biggest problem preventing purl from being the “universal software identifier” today.


Before last Friday’s SBOM Forum meeting (August 9), I asked Philippe and Steve Springett (leader of the OWASP Dependency Track and CycloneDX projects and also an SBOM Forum member. Steve worked with Philippe to specify purl in the early days, and is still a maintainer of the purl project) to discuss options for extending purl to proprietary software. 


My request to Philippe and Steve to speak to the SBOM Forum last Friday was partly based on a request from Bruce Lowenthal, Senior Director of Product Security at Oracle (and - you guessed it! - a member of the SBOM Forum). Like me and many others, Bruce is concerned about how software users can easily learn about newly identified vulnerabilities that apply to software products they use to fulfill their organization’s mission. 


The SBOM Forum’s 2022 white paper detailed (on pages 4-6) some of the many problems caused by reliance on CPE names as the primary software identifier. That paper urged a gradual move away from relying completely on CPE as an identifier. However, we knew at the time that this would be a huge job. Given that before this February, the NVD was doing a decent job of producing CPE names for all products identified in the text of CVE reports, it didn’t seem like a big problem to live with CPE’s problems for a few more years, while the foundations could be laid for an eventual move to purl as the primary identifier for all software. However, the NVD’s actions (or lack thereof) since February 12 have lent new urgency to solving this problem.


This wasn’t the first time we’d had this discussion. In 2022, as the SBOM Forum (not yet part of OWASP) was discussing our white paper on fixing problems with software identification in the NVD, we realized we needed to describe some way that purl could be extended to cover proprietary software. In 2022 and last Friday, Steve first pointed out a fairly easy way to accommodate a large number of proprietary software products, although nowhere near all of them. Steve’s idea was (and is) quite ingenious.


The central feature of purl is the fact that it doesn’t require any central database. As the 2022 paper explains at length, the purl for an open source package in a package manager is based on the name of the package manager (which determines the “Type”), as well as the name and version string of the software in the package manager - that’s all that’s required. Because the package manager is a controlled namespace, anyone who accesses the package manager will find the same product name and product version string. Therefore, everyone will be able to create the same purl for the product. There’s no need to look anything up in order to find its purl - at least for open source packages. If you have downloaded the product, you already have all the information you need to create the purl.


Steve’s idea was that a proprietary analog to a package manager is an online store like Google Play or the Apple Store. Like the package manager, the store is a controlled namespace; the name/version of the software won’t vary in the store. Therefore, a purl type could be created for the store; the actual purl would just consist of the type, as well as the name and version string of the product in that store. If the product/version pair doesn’t change, the name and version string won’t change, either. As long as someone obtained their software from the store, they should always be able to create the correct purl for it in the format: scheme:type/namespace/name@version?qualifiers#subpath (optional items in red)


Using purl to identify software in online “stores” will without a doubt reduce the number of proprietary software products that aren’t currently covered by purl. However, there remain a lot of proprietary products (perhaps the majority) that are only available through other means, like salespeople or call centers. For these products, Steve proposed a solution based on SWID tags.


In the SBOM Forum’s 2022 white paper, we proposed a new purl type called “SWID” (which Steve then got added to the purl spec). Steve suggested that software suppliers could create a SWID tag for each of their product/versions. The required fields in the SWID tag (meaning the fields that must be in the SWID tag in order to create a purl with type SWID) are here . They form a small subset of the 80 or so available fields in SWID, but they are all that’s needed to create the purl. 


The original idea of SWID was the tags would be distributed with the software binaries (and some suppliers like Microsoft distributed SWID tags with all of their software for a couple of years). When the end user wants to learn about vulnerabilities in their software, they can retrieve the SWID tag from the binaries and create the purl; in fact, Steve has developed a purl generator that takes input from the SWID tag.  The user can use the purl to learn about vulnerabilities identified in the product.


However, the two biggest problems with this proposal are (a) it doesn’t address how SWID tags will be created for legacy software, and (b) it doesn’t address how SWID tags (for both legacy and current software) can be discovered online. There are various ways that both of these goals can be accomplished, but some group needs to take it upon themselves to describe a workable solution for both (I’m suggesting that the supplier that currently owns the legacy product should be responsible for creating a SWID tag for every version of that product, and that each supplier could have a text file of SWID tags for its legacy and current products at a well-known location on its website - e.g. companyname.com/swid.txt. But there are certainly many other ways to create and distribute purls for proprietary software, whether or not they’re based on SWID tags).


The SBOM Forum and the purl working group could take on responsibility for this, but it will require funding to do it properly. The funding can be in the form of donations to OWASP that are restricted for the SBOM Forum. Any organization interested in donating for this purpose should email Tom at tom@tomalrich.com.


Even if these efforts can start soon, it will probably take at least 2-3 years before purl can become the universal software identifier that’s needed. However, there isn’t any alternative that I can see. There’s no longer a reliable source of CPE identifiers for CVE records, and there seems to be very little enthusiasm for trying to change that situation. This isn’t surprising, since CPE has lots of problems.


This means it will be at least 2-3 years before truly automated vulnerability management is possible again. This is discouraging, of course, but what’s worse is if automated vulnerability management will never be possible again, other than for open source software. Automated vulnerability management for open source software is alive and well today, because almost all vulnerability databases for OSS use purl to identify software packages. This shows there’s really no alternative to starting to work on making purl the software identifier for proprietary software as well.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

I lead the OWASP SBOM Forum and its Vulnerability Database Working Group. These two groups work to understand and address issues like what’s discussed in this post; please email me to learn more about what we do or to join us. You can also support our work through easy directed donations to OWASP, a 501(c)(3) nonprofit, which are passed through to the SBOM Forum. Please email me to discuss that.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.


No comments:

Post a Comment