Tom Alrich's Blog: November 2022

Sunday, November 27, 2022

Did CISA do their homework?

On November 10, CISA issued a blog post called “TRANSFORMING THE VULNERABILITY MANAGEMENT LANDSCAPE”. It got a lot of attention and widespread approval. It describes three techniques whose implementation will, according to the post, lead to “more efficient, automated, prioritized vulnerability management.” While these should be of interest to both software (and intelligent device) suppliers and end users, the recommendations are aimed primarily at the suppliers – since they need to take the initiative in all three of these areas.

Since the third recommendation, use of the new SSVC vulnerability categorization framework (which, not at all coincidentally, is CVSS spelled backwards) isn’t closely related to the first two, I’ll just focus on those two, although SSVC strikes me as a very good idea.

CISA’s first recommendation is that software suppliers should “Publish machine-readable security advisories based on the Common Security Advisory Framework (CSAF).” CSAF is currently on v2.0, which is also its first versoin. CSAF is the replacement for the Common Vulnerability Reporting Format (CVRF) version 1.2, which has been available since 2017. CVRF was developed and maintained by the CSAF technical committee of OASIS. I don’t know the full story, but at some point the committee presumably decided that the new version they were contemplating was going to be so different from CVRF that they should just name it after the committee. Sounds like a smart move to me. After all, it’s all about branding.

Unfortunately, branding alone isn’t enough when it comes to a machine-readable advisory format. You need two more things. First, you need at least one software tool to create the advisories, for use by suppliers who aren’t intimately familiar with the format. If you spend five minutes looking through the CSAF format, I think you’ll agree that it would take a lot of study for someone with no prior knowledge of, or experience with, CSAF to create an advisory without the help of a tool. Currently, the only available tool is Secvisogram, a CSAF editor (this was also the only available tool when the VEX working group approved CSAF as the VEX format in the spring of 2021). It will count brackets for you and perform similar tasks, but you need a full understanding of the CSAF format in order to use the tool to create a CSAF document.

There are a small number of mostly very large organizations – including Oracle, Cisco and Red Hat – that have announced support for CSAF 2.0. However, I could only find actual CSAF files published by Red Hat (and Red Hat labels the CSAF 2.0 format as “beta”, even though it was finalized and approved more than a year ago). Presumably, those organizations have been able to make the substantial investment of time required to create CSAF advisories (I know that Cisco and Red Hat have been part of the CSAF technical committee for years).

However, CISA’s blog post didn’t limit its recommendations just to large, well-resourced companies. They clearly want all software and device suppliers, large and small, to start issuing CSAF vulnerability advisories. And this is where I see a problem: Expecting every supplier of any size, large or small, to start creating CSAF advisories without having to invest a huge amount of time learning the CSAF format requires a “CSAF for Dummies” tool. I define this as a tool that prompts the user for the information they want to represent in the advisory – the CVE or CVEs in question, the affected products and versions, remediation advice for each affected version (which might include a patch or upgrade, but might also include something else), etc. To use the tool, the user shouldn’t need to understand the details of the format; they should just be required to answer questions about what they want in the advisory.

Currently, no such tool is available for CSAF (one is under development, and evidently has been for a long time); if a supplier wants to create CSAF advisories now, they have to go Full Monty and learn CSAF well enough to be able to use Secvisogram intelligently. And there are clearly a number of required fields – like “product name” (and the related concept of “product tree”) and “branches” – whose meaning is anything but self-evident. Any supplier wishing to create CSAF documents will need to have a good understanding of the huge number of options available for creating these and other fields, as well as understand how versions are represented, which is anything but straightforward.

Second, a machine-readable advisory format, in order to be usable, requires tools that read the format. And here the story is the same: There are no tools that read CSAF documents, even a simple parser tool. There is a parser tool under development, but that’s all. And frankly, a parser tool isn’t going to do end users a lot of good. Since most medium-to-large organizations utilize some tool or tools to manage vulnerabilities on their network (the tool may go under the name “scanner”, “vulnerability management”, “configuration management”, “asset management”, and probably other names as well), just parsing a CSAF file so that it’s readable by ordinary humans doesn’t get the information into their vulnerability management tool, where it’s needed. At the moment, the user is going to have to key the information in by hand, unless they’ve created their own tool to do this.

You probably get the idea: CSAF is a machine-readable vulnerability reporting format, but currently no machine can create or read CSAF documents – which raises the question what “machine readable” means in practice. If CSAF 2.0 had just been recently released, I wouldn’t be too bothered by this fact. Indeed, I wasn’t bothered a year ago that there weren’t any tools for it, since it was then only three months since CSAF 2.0 had been approved (although it had been under development for at least a couple of years).

But to be honest, the fact that it’s a year later and there still aren’t tools available for producing or utilizing CSAF documents makes me wonder why that’s the case and when this situation will change. The VEX working group has been repeatedly assured that such tools are under development (as we were 18 months ago, when we decided to base the VEX format on CSAF); and it’s certainly true that they’re under development. But I won’t recommend that any software supplier even start learning to create CSAF documents until someone can provide me a firm date by which a CSAF creation tool will be ready (and the date had better be before say July 1, 2023. Anything beyond that isn’t a firm date; it’s a wish and a prayer).

Of course, it still won’t help when there’s a CSAF creation tool, if there aren’t also CSAF consumption tools. And those need to be something more than a parser, as described above. But I’ll cut CSAF some slack: If there’s a parser ready by next July 1 along with a CSAF production tool, I’m willing to stipulate that it might not take a huge additional effort to put together the required interfaces to vulnerability management tools (in fact, the tool vendors themselves might create them). So, by the end of 2023, end user organizations might be able to actually utilize CSAF advisories, if the production and parser tools are ready by July 1.

Getting back to the CISA blog post, it seems clear that CISA should never have advised that suppliers start putting out CSAF vulnerability advisories without also warning them that a) they need to be prepared to invest a lot of time learning the CSAF format, and b) they have to understand that currently no end user organization will be able to do anything with the advisories when they get them.

CISA’s second recommendation was “Use Vulnerability Exploitability eXchange (VEX) to communicate whether a product is affected by a vulnerability and enable prioritized vulnerability response”. So how about this? Are there tools to produce and utilize VEX documents? Fortunately, the picture is much brighter there. There is at least one open source tool now, Dependency-Track, that both creates and reads VEX documents. Plus, I know of at least one other tool today, Stack Aware, that reads VEX documents. Moreover, I’ve been told there are a number of other tools, for both creating and ingesting VEX documents, that will be available soon.

So, it seems that CISA’s second recommendation was a good one. However, there’s one small problem in what they recommended: They only mentioned the CSAF based VEX format. Since that format is just a subset of the full CSAF format (and that subset still hasn’t been spelled out, beyond a short NTIA document that I prepared the first draft of in 2021, but which I now realize isn’t complete), any tools to create or read CSAF VEX documents will need to wait on tools that do the same for all CSAF documents – thus, what I just said about CSAF documents in general applies directly to CSAF VEX documents as well.

However, there is another VEX format. In January 2022, Steve Springett and Patrick Dwyer, co-leaders of the OWASP CycloneDX (CDX) project, surprised everyone (including me, who had by then become friends with Steve) by announcing that CycloneDX v1.4 includes VEX capability. I can confirm that the CDX VEX implementation is quite straightforward, so it doesn’t surprise me that there are already tools both to read and write CycloneDX VEX documents.

So, CISA provided great advice in their second recommendation: Software suppliers should create VEX documents in CDX format, since there are currently tools both to create and read them. In fact, the same applies to the first recommendation: The CycloneDX format can be used to create Vulnerability Disclosure Reports (VDR) just as easily as it can be used to create VEX documents. As long as they’re not committed to CSAF and nothing else, a supplier will be able to “Publish machine-readable security advisories” per CISA’s recommendation.

Funny how these things work out…

Reminder: I'll be doing a podcast that will discuss VEX in depth on Wednesday, Nov. 30 at 10AM ET. I'll be able to discuss these issues in a lot more detail. Hope you can join me!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

Monday, November 21, 2022

The purl in your future

In September, I announced that the SBOM Forum, the informal discussion group I started last March, had published a “proposal” to address the naming problem. This problem is one of the most important obstacles currently limiting use of software bills of materials (SBOMs) by end user organizations (although I pointed out, as I do frequently, that software developers are already using SBOMs heavily to learn about vulnerabilities in the software they’re building. This shows the benefits are real).

In this recent post, I described the problem, which centers on the CPE identifier that’s the basis for the National Vulnerability Database (NVD), using excerpts from our proposal. While there are six particular causes of the problem, their result is the same: Only a small percentage - probably under five percent - of component names found in an SBOM can be located through a search of the NVD, even using a properly-constructed CPE name. If a developer wishes to find more than this small percentage of names, they need to utilize various heuristic schemes like AI and fuzzy logic to increase the percentage.

The good news is that, by all reports I’ve heard, it’s possible to find a respectable percentage of component names in the NVD using these schemes. But the bad news is that having to do this means that, until at least a partial solution is implemented (and ours is the only proposal I’ve heard about, although I know CISA will be soliciting other proposals as well - something a government agency has to do), producing SBOMs is always going to require a substantial “manual” effort; this process can’t be fully automated until the component names produced by the automated process can in principle always be found in a vulnerability database like the NVD or OSS Index.

How does our proposal address this problem? It proposes that the NVD start accepting other identifiers for software and hardware besides CPEs – although there is no need to replace CPEs. Existing CPEs can be retained and new CPEs can be created. We are proposing that the NVD accept new identifiers for software and hardware. I’ll discuss our software naming proposal in this post, and our hardware naming proposal in a future post.

For software, we’re proposing that the NVD accept an additional identifier called purl, which stands for “package url”. While there is no way to know for sure, we believe that accommodating purl will address 70-80 percent of the software naming problem in the NVD. Purl is already in widespread use in the open source community; it is the identifier used in Sonatype’s OSS Index and most other open source databases.

Purl was developed primarily to address the problem that the “same” open source component can often be found in multiple repositories (package managers), and the code may be somewhat different in each one. Additionally, the same component might be found with different names in different package managers. Purl provides a name that is tied to the download location; if the user knows where they downloaded the component from and the name in that repository, they should always be able to create a purl that matches the one used by the component supplier (which is often an open source community, of course) when they reported a vulnerability for the component; thus, they should always be able to find the vulnerabilities reported for that component. That certainly can’t be said for CPE, or any other identifier that has to rely on a centrally maintained database.

Since 90% of software components are open source, purl was the natural choice to be the basis for the SBOM Forum’s proposal. However, as you’ll see if you read our whole proposal, we have utilized the unique properties of purl to incorporate names for proprietary (or “closed source") software into the proposal as well, utilizing SWID tags (I plan to discuss this in a post in the near future).

Purl is unique among identifiers in that it doesn’t require a centralized database. How is that possible? Here’s how we answer that question in our proposal:

Our solution is based on…the distinction between intrinsic and extrinsic identifiers. As described in an excellent article^{^[1]}, extrinsic identifiers “use a register to keep the correspondence between the identifier and the object”, meaning that what binds the identifier to what it identifies is an entry in a central register. The only way to learn what an extrinsic identifier refers to is to make a query to the register; the identifier itself carries no information about its object.

A paradigmatic example of an extrinsic identifier is Social Security numbers. These are maintained in a single registry (presumably duplicated for resiliency purposes) by the Social Security Administration. When a baby is born or an immigrant is given permission to work in the US, their number is assigned to them. The only way the person “behind” a Social Security number can be identified is by making a query to the central registry (which of course is not normally permitted) or by hacking into the registry.

By contrast, intrinsic identifiers “are intimately bound to the designated object”. They don’t need a register; the object itself provides all the information needed to create a unique identifier. What intrinsic identifiers need is an agreed-on standard for how that information will be represented. Once a standard is agreed on, anyone who has knowledge of the object can create an identifier that will be recognizable to anyone in the world, as long as both creator and user of the identifier follow the same standard.

An example of an intrinsic identifier is the name of a simple chemical compound. As the article states, “We learned in high school that we do not need a register that attributes different identifiers to all possible chemical compounds. It’s enough to learn once and for all the standard nomenclature^{^[2]}, which ensures that a spoken or written chemical name leaves no ambiguity concerning which chemical compound the name refers to. For example, the formula for table salt is written NaCl, and read sodium chloride.”

In other words, simply knowing information about the makeup of a simple chemical compound is sufficient to create a name for the compound, which will be understandable by chemists that speak any language - as long as you follow the standard when you create the name. NaCl refers to table salt, no matter where you are or what language you speak^{^[3]}.

An example of an extrinsic identifier that is very pertinent to our proposal is a CPE name. CPE names are assigned by a central authority (NIST) and stored in a register. Whenever a user searches for a CPE name in the National Vulnerability Database, the register is searched to determine which entry the name refers to.

Only CVE Numbering Authorities (CNA) authorized by NIST may create CPE names; currently, there are around 200 CNAs, mostly software suppliers or cybersecurity service providers. Organizations that wish to report a CVE (vulnerability), but do not themselves have a CNA on staff and are unable to identify an appropriate CNA to do this, may submit their requests to the MITRE Corporation “CNA of Last Resort”, which will identify an appropriate CNA to process the request.

In most cases, a software supplier applies to NIST for a CPE name for a product. When the name is assigned, the supplier can report vulnerabilities (each of which has a CVE name, assigned by CVE.org^{^[4]}) that apply to the product. Even though software products are named in many different ways as they are being developed and distributed, the product name used in the CPE is based on a specification^{^[5]} that has no inherent connection to the product itself.

The solution we are proposing for the software naming problem is based on an intrinsic identifier called purl^{^[6]}, a contraction of “package URL”. As in the case of a simple chemical compound, knowing certain publicly available attributes of a software product enables anyone to construct the correct purl for the product. Moreover, anyone else who knows the same attributes will be able to construct exactly the same purl, and therefore find the product in a vulnerability database without having to query any central name registry (which of course does not exist for purl, since it is not needed).

Purl was originally developed to solve a specific problem: A software product will have different names, depending on the programming language, package manager, packaging convention, tool, API or database in which it is found.^{^[7]} Before purl was developed, if someone familiar with, for example, a specific package manager (essentially, a distribution point for software) wanted to talk about a specific open source product with someone familiar with a different package manager, the first person would need to learn the name of that product in the second package manager, assuming it could be found there. This would always be hard, because – of course – there is no common name to refer to.

This situation is analogous to the case in which an English speaker wishes to discuss avocados with someone who speaks both Swahili and English. The English speaker doesn’t know the Swahili word for “avocado”. To find that word, the English speaker uses an English/Swahili dictionary to learn that the Swahili word for avocado is parachichi. Neither avocado nor parachichi has any connection to an actual avocado – they are simply names. They are extrinsic identifiers, and there is no way to know that they refer to the same thing without a central register, which in this case is the dictionary.

However, what if the two speakers didn’t have ready access to an English/Swahili dictionary, either in hard copy or online? Most likely, the English speaker would find a picture of an avocado and show it to the Swahili speaker. The latter would smile broadly and say they now understand exactly what an avocado is. In fact, even if a dictionary were available, it would probably be easier for the English speaker to show the picture to the Swahili speaker. The picture is an intrinsic identifier; it is based on an attribute of the thing identified – in this case, what the thing looks like. Even if there were no English/Swahili dictionaries in existence, the picture would be a perfectly acceptable (in fact, preferable) identifier.

For a more technical description of purl by Philippe Ombredanne, the creator of purl, go here.

If you want to know how we envision purl working in the NVD, go to the discussion starting on page 10 of our proposal. In future posts, I plan to discuss how our proposal addresses proprietary software and intelligent hardware devices.

^{^[1]} https://www.softwareheritage.org/2020/07/09/intrinsic-vs-extrinsic-identifiers/

^{^[2]} https://en.wikipedia.org/wiki/Chemical_nomenclature

^{^[3]} More complex compounds may require a CAS Registry Number, which is a centralized database.

^{^[4]} https://www.cve.org/About/Overview

^{^[5]} https://cpe.mitre.org/specification/

^{^[6]} https://github.com/package-url/purl-spec

^{^[7]} The reader does not need to understand what all these items are, in order to understand the principle behind purl. In principle, nothing would be different if each of these items were a language, so the reader might think of the items simply as different languages.

Thursday, November 17, 2022

VEX without the puns

To say there is confusion about what constitutes VEX is a big understatement. Conservatively, I’d say upwards of 95% of what’s been written about VEX, or discussed in webinars or live presentations, is wrong or misleading; this includes a lot of what I’ve written.

However, I’m not alleging this is due to some vast conspiracy to bamboozle the cybersecurity community. Nobody making these wrong or misleading statements is acting in anything other than good faith. But good faith doesn’t make those statements correct.

There are two primary reasons for this situation:

1. The VEX working group, which started under the NTIA in 2020 and continued under CISA this year, has only published two documents on VEX. These can both be found at cisa.gov/sbom [i]. While these are both good, neither one attempts to provide a definitive definition or description of VEX.

2. More importantly, almost all of the false or misleading statements were based on expectations – really, hopes – for events we all thought would occur, but which in fact didn’t. Yet as often happens, the words didn’t change nearly as quickly as events on the ground. This is why statements that were neither wrong nor right when they were made are now wrong. It is also why people whose knowledge of VEX is based on those erroneous statements make more erroneous statements, in classic whisper-down-the-lane fashion.

Because of this situation, VEX documents – which many people believe to be in wide production and use at this time – are neither being produced nor used, to the best of my knowledge. How could this be otherwise, given that no parties interested in producing or using VEXes could possibly understand how to do either?

Since the human race has existed for approximately 200,000 years without VEX documents, the fact that this situation isn’t changing currently might not be considered high tragedy. However, the VEX idea came about because of an important realization: Software bills of materials (SBOMs) will never be widely produced or used until VEXes are also produced and used. No VEXes means (almost) no SBOMs. It’s literally that simple.

Thus, if we want to have SBOMs, we have to figure out a way to have VEXes. Or at least we have to figure out how to have the information that we’ve been thinking (erroneously, I now believe) needs to be delivered in VEX documents.

How this situation came about and how it might be remediated are the subjects of a web event I’ll be participating in on November 30 at 10AM Eastern Time. It’s sponsored by Scribe Security. I hope you can join in!

[i] The NTIA published a one-page document in 2021, which I originally drafted. It is out of date and should be ignored.

Friday, November 11, 2022

Who will tame the CPE beast?

I confess that I and almost every other person who writes or speaks about software bills of materials has been misleading you for years. We have done this when we’ve spoken about SBOM as something that is well defined. The two main formats are well defined, but almost all the best practices regarding production, distribution and use of SBOMs are still far from being defined, or at least agreed upon.

We have also been misleading you when we’ve implied or stated that SBOMs were being widely (or even narrowly) used for the purpose that most people talk about (including in Executive Order 14028): to enable software end users to learn about most important exploitable component vulnerabilities in the software they utilize, so they can coordinate with the suppliers to patch or otherwise mitigate them.

There are many issues standing in the way of widespread SBOM use, but there are a few that are show-stoppers, meaning I don’t see any way SBOMs will be used widely for this purpose until those problems are addressed. One of those is the naming problem, which can be summarized as, “When an SBOM is generated by an automatic process, only a small percentage of component names will be found through a search of the National Vulnerability Database (NVD).” In other words, in very few cases will a user who looks for a component name in the NVD find the component – meaning the user (or even a software tool acting on their behalf) will only be able to learn about a small percentage of component vulnerabilities through an NVD search, despite having an SBOM listing all the components.

How small a percentage is this? The Director of Product Security for a very large software supplier, who has participated for years in the SBOM and VEX discussions under the NTIA and now CISA, as well as the informal group I started called the SBOM Forum, had previously used the figure of 20%. Of course, that’s bad, since it means the user won’t be able to learn (through searching the NVD) about vulnerabilities applicable to 80% of the components in an SBOM they receive.

However, that same person was challenged regarding this figure by another very large supplier (of both software and intelligent devices) at one of our SBOM Forum meetings. Did the other supplier question how the 20% figure could be so low? No, they asked why it was so high, because their experience was that it was below 5%. The original supplier admitted that 20% was a very conservative number and agreed that 5% is closer to their actual experience.

So, the good news is that it isn’t true that 80% of component names can’t be found in the NVD; the bad news is the figure is really more like 95%. This means you’ll only be able to find about 5% of component names from an SBOM in the NVD.

Needless to say, SBOMs wouldn’t be used at all if users could only find 5% of component vulnerabilities. Yet, even though end users are barely using SBOMs at all today, suppliers are using them very heavily. How can they do that? It’s because every supplier who needs to use SBOMs – or in many cases, the consultants who help suppliers use SBOMs – has some method, based on AI, fuzzy logic, throwing bodies at the problem, prayer, etc., to get around this problem. The good news is that every supplier or consultant that I’ve talked to about this problem says they’ve been able to get the matches up to an acceptable level, although it’s nowhere near 100%.

So while this is an acceptable workaround for those suppliers willing to invest the time, money or both that is required, the fact is that it makes it impossible to “operationalize” SBOM production; every SBOM produced will need its own care and feeding, rather than being a completely automated process. SBOMs will never be produced in the volume required, if every one of them needs to be massaged in this way.

This is why the SBOM Forum developed a workable “solution” for maybe 70-80% of the naming problem in the NVD, which we described in this paper that we released in September. It’s now being evaluated by CISA and MITRE, and I’m reasonably optimistic it will be implemented at least in part. In this post and (probably) my next one, I’ll lay out the basics of the argument that we made in the paper, since I’ll admit that it’s densely written and a lot of people may have given up on reading it.

The NVD’s problems lie mostly with CPE names, the only identifiers supported by the NVD. Here are the six main problems:

Vulnerabilities are identified in the NVD with a CVE number, e.g. “CVE-2022-12345”. A CPE is typically not created for a software product until a CVE is determined to be applicable to the product. However, many software suppliers have never identified a CVE that applies to their products, so they have never created a CPE for them. This is almost certainly not because the products have never had vulnerabilities, but because the suppliers, for whatever reason, have not submitted any vulnerability reports for those products for inclusion in the National Vulnerability Database.

The worst part of this problem is that the result of an NVD search will be the same in both cases - the case where a vulnerability has never been identified in a product and the case where the supplier has never felt inclined to report a vulnerability, even if their product is loaded with them. The search will yield “There are 0 matching records” in both cases. Someone conducting a search won’t know which case applies, so they may believe the product has no vulnerabilities, when the truth is very different.

There is no error checking when a new CPE name is entered in the NVD. Therefore, if the CPE name that was originally created for the product does not properly follow the specification, a user who later searches for the same product and enters a properly-specified CPE will receive an error message. Unfortunately, it is, once again, the same error message that they would receive if the original CPE name were properly specified but there are no CVEs reported against it: “There are 0 matching records”.

In other words, when a user receives this message, they might interpret this to mean that there is a valid CPE for the product they’re seeking, but a vulnerability (CVE) has never been identified for that product - i.e. it has a clean bill of health. However, in reality it would mean the CPE name was created improperly. In fact, there might be a large number of CVEs attached to the off-spec CPE, but without knowing that name, the user will not be able to learn about those CVEs.

Another explanation for the “There are 0 matching records” error message is that the user had misspelled the CPE name in the search bar. Again, the user would have no way of knowing whether this was the reason for the message, or whether the message means the product has no reported vulnerabilities.

It is to avoid problems like this that most organizations that use the NVD employ advanced search techniques based on AI or fuzzy logic^{^[1]}. While that can greatly reduce the number of unsuccessful searches, having to resort to this makes it impossible to conduct truly automated searches. Considering that an average-sized organization might easily need to conduct tens of thousands of NVD searches per day and a service provider doing this on behalf of hundreds of customers would need to conduct some large multiple of that number, the magnitude of this problem should be apparent.

When a product or supplier name has changed since a proprietary product was originally developed (usually because of a merger or acquisition), the CPE name for the product may change as well. Thus, a user of the original product may not be able to learn about new vulnerabilities identified in it, unless they know the name of the current supplier as well as the current name for the product. Instead, this user will also receive the “There are 0 matching records” message.
A similar consideration holds true for supplier or product names that can be written in different ways, such as “Microsoft(™)” and “Microsoft(™) Inc.”, or “Microsoft(™) Word” and “Microsoft Office(™) Word”, etc. A user searching on one of the variants of a supplier or product name may learn about just the CVEs that are applicable to the variant they entered, rather than all of them.
Sometimes, a single product will have many CPE names in the NVD because they have been entered by different people, each making a different mistake. For this reason, it will be hard to decide which name is correct. Even worse, there may be no “correct” name, since each of the names may have CVEs entered for it. This is the case with OpenSSL (e.g. “OpenSSL” vs “OpenSSL_Framework”) in the NVD now. Because there is no CPE name that contains all of the OpenSSL vulnerabilities, the user needs to find vulnerabilities associated with each variation of the product's name. But how could they ever be sure they had identified all the CPEs that have ever been entered for OpenSSL?
Often, a vulnerability will appear in one module of a library. However, because CPE names are not assigned on the basis of an individual module, the user may not know which module is vulnerable, unless they read the full CVE report. Thus, if the vulnerable module is not installed in a software product they use but other modules of the library are installed (meaning the library itself is listed as a component in an SBOM), the user may unnecessarily patch the vulnerability or perform other unnecessary mitigations. In fact, it’s likely that at least some of the patching performed for the log4j vulnerabilities was unnecessary, for precisely this reason.

What is needed is to be able to name software and hardware components in a BOM with an identifier that, when entered in the NVD, will

Almost always match to the correct product, if the product is listed.
Almost never match to an incorrect product.
Not require that the identifier already exist in the NVD. This is almost always required today, in order for the user to get a correct response. If the user searches on a CPE name that doesn’t exist in the NVD, the error message they receive, “There are 0 matching records”, is the same one they would receive if the CPE does exist, yet it has no reported vulnerabilities.
Never yield a result that might be interpreted to mean the product was found but there are no applicable vulnerabilities, when in fact one of the following is the case:

The wrong identifier was entered in the search bar; or
An off-spec CPE was initially created for the product, so the product cannot be found by searching on a CPE that was created according to the spec; or
The name and or/supplier of the product has changed due to a merger or acquisition. Thus, the CPE entered by a user of the original product won’t match the current CPE name.

Identify the vulnerable module in a library rather than just the entire library, so that, if that module isn’t installed in a product but other modules are installed (meaning the product will appear to be vulnerable when in fact it isn’t[i]), users will not patch or perform other mitigations that are not necessary.
When a supplier and/or product name has changed for a product, allow there to be separate identifiers - and thus separate locations to report CVEs - for the different supplier or product names; thus the different supplier/product names will be treated as separate products.

Who or what is the hero that will tame the CPE beast! Keep tuned to this blog for the exciting conclusion!

^{^[1]} Or, in the case of at least one third-party service provider, a “small army” of CPE-resolvers.

[i] A VEX from the supplier, saying that the vulnerability isn’t exploitable even though the component itself is present, would address this problem. However, an identifier that applied at the module level would prevent this problem from even occurring.

Tuesday, November 8, 2022

A star is born

For the last six months or so (maybe more), Brandon Lum of Google has been sometimes participating in two or three of the CISA SBOM workgroups, especially the VEX workgroup. His title has something to do with open source software, and since a lot of people in the workgroups are involved with OSS, I wasn’t surprised that he would be attending these meetings.

A little more than a month ago, he started making announcements in some of the meetings about a new Google project called GUAC and asking for people to participate in it. I didn’t pay a lot of attention to the details, but I knew it had something to do with software supply chain security, and with SBOMs in particular. Since open source software depends on volunteers to develop and maintain the projects, this wasn’t unusual, either.

Moreover, there was one reason why I deliberately didn’t pay a lot of attention to what Brandon was saying about GUAC: Last year, Google announced another project aimed at software supply chain security called SLSA, which has been very well received by the developer community. It’s essentially a framework that will allow developers to identify and take steps to prevent attacks on the software build process, which nobody (that I know of, anyway) even thought of before SolarWinds.

(SolarWinds fell victim to an extremely sophisticated 15-month attack conducted by – according to Microsoft’s estimate – about 1,000 people working out of Russia. There’s a really fascinating article on CrowdStrike’s website about SUNSPOT, the malware that the Russians purpose-built for this attack. In fact, they tested it during a three-month proof of concept conducted inside the SolarWinds software build environment, then deployed the malware for 5 or 6 months, without ever being detected. This was easily, along with Stuxnet, the most sophisticated malware ever developed. If only the Russians would start putting all that great expertise toward a good use, for a change! BTW, don’t try to understand everything in the article. It’s just amazing to see what Sunspot was able to do, all without any direct Russian control).

But I digress. When I heard Google was following a project called SLSA with one called GUAC, I found this a little too cute for my taste (can Google CHIPS be far behind?). So, frankly, I tuned Brandon out when he brought this up.

However, two weeks ago I saw a good article in Dark Reading – which linked to a great Google blog post - about GUAC. I also found out that my good friend Jonathan Meadows of Citi in London – a real software supply chain guru, although very focused on how ordinary schlumps like you and me (OK, maybe not you) can secure our software supply chains without having all of his knowledge and experience – was involved with GUAC from the get-go. These two datapoints convinced me that I should be paying a lot more attention to GUAC.

So I did. And this is what I found:

The project intends to present to users a “graph database”, which in principle links every software product or intelligent device with all of its components, both hardware and software components, at all “levels”. You might think of the database as being based on a gigantic SBOM dependency tree that goes in all directions – i.e., each product is linked with all its upstream dependencies, as well as the downstream products in which it is a component (or a component of a component).
One of the important functions of this database is to provide a fixed location in “GUAC space” (my term) for software products and their components. Artifacts necessary for supply chain analysis, like SBOMs and VEX documents, can be attached to these locations, making it easy for the user of a software product to learn what new artifacts are available for the product (actually, the nodes of the database are versions of products, not the products themselves).
While the database will incorporate any artifacts created in the software supply chain, the three types of artifacts incorporated initially will be SBOMs, Google SLSA attestations, and OSSF Scorecards. The idea is that, ultimately, all the documents required for an organization (either a developer or an end user organization) to assess their software supply chain security will be available at a single internet location (and I don’t think the location would change – just its attributes).
The artifacts can be retrieved by GUAC itself – the supplier of the artifact won’t have to put it in place “manually”. Google says, “From its upstream data sources, GUAC imports data on artifacts, projects, resources, vulnerabilities, repositories, and even developers.”
Artifacts like SBOMs can be contributed and made available for free, but they don’t have to be. The Google blog post says, “Some sources may be open and public (e.g., OSV); some may be first-party (e.g., an organization’s internal repositories); some may be proprietary third-party (e.g., from data vendors).” In other words, a vendor that has prepared documents or artifacts related to security of a product can attach a link to the product in GUAC space. Someone interested in one of those artifacts can follow the link and, if they agree to the price, purchase it from the vendor.
Thus, one function of GUAC can be enabling a huge online marketplace. However, unlike most markets related to software, the user won’t have to search on the product name to find what’s available for it. Instead, the user will just “visit” the fixed location for the product and look through what’s available there.

I can imagine the inspiration for GUAC might have occurred when some Google employee involved in software supply chain security grew frustrated at the number of different internet locations they had to visit to get the artifacts they needed to analyze just a single product. Instead of a person running ragged while searching for the most up-to-date artifacts, how about having the computer do the legwork in advance? The user would just have to go to the right location, to find everything they need in one place and with one search.

In fact, Google has done this before. Those of you who were involved with computers in the later 1990s (once the internet had supposedly arrived, but was often proving to be slower than just doing some things by hand) may remember that the search engines – Yahoo, DEC’s Alta Vista, etc. – just searched for character strings. If you were good (plus lucky) and entered a string that would return you just the items you were interested in but no others, searching was a pleasant experience.

However, if you weren’t good and lucky, and just searched on a topic like “mountain vacation”, you would get hundreds of pages of results, with no assurance that what you were really looking for wasn’t on the very last of those pages. Google came out with a very intelligent search engine that used all sorts of tricks – like ranking results by their popularity with others – to make it more likely that what you were looking for would be on the first page or two. The rest, as they say, is history.

I certainly don’t think GUAC will have anywhere near the success that the search engine had. After all, just about everyone in the world can use a general search engine, but only a small fraction of the world’s inhabitants are involved with software supply chain security – although given how I spend my time nowadays and the people I meet online, I’m sometimes tempted to think it’s actually a very large percentage.

My guess is that in five years, anyone involved in software supply chain security will be spending a lot of their time navigating the highways and byways of the GUAC world. For some, the need to do this will become apparent sooner rather than later. For example, if you’re involved with one of the companies for which distribution of SBOMs is an important part of the business model, you should be figuring out how you can incorporate GUAC into that model – although I’m certainly not saying you should abandon whatever you’re doing, or planning to do, now.

Another idea: While I haven’t tried to look at tech specs on the project yet, I’m sure there must be some sort of fixed address for a software product (or a component, which of course is just a product that’s been incorporated into another product) within GUAC world. I can see that address becoming a kind of universal name repository for a software product, which of course can have multiple names over its lifecycle. Currently, if you have an old version of a product whose name has subsequently changed, and you want to learn about vulnerabilities that have recently been reported for the product, you’re out of luck, unless you happen to know the current name of the product. That (and a lot of other things) may change with GUAC.

Tuesday, November 1, 2022

NIST’s new guidance for IoT and IIoT cybersecurity

In the past 3-4 years, NIST has produced a dizzying array of guidance documents for IoT security. This isn’t because they can’t make up their minds. It’s because a) the IoT marketplace has been going through rapid change, and b) NIST has been ordered to undertake at least two “regulatory” mandates for IoT (in the IoT Act of 2020 and the IoT device labeling program mandated by Executive Order 14028), even though they’ve never been a regulator. The mandates changed multiple times after they were given to them; in both cases, they’ve finally been withdrawn from their purview.

Thankfully, now NIST’s duties regarding IoT seem to have returned to their traditional role, which they are quite good at: developing risk-based cybersecurity frameworks that are mandatory for the federal government and merely advisory for the private sector. The end product of the various NIST documents and initiatives regarding IoT security in the past few years is a new Interagency Report: NIST.IR 8425.

This is a very good document, and it was worth the many interim documents, workshops, etc. that led to it. Don’t get thrown off by the fact that it’s titled “Profile of the IoT Core Baseline for Consumer IoT Products”. “Consumer” is in the title because these were originally intended to be the guidelines behind the device labeling program in EO 14028, not because the NISTIR only addresses security of consumer (i.e. household) IoT devices.

The recent workshop at the White House showed that that they now have a better idea for the device labeling program. They would like the FTC to run the program, although whatever criteria they use for assigning the label will almost ccertainly be NIST's - probably NIST.IR 8425. The FTC is a logical choice to run the program, since they're responsible for enforcing other regulations regarding the trustworthiness of products sold in the US, as well as the truthfulness of claims made about them by their manufacturers.

This move by the White House freed NIST to take the guidelines they’d published in February for the device labeling program and turn them into their final set of guidelines for both business and consumer (i.e. household) IoT products. NIST did this with only minor changes to the document, based in part on comments they’ve received since February

The fact that NISTIR 8425 is intended to address both consumer and business (i.e. commercial and industrial) products is made clear at the beginning of the Abstract on the first page. The first sentence states that the publication “…identifies cybersecurity capabilities commonly needed for the consumer IoT sector (i.e., IoT products for home or personal use)”. However, the second sentence reads, “It can also be a starting point for businesses to consider in the purchase of IoT products.”

With the words “starting point”, NIST is clearly leaving the door open for more extensive business-oriented guidelines in the future. Any businessperson who yearns for more extensive guidelines today can always go back to NISTIRs 8259A and 8259B, both of which are good documents and go beyond what’s included in NISTIR 8425 (in fact, most of the guidelines in 8425 were selected from 8259A and B).

But NIST’s choice of words means that businesses wanting to base their IoT program on a recognized framework (at least, recognized in the US) would do well to follow NISTIR 8425 now, since any further NIST guidelines for business IoT are likely to build on that, not replace it. This applies both to manufacturers of IoT and IIoT devices that are designing new products or beefing up security on existing products, and to commercial and industrial firms that are looking for guidelines they can use to assess the security of the devices they buy.

There’s another use case for NISTIR 8425 in those commercial and industrial firms: as the basis for contract language. But please do me a favor: Before you write a contract term that reads something like, “Vendor must comply with NISTIR 8425”, keep in mind that “comply” is a meaningless term when it comes to this document (and to most other NIST documents).

For example, look at the first requirement in the “Data Protection” section on page 7: “Each IoT product component protects data it stores via secure means.” Of course, there are lots of ways to “protect data” by secure means. Which one should the manufacturer implement in the device?

That will very much depend on the destination of the device. If the device is for dry cleaning establishments, its security requirements can be much less strict than if it were going into, for example, a nuclear power plant. A device intended for a nuclear plant would probably need to employ the latest encryption technology to secure the data stored by the device, while a device intended for a dry cleaner might not need encryption at all. For them, just requiring the user to enter a password when they first log into the device might be all the protection required.

This is why simply asking a device manufacturer for “compliance” with 8425 will never work. Instead, the term might read, “Vendor shall be prepared to provide evidence that they have implemented a product risk management plan based at minimum on the guidelines provided in NIST.IR 8425” – or something like that. That way, the manufacturer will be assessed on whether it is taking steps in each area of capability listed in the NISTIR (there are ten “capabilities”, all but one of which have between one and three subordinate capabilities. The “Documentation” capability lists a large number of subordinate capabilities, which isn’t surprising).

However, there is one requirement that I think should be addressed by all device manufacturers, regardless of who their users are: They should report any vulnerability they learn of, in either the software or firmware installed in the device, to CVE.org. It seems very few IoT manufacturers are doing this now; yet it is the only realistic way for users of a device to learn about the vulnerabilities found in it.

My reason for requesting this is described in the last section of this article, which I wrote with Isaac Dangana of Red Alert Labs. RAL is a client of mine that works with IoT manufacturers to secure their devices, and with IoT consumers to verify that the devices they buy are and remain secure. While the article officially discusses SBOMs for IoT and IIoT devices, the need to report vulnerabilities for the device itself has nothing to do with SBOMs. Rather, it’s needed so that IoT and IIoT users can properly secure their devices.