A topic that seems to come up a
lot in the CISA SBOM meetings is “verification” of SBOMs. Of course, that could
mean a lot of things, but this usually seems to mean that the software customer
doesn’t believe their supplier can be trusted to accurately represent all the
components of the software in the SBOM. For example, the supplier might not
report a component that’s nine years old and is loaded with vulnerabilities, or
they might list a component as version 4.5, which was released three months
ago, when it’s in fact version 1.1, which was released in 2011.
Could this happen? Certainly it
could. However, you need to keep in mind that, once you start receiving SBOMs on
a regular basis for more than just one or two products, it’s inevitable there
will be a lot of empty spaces and “NOASSERTION” statements. This is because there are so
many problems
with naming of components (although help seems to be on the way on this issue,
I’m glad to report).
Might some of those empty spaces
be the result of deliberate obfuscation by the supplier? That’s certainly
possible. But, given that some large suppliers estimate that over 90% of components
are either mis-identified or not identified at all in an SBOM that’s produced as part of their software build process (the most common scenario), how will you ever know if the lack of
an identifier for a component is due to a deliberate act by the supplier, or
just due to the normal wear and tear of the naming problem? Answer: you won’t.
But there’s an even more important
reason why verifying an SBOM may never be possible: How could you ever do that,
even in principle? Here’s the problem: What you find in an SBOM will vary
widely due to when in the software lifecycle the SBOM is produced. One of the
CISA workgroups recently finalized a document
on this issue, which is awaiting final approval for publication. The document
describes six SBOM Types. They’re all valid for specific use cases, but they’ll
always differ from each other, sometimes radically.
In a large number of cases, the
SBOM you get will be created during the final Build stage, when the software code
(including components) will be “set in stone” – i.e. the code contained in the
binaries delivered to you the customer is exactly the code that went into the
final build. If you want to verify what the supplier did with the greatest
accuracy, you will need another SBOM created at the final Build stage. However,
a particular version of a product only goes through one final build. This means
that, unless you can roll back time and persuade the supplier to let you produce
your own SBOM during the final build of the version that you now utilize, you
won’t have a completely comparable SBOM to compare with the one the supplier
provided you.
If you want to produce your own
SBOM and not have to time-travel, you could produce an Analyzed SBOM using a “binary
analysis” tool. This is a tool that, starting with the binary files distributed to customers, attempts to decompile the supplier’s code[i] and create the SBOM using
that code. Of course, this will never be a completely clean process and will usually
result in more serious naming problems than occur with just a Build SBOM.
In other words, probably the only
SBOM Type that will be within your power, as a customer, to produce will
inevitably be substantially different from the one the supplier provided to you
(unless the supplier themselves used binary analysis to produce their SBOM. In
some cases, the supplier may have to do that, especially if they use older
languages like C or C++. But even if they did that, the Analyzed SBOM produced by
the supplier will differ a lot from the one that you produce, since they bring
to it a lot of inside knowledge known only to the developer).
Of course, an SBOM produced at any
stage of the software lifecycle is interesting. In fact, some people who know a
lot more about this than I do (a low hurdle to clear, to be sure!) say the best SBOM is one that blends two or more
of the different types. For example, the Deployed SBOM is unique among the six
SBOM Types, in that it includes not just the software itself but everything
that is deployed with it: the installer, a container, runtime dependencies,
etc. Since every artifact that’s deployed in the user’s environment can be a
source of risk, knowing what’s inside all of these items is almost as important
as knowing what’s inside the software itself. On the other hand, since the
Deployed SBOM depends on binary analysis, it will never provide as good a
description of the software itself as the Build SBOM does. It might be best to
combine the two of them, although that in itself requires a lot of skill.
I hope you get the idea: In order truly
to verify an SBOM, you must have something comparable to measure it against. However,
it’s not likely that, without closely cooperating with the supplier, you’ll
ever be able to produce anything that’s comparable. But if verification
requires cooperating closely with the supplier,
that’s not exactly verification, is it?
This brings up an idea: Rather
than taking an adversarial position vs. the supplier and pretending it’s
possible for you to conduct an “independent” verification, you could utilize a
binary analysis tool to build your own SBOM, then discuss the differences
between the two SBOMs with the supplier. You both might learn something
interesting from doing that.
Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.
[i] Doing
this may violate the supplier’s license agreement.