When I first started attending the
meetings of the NTIA Software Component Transparency Initiative (which of
course was the name for the “SBOM initiative”) in the summer of 2020, I
immediately started hearing about what was called the “naming problem”. I wrote
a post about the problem in November 2020 and called it the “one
problem that towers over the others”, as far as SBOMs are concerned.
In other words, at that time, I
considered the naming problem to be the most serious roadblock in the path to
widespread (or even narrowspread, to be honest) distribution and use of SBOMs
by the general community. However, the whole group seemed to have decided that
this was too hard a problem to address at the time, given that they were still
trying to a) figure out how SBOMs were going to work in general, and b)
interest the developer community in producing them. This was still the accepted
view when the NTIA Initiative ended at the end of 2021, even though a lot of
progress had been made in both of those areas.
I have good news and bad news. The
good news is that I no longer consider the naming problem to be the biggest
impediment to the spread of SBOMs. But the bad news is that I don’t say this
because I think the problem has diminished. Rather, I’ve come to realize there are
at least five or six just-as-serious impediments to SBOMs (and a host of
not-so-serious impediments, which are impediments nevertheless); in other
words, the naming problem now has company, instead of “towering” over the
others. There’s progress for you!
Early this year, it seemed to me
and a few others that it was time for the private sector to take the lead in
addressing head-on the most serious problems for SBOMs. The idea was that we
would meet weekly and discuss one problem until we at least understood how it
might be solved. Then, if there was something we could do to put in motion the
solution to the problem, we would do that. After that…on to the next problem.
Why were we doing this? I’ll admit
that we all had selfish reasons. Our livelihoods are all tied in one way or
another to the success of SBOMs. We will all benefit if SBOMs start to be
widely used outside of the development community (where they’re already widely
used for product risk management purposes).
The question came up, should the
participants worry about helping potential competitors? After all, when one of
these problems is solved, it will benefit their competitors, as well as themselves.
My answer to this question (which I didn’t hear often, to be sure) is that
ultimately the SBOM “market” will include every organization on the planet. It’s
difficult to imagine any organization of any size in the world today that doesn’t
use software (even if it’s just the software in one person’s smartphone); it
will soon be literally impossible to imagine that.
Given this, it follows that today’s
market for SBOM services and tools is an almost infinitesimal fraction of what
it can be. Doesn’t it make sense to focus on expanding the market so it’s at
least a significant fraction of what it ultimately will be, rather than worrying
about how you and your competitors are going to divide up the very small market
that’s there today?
Of course, it made sense to my
friends to take the latter course. The only question was, which of the five or
six serious problems would we start with? I wasn’t sure which one, but I was
sure of one thing: it wouldn’t be the naming problem. I honestly thought that this
problem would take a year to even understand, another year to develop at least
a partial solution for it, and finally about eight long, grinding years – full
of political battles – to see the near-solution implemented.
My friends and I created an
informal group called the SBOM Forum and started meeting for an hour every
Friday. In one of our earliest meetings, Tom Pace of NetRise described an eye-opening experience he had with the NVD, in which a device,
that appeared in the NVD to have never had a vulnerability, in fact had at
least 1,237; Tom found these with a simple scan of just two of the firmware products
installed in the device. Moreover, Tom later came to realize that the same product has probably 40,000 unpatched
vulnerabilities, even though an NVD search will find nary a one of them.
The problem Tom came across was
just one of the many manifestations of the naming problem. The naming problem
in general refers to the fact that software products have many names in the
many locations in which they’re found. However, the branch of the naming
problem that causes the most consternation for the software community is centered
on the National Vulnerability Database (NVD), by far the most heavily used
vulnerability database in the world. The NVD uses a very problematic naming
system called CPE (common platform enumeration), which is the source of most
aspects of the problem.
To my horror, after Tom’s
discussions with our group, we stumbled into deciding to take on the naming
problem first. But I was in for a big surprise: Instead of taking two years to
identify the problem and document a solution to at least 70-80% of the problem,
we took a little more than four months. We published the document this week.
The six most serious aspects of
the problem with CPE are described on pages 5-7 of our document. I used to
think that all those aspects were going to require their own specialized measures,
which was why I was sure that any “solution” we proposed would be a godawful
mess. However, I didn’t realize that our group was lucky enough to have the
perfect Hero to slay the foul Naming Beast: Steve Springett, co-leader of the
OWASP CycloneDX project and without much doubt the most creative person in the
SBOM community.
As we started talking about the
problem, Steve quickly realized that a big component (no pun intended) of any
solution would have to be purl, a unique type of identifier that is already in
wide use - although it isn’t used so far in the NVD (Steve is a maintainer of
the purl project, which has helped our group a lot already). Purl has so far
been used mostly with open source software (and used very extensively. Just one
tool, Dependency-Track, is currently used about 27
billion times every month to look up a vulnerability for an open source
product in Sonatype’s OSS Index vulnerability
database, which is based on purl).
Steve realized that purl’s unique
extensibility would allow us to “incorporate” two other identifiers into purl:
SWID for proprietary software, and Software Heritage ID for legacy open source
software (since our document explains the various naming schemes that our
solution will utilize, I won’t explain them here).
The best thing about purl: for
reasons described in the document (be sure to read the discussion of
“intrinsic” vs. “extrinsic” identifiers), integrating purl into the NVD will
not require any database linking; every open source product that’s in a package
manager already has a purl. When searching for vulnerabilities in that version of
the product, the user can easily create the correct purl – 100% of the time.
While purl, SWID and SWHID cover a
good portion of the software universe, they don’t cover hardware. However, the
NVD includes hardware, and devices also have CPE names (along with the problems
belonging thereto). I would have been fine with declaring victory with software
and moving on, but Steve insisted we could do hardware as well. He and Tony
Turner of Fortress Information Security recognized that the two most widely
used hardware identifiers, GTIN and GMN (both part of the “GS1” family of
hardware standards), could be integrated with the NVD as well.
However, in order to allow those
two identifiers to be used with the NVD, various databases need to be linked
with the NVD. This is the only part of our proposal that requires substantial
work. I think it would amount to just a few person-months, but since there will
be various groups involved, it might take longer than that.
But the work will be worth it. One
of the GTIN identifiers is the UPC code. Once our proposal is fully implemented,
you should be able to find out about vulnerabilities in a device by entering
its UPC code – either by scanning the code itself or entering the numbers by
hand – in the NVD. Whoever thought using the NVD could be this much fun?
I’m hoping that, two years from
today, our “proposal” will be fully implemented; Steve has already started discussing
this with the group that will be most involved in the implementation: MITRE
Corporation. CISA at some point (perhaps not until early next year) will hold a
meeting to discuss the question of software naming in the NVD.
What can you – and your
organization - do to advance this proposal? I’d say the first thing is to read
the document. I’ll admit it’s dense. I’m going to put up two or three blog
posts trying to shed light on some of the harder parts. And OWASP will have a
webinar soon to discuss the proposal.
Second, you can read the blog
post by Steve Springett. At the bottom, he has a “How to Help” section. To
what he has there, I would add that we would like to put together a set of short
testimonials (1-3 sentences) on why your organization supports this proposal,
including how this will enable the organization to do whatever you do better
and more efficiently.
For example, the document includes
a statement from a very large software maker (identified in the document) to
the effect that they regularly create SBMs for all of their products, but when
they look at just the output of the tooling they use to create the SBOM, only
one in five components can be identified in the NVD. Of course, this means they
can’t “operationalize” their SBOM production process. While they can often find
a lot of those components, they – like many other companies and consultants –
need to employ AI, fuzzy logic, references to other databases, prayer…whatever
will help. Being able to fully automate SBOM production will greatly increase
the efficiency of their software development process.
But fully automating the SBOM
production (or interpretation) process is out of the question for any organization,
until the naming problem is solved.[i] Hopefully, that day is now
closer than it’s ever been.
Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.
[i] I
won’t pretend that the naming problem will ever be “solved”, since there are so
many “corner cases” that will never all be addressed in the structure outlined
in our proposal. However, it can be close to solved for two important areas:
open source components (which of course compose 90% of software components) and
hardware devices. The one area where our proposal is weakest is proprietary
software, since that will require commercial developers to create SWID tags for
all of their products, both new and existing (although they will only have to
enter 4 or 5 fields in the tag).
Sure, this will be a little work, and some developers will resist doing this. But the choice isn't necessarily theirs. If they don't put a SWID tag in their product, their users will be at the mercies of CPE names when they try to look up vulnerabilities for the product in the NVD. As with most consumer products, the consumers (whether organizations or individuals) "vote" for their favorite products in the marketplace, using dollars, euros, etc. A supplier that doesn't provide SWID tags may find themselves "losing" a lot of elections!
No comments:
Post a Comment