Tuesday, November 5, 2024

Reply to Boris Polishuk, Cybellum – November 5, 2024

 

From Tom Alrich: Boris is Chief Architect of Cybellum, the Israeli company that is one of the leading service providers in the SBOM world; he is a member of the OWASP SBOM Forum. He wrote me in response to the white paper that Tony Turner and I, who are co-leaders (with Jeff Williams) of the SBOM Forum, recently developed, titled “purl needs to identify proprietary software”. The paper describes the project we are proposing to expand the scope of the “purl” software identifier to address proprietary software as well as closed source software. We hope to be able to start that project soon.

To summarize why this project is important, the software security “industry” is plagued by the problem that software products all have different names in different contexts. In order to learn about vulnerabilities (usually designated with a “CVE number” like CVE-2021-44228) that affect a software product they rely on, an end user organization needs to locate the product in a vulnerability database. To do that, they need to learn (or be able to construct from information they already have) the identifier for the product in the database.

The most widely used vulnerability database in the world, the US National Vulnerability Database (NVD), utilizes exclusively the “CPE” identifier, which stands for “Common Platform Enumeration”. CPE has been in use, in multiple versions, for around twenty years. Unfortunately it is an unreliable identifier, as the SBOM Forum described on pages 4-6 of this 2022 white paper. Even more unfortunately, serious problems in the NVD since February 2024 have resulted in over two thirds of the vulnerabilities reported this year being totally invisible to a search using a CPE name. Clearly, an alternative software identifier is needed.

Below, I have posted each paragraph from Boris’ letter in bold roman type, with my response in italics.

Hi Tom,
Thanks for sharing your thoughts on using SWID to generate PURLs for proprietary software. We've considered this approach, but we have some reservations about its feasibility at this time. Here's our reasoning:

  • Decentralized Ecosystem: Proprietary software exists in a decentralized ecosystem with no central authority to enforce naming standards or manage a unified repository. This increases the risk of duplicate or conflicting PURLs, even when generated through SWID.

I agree that proprietary software exists in a decentralized ecosystem. However, I think you’ll agree that the same can be said for open source software. Despite that fact, not only is purl by far the leading software identifier used in open source vulnerability databases today, it is almost the only one. I am sure there have been duplicate purls for OSS, mainly due to somebody making a mistake. But I don’t know of any case where a purl was correctly generated, yet still exactly duplicated a purl for a different product. I also don’t know of any case where two different purls were created for the same product (this happens often with CPEs). Do you know of any such cases?

·        Limited SWID Adoption: While SWID offers a potential solution, its adoption has been limited. In our experience, many organizations are unfamiliar with SWID or its implementation. Relying on SWID for proprietary software identification might face similar obstacles as CPE, hindering widespread adoption and effectiveness.

I agree that SWID has had limited adoption. You probably know that NIST developed the SWID specification (and got it approved as ISO/IEC 19770-2) to be a replacement for the CPE identifier in the NVD. For a few years, it enjoyed modest success; for example, Microsoft included a SWID tag in the binaries for all their new products or versions for at least a couple of years. However, SWID never reached the point where NIST felt comfortable dropping CPE altogether and only using SWID in the NVD. They have more recently acknowledged they will never make that change.

However, you need to understand the reason why we’re proposing that SWID tags be used to create purls for proprietary software that isn’t distributed through app stores: The supplier needs to decide the name and version string for every product they release. This information needs to be made publicly available in a single document.

We could have defined our own format for the document, but since SWID includes the fields needed for the purl (as well as many more that aren’t needed and don’t need to be filled in) and since it is already an ISO standard, we decided to use SWID. In fact, Steve Springett has created a SWID tag generator, which a software supplier (or a third party, if the supplier has not done this already) can use to create a SWID tag for one of their current or legacy products (note that the majority of the fields in Steve’s tool are optional in the purl). The suppliers won’t need to know anything about SWID to create a SWID tag.

·        PURL and CVE.org: The lack of CVE.org's full embrace of PURL as a primary identifier raises concerns about its long-term viability as the sole solution for vulnerability management.

The CVE 5.1 specification (which allows purls in a CVE report but doesn’t say anything about how they will be created or entered in the report) was only adopted by CVE.org this past spring (the SBOM Forum submitted the pull request to add purl to the CVE spec in 2022); very few CNAs are using v5.1 yet. This is mostly because including a purl in a CVE report won’t do any good now, because the NVD is at least several years away from being able to handle purl (Sonatype’s OSS Index vulnerability database supports both CVE and purl today, so it might be able to utilize purls included in CVE v5.1 records with little or no change required. We would like to test this in our proof of concept).

Given these challenges, we believe that a more comprehensive approach is needed to address the complexities of identifying proprietary software. This might involve:

  • Collaboration with CVE.org: Working closely with CVE.org to establish clear guidelines and standards for both PURL and CPE, ensuring they complement each other and address the limitations of each system.

I agree this is important. In fact, it is already one of the “deliverables” of our project (which are described on pages 6-9 of the preliminary project plan that I made available last week). We plan to work with CVE.org and the CNAs (which report to CVE.org) to develop whatever rules are required for them to correctly create a purl for a product described in a CVE report, and to include the purl in the report.

Regarding CPE, the reason we’re doing this project is that, while purl is clearly a much better identifier than CPE for open source software and components – and should be adopted by the NVD for OSS as soon as possible – it doesn’t identify proprietary software products, unless they are delivered through a package manager like PyPI or Maven Central (which rarely happens). Currently, CPE is the only identifier for proprietary software, but it’s subject to the problems listed on pages 4-6 of the OWASP SBOM Forum’s 2022 white paper linked earlier.

When our project is finished and after our recommendations are implemented, we believe that purl will prove to be superior to CPE as an identifier for proprietary software, as it is now for open source software (see our 2022 white paper for why purl is in general superior to CPE). However, the 240-250,000 CVE records that now include CPE can’t simply be thrown away. That means the NVD, and the other vulnerability databases based on the NVD, will need to support both identifiers, perhaps for a very long time.

However, given that the NVD currently has a backlog of more than 19,000 CVE records that don’t include a CPE or purl – and that backlog is growing all the time – integrating purl into the NVD isn’t likely to happen anytime soon. Fortunately, there are multiple open source vulnerability databases that support purl, although I don’t think there are any that also support CPE.

·        Hybrid Model: Exploring a hybrid approach that leverages PURL's strengths for open-source components while incorporating alternative mechanisms, potentially drawing inspiration from SWID or other identification schemes, for proprietary software.

Essentially, this is what we’re proposing. Purl is currently used in almost all vulnerability databases for open source software (except the NVD and other databases based on it), but it doesn’t address proprietary software currently. SWID by itself was intended originally to be an identifier for both OSS and proprietary software products, but I don’t know any vulnerability database that supports SWID as a software identifier now, nor do I know of any other identifiers for proprietary software, other than the very flawed CPE.

We’re not saying that anybody who is happy with using CPE to identify software products needs to give it up. But we are saying that, once our recommendations are implemented, we believe purl will prove to be a superior identifier for both open source and proprietary software.

·        Industry-Wide Standards: Promoting the development of industry-wide standards for software identification to ensure interoperability and consistency across different ecosystems.

That’s what we want to do! Once purl and CPE are both able to handle open source and proprietary software, the software community will “decide” (through their choices in the marketplace) whether they want to use one or the other identifier, or both. As mentioned earlier, both will continue to be in use for many years.

We appreciate your insights and the work you're doing in the OWASP SBOM Forum. While we don't see SWID -> PURL as a viable solution for proprietary software at this point, we're open to further discussions and exploring alternative approaches.

SWID -> PURL is certainly not a viable solution for proprietary software at this point, but we want to make it one as soon as possible! However, I want to point out that the idea of using SWID tags to generate purls for proprietary software isn’t set in stone; if anyone has a better idea for expanding purl to cover proprietary software not distributed through app stores (purls for software in app stores can be easily created without using SWID tags, as described in our new white paper), please bring it up to us. We hope to start the project soon, at which point the participants will be able to determine its direction in both synchronous and asynchronous online discussions.

We welcome anyone who wants to participate and/or to contribute to the effort through directed donations to the SBOM Forum through OWASP (a 501(c)(3) nonprofit organization).

Sunday, November 3, 2024

When will the “NERC CIP in the cloud” problem be solved for good? You won’t like the answer…

NERC’s 6-hour virtual Cloud Technical Conference on November 1 was quite successful. The conference included three panels of industry types (including me) discussing questions mostly posed to them in advance, followed by a discussion by members of the team that will draft changes to the CIP standards to address what I call the “CIP in the cloud” problem.

The discussions were very productive and produced some great insights. I took a lot of notes and will produce multiple posts on those insights in the coming month or so. However, I’m going to start off with a question that wasn’t discussed in the conference, but was very much hanging over it: When will new or revised CIP standards be in place, so that full, but secure, use of the cloud by NERC entities with BEES assets will finally be possible?

This isn’t an academic question at all. As multiple panelists pointed out, previously on-premises software products and security services are moving to the cloud all the time. In some cases, they retain an on-premises option, with the caveat that the most innovative changes will only go into the cloud. In other cases, the vendor is making a clean break with on-premises systems, leaving NERC entities with high- or medium-impact BES environments with no other choice than to find a totally on-premises replacement. And as Peter Brown of Invenergy pointed out (in the conference as well as an earlier webinar sponsored by the informal NERC Cloud Technology Advisory Group and SANS, which I wrote about in this post), those replacements are inevitably more expensive and offer less functionality.

In January, I wrote a post that examined this question. I concluded by saying:

So, if we get lucky and there are no major glitches along this path, you can expect to be “allowed” to deploy medium and high impact BCS, EACMS and PACS in the cloud by the end of 2029. Mark your calendar!

Of course, the end of 2029 sounds like a long time to have to wait, especially with security services and software already abandoning their on-premises options. Do I still think the industry will have to wait five years for the cloud to be completely “legal”? I have good news and bad news, but finally some good news, for you:

·        The first good news is I no longer think the end of 2029 is the likely date by which cloud-based systems and services for systems to which the CIP standards apply will be fully “legalized”.

·        The bad news is I think it will probably be later than 2029.

·        However, the second good news is that, given how this problem is affecting more and more NERC entities all the time, it’s unlikely there won’t be at least a partial solution to this problem before 2029 – although don’t ask me what form that solution will take. This is very much uncharted ground.

Here's a short summary of my timeline and the reason for my changes:

1.      I had thought the new Standards Drafting Team (SDT) would start their drafting work when they convened in July. However, it turns out they are now working on revising their Standards Authorization Request (SAR). They will finish that by the end of this year and will submit it to the NERC Standards Committee for approval. That approval is likely to be quickly granted, so the team will probably start drafting in January 2025, not last July as I had anticipated.

2.      There are some huge issues that will need to be discussed when the SDT starts drafting. I attended a lot of the meetings of the CSO706 SDT that drafted CIP version 5. V5 completely rewrote the CIP standards and definitions that had been put in place with CIP version 1. Even though there were a lot of fundamental questions discussed in those meetings, I also know the SDT had a good idea of what they needed to do when they started drafting v5 in early 2011. Even then, developing the first draft took a year and a half (see the January post linked above). The “cloud” SDT might take that long or even longer to develop their first draft.

3.      Once the SDT has their first draft, they will submit it to the NERC Ballot Body for approval. It’s 100% certain it won’t be fully approved on the first ballot. With each ballot, NERC entities can submit comments – which, of course, mainly discuss why the commenter didn’t vote for the standard in question (each new or revised standard will be voted on separately). The drafting team needs to respond to every comment, although in practice they group similar comments and respond to them at one time. For just one of the CIP v5 ballots, 2,000 pages of comments were submitted.

4.      It’s close to certain that the new or revised standards will go through at least four ballots before they’re approved, with three comment periods in between them. The balloting process alone took the CIP v5 SDT a year, and I assume the new SDT’s experience will be roughly the same. Adding that to the estimate of 18 months to draft the first version of the new standads, we’re at 2 ½ years, starting in January.

5.      When the new or revised standards have been approved by the ballot body, they will go to the NERC Board of Trustees for approval at their next quarterly meeting; it’s close to certain the BoT will approve it in one meeting. So, BoT approval requires 3 months, bringing us to two years and nine months for the process so far.

6.      At that point, the standards go to FERC for approval. Even though individual FERC staff members have been quite supportive of the need for changes to accommodate cloud use (and two staff members spoke in the technical conference), the staff might very well not be in line with some of the actual changes that are proposed. Of course, the five FERC Commissioners are the ones who must approve those changes; they always take a lot of time to come to general agreement. I’ll stick with my earlier estimate of one year for FERC to approve the new or revised standards, but it could well be longer than that. We’re now at three years and nine months from next January, which is the third quarter of 2029.

7.      However, FERC approval doesn’t mean that NERC entities can rush off and start using the cloud. There will without doubt be an implementation period of more than one year; I’ll say it will be 18 months[i], but even that may be a low estimae. This puts us at the first or second quarter of 2031, before the new or revised CIP standards are enforced.[ii]

Thus, instead of saying that the cloud will be completely “legal” for NERC entities by the end of 2029, I’m now saying this will happen by the second quarter of 2031, which is 6 1/2 years from now. But that isn’t all: In my January 2024 post, I pointed out that I thought it was possible that the changes required for the cloud will also require changes to the NERC Rules of Procedure; I now believe it’s likely this step will be needed.

The SDT has no power to make RoP changes, and my guess is there might need to be a separate drafting team for those changes. Of course, this alone could add a couple more years to the whole process. Since I don’t know what’s involved, I won’t change my estimate of Q2 2031 as the date when systems subject to NERC CIP compliance can be freely used in the cloud, subject to the controls in the CIP standards. But there’s now a big asterisk beside that date.

If you’re like some NERC entities, as well as some members of the NERC ERO, you’ll probably look at my Q2 2031 date and say something like, “This is unacceptable! The NERC community can’t wait this long.” You would be right; this is unacceptable. This is why I’m sure that some measures will be taken long before that date to allow at least some cloud use cases for BES Cyber Systems, EACMS and PACS. However, I have no clear idea of what those measures will be, beyond my own wishful thinking. 

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] The CIP v5 standards were approved by FERC in November 2013, but were enforced on July 1, 2016. That was 2 ½ years after approval.

[ii] Since many NERC entities are eager to start using the cloud for OT systems, there will probably be accommodations for entities that wish to follow the new standards before the implementation period is finished. However, only a small number of NERC entities will be allowed to take advantage of those accommodations, and they will be closely monitored. This was done when CIP v5 had been approved by FERC in 2013. At that time, NERC established the Version 5 Technical Advisory Group (V5TAG), a small group of NERC entities that implemented the v5 standards before the enforcement date. They were closely monitored by NERC and documented their experiences.

Wednesday, October 30, 2024

Sorry for the late notice…

I just learned that a NERC webinar that was originally scheduled for tomorrow, but which I thought had been postponed, will in fact take place. It is the fourth in a series sponsored by the NERC Cloud Technical Advisory Group (CTAG) and SANS. It will be at 1PM Eastern Time Thursday October 31 (yes, Halloween).

The webinar should be good. It will feature Maggy Powell, formerly with Exelon and now with AWS, and Mikhail Falkovich, formerly with ConEd and now with Microsoft. You can register for the webcast, as well as access the recordings of the previous three CTAG/SANS webinars, here. If you forget to register in advance, I believe you will still be let in, but it’s better to register!

Note this is just the first NERC webcast regarding the cloud this week. The all-day webcast I wrote about last week will still take place on Friday – in fact, both Thursday speakers will participate on Friday as well (I will, also).

 

 

 

 

 

 

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Tuesday, October 29, 2024

The NVD’s problems deepen

This morning, one of the members of the OWASP SBOM Forum sent around an update to the group on how the National Vulnerability Database (NVD) is doing in their quest to reduce their current huge backlog of “unenriched” vulnerability records – namely, new CVE Records that don’t have any CPE identifier attached to them. Not having an attached CPE means that searching the NVD for a particular product will never identify any of those CVEs, even though the product might be vulnerable to one or more of them.

The only way to know for sure whether any of those CVEs affect that product is to manually search through the text of every unenriched CVE report. How many are there? I pointed out at the beginning of October that currently there’s a backlog of over 18,000 unenriched CVE records, which is over 2/3 of the new CVEs identified this year. Moreover, that backlog continues to grow.

Did the SBOM Forum member have progress to report? That depends on what you call “progress”. He had been hoping that on October 1, the first day of the federal government’s fiscal year, the NVD would begin a concerted effort actually to reduce their backlog. Alas, that was not to be. He reported:

Starting October 1st…CPE assignments have fallen off significantly as a percentage of new CVE assignments. Essentially, the backlog has increased by around 1,000 since the week of September 23rd…I was hoping NVD’s CPE assignment was going to essentially catch up. I was optimistic in September, but that is no longer the case. 

Thus, the NVD now faces a backlog of over 19,000 unenriched CVE records (after promising last May to reduce the backlog to zero by September 30). Will they ever turn this situation around? I have no idea, but I do know that it’s foolish to sustain ourselves in the hope that they’ll be able to do this. They have disappointed us at every step of the way, since their problems (still never officially explained) started on February 12.

Given that no vulnerability search on the NVD will yield accurate results unless you only care about vulnerabilities that were identified before 2024, what alternative vulnerability databases are there? There are several databases that are based on the NVD, which have conducted their own enrichment of some of the unenriched CVEs – i.e., they have created their own CPEs and added them to the CVE records in their database. However, it’s important to keep in mind that the CPE identifier (which is only used by the NVD and its derivatives) has a lot of problems; there’s no such thing as a “definitive” CPE, so a CPE created for one database won’t necessarily match one created for another (of course, this in itself shouldn’t be a big problem if you confine all of your searches to one of those databases).

If your primary concern is vulnerabilities in open source software, you’re in luck, since there are multiple good vulnerability databases to choose from for open source (including OSV, OSS Index, GitHub Security Advisories, and others).

What makes the open source vulnerability databases so good? They are literally all based on the purl identifier. Purl is highly reliable and most importantly doesn’t require lookup to a centralized list of identifiers, as CPE does. In other words, every open source product that is available in a package manager (the majority of those products) has an intrinsic purl that can be created by a user on their own, as long as they know the name of the package manager they downloaded the product from and its name and version number in the package manager. Since vulnerabilities are reported using the same purl, the user should always (barring error, of course) be able to learn of all vulnerabilities that apply to the product in a database search.

Most importantly, purls don’t need to be created by anybody, so the NVD’s current problems with not being able to create as many CPEs as are needed would never have happened if the NVD were based on purl.

However, there is currently a big downside to purl: It can only be used to identify open source software, not proprietary software; CPE can be used to identify both. Moreover, currently there is no alterative identifier for proprietary software, other than CPE. Thus, given that the NVD has fallen on hard times, it can truthfully be said that today there is no trustworthy way to conduct an automated search for vulnerabilities in proprietary software. Of course, for most organizations, automated searches are essential to an effective vulnerability management program.

This raises the question, “Can purl be somehow extended to cover proprietary software, and will it be almost as dependable in that domain as it is for open source software?” The answers to those two questions are yes and yes. Then the question arises, “What needs to be done to make this happen?”

I’m glad you asked. The OWASP SBOM Forum has identified two likely paths by which purl can be expanded to cover proprietary software. We want to flesh out the details of both of those paths and test them in a proof of concept. We have scoped out a project to do that and are looking both for volunteers to contribute to the project and modest financial support to make it happen (support can be in the form of donations to OWASP which are directed to the SBOM Forum. OWASP is a 501(c)(3) nonprofit organization).

You can read about the project here. Please email me if you would like to discuss this more.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Monday, October 28, 2024

What’s the connection between AI growth and coal-fired power plants?


GridSecCon is the premiere power grid security conference and exhibition, which is sponsored annually by NERC and the E-ISAC. I have only missed two of the onsite events since the first one in 2011, and I was quite pleased to attend the 2024 event in Minneapolis last week. As usual, it was a very informative conference and a great opportunity to interact with a lot of people who are involved with ensuring the cyber and physical security of the North American power grid. I want to thank the corporate sponsors of the event, as well as NERC and the E-ISAC.

Without a doubt, the most memorable presentation during the week was the one by Sunny Wescott of CISA, where her title is ISD Chief Meteorologist. I doubt too many people who saw her presentation – which was to the entire conference in the morning of the first day – will disagree that it was one of the most powerful talks they have ever witnessed. She has given this presentation to multiple audiences (and will continue to, I’m sure). You can find several videos of her presentation from other venues on YouTube by searching on her name, but I also recommend you see her live if you ever get that opportunity.

Andy Bochman of Idaho National Laboratory wrote an excellent post on her presentation on LinkedIn, but my summary of what she said is, “We’re facing tremendous challenges due to climate change. They are coming at a faster pace, from a million different directions, than we ever imagined was possible. At this point, we can’t eliminate those challenges, but there’s a lot that we can do – especially on the local level – to prevent them from leading to unmitigated disaster.”

However, there was another very powerful talk during the conference; this one was by Andy. Since it was in a breakout session, it was only witnessed by a fraction of the number of people who saw Sunny’s presentation, but I know a lot of people considered his talk to be at least the second most powerful of the conference. I certainly did.

Andy summarized his talk (in the same LinkedIn post) as, “about the risk of suppliers putting generative AIs, prone to hallucinations and emergent behaviors in control centers, and I also extended the topic to address ultra-realistic AI-boosted disinformation including deepfakes that could spoof operators into taking harmful actions.” He didn’t say that AI should be banned from grid control systems altogether, but he did say we need to be very careful about deploying it on those systems.

A week ago, I would have said that a presentation on the impact of climate change and a talk on dangers posed by indiscriminate deployment of AI in grid control centers would both be interesting, but they wouldn’t have anything in common. However, I now realize the two topics are very closely linked.

The link between the topics became clear when someone mentioned to me something I hadn’t heard before: that the Coal Creek Station, a 1,000MW coal burning generating plant (which I visited 7 or 8 years ago) in the middle of wheat fields in North Dakota, was purchased by a data center provider in 2022 to power a new data center to be built nearby. Thus, the plant will most likely continue operations for decades to come

Like a lot of people, I had heard of a couple of deals in which output from a nuclear plant (or at least one unit of the plant) was committed to a data center provider – most notably, Microsoft’s signing of a 20-year power purchase agreement that will allow Constellation Energy to restart Unit 1 of the Three Mile Island nuclear plant in Pennsylvania (Unit 2 was shut down after the famous 1979 incident, but Unit 1 wasn’t affected by it).  Note that Unit 1 has 837MW capacity, which is less than Coal Creek’s capacity.

However, I was startled when I searched for more information on the Coal Creek deal and I found this article from Power Magazine. It doesn’t even mention Coal Creek, but it makes clear that coal-fired plants all over the US are getting a new lease on life for one main reason: The huge power needs of AI can’t be satisfied just by the rapid increase in renewable energy production. Not only must renewable energy increase, but fossil fuel production – especially coal – can’t decrease for the foreseeable future.

In other words, if coal plants continue to close (or be scheduled for closure) at the rate they have over the past decade, the North American grid clearly won’t be able to satisfy both normal power demand (which wasn’t growing quickly before the AI boom) and AI demand. As Power pointed out, coal-fired generation has a new lease on life (and that will inevitably be the case worldwide, not just in North America, although the article doesn’t mention that). While this is good news for people whose jobs depend on coal plants (and who might have to move and take a pay cut, if they want to work in renewable energy), it isn’t good news for the fight against climate change.

Is somebody doing something wrong here? After all, workers in the coal plants, like most of us, would like to keep working in a job we understand and can perform well. The data center operators want to obtain the power they need to fulfill the orders that are flooding in from tech companies. The tech companies like Microsoft are trying to keep ahead of their competitors – and today, that means going all in on AI. The public is already benefiting from AI in many ways; they would be quite reluctant to see those benefits stop growing if AI’s power use is somehow disfavored by the public and private organizations that operate and regulate the grid.

Nobody is doing anything wrong, yet at the same time - as Sunny Wescott’s presentation cogently demonstrated – we need to do everything we can to keep the rate of acceleration in climate change (i.e., the second derivative. We’re beyond being able to control the first derivative) from increasing any more than it already has. Are we all simply SOL, and will our grandkids ultimately end up needing to find another planet to live on?

I don’t think so, because I think there’s one link in this seeming circle of doom that can be broken: AI needs to figure out how to use much less energy than it currently uses, while not cutting back on the substantial benefits it is currently providing and will provide for society. This might be achievable by examining the fundamental assumptions on which what is currently called artificial intelligence is based.

I will do exactly that in a post that’s coming soon to a blog near you. 

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Thursday, October 24, 2024

NERC Cloud Services Technical Conference Nov. 1

 


NERC has for a long time wanted to have a technical conference to address compliance and security issues with use of the cloud by NERC entities. I’m pleased to announce that it’s now scheduled for next Friday, November 1 (sorry for the late notice. NERC will put out an official announcement soon, hopefully today). It will run from 10 AM to 5 PM Eastern Time. Registration is available here.

The agenda is very well-thought-out. It consists of four panels; I will participate in the third panel. The fourth panel is an update from the new “Risk management for third-party cloud services” drafting team. I’ve seen all of the questions that will be asked of all the panels, and I can assure you that all of the panels will be worth your time, if you’re available (of course, you can browse back and forth to the meeting, as your schedule permits).

I’ll hope to see you there!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Sunday, October 20, 2024

How can we (really) automate software vulnerability identification?

I probably don’t need to tell you that vulnerability management is important for any organization, public or private, that uses software. If you’re not convinced of this, all you need to do is look at devastating ransomware attacks like WannaCry, NotPetya and Ryuk. All of these exploited known vulnerabilities for which patches were available.

I also probably don’t need to tell you it is impossible to manage vulnerabilities that affect software you use, if you can’t learn about them using frequent, fully automated searches – in which you enter an identifier for a software product and version and immediately discover all recently identified vulnerabilities that affect that product and version.

Yet, that is the situation today: The most widely used vulnerability database in the world is the US National Vulnerability Database (NVD). However, because of the NVD’s currently huge backlog of “unenriched” CVE (vulnerability) records dating from February of this year, any search for vulnerabilities that apply to a particular software product and version will yield on average fewer than one third of the vulnerabilities that have been identified this year for that product and version. Even worse, the NVD provides no warning about this situation.

This is analogous to a doctor that stopped studying new diseases eight months ago and can only diagnose diseases that were identified before then – yet never warns his patients that they could possibly have contracted a disease he hasn’t yet learned about. In both cases, the end user/patient is more likely to be harmed due to not knowing about a vulnerability/disease, than to benefit from knowing about one that they face. Ignorance is not bliss.

However, the NVD’s biggest problem isn’t their current backlog, but the fact that the CPE (“common platform enumeration”) software identifier that is required for all vulnerability lookups in the NVD has many problems - and there is no good solution for them. These problems cause many searches to fail, without any explanation for the failure. Even worse, the user will usually not even be informed that the search has failed.

In 2022, the OWASP SBOM Forum (which I co-lead) published a white paper on the CPE problem in the NVD. The central argument of that paper was that the purl (product URL) software identifier is far superior to CPE, and that CVE.org (the agency of the Department of Homeland Security that oversees the CVE Program) and the NVD should move as quickly as possible toward supporting both purl and CPE. After writing that paper, we submitted a “pull request” to CVE.org to add purl support to CVE records. That request came into effect when the CVE 5.1 specification was approved earlier this year.

However, the 5.1 specification alone didn’t solve the problem. The CVE Numbering Authorities that create CVE records (i.e., report new vulnerabilities in software products, usually products developed by their own organization. For example, Microsoft, Oracle, Red Hat, Schneider Electric and HPE are all CNAs) need to start adding purls to those records, yet few if any have done so thus far. One reason for this is that, even if the CNAs started doing that, the purls would be “all dressed up with nowhere to go”, since neither the NVD nor the CVE.org database currently allows a search using purl.

But there’s an even bigger problem: While purl has literally conquered the world of open source software, it can only be used to identify a tiny percentage of proprietary software products with vulnerabilities today. This means a user of a proprietary software product cannot look that product up in the NVD using purl; instead, they must use CPE. Purl can never be on an equal footing with CPE until it can be used to identify proprietary software products, not just open-source products.

The OWASP SBOM Forum has decided this is an unacceptable situation, especially since purl eliminates most of the problems that affect CPE. We are asking, “What will it take to give purl the capability to identify proprietary (closed-source) software, as well as open-source?”

Fortunately, two very smart individuals are members of the Forum. One is Steve Springett, creator and leader of two of OWASP’s major projects: Dependency-Track (which performs over 20 million automated vulnerability lookups every day - although few of these use the NVD. In fact, D-T mainly uses Sonatype’s OSS Index, an open source vulnerability database that is based on purl) and CycloneDX. The other is Tony Turner, the cybersecurity expert and SANS instructor who co-leads the SBOM Forum with me, along with Jeff Williams of Contrast Security.

Both Steve and Tony are quite familiar with purl, since they are both part of the project team. In fact, in the “early days” of purl (which were less than ten years ago, believe it or not), Steve worked closely on the design with Philippe Ombredanne, the creator of purl (who is also a member of the SBOM Forum). When the SBOM Forum developed our paper in 2022, Steve described two ideas for how to expand purl to identify proprietary software.

Before I explain Steve’s ideas (one of which Tony came up with separately), I need to point out the most important feature of purl: It isn’t based on a centralized “namespace” like CPE is. CPE names are created by contractors who work for the NVD (which is part of NIST). Unless one of those contractors creates the CPE name, it isn’t valid[i].

If a CNA or software user wants to learn the CPE name for a software product, they must use a variety of methods to find it – fuzzy logic, generative AI, prayer, etc. There is a centralized “CPE database”, but it is simply a list of all the CPEs that have ever been created, without any contextual information. As Bruce Lowenthal of Oracle has pointed out, this would be like listing all the words in the Bible in alphabetical order and calling that an English dictionary.

By contrast, purl creates a decentralized namespace. Purl consists of a series of one-word types, which currently mostly refer to package managers for open-source software (e.g., the “maven” type refers to the Maven Central package manager). All you need to know about package managers now is that they’re a single web location from which you can download software, if you know the name of the product and its version string. Since a single product/version pair can never be replicated within the package manager, each pair is unique. Therefore, each package manager has a controlled namespace.

What’s more important is that the combination of three pieces of information – package manager (type), product name within the package manager, and version string - is guaranteed to be unique within the entire purl namespace (i.e. across all purl types). What’s even more important is that the user of the product doesn’t have to query a central database to find out the purl for their product. The user can create the purl on their own, using information they already have.

To create the unique purl, the user just needs to know the type (package manager), and the name and version string in that package manager. For example, the purl for version 1.11.1 of the Python package named “django” in the PyPI package manager is “pkg:pypi/django@1.11.1”.[ii]

Of course, even though the user can always re-create the correct purl for the product, that will only help them identify a vulnerability if the supplier reports vulnerabilities in that product/version to CVE.org[iii] using the same purl; that way, the purl the user enters in the vulnerability database will match the purl on the CVE record. This is how CPE is supposed to work, but since it’s impossible to know for certain what the NVD contractor actually created, there can never be any certainty regarding CPE.

For example, if the contractor used “Microsoft” as the vendor name, that CPE will be different than if they used “Microsoft, Inc.” If a user, who is trying to learn about vulnerabilities in a Microsoft product, creates a CPE according to the CPE specification, they will have to guess which of these is the vendor name used by the contractor, since they will be different CPEs.

What is worse is that if they guess wrong and search on the wrong CPE, they will simply be informed that “There are 0 matching records”. This is the same message they would receive if they had guessed correctly, but there are no vulnerabilities listed in the NVD that apply to that product/version (which might be interpreted to mean the product/version has a “perfect record”). There is no way for the user to learn which is the case.

With purl, as long as the user knows the package manager they downloaded the product from and the product’s name and version string in that package manager, they should always (barring a mistake) be able to create the same purl that the supplier used when they reported the vulnerability. This is why purl has literally conquered the open source software world. In that world, it would be difficult even to say there is a number two software identifier after purl.

Of course, the key to purl’s success is the existence of package managers in the open source world; it would be much more difficult to create a distributed namespace without them. That raised the question in a few creative peoples’ minds: Is there an analogue to package managers in the proprietary software world? At different times, both Steve and Tony realized that the answer to this question is yes: it’s app stores.

Like package managers, app stores (these include the Apple Store - which is in fact five stores - as well as Google Play and the Microsoft Store, although there are many smaller stores as well) do the following:

1.      Provide a single location from which to download software;

2.      Control the product namespace within the store, so that each product has a unique name; and

3.      Ensure that each version string is unique for the product to which it applies. For example, the product named Foo won’t have two versions that have the same version string, say “4.11.6”.

In other words, app stores can probably be treated in purl like package managers are treated today. Each app store will have its own purl type, just like package managers do now. Perhaps the most impressive aspect of adding app stores to the purl ecosystem is that, as soon as a purl type is created for a new store, all the products in that store (for example, Google Play currently contains about 3.5 million products) will instantly have a purl. No NVD employee or contractor (or anyone else) needs to do anything to enable this to happen.

What about proprietary products that aren’t in app stores?

The great majority of proprietary software products are not available in app stores, but from the website of either the developer or a distributor. How can purl be expanded to include them?

In the SBOM Forum’s 2022 paper, we provided a two-paragraph high level description of the purl solution we were suggesting for proprietary software, based on an idea of Steve Springett’s:

1.      When a developer releases a new software product or a new version of an existing software product, they will create a short document (called a tag) that provides important information on the product, especially the name, supplier and version string.

2.      When a user downloads that product from the developer’s website (presumably after paying for it), the user will also receive the tag; they can use the information in the tag to create the purl for the product (perhaps like the purl described above)[iv].

Since the supplier created the tag in the first place, when they report a vulnerability for the product to CVE.org, they should use a purl that includes the information from the tag. Thus, the purl created by the user will match the one created by the supplier, since they are both based on the same tag. When the user searches a vulnerability database using that purl, they are sure to learn about any vulnerabilities the supplier has reported for the product.

Rather than create our own format for the product information tag, Steve suggested that we use the existing SWID (“software identification”) format. SWID is a specification (codified in the ISO/IEC 19770-2 standard in 2006) that was developed by NIST. It was originally intended to be the replacement for CPE in the NVD and to be distributed with the binaries for a software product. However, it never gained much traction for that purpose. NIST has dropped the idea of replacing CPE with SWID tags in recent years.

Steve realized that, since SWID is an existing standard and a lot of software products have SWID tags now (for example, for about two years, Microsoft distributed SWID tags with all their new products and product versions), it would be better to use that than to create a new format; this was especially important, since the SWID format includes all the information required to create a usable purl. Steve defined a new purl type called “SWID” and got it be added to the purl specification in 2022. He also developed a tool that creates a purl based on information in a SWID tag.[v]

However, our 2022 document didn’t address two important questions:

1.      For legacy products, if the supplier didn’t create a SWID tag originally, who should create one now? Presumably, it will be the current supplier of the product, even if the product has been sold to a different supplier in the meantime.

2.      How will the user of a product, for which the supplier has created a SWID tag, locate and access the tag? While the supplier could develop a mechanism through which a customer can automatically locate and download the tag from their website, there will soon be a much more universal method for discovering and accessing software supply chain artifacts: the Transparency Exchange API. This is being developed by the CycloneDX project. It will be fully released by the end of 2025, when it will also be approved as an ECMA standard.

How will all of this happen?

The OWASP SBOM Forum believes that, once purl can represent proprietary software products (after the required new types are implemented in the purl specification), the following set of steps[vi] will be set in motion:

1.      A “purl expansion working group” – including members from many different types of organizations – will meet regularly to work out required details for expansion of purl to proprietary software products. The group will publish these details (most likely as OWASP documents). The group will also:

a.      Recruit operators of app stores to participate in the purl community, along with creating a new purl type for each store and submitting the pull request to add that type to the purl specification; and

b.      Conduct tabletop exercises with software suppliers to test the formats and procedures required to implement the purl SWID tag program. This will include testing the purl SWID type definition. This definition was created more than two years ago, but it has only been tested by a few software developers. It needs to be subjected to broader “tabletop” testing.

2.      Private and governmental security organizations (including CVE.org) conduct awareness and training activities for the activities described in this paper, especially regarding the development, distribution and use of SWID tags to create purls for proprietary software products. These activities will target CNAs, software suppliers, security tool vendors, vulnerability database operators and larger end user organizations, including government agencies.

3.      Suppliers create SWID tags for their products, starting with new products and product versions and continuing with legacy products that do not yet have SWID tags.

4.      Suppliers make their SWID tags available through one (or more) of three channels: a) directly to customers, b) in a machine-accessible format on their website, and c) using the Transparency Exchange API, when it is available.

5.      After being trained in purl and the new purl types for proprietary software, CNAs start including purls in CVE records. The purls are based on the suppliers’ SWID tags.

6.      Vulnerability databases based on CVE records (perhaps including the NVD) advertise the fact that users can now find vulnerabilities in proprietary software using purl. They offer training materials (webinars, videos, website content and hard-copy publications) for users.

7.      Users begin to see the advantage of using purl. The primary advantage is that they can deploy fully automated tools for vulnerability identification without having to intervene regularly in the identification process, as is the case with CPE.

8.      As suppliers realize their SWID tags are being accessed by their customers, they also see this is giving them a small but tangible marketing advantage over competitors.

9.      Purl-based open source vulnerability databases see increased traffic once they start accepting the new purl types, as users realize they now have a “one-stop-shop” for identifying vulnerabilities in both open source and proprietary software.

10.   Operators of CPE-based vulnerability databases (especially the NVD) notice that not having to create at least one CPE for every new CVE record saves their staff a lot of time. They also notice that users of those databases are expressing more satisfaction with their experience, since a much higher percentage of the purls they enter are finding their match in the CVE records, than was the case when CPE was the only software identifier available to them.

11.   As CNAs begin to realize that users are taking purl seriously, they add more purls, and fewer CPEs, to CVE records.

12.   The above set of steps cycles continually, until growth of the overall vulnerability database “market” results in continuous growth of both purl and CPE, with roughly constant “market shares”.

The OWASP SBOM Forum is under no illusion that the above set of steps will be accomplished very quickly, given the current rudimentary state of awareness regarding purl and its advantages. On the other hand, the fact that truly automated vulnerability management is currently almost impossible in the NVD makes it even more important that we start implementing a real solution to those problems, while still hoping that the NVD will eliminate their huge backlog of unenriched CVE records in the coming one or two years.

There is good reason to believe that if we start now, within 3-4 years purl will be widely accepted and used to identify vulnerable proprietary software products in most vulnerability databases. We say this because this will be the second time that purl has been quickly accepted. Here is the story of the first time:

Steve Springett[vii] states that in 2017 and 2018, purl had little traction in the open source world, because it was so new. Steve’s Dependency Track and CycloneDX projects, along with Sonatype’s OSS Index vulnerability database, were a few early adopters of purl in 2018. Yet, purl was in wide use in the open source community by 2022. Steve points out that today, purl has been adopted by “most SCA vendors, hundreds of open source and proprietary tools, and multiple sources of vulnerability intelligence.” I would add that purl is used today by literally every major vulnerability database worldwide, other than the NVD and databases based on NVD data. Indeed, purl has “won the war”, when it comes to identifiers for open source software.

Of course, the world of proprietary software is quite different from the open source world, since the participating organizations are sometimes true competitors; that is not often the case with open source software. However, once new purl types are developed to allow identification of proprietary software, it should not require a heavy lift for databases now based on purl to accommodate those new types. This means that, soon after the new purl types for proprietary software have been incorporated into the purl specification, big purl-based vulnerability databases like OSV and OSS Index, which today only support open source software, may quickly support vulnerabilities in proprietary software products as well.

Looking ahead

The OWASP SBOM Forum has recently published a white paper that discusses all the above topics in more detail. It is available for download here. We are actively discussing these topics in our meetings and welcome new participants. Our meetings are every other Tuesday at 11AM ET and every other Friday at 1PM ET. To receive the invitations for these meetings, email tom@tomalrich.com.

We currently expect this to be a two-phase project:

1.      Planning and Design. This will consist of just the first of the above steps. We believe this phase will require no more than 4-5 months of bi-weekly meetings (plus online asynchronous work between meetings, including soliciting participation by app stores and software suppliers and conducting the tabletop exercise to test adequacy of the SWID purl type). This phase will require a modest budget for coordination of those activities.

2.      Rollout. All steps listed above other than the first are included in this phase. This phase can be summed up as “training and awareness”. While training and awareness activities are not inherently difficult, they require large numbers of people to be involved, both on the “trainer” and “trainee” sides. We estimate that this phase will require five to ten times the amount of resources required for the first phase.

We estimate that the first phase will require approximately $50,000 to $100,000 in funding, although we are willing to start work on this phase with less than that amount committed. Since the resources required for the second phase will depend on the design developed in the first phase, we will wait until at least a high-level design is available during the first phase, before estimating the second phase and seeking funding.

We invite all interested parties, including software developers, software security service and tool providers, and end users of software of all types, both to participate and to donate to this effort. Donations (both online and directly) over $1,000 can be made to OWASP and “restricted” to the SBOM Forum[viii]. Any such donations are very welcome (OWASP is a 501(c)(3) nonprofit organization, meaning many donations will be tax-deductible. However, it is always important to confirm this with tax counsel). To discuss a donation of any size, please email tom@tomalrich.com and tony@locussecurity.com. 

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] Due to the NVD’s current problems in creating CPEs, CISA has been designated an “Alternate Data Provider”, who can create authoritative CPEs that have the same status as those created by the NVD contractors. CISA’s “Vulnrichment” program has created many CPEs since their designation, but these are just a fraction of the number required to reduce the backlog.

[ii] Every purl begins with the prefix “pkg”. This prefix is not needed today, but will be in the future.

[iii] Many open source vulnerabilities are not reported to CVE.org, but instead to a vulnerability database like GitHub Security Advisories (GHSA). Many of these databases share their vulnerabilities with the OSV database (managed by Google), where they are displayed using the OpenSSF Vulnerability Format. Most OpenSSF vulnerabilities can be mapped to the CVE format.

[iv] Of course, the user should not have to create the purl manually; the process can be completely automated within a vulnerability management tool.

[v] Steve’s tool requires the user to manually input data from the SWID tag, but the code can of course be adopted for automated use by a vulnerability management tool.

[vi] These steps aren’t a “chain”, since they will ideally happen simultaneously, at least after an initial “startup” period. In general, each step listed depends on the previous step being accomplished.

[vii] In an email on October 19.

[viii] OWASP reserves ten percent of each “restricted” donation to fund administration. That is, OWASP doesn’t simply pass the donation through to the project team – in this case, the SBOM Forum. Instead, as the project team performs work or incurs other expenses on the project, they submit invoices to OWASP, which determines whether they are appropriate before paying them.