Tom Alrich's Blog: November 2024

Wednesday, November 27, 2024

The fundamental problem preventing CIP compliance in the cloud today

I now believe there are five main problems that make it hard, if not impossible, for NERC entities to maintain CIP compliance while deploying certain CIP-related workloads in the cloud. Each of these problems is unique and requires a unique solution. I will discuss each of them, as well as their possible solutions, in separate posts soon. They are (in approximate order of importance):

1. The “EACMS problem”.

2. The medium impact renewables Control Centers problem.

3. The SaaS/BCSI problem.

4. The high and medium impact utility Control Centers problem.

5. The low impact Control Centers problem.

Contrary to what many people think, it isn’t true that the NERC CIP requirements in any way “forbid” use of the cloud by assets that fall under the purview of CIP. The current requirements had their genesis in the years after 2008, when FERC approved CIP version 1. At that time, the cloud was very new. The idea that assets that control the power grid might at some point be deployed in the cloud was almost unthinkable. Therefore, the original CIP requirements said nothing about the cloud, because nobody thought it was likely this would ever become an issue.

To this day, there is no mention of the cloud in any CIP requirement or definition. This means that any NERC entity that wishes to outsource their entire OT environment to the cloud can do so without fear of being in direct violation of any CIP requirement – as long as they don’t mind receiving a boatload of Notices of Potential Violation anyway. How can this happen?

It can happen because, as any NERC compliance person well knows, remaining in compliance with any NERC requirement means being able to provide appropriate evidence of compliance. Non-binding suggestions for that evidence are usually found in the “Measures” column of the Requirement. The general rule of NERC compliance is: “If you didn’t document it, you didn’t do it.”

Of course, proving compliance with any cybersecurity standard always requires some sort of evidence. However, NERC CIP differs from other standards in that the NERC entity needs to be prepared to provide evidence that they were compliant with a CIP requirement in every instance in which compliance was required; it doesn’t matter whether the systems in question are deployed on premises, in the cloud, or both.

The problem with this is that the Measures were all developed for the on-premises use case only (except for the Measures shown in Requirement CIP-004-7 Part R6.1 and Requirement CIP-011-3 Part R1.2, which were developed with both cloud and on-premises systems in mind).

For many CIP Requirements and Requirement Parts, evidence for compliance in the cloud does not pose a problem, since the Requirement merely states the objective to be achieved; usually, the objective is implementing a policy or procedure. For example, CIP-005-7 Requirement R1 Part R1.5 requires the entity to “Have one or more methods for detecting known or suspected malicious communications for both inbound and outbound communications.”

The Measures section of that Requirement Part reads, “…documentation that malicious communications detection methods (e.g. intrusion detection system, application layer firewall, etc.) are implemented.” In other words, the NERC entity needs to document how they complied with this Requirement Part, but they are allowed to choose the technology or technologies they implement to achieve the objective. For on-premises systems, the evidence might be output produced by an IDS or an application layer firewall. For the cloud, it might be an audit report for ISO 27001 certification or FedRAMP authorization, which describes how the CSP has complied with a requirement to detect malicious inbound communications. A CIP auditor might consider both of these to be evidence of compliance with CIP-005-7 Requirement R1 Part R1.5.[i]

However, some NERC CIP Requirements are not objectives-based, but instead mandate that particular actions be performed without regard to achieving a particular objective. For example, CIP-007-6 Requirement R2 Part R2.2 requires the NERC entity to “At least once every 35 calendar days, evaluate security patches for applicability that have been released since the last evaluation from the source or sources identified in Part 2.1.”

Among other things, this tightly packed Requirement Part mandates that the NERC entity check with the vendor of every software product installed on any system within their high or medium impact Electronic Security Perimeter every 35 days, to determine whether a new security patch is available for their product. The entity then needs to determine whether the patch is applicable to their product or environment. If it is applicable, they need to either apply the patch or develop a mitigation plan.

A security patch almost always fixes one or more software vulnerabilities (often identified using a CVE number). However, according to the well-respected vulnerability intelligence firm VulnCheck, only 1.1% of publicly known vulnerabilities are observed being exploited by attackers.

Does this mean that an organization not subject to CIP compliance could safely deploy just 1.1% of the security patches that are made available for their systems? Since active exploitation is always an ex post facto measure, waiting for your system to be exploited before applying the patch is probably not a great strategy. However, there are measures of active exploitation available, such as the EPSS score and CISA’s Known Exploited Vulnerabilities (KEV) catalog, that all organizations can use to prioritize their patching efforts.

Since virtually all large organizations today have a big backlog of patches to apply but nowhere near enough bandwidth to apply them all, they must triage them. They need to accept the fact that they won’t be able to apply all patches, and instead divide them into three groups: patches they definitely will apply, others they definitely won’t apply, and still others they will apply only if time permits.

However, a NERC entity subject to compliance with CIP-007 is not allowed to consider information about active exploitation (or anything else) in deciding whether to apply a security patch. Requirement CIP-007-6 Part 2.2 does not allow NERC entities with high or medium impact BES Cyber Systems to ignore any patch because it has very low risk of exploitation. It doesn’t matter whether the patch mitigates any significant security risk or not; if it is available and it applies to the NERC entity’s configuration, it must be applied.[ii]

I don’t honestly know whether CSPs follow the NERC CIP approach and try to apply every available security patch regardless of whether it mitigates any risk, or whether they triage patches based on the risk of exploitation of the vulnerability(ies) that are mitigated by the patch. However, if they take the latter approach, they are not complying with the letter of CIP-007 R2, even though I believe a risk-based approach is best for almost any cybersecurity problem.

But there is a much bigger problem that prevents platform CSPs from producing compliance evidence for prescriptive CIP requirements in the cloud like CIP-007-6 R2, CIP-010 R1, and CIP-005 r1: Evidence for these three requirements must be produced on an individual device basis, because the requirements can only be complied with on the device level. And since single cloud workloads migrate from system to system and data center to data center all the time, a single BCS might reside on hundreds or even thousands of individual devices during a three-year audit period. There is simply no way a platform CSP could ever produce the full set of evidence for any of the prescriptive CIP requirements, even if they were inclined to do so.

However, the platform CSPs could potentially comply with CIP requirements that just mandate policies or procedures. If they do comply, it will probably be with three non-negotiable positions[iii]:

1. They will not provide CIP compliance evidence to individual NERC entities, but only to NERC itself (or perhaps a third party designee). It will be up to NERC to share that evidence with any NERC entity that can demonstrate a need for it.

2. The evidence the platform CSP provides will include selections from audit reports for ISO 27001 certification, FedRAMP authorization, and SOC 2 Type 2 audits. It will be up to the individual NERC entities to decide what to make of that information; NERC will never “certify” CSPs for use by NERC entities (indeed, any attempt to do so might be considered an antitrust violation).

3. While platform CSPs are open to answering other questions besides the ones included in frameworks like the NIST CSF, ISO 27001 and FedRAMP, the questions will need to be agreed on beforehand by NERC entities. NERC will administer the questions and evaluate any evidence provided by the CSPs before distributing it to the NERC entities. However, this will not be a “compliance audit”, since neither NERC nor FERC has any jurisdiction over the CSPs.

While I think this NERC “audit” of platform CSPs needs to be part of whatever final set of solutions comes out of the current CIP standards drafting effort, I also don’t see any way this can be part of the CIP standards themselves. What I’m describing here will require changes to the NERC Rules of Procedure, even though drafting this will probably require a NERC team separate from the current Project 2023-09 Risk Management for Third-Party Cloud Services team, as well as one or two possible additional years of effort, before all the pieces for the full solution are in place. That’s why the NERC CIP community needs to think now about partial solutions that will allow NERC entities that wish to do so to make as much use of the cloud as possible, without requiring a complete rewrite of the CIP standards.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

[i] Currently, a NERC CIP auditor is unlikely to accept an audit report as compliance evidence, since there is nothing in the Rules of Procedure that allows for acceptance of audit results - other than CIP audit results - as evidence of compliance with a CIP Requirement. A permanent fix to this problem will probably require changing the Rules of Procedure, although as a temporary measure a “CMEP Practice Guide”, which is created by the NERC auditors to address an area of ambiguity in the current requirements, would probably be sufficient.

[ii] Many tools now ease the burden of compliance with this Part, although there is always a large amount of care and feeding involved with CIP-007 R2 compliance, regardless of the degree of automation.

[iii] This section just applies to the major platform CSPs, not to SaaS providers. I think the latter should be subject to a NERC “audit” as well, but it should be very different from that of the platform CSPs, since their situation is very different.

Wednesday, November 20, 2024

NERC CIP: Is there a risk of over-concentration in the cloud?

Most of the discussion of cloud risks in the power industry has focused on cybersecurity: Is it safe to deploy critical infrastructure systems and data in the cloud? And if so, how can we fix the NERC CIP standards so that electric utilities and Independent Power Producers (IPPs) feel comfortable using the cloud for all types of workloads? Currently, NERC entities don’t feel comfortable using the cloud for systems subject to CIP compliance (even low impact ones). You could say quite truthfully that the electric power industry is seriously underutilizing the cloud because of fears of becoming non-compliant with CIP, even though those fears are often not justified.

But there has lately been discussion of the opposite risk: that too many power industry assets will be deployed in the cloud, so that a big outage at one of the major CSPs will cause a substantial adverse impact on the power grid – or something like that.

Before we go any further, let’s have a reality check on whether this is a current problem. How many assets subject to NERC CIP are deployed in the cloud today? At NERC’s GridSecCon conference last month, I heard a person who is in a position to know this answer the question:…(drumroll, please….) There are currently two (count them!) small low-impact Control Centers deployed in the cloud (he didn’t say whether they’re deployed with the same platform CSP or not, let alone whether they’re in the same data center, etc).

I don’t know about you, but this hardly strikes me as an imminent danger. Perhaps in five years or more we will need to think harder about this problem. But since the subject comes up repeatedly (as it did during NERC’s Cloud Technical Conference three weeks ago), it seems we can’t avoid it. What is the nature of this threat?

This threat is usually described as due to the over-concentration of generation assets in the cloud – that is, systems that control generation. Let’s say the two Control Centers just mentioned control generating assets (although they’re much more likely to control renewables like wind and solar. Given the low latency requirements of synchronous – i.e., mostly fossil – generation, I find it doubtful that any such generation will ever be deployed in the cloud. Given that renewables generation is interrupted all the time when the sun doesn’t shine and the wind doesn’t blow, it’s hard to see how interruption of any quantity of renewable generation would lead to a grid emergency. But let’s pretend that it would…).

Furthermore, let’s pretend (isn’t this fun?) that those two Control Centers are now twenty, and they’re all deployed with the same platform CSP. Additionally, let's say these each control 1,000MW (I’m keeping that number under 1500MW, since that would make the Control Center medium impact. Currently, medium and high impact systems cannot be deployed in the cloud, without putting their owners into non-compliance with multiple CIP requirements).

Now, let's say the CSP experiences a catastrophic outage that brings it mostly down (which of course has hardly ever happened). Will this bring all that generation to an immediate halt? Only if none of these 1,000MW Control Centers has a backup somewhere else, perhaps in another CSP’s cloud or on premises at the IPP – which I hope is not the case, although again the risk with renewables is much lower. So now we must imagine a much larger outage that affects multiple platform CSPs. And since each CSP should (and does) have complete redundancy for all their workloads, we also must imagine that none of those redundant workloads is operational, either.

Of course, if someone with a vivid imagination (and I don’t deny those people exist) wants to assume that all the unfortunate events just described can easily come to pass, how will we prevent this from happening? The answer seems quite simple: We reduce the concentration of Control Centers in the cloud.

One way to do this is to simply ban cloud use by the power industry. But let’s say that’s too drastic. How do we reduce the concentration of grid control in the cloud without totally banning it? To do this, we need to bring the CSP into the compliance picture by developing special NERC CIP standards that apply to them. They will need to figure out some way to limit their number of NERC CIP customers (perhaps by limiting them to entities whose names begin with A through L?) and be subject to penalties if they don’t do that.

Today, no CSP is subject to compliance with NERC standards, and it’s hard to identify any Functional Model designation they would fall into. One that has been mentioned is Generation Operator (GOP)? After all, the CSP in the above example hosts assets that control at least 20,000MW. If you bring in four times as many 1,000MW Control Centers, that amount of generation would bring the CSP over the threshold (75MW) to be low impact under CIP.

But just because the CSP hosts a bunch of servers that are controlling generation, does that make the CSP a GOP – as opposed to the entity that actually controls those servers? If so, then a CSP that hosts servers involved in pharmaceutical manufacturing should be declared a drugmaker and become subject to FDA regulations. And a CSP that hosted software for manufacturing aircraft would be declared an aircraft manufacturer and would perhaps be subject to FAA regulations.

The fact is that the CSP is just a platform used by the actual GOP, in the same way that HP or Dell servers are used on premises to control generation. Just like HP and Dell, CSPs will never consent to become NERC entities, nor will it ever make sense to try to force them to do that. If over-concentration in the cloud ever becomes a problem, the way to combat it will be to take other measures to prevent at least some NERC entities from using the cloud. Come to think of it, we’re doing a great job of that today. We just need to keep up what we’re doing!

Tuesday, November 19, 2024

Some good news regarding CPE

Today, Patrick Garrity of VulnCheck sent me this link to a blog post they just put up. Here’s a very high-level summary:

1. VulnCheck, which operates a vulnerability database that’s based on data from the National Vulnerability Database (NVD), is announcing that their NVD++ database, which they started offering for free after the NVD’s near-collapse early this year, has generated CPE names for 76% of new CVE records added to the NVD since February.[i] This compares with the NVD, which has only created CPEs for about 41% of new CVEs.

2. If that were all that VulnCheck did, I wouldn’t be very impressed. The biggest problem with CPE is that there’s no way to ensure that two people working with the same information about a software product will produce identical CPE names for the product (see this post for more information on that). Since creating CPEs is technically NIST/NVD’s sole responsibility, when and if the NVD finally gets back up to full speed, there’s no assurance they will accept CPE names created by any third party besides CISA, whose Vulnrichment program has a special status. This means that users of NVD++ who are using VulnCheck’s CPE names in their internal vulnerability management process face the real prospect of having to replace all of those with whatever CPE names the NVD comes up with.

3. However, VulnCheck has addressed this potential problem by basing all their new CPEs on CPEs that have already been created by the NVD (these are found in the mis-named “NVD Dictionary”, which is just a list of every CPE that the NVD has created). Thus, it is likely that VulnCheck’s CPEs will be accepted by the NVD when it is running properly again, and VulnCheck’s users will not have to replace them in the future.

Does this mean I have changed my longtime opinion that purl is a far superior identifier to CPE, and vulnerability identification using purl is much more accurate than it is using CPE? Not at all. Currently, for open source software, purl is much better than CPE.

However, the same cannot be said for commercial software, since purl currently does not have a way of identifying it. This is why the OWASP SBOM Forum is proposing to develop new purl “types” that will make it possible for purl to identify commercial software products[ii]. This, coupled with the fact that purls can now be included in CVE records (previously, only CPEs could be used), means that in a few years there will be vulnerability databases (perhaps including the NVD) in which it will be possible to search for vulnerabilities in both open source and commercial software products using purl – although CPEs will continue to be created and used for many years and may never be totally abandoned.

I’ll be honest that it will be probably 4-5 years before purl becomes widely used to identify commercial software, although it may never be as dominant there as it is now for open source software (where it’s very hard to say there’s even a “number two” identifier. This despite the fact that less than ten years ago, purl was literally nowhere).

I have been thinking that, given the NVD’s current problems, there’s literally no fully automated way to identify vulnerabilities in commercial software; it’s quite sad to think that this situation could continue for the next 4-5 years. If VulnCheck can come close to taking the place of the NVD during that time (and maybe afterwards as well), the whole software world will be much better off.

[i] If you’re unfamiliar with CVE and CPE, reading this post may be helpful.

[ii] We are looking for both volunteers and donors to enable this project to get started. See this detailed proposal for the project.

Sunday, November 17, 2024

CIP in the cloud: Can we fix EACMS before BCS? Probably not.

The NERC Project 2023-09 Risk Management for Third-Party Cloud Services Standards Drafting Team (SDT) is currently finalizing a revised Standards Authorization Request (SAR) that will guide their effort to revise the NERC CIP Reliability Standards going forward. It will be submitted to the NERC Standards Committee soon, and it will very probably be approved by them in December. In January, the SDT will set about the business of revising the standards themselves. As I wrote recently, I expect this to be a long process, but it’s always better to take the first step, rather than never take it.

While the draft version of the SAR that was distributed to the “plus list” for the project last week lacks a lot of detail, it does emphasize twice an idea from the original SAR (approved by the Standards Committee last December) that I previously thought was exactly what the doctor ordered:

Determine a development plan to define whether revisions will be made to accommodate use of cloud for all CIP defined systems (such as EACMS, PACS, BCS, etc.) or if an incremental revisions approach will be taken to allow use of cloud for individual or groups of CIP-defined systems (such as first revising the standards to allow for EACMS use of cloud services).

And

Holistic or incremental - The DT will evaluate revision approaches and determine whether to develop requirements applicable to use of cloud for all CIP-defined systems (such as EACMS, PACS, BCS, etc.), or to develop incremental revisions to allow use of cloud for individual or groups of CIP-defined systems (for example, first revising the standards to allow for EACMS use of cloud services).

Notice that both excerpts contain the identical phrase, “first revising the standards to allow for EACMS use of cloud services”. Perhaps you perceive – as I do – that the SDT really, really wants to address what I call the “EACMS problem” first. If so, you’re correct! The SDT knows that, of the perhaps 5 or 6 separate problems that compose the “cloud CIP problem”, without much doubt the most serious is the problem of EACMS not being deployable in the cloud.

What is the problem? It’s simple. If a system meets the definition of EACMS, “Cyber Assets that perform electronic access control or electronic access monitoring of the Electronic Security Perimeter(s) or BES Cyber Systems. This includes Intermediate Devices”, it is subject to compliance with well over 100 NERC CIP Requirements and Requirement Parts.

The NERC entity must furnish evidence that they were in continuous compliance with each Requirement and Requirement Part during the entire audit period (usually three years). The same evidence is required, whether the system is deployed on-premises or in the cloud. The main difference between the two is that the cloud service provider (CSP) needs to gather evidence for cloud-based systems, since the NERC entity cannot do that on its own.

For the majority of CIP Requirements and Requirement Parts, the evidence will be relatively easy to gather, whether it is gathered from on-premises or cloud-based systems. For example, CIP-004-7 Requirement 3 Part 3.5 mandates that the NERC entity’s Personnel Risk Assessment program include a “Process to ensure that individuals with authorized electronic or authorized unescorted physical access have had a personnel risk assessment completed according to Parts 3.1 to 3.4 within the last seven years.”

For their on-premises systems, the NERC entity will show the documentation for their PRA program to the auditor. Since that program does not directly apply to the CSP, the CSP will need to provide some sort of “equivalent” evidence to the auditor, such as a certain section from their ISO 27001 certification audit, or perhaps their FedRAMP authorization audit.[i]

However, there are a small number of very prescriptive NERC CIP requirements for which it is simply impossible for the CSP to provide evidence. Consider CIP-010-4 R1.1. It requires the Responsible Entity to:

Develop a baseline configuration, individually or by group, which shall include the following items:

1.1.1. Operating system(s) (including version) or firmware where no independent operating system exists;

1.1.2. Any commercially available or open-source application software (including version) intentionally installed;

1.1.3. Any custom software installed;

1.1.4. Any logical network accessible ports; and

1.1.5. Any security patches applied.

This requires the entity to develop a baseline configuration for every device on which any part of the system in scope - (a BES Cyber System, EACMS, Physical Access Control System (PACS), or a Protected Cyber Asset (PCA) – resides. This applies to physical devices today, although it will apply to both physical and virtual devices in perhaps a couple of years. Since on-premises systems normally reside on a small number of devices and do not often switch from device to device, this is not a particularly onerous requirement.

However, what about a system that is deployed in the cloud? To quote Google Cloud, “To maintain availability and provide redundancy, cloud providers will often spread data to multiple virtual machines in data centers located across the world.” In other words, a system in the cloud will often be spread over multiple VMs, which might be located on multiple physical servers that are themselves located in multiple data centers worldwide (although a CSP will usually offer the option for a NERC entity to restrict their systems to being deployed on servers in the US or North America). But that isn’t all: These pieces of systems will regularly jump around from physical server to physical server, VM to VM, data center to data center, etc. They may do this multiple times in a day or even in an hour.

Given that, how will a CSP provide evidence to a NERC entity that all the component parts of their EACMS always resided on a physical or virtual device that had an up-to-date baseline configuration? After all, NERC CIP compliance requires the entity to be able to provide a copy of each baseline configuration, not simply attest that it exists.

Of course, no CSP could ever provide that sort of evidence, without locking all of the servers holding the EACMS in a single enclosed space with an ESP and PSP, protected by physical access control devices controlled by the NERC entity. Could a CSP ever do that without breaking the cloud business model?

The answer is that no CSP could ever provide this evidence, meaning the NERC entity would possibly be found in violation of CIP-010-4 R1 during every day of the (usually three-year) audit period. That would also apply to CIP-007-6 R2, CIP-005-7 R1, and at least a few more CIP requirements that apply on an individual device basis.

This is why deployment in the cloud is not “permitted” for medium and high impact BCS, EACMS, PACS and PCA. The problem has nothing to do with concerns about the security of the cloud or any explicit prohibition of cloud use (the current CIP standards say nothing about the cloud). Rather, it has everything to do with the fact that the CSP could never provide the required compliance evidence to a NERC entity.

Given that this problem applies to BCS, EACMS, PACS and PCA, why is the SDT singling out EACMS as the most urgent problem, which might require the SDT to develop one new version of the CIP standards to fix the EACMS problem, then another to fix the BCS, PACS and PCA problems? They’re doing this because there isn’t much doubt that the fact that EACMS can’t be deployed in the cloud today is the most serious of the five or six what I call “CIP cloud problems”.

The EACMS problem is the most serious because it affects security services and software. Think of SIEM, dual factor authentication, etc. These are all EACMS because they “perform electronic access control or electronic access monitoring of the Electronic Security Perimeter(s) or BES Cyber Systems.” When one of these services moves from an on-premises version to the cloud, their NERC entity customers won’t be able to follow them. Even more importantly, there are many cloud-based security service providers (e.g., SecureWorks) that have been off limits to NERC entities with high and medium impact CIP environments all along. Wouldn’t it be nice if NERC entities could take advantage of every available cloud service, rather than have to content themselves with the dwindling number of on-premises security services?

Of course, it would. However, even though the EACMS problem is more serious than the problems with BCS, PACS and PCA, will it do any good for the SDT to go through the large extra time and effort required to develop one set of standards that fixes EACMS alone – especially since this will delay the delivery of the final version by at least 1-2 more years? If fixing EACMS were going to be very different from fixing the other asset types, this might be worthwhile.

But it won’t be. Whether the goal is just to make EACMS deployable in the cloud or to make all four asset types deployable, the same steps will need to be taken. That is, the SDT will need to rethink the CIP standards and develop two different sets of standards: one for on-premises systems - almost exactly what is in place today - and one for cloud-based systems. I described one way to do that in this post more than a year ago (there are some things I would like to change in that post, but in principle there’s nothing wrong with it).

Of course, given the importance of fixing the EACMS problem as soon as possible, it would be comforting to think there is a way it could be fixed separately from how the other CIP/cloud problems are fixed; but that simply is not going to happen. All the problems will be fixed at once or none of them will. Thus, if the NERC community wants to completely fix any of the CIP/cloud problems, we will have to completely fix all of them. And that optimistically looks like 2031.

Are there interim or partial solutions available? Yes, there are. Will the NERC community take advantage of them? Dunno.

[i] Neither the ISO certification nor the FedRAMP authorization would be accepted as evidence in a NERC CIP audit today. However, at some point in the not-too-distant future, a CMEP Practice Guide or some other document might indicate that this is acceptable evidence.

Sunday, November 10, 2024

How will NERC entities assess their CSPs?

I participated in the third of the four panels in NERC’s successful Cloud Technical Conference on November 1. Two of the three questions that all panel members were asked were:

· How does the shared responsibility model in cloud computing reshape the way utilities manage accountability for security and compliance, and what best practices can help clearly define these responsibilities between utilities and cloud service providers?

· How can utilities effectively manage and verify that cloud providers are fulfilling their security responsibilities, and what role do audits and third-party assessments play in this process?

Of course, these are both variations of the question, “How will NERC entities assess their CSPs, once they are able to fully utilize their services?” While I am satisfied with the separate answers I provided to these two questions during the conference, I now realize it is much better to answer the unified question.

First, I want to point out that in this post I’m treating the term “cloud service provider” (CSP) to mean two types of organizations: “Platform” CSPs like AWS, Azure and Google Cloud, and SaaS (“software-as-a-service”) providers, meaning software providers that offer subscriptions for access to their software in the cloud. Usually, I distinguish between the two, as I did in this post.

In the new or revised CIP standards that the NERC Project 2023-09 Risk Management for Third-Party Cloud Services Standards Drafting Team will start drafting in 2025, I think the CSPs should be assessed in two ways:

1. While the CSPs are not subject to the jurisdiction of either NERC or FERC directly, there needs to be an annual “audit” of the CSPs. It should be conducted by the NERC ERO; the CSPs will never agree to be audited by every NERC entity that is a customer. Kevin Perry, former Chief CIP Auditor for SPP Regional Entity, suggested the Regional auditors could conduct a joint audit (they perform these all the time).

a. The audit will have two parts. First, there should be an assessment of the audit report from either the CSP’s ISO 27001 certification or their FedRAMP authorization. This assessment does not need to cover the entire report, but only certain topics that the current “cloud” Standards Drafting Team (i.e., the team that is meeting now) has decided should be a focus of the assessment. These might include topics such as background checks for personnel, incident response plan, internal network security monitoring (INSM), etc. The NERC assessors will look for adverse findings in any of these areas and note them.

b. For the second part of the audit, the current SDT should identify cloud risks that are not addressed by the CSP’s authorizations or certifications. The NERC assessors will need to interview CSP personnel regarding the degree to which the CSP has mitigated each of these risks. They might include:

i. Multitenant databases in SaaS products. This isn’t itself a risk, since a SaaS provider can never provide each customer with their own instance of the product without completely breaking their business model. On the other hand, NERC entities shouldn’t be sharing a database with organizations from Russia and Iran. The SDT will need to debate this issue and come up with reasonable measures that mitigate risk without putting the SaaS provider out of business.[i]

ii. Whether the CSP is properly training their customers in how to manage the security controls for their own cloud environment.

iii. How well the platform CSP vets third parties that broker access to services in their cloud.

c. The ERO auditors will prepare a report on their assessment of each platform CSP and SaaS provider and make these available on request by NERC entities that are customers of those services, as well as to the CSP itself.

d. NERC will not “certify” the CSPs. Their job is only to assess particular risks to which the CSP is subject, whether these risks are addressed in a certification or whether they are subject to the separate risk review described in item b above.

I want to point out that there is currently no provision in the NERC Rules of Procedure for NERC to conduct assessments of third parties that are not subject to NERC’s jurisdiction – which is the case with CSPs, of course. If what I have just described is to come to pass, there will probably need to be RoP changes; however, no Standards Drafting Team is currently empowered to make those.

This is one of the many unknowns that will impact the likely implementation date for the revised CIP Reliability Standards. In a recent post, I stated that I think the most likely date is Q2 of 2031; I also pointed out that if a change to the Rules of Procedure is required, even that date might be too optimistic. Guess what? I now believe an RoP change (or at least some sort of change to NERC rules, which the SDT has no authority to change on their own) is required. Ergo, Q2 2031 is an optimistic estimate; it would be safer to use a later one, although I have no idea what that would be.

This gets me back to the conclusion of the post I just linked: Asking NERC entities to wait until new or revised CIP standards are in place to make full (and secure) use of the cloud isn’t workable. There are partial measures that can be taken on an interim basis to enable at least some cloud use by NERC entities with high or medium impact BES environments. I believe it’s time to make some decisions on what needs to be done in say the next two years, and how to do it.

[i] I can see this debate alone taking six months; I’m sure there are a few other topics that could be equally contentious. That is why I am now anticipating that new and/or revised CIP standards that address cloud issues won’t be in place until 2031.

Tuesday, November 5, 2024

Reply to Boris Polishuk, Cybellum – November 5, 2024

From Tom Alrich: Boris is Chief Architect of Cybellum, the Israeli company that is one of the leading service providers in the SBOM world; he is a member of the OWASP SBOM Forum. He wrote me in response to the white paper that Tony Turner and I, who are co-leaders (with Jeff Williams) of the SBOM Forum, recently developed, titled “purl needs to identify proprietary software”. The paper describes the project we are proposing to expand the scope of the “purl” software identifier to address proprietary software as well as closed source software. We hope to be able to start that project soon.

To summarize why this project is important, the software security “industry” is plagued by the problem that software products all have different names in different contexts. In order to learn about vulnerabilities (usually designated with a “CVE number” like CVE-2021-44228) that affect a software product they rely on, an end user organization needs to locate the product in a vulnerability database. To do that, they need to learn (or be able to construct from information they already have) the identifier for the product in the database.

The most widely used vulnerability database in the world, the US National Vulnerability Database (NVD), utilizes exclusively the “CPE” identifier, which stands for “Common Platform Enumeration”. CPE has been in use, in multiple versions, for around twenty years. Unfortunately it is an unreliable identifier, as the SBOM Forum described on pages 4-6 of this 2022 white paper. Even more unfortunately, serious problems in the NVD since February 2024 have resulted in over two thirds of the vulnerabilities reported this year being totally invisible to a search using a CPE name. Clearly, an alternative software identifier is needed.

Below, I have posted each paragraph from Boris’ letter in bold roman type, with my response in italics.

Hi Tom,
Thanks for sharing your thoughts on using SWID to generate PURLs for proprietary software. We've considered this approach, but we have some reservations about its feasibility at this time. Here's our reasoning:

Decentralized Ecosystem: Proprietary software exists in a decentralized ecosystem with no central authority to enforce naming standards or manage a unified repository. This increases the risk of duplicate or conflicting PURLs, even when generated through SWID.

I agree that proprietary software exists in a decentralized ecosystem. However, I think you’ll agree that the same can be said for open source software. Despite that fact, not only is purl by far the leading software identifier used in open source vulnerability databases today, it is almost the only one. I am sure there have been duplicate purls for OSS, mainly due to somebody making a mistake. But I don’t know of any case where a purl was correctly generated, yet still exactly duplicated a purl for a different product. I also don’t know of any case where two different purls were created for the same product (this happens often with CPEs). Do you know of any such cases?

· Limited SWID Adoption: While SWID offers a potential solution, its adoption has been limited. In our experience, many organizations are unfamiliar with SWID or its implementation. Relying on SWID for proprietary software identification might face similar obstacles as CPE, hindering widespread adoption and effectiveness.

I agree that SWID has had limited adoption. You probably know that NIST developed the SWID specification (and got it approved as ISO/IEC 19770-2) to be a replacement for the CPE identifier in the NVD. For a few years, it enjoyed modest success; for example, Microsoft included a SWID tag in the binaries for all their new products or versions for at least a couple of years. However, SWID never reached the point where NIST felt comfortable dropping CPE altogether and only using SWID in the NVD. They have more recently acknowledged they will never make that change.

However, you need to understand the reason why we’re proposing that SWID tags be used to create purls for proprietary software that isn’t distributed through app stores: The supplier needs to decide the name and version string for every product they release. This information needs to be made publicly available in a single document.

We could have defined our own format for the document, but since SWID includes the fields needed for the purl (as well as many more that aren’t needed and don’t need to be filled in) and since it is already an ISO standard, we decided to use SWID. In fact, Steve Springett has created a SWID tag generator, which a software supplier (or a third party, if the supplier has not done this already) can use to create a SWID tag for one of their current or legacy products (note that the majority of the fields in Steve’s tool are optional in the purl). The suppliers won’t need to know anything about SWID to create a SWID tag.

· PURL and CVE.org: The lack of CVE.org's full embrace of PURL as a primary identifier raises concerns about its long-term viability as the sole solution for vulnerability management.

The CVE 5.1 specification (which allows purls in a CVE report but doesn’t say anything about how they will be created or entered in the report) was only adopted by CVE.org this past spring (the SBOM Forum submitted the pull request to add purl to the CVE spec in 2022); very few CNAs are using v5.1 yet. This is mostly because including a purl in a CVE report won’t do any good now, because the NVD is at least several years away from being able to handle purl (Sonatype’s OSS Index vulnerability database supports both CVE and purl today, so it might be able to utilize purls included in CVE v5.1 records with little or no change required. We would like to test this in our proof of concept).

Given these challenges, we believe that a more comprehensive approach is needed to address the complexities of identifying proprietary software. This might involve:

Collaboration with CVE.org: Working closely with CVE.org to establish clear guidelines and standards for both PURL and CPE, ensuring they complement each other and address the limitations of each system.

I agree this is important. In fact, it is already one of the “deliverables” of our project (which are described on pages 6-9 of the preliminary project plan that I made available last week). We plan to work with CVE.org and the CNAs (which report to CVE.org) to develop whatever rules are required for them to correctly create a purl for a product described in a CVE report, and to include the purl in the report.

Regarding CPE, the reason we’re doing this project is that, while purl is clearly a much better identifier than CPE for open source software and components – and should be adopted by the NVD for OSS as soon as possible – it doesn’t identify proprietary software products, unless they are delivered through a package manager like PyPI or Maven Central (which rarely happens). Currently, CPE is the only identifier for proprietary software, but it’s subject to the problems listed on pages 4-6 of the OWASP SBOM Forum’s 2022 white paper linked earlier.

When our project is finished and after our recommendations are implemented, we believe that purl will prove to be superior to CPE as an identifier for proprietary software, as it is now for open source software (see our 2022 white paper for why purl is in general superior to CPE). However, the 240-250,000 CVE records that now include CPE can’t simply be thrown away. That means the NVD, and the other vulnerability databases based on the NVD, will need to support both identifiers, perhaps for a very long time.

However, given that the NVD currently has a backlog of more than 19,000 CVE records that don’t include a CPE or purl – and that backlog is growing all the time – integrating purl into the NVD isn’t likely to happen anytime soon. Fortunately, there are multiple open source vulnerability databases that support purl, although I don’t think there are any that also support CPE.

· Hybrid Model: Exploring a hybrid approach that leverages PURL's strengths for open-source components while incorporating alternative mechanisms, potentially drawing inspiration from SWID or other identification schemes, for proprietary software.

Essentially, this is what we’re proposing. Purl is currently used in almost all vulnerability databases for open source software (except the NVD and other databases based on it), but it doesn’t address proprietary software currently. SWID by itself was intended originally to be an identifier for both OSS and proprietary software products, but I don’t know any vulnerability database that supports SWID as a software identifier now, nor do I know of any other identifiers for proprietary software, other than the very flawed CPE.

We’re not saying that anybody who is happy with using CPE to identify software products needs to give it up. But we are saying that, once our recommendations are implemented, we believe purl will prove to be a superior identifier for both open source and proprietary software.

· Industry-Wide Standards: Promoting the development of industry-wide standards for software identification to ensure interoperability and consistency across different ecosystems.

That’s what we want to do! Once purl and CPE are both able to handle open source and proprietary software, the software community will “decide” (through their choices in the marketplace) whether they want to use one or the other identifier, or both. As mentioned earlier, both will continue to be in use for many years.

We appreciate your insights and the work you're doing in the OWASP SBOM Forum. While we don't see SWID -> PURL as a viable solution for proprietary software at this point, we're open to further discussions and exploring alternative approaches.

SWID -> PURL is certainly not a viable solution for proprietary software at this point, but we want to make it one as soon as possible! However, I want to point out that the idea of using SWID tags to generate purls for proprietary software isn’t set in stone; if anyone has a better idea for expanding purl to cover proprietary software not distributed through app stores (purls for software in app stores can be easily created without using SWID tags, as described in our new white paper), please bring it up to us. We hope to start the project soon, at which point the participants will be able to determine its direction in both synchronous and asynchronous online discussions.

We welcome anyone who wants to participate and/or to contribute to the effort through directed donations to the SBOM Forum through OWASP (a 501(c)(3) nonprofit organization).

Sunday, November 3, 2024

When will the “NERC CIP in the cloud” problem be solved for good? You won’t like the answer…

NERC’s 6-hour virtual Cloud Technical Conference on November 1 was quite successful. The conference included three panels of industry types (including me) discussing questions mostly posed to them in advance, followed by a discussion by members of the team that will draft changes to the CIP standards to address what I call the “CIP in the cloud” problem.

The discussions were very productive and produced some great insights. I took a lot of notes and will produce multiple posts on those insights in the coming month or so. However, I’m going to start off with a question that wasn’t discussed in the conference, but was very much hanging over it: When will new or revised CIP standards be in place, so that full, but secure, use of the cloud by NERC entities with BEES assets will finally be possible?

This isn’t an academic question at all. As multiple panelists pointed out, previously on-premises software products and security services are moving to the cloud all the time. In some cases, they retain an on-premises option, with the caveat that the most innovative changes will only go into the cloud. In other cases, the vendor is making a clean break with on-premises systems, leaving NERC entities with high- or medium-impact BES environments with no other choice than to find a totally on-premises replacement. And as Peter Brown of Invenergy pointed out (in the conference as well as an earlier webinar sponsored by the informal NERC Cloud Technology Advisory Group and SANS, which I wrote about in this post), those replacements are inevitably more expensive and offer less functionality.

In January, I wrote a post that examined this question. I concluded by saying:

So, if we get lucky and there are no major glitches along this path, you can expect to be “allowed” to deploy medium and high impact BCS, EACMS and PACS in the cloud by the end of 2029. Mark your calendar!

Of course, the end of 2029 sounds like a long time to have to wait, especially with security services and software already abandoning their on-premises options. Do I still think the industry will have to wait five years for the cloud to be completely “legal”? I have good news and bad news, but finally some good news, for you:

· The first good news is I no longer think the end of 2029 is the likely date by which cloud-based systems and services for systems to which the CIP standards apply will be fully “legalized”.

· The bad news is I think it will probably be later than 2029.

· However, the second good news is that, given how this problem is affecting more and more NERC entities all the time, it’s unlikely there won’t be at least a partial solution to this problem before 2029 – although don’t ask me what form that solution will take. This is very much uncharted ground.

Here's a short summary of my timeline and the reason for my changes:

1. I had thought the new Standards Drafting Team (SDT) would start their drafting work when they convened in July. However, it turns out they are now working on revising their Standards Authorization Request (SAR). They will finish that by the end of this year and will submit it to the NERC Standards Committee for approval. That approval is likely to be quickly granted, so the team will probably start drafting in January 2025, not last July as I had anticipated.

2. There are some huge issues that will need to be discussed when the SDT starts drafting. I attended a lot of the meetings of the CSO706 SDT that drafted CIP version 5. V5 completely rewrote the CIP standards and definitions that had been put in place with CIP version 1. Even though there were a lot of fundamental questions discussed in those meetings, I also know the SDT had a good idea of what they needed to do when they started drafting v5 in early 2011. Even then, developing the first draft took a year and a half (see the January post linked above). The “cloud” SDT might take that long or even longer to develop their first draft.

3. Once the SDT has their first draft, they will submit it to the NERC Ballot Body for approval. It’s 100% certain it won’t be fully approved on the first ballot. With each ballot, NERC entities can submit comments – which, of course, mainly discuss why the commenter didn’t vote for the standard in question (each new or revised standard will be voted on separately). The drafting team needs to respond to every comment, although in practice they group similar comments and respond to them at one time. For just one of the CIP v5 ballots, 2,000 pages of comments were submitted.

4. It’s close to certain that the new or revised standards will go through at least four ballots before they’re approved, with three comment periods in between them. The balloting process alone took the CIP v5 SDT a year, and I assume the new SDT’s experience will be roughly the same. Adding that to the estimate of 18 months to draft the first version of the new standads, we’re at 2 ½ years, starting in January.

5. When the new or revised standards have been approved by the ballot body, they will go to the NERC Board of Trustees for approval at their next quarterly meeting; it’s close to certain the BoT will approve it in one meeting. So, BoT approval requires 3 months, bringing us to two years and nine months for the process so far.

6. At that point, the standards go to FERC for approval. Even though individual FERC staff members have been quite supportive of the need for changes to accommodate cloud use (and two staff members spoke in the technical conference), the staff might very well not be in line with some of the actual changes that are proposed. Of course, the five FERC Commissioners are the ones who must approve those changes; they always take a lot of time to come to general agreement. I’ll stick with my earlier estimate of one year for FERC to approve the new or revised standards, but it could well be longer than that. We’re now at three years and nine months from next January, which is the third quarter of 2029.

7. However, FERC approval doesn’t mean that NERC entities can rush off and start using the cloud. There will without doubt be an implementation period of more than one year; I’ll say it will be 18 months[i], but even that may be a low estimae. This puts us at the first or second quarter of 2031, before the new or revised CIP standards are enforced.[ii]

Thus, instead of saying that the cloud will be completely “legal” for NERC entities by the end of 2029, I’m now saying this will happen by the second quarter of 2031, which is 6 1/2 years from now. But that isn’t all: In my January 2024 post, I pointed out that I thought it was possible that the changes required for the cloud will also require changes to the NERC Rules of Procedure; I now believe it’s likely this step will be needed.

The SDT has no power to make RoP changes, and my guess is there might need to be a separate drafting team for those changes. Of course, this alone could add a couple more years to the whole process. Since I don’t know what’s involved, I won’t change my estimate of Q2 2031 as the date when systems subject to NERC CIP compliance can be freely used in the cloud, subject to the controls in the CIP standards. But there’s now a big asterisk beside that date.

If you’re like some NERC entities, as well as some members of the NERC ERO, you’ll probably look at my Q2 2031 date and say something like, “This is unacceptable! The NERC community can’t wait this long.” You would be right; this is unacceptable. This is why I’m sure that some measures will be taken long before that date to allow at least some cloud use cases for BES Cyber Systems, EACMS and PACS. However, I have no clear idea of what those measures will be, beyond my own wishful thinking.

[i] The CIP v5 standards were approved by FERC in November 2013, but were enforced on July 1, 2016. That was 2 ½ years after approval.

[ii] Since many NERC entities are eager to start using the cloud for OT systems, there will probably be accommodations for entities that wish to follow the new standards before the implementation period is finished. However, only a small number of NERC entities will be allowed to take advantage of those accommodations, and they will be closely monitored. This was done when CIP v5 had been approved by FERC in 2013. At that time, NERC established the Version 5 Technical Advisory Group (V5TAG), a small group of NERC entities that implemented the v5 standards before the enforcement date. They were closely monitored by NERC and documented their experiences.