Sunday, August 17, 2025

What cloud risks does CIP need to address? (full version)


Note from Tom: I am now putting up all of my full posts in my Substack blog, but only select ones in Blogspot. Even though I put this up as a paywalled post on Blogspot two days ago, I'm now making the full post available because of its importance for the NERC CIP community. Enjoy this post, but please also subscribe to my Substack blog on a paid ($30 per year) basis. You will be able to read of my new posts, as well as all 1200+ “legacy” posts that I originally put up in Blogspot - but which are now also available to paid Substack subscribers.

 

Two days ago, I participated in two lengthy web conversations regarding NERC CIP and the cloud. The first was the bi-weekly meeting of the informal Cloud Technical Advisory Group (CTAG), a group led by Lew Folkerth of RF and Chris Holmquest of SERC. The group discusses how NERC entities can make use of the cloud for systems that are subject to compliance with the CIP standards.

A lot of this group’s discussions are about why there is so little cloud use by NERC entities today. There are two main reasons:

1.      Most use of the cloud by high or medium impact BES Cyber Systems (BCS) is impossible today, because the cloud service provider (CSP) will not be able to provide the NERC entity with the required compliance evidence for the over 100 CIP Requirements and Requirement Parts that apply to high and medium impact BCS.

2.      At least two uses of the cloud for BCS are possible today, including low impact Control Centers and medium or high impact BES Cyber System Information (BCSI) used in SaaS applications. However, I know of only a few instances of these uses today. I believe this is because neither NERC nor the NERC ERO has provided guidance on how to do this, while maintaining CIP compliance.

The second meeting was a normal meeting of the NERC Risk Management for Third Party Cloud Services Standards Drafting Team. Of course, this is very much an official NERC group; they are charged with drafting new and/or revised CIP standards that should, when put in place, allow much more extensive use of the cloud by systems subject to CIP compliance, than is possible today. This team has been meeting for more than a year and a few months ago started putting virtual pen to virtual paper to draft some real requirements.

How long will it take before what I call the “Cloud CIP” standards (i.e., the standards the SDT is drafting) come into effect? The new standards will be in effect once they have a) been drafted (which will take a at least 1-2 more years), b) gone through at least four ballots by NERC members (including a two month comment period after each ballot, plus another month for the SDT to respond to the comments and make changes to the draft standards), c) been approved by the NERC Board of Trustees, d) been submitted to FERC for their approval, e) been approved by FERC, most likely more than a year after they are submitted, and f) gone through a 2-3 year implementation period.

With some luck, all the above steps will be completed by…drumroll, please…2031. However, I think even this estimate may be over-optimistic. I say this because I haven’t allowed for changes to the NERC Rules of Procedure (RoP) – and I think they are likely to be needed. Since I don’t believe any new or revised CIP standard has ever required an RoP change, nobody I know can even tell me how this can be accomplished, except that it’s certain the SDT does not have the power to do it themselves. Therefore, I can’t quantify the time required for this step. The best case is that it might be possible to make the RoP changes during the multiyear implementation period for the “Cloud CIP” standards. If that happens, my estimated implementation year would remain 2031.

However, one thing I didn’t allow for in my 2031 estimate is that the SDT might spend a year or two in what turns out to be totally unproductive activity – for example, if the team spends 1-2 years polishing off a set of standards that are subsequently rejected by NERC or FERC for being unenforceable. Were this to happen the SDT would need to start over from the beginning. This might sound like a joke, but it isn’t – it could very well happen. Of course, this would be a big disappointment, since it would push the implementation date for the new standards to 2033 or later.

This is why I’ve decided to create an online Cloud CIP Working Group whose goal will be to help move up the implementation date for the Cloud CIP standards. This group will be open to all members of the NERC community who are concerned about both the arrival time and the quality of the Cloud CIP standards; this includes NERC entities, NERC ERO staff members, software vendors, CSPs, consultants, etc.

Fortunately, there is a way to accelerate development of the new or revised standards. It is to start performing tasks that the SDT will otherwise need to perform themselves, if they are to produce a quality product. In other words, I propose to have this new group “break a path” for the SDT by working on steps that the SDT will inevitably have to take in the future, if they’re going to produce a worthwhile set of standards.

I want this group to start at the very beginning. The first question that needs to be answered is, “What is the problem to be addressed in the new standards?” Cybersecurity is about risk mitigation. A cybersecurity standard needs to mitigate a certain set of risks. The risks addressed by CIP in general are those that apply to systems used to maintain the reliability of the Bulk Electric System.

The problem that the SDT is addressing is the fact that the current CIP standards, except for three requirements having to do with BCSI that came into effect on January 1, 2024, were written on the assumption that all systems in scope will be on premises. At first glance, the solution to that problem appears to be to rewrite the existing requirements so they take account of the cloud (in fact, I believe this is the approach the SDT is currently taking).

However, that approach doesn’t take account of the fact that, while the wording of the current standards needs to be changed, the bigger issue is that the entity will need to provide, for each requirement, evidence that the CSP has complied with the requirement. The CSPs are more than willing to make available their audit reports for ISO 27001, Soc 2 Type 2 audits, and FedRAMP, but they are not willing or able to provide evidence for any requirement that relates to specific devices or network configurations. This is because cloud data is always spread across devices and data centers. In order to comply with a requirement like CIP-007 R2 Patch Management, the CSP would need to identify each device in any data center that might hold just one part of a BCS during a 3-year audit period; the CSP would need to apply all of the parts of CIP-007 R2 to every one of those devices.

Of course, this would literally be impossible, but even if it were, the CSPs would - rightly – never agree to do it, because the cost of doing so would be astronomical. Moreover, they certainly won’t be able to pass that cost on to their NERC entity customers, who expect to be on the same rate schedule as any other customer.

The cloud business model is based on providing the same services to huge numbers of customers. If 100 NERC entities start requiring compliance information for over 100 NERC CIP Requirements and Requirement Parts (which is the number of requirements in scope for each BES Cyber System or Electronic Access Control or Monitoring System – EACMS – deployed in the cloud), that will be 10,000 pieces of information per system in scope. If each of those entities has 100 BES Cyber Systems deployed in the cloud, that will be one million pieces of compliance information required for just those 100 NERC entities.

This is one reason why any CIP requirement that refers to particular systems or network configurations will need to be changed, if it is to apply to cloud-based systems. Moreover, even requirements that don’t apply to particular systems at all, such as the CIP-008 and CIP-009 requirements, have a similar problem. This is because they require specific deliverables from the CSP – e.g., an incident response plan and a backup and recovery plan, as well as testing of those plans. The CSPs are no more inclined to prepare those plans for individual NERC entities than they are to provide information on individual devices.

However, even if the CSPs were able to provide compliance evidence for the existing CIP requirements, I don’t understand why the drafting team would even waste their time making the existing CIP requirements “cloud-friendly”. This is because almost all those requirements are like provisions that are already covered in ISO 27001 and Soc 2 Type 2 certifications, as well as FedRAMP authorizations.

Of course, changing the existing CIP requirements won’t work, since those requirements still need to apply to on-premises systems. What’s needed is a separate set of Cloud CIP requirements that just apply to BES Cyber Systems deployed in the cloud (I’ll call them “Cloud BCS”). Each of these would be intended to address a similar risk to the “equivalent” on-premises CIP requirement, but it would be based on wording found in a roughly equivalent requirement in 27001, Soc 2 or FedRAMP. If that is done, the only compliance evidence required would evidence that the CSP’s audit for the standard in question didn’t produce any findings for the requirement in question.

For example, Requirement CIP-004-7 Part 5.1 reads, “A process to initiate removal of an individual’s ability for unescorted physical access and Interactive Remote Access upon a termination action, and complete the removals within 24 hours of the termination action (Removal of the ability for access may be different than deletion, disabling, revocation, or removal of all access rights).” FedRAMP AC-2.h reads:

The organization notifies account managers when accounts are no longer required; when users are terminated or transferred; and when individual information system usage or need-to-know changes.

Of course, this isn’t exactly the same as Part 5.1 – it is both more comprehensive and less so (there are other FedRAMP requirements that also correspond to Part 5.1). However, let’s say the drafting team decides that the wording is close enough that this can be considered a rough equivalent of Requirement CIP-004-7 Part 5.1. The language of the FedRAMP requirement would be the basis for a CIP requirement applicable to Cloud-based BCS. The “Measures” section of the requirement would require evidence of the absence of audit findings for FedRAMP Requirement AC-2.h.

By using this approach, the SDT will avoid the fate of developing a bunch of CIP requirements that can’t be audited, since the CSP will never provide evidence of compliance with a requirement that they consider to be already covered in ISO 27001, Soc 2 Type 2, or FedRAMP. Compliance evidence using the approach I’m suggesting will consist solely of pointing to a particular section in an audit report.

Of course, if a CSP’s evidence is found to comply with a Cloud CIP requirement by one NERC entity, all other NERC entities should have the same finding. This means it would be silly to have each NERC entity require the CSP to provide the same evidence for each Cloud CIP requirement. Instead, there needs to be a mechanism in which the CSP provides the evidence to NERC (or some third party designated by them), which then makes it available to each NERC entity.[i]

Once the SDT drafts “cloud versions” of the existing CIP requirements, will they be finished? Hardly. Those cloud requirements are based on just one type of risks: risks addressed in existing CIP requirements. There are two other types of risk that the SDT (or the group I’m proposing) should examine, to determine which of those risks are important enough to merit their own requirements in Cloud CIP.

The second risk type is requirements in ISO 27001, Soc 2 Type 2 and FedRAMP that don’t have “near-equivalent” CIP requirements, but that are important enough to include in the Cloud CIP requirements. For example, FedRAMP requirement AC-2.k reads, “The organization establishes a process for reissuing shared/group account credentials (if deployed) when individuals are removed from the group.”

This requirement doesn’t match any existing CIP requirement, but our working group (or the SDT) might decide it addresses a source of risk that is important enough to warrant its own CIP requirement (applicable to Cloud BCS, not onsite BCS). Therefore, this requirement could be reworded to be applicable to Cloud BCS. The “Measures” section of the requirement would call for evidence of “no findings” in the FedRAMP audit report.[ii]

The third risk type is by far the most important of the three, since there is no “up front” evidence (such as a certification or authorization) that the CSP has already mitigated risks of this type. This third type consists of risks that only apply to cloud-based systems. Since both ISO 27001 and FedRAMP can apply to both on premises and cloud-based systems, I believe they don’t include cloud-only risks (although I’m open to correction if someone knows otherwise).

Three cloud-only risks that I’ve identified just by reading the news are:

1. The CSP doesn’t make sure their customers are adequately trained in the security measures required to protect their cloud environment. As I described in this post, Paige Thompson, the woman who single-handedly almost brought down Capitol One, was a technical staff member who had recently been fired by a CSP. She got revenge by breaking into the cloud environments of at least 30 customers of that CSP, one of which was Capitol One. She bragged online that all those customers had made the same mistake in configuring their security controls; moreover, she said there were many other customers that made that mistake.

Of course, the CSP shouldn’t be held responsible for every configuration mistake made by a customer. However, if 30+ customers have all made the same mistake, that’s clearly a problem that needs to be addressed by the CSP. And the answer can’t be, “We offered a class for $695 that included discussion of this issue, but they didn’t take it.” The CSP should provide the training for free, if reasonably security-proficient customers are prone to make a serious mistake like this one.[iii]

If this risk were to be addressed in a CIP requirement, the requirement might call for the CSP to explain what they are doing to make sure their customers understand how to securely configure their cloud environment.

2. The CSP didn’t vet their online access providers’ security properly. The Russian attackers that perpetrated the SolarWinds attack were quite smart. Perhaps the smartest thing they did was to conduct a classic supply chain attack by conducting their attack on the SolarWinds development environment by first compromising a popular SaaS application used by SolarWinds. They did that by first compromising a third party that sells access to that SaaS application. Through that third party, they gained access to the cloud environment on which the SaaS application was running (the SaaS app itself was not compromised). From that environment, they launched the entire attack on SolarWinds (which was without much doubt one of the most sophisticated cyberattacks of all time).

Of course, the solution to this problem is for the platform CSP to tighten security requirements on the third party access brokers. A CIP requirement to address this risk might ask the CSP to describe the security requirements they place on third party access brokers and how they enforce them, as well as whether they have experienced any breaches through these access brokers.

3. Cloud Hopper. This attack was revealed in a Wall Street Journal article[iv] by Rob Barry and Dustin Volz in 2018. What’s most scary about it is that the attackers were able to jump from customer to customer within the clouds of multiple CSPs, and that they used a variety of techniques to penetrate different customers. This shows that your security in the cloud is at least partially dependent on whether your fellow cloud customers also practice good security – i.e., it’s a kind of herd immunity.

Of course, a platform CSP can’t police the general cybersecurity practices of all companies that utilize their cloud. However, the CSP should have measures in place today to detect and counteract attempts to hop from one customer to another. A CIP requirement might ask the CSP to describe (at least in general terms) the measures they have in place, as well as whether those measures have been successful in preventing Cloud Hopper-type breaches.

Of course, the three risks listed above are not the only cloud risks faced by NERC entities! The primary task of the group I want to form will be to review cloud risks identified by various organizations – federal agencies, the military, CSPs themselves, etc. – and decide which of them should be included as the basis for Cloud CIP requirements.

You may have noticed that, when I reached cloud-only risks (the third type of cloud cybersecurity risks), I abandoned my earlier concern that the CSPs won’t be willing to provide answers to unique questions like these. This is because the first two types of risks are already addressed in ISO 27001 and FedRAMP, with which the CSP is presumably already compliant. To provide evidence of compliance for CIP requirements based on those two types of risks, the CSP can simply point the customer to the audit reports.

However, risks of the third type – cloud-only risks – aren’t addressed by those other standards. Therefore, the CSP should feel obligated to provide compliance evidence for CIP requirements based on those risks. Again, since the CSP’s response to any of these requirements will be the same for all NERC entities, NERC, or a third party designated by them, should be the single point of contact for the CSP.

NERC will “audit” each of the CIP cloud-only risk requirements by evaluating the CSP’s response to a question or small set of questions, e.g., “Describe the security requirements you place on third party access brokers and how you enforce them. Have you experienced any security breaches that came through one of these access brokers?” The response will be evaluated by asking and answering the question whether the CSP has adequately mitigated the risk that is the basis for the CIP requirement. If the response has not convinced NERC (or the third party) that the CSP has mitigated that risk, they will need to ask the CSP for more evidence.

The problem with the process I’ve just described is that it’s not currently allowed by anything in the NERC Rules of Procedure. As I’ve already said, I doubt there is any way to implement this process without making RoP changes.

However, in my opinion these cloud-only risks should be the primary focus of evaluation of the CSPs. This is because the fact that the major CSPs have all passed ISO 27001 and Soc 2 Type 2 audits, and have all been authorized for use by federal agencies under FedRAMP, means they don’t present a big problem when it comes to “normal” risks, like lack of patch management or configuration management programs. It’s fine to have Cloud CIP requirements that apply to normal risks, but the only compliance evidence the CSPs should have to provide is what’s in their audit reports based on those three compliance regimes (and perhaps others as well).

On the other hand, I strongly doubt that the cloud-only risks I’ve listed (and many more identified by others) are found in any standard compliance regimes today. The working group I want to put together will have one primary responsibility: compile a list of cloud-only risks that are not currently addressed by standard compliance regimes, then decide which of these are important enough to be addressed in the Cloud CIP standards. You’re welcome to participate in this effort if you are with a NERC entity, a vendor of cloud or software services to NERC entities (including SaaS providers and platform CSPs), NERC or the NERC ERO, a consulting organization that provides services based on NERC CIP, or if you’re just a user of electricity.

If you don’t use electricity, you obviously have no stake in what this group will do, so you’re not welcome. On the other hand, if you don’t use electricity, I’d like to know how you’re reading this blog post.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com or comment on this blog’s Substack community chat.



[i] There’s no mechanism today by which NERC can receive audit evidence from a third party and distribute it to individual entities. This is one reason why I think there will need to be changes to the NERC Rules of Procedure. As you’ll see later in this post, this isn’t the only reason why I think that.

[ii] It’s possible that the SDT (or our Cloud CIP Working Group) might decide that, if the CSP has received the appropriate certifications (ISO 27001 and Soc 2 Type 2) and authorization (FedRAMP), there’s no need to consider the other requirements in the certification or authorization, besides those that map to CIP requirements. This is because the certification means the CSP has either received no audit finding on each requirement, or they received a finding but have already mitigated the risk satisfactorily.

Therefore, the SDT and/or the Cloud Security Working Group might just skip this second risk type altogether and address the third risk type. The third type is much more important than the first two, since risks of the third type presumably are not already addressed by any of the three compliance regimes.

[iii] After my posts on this incident appeared in 2018, I and Dick Brooks of Reliable Energy Analytics talked with someone from that CSP, who had contacted me about those posts. They convinced me that they had already addressed the problem (the meeting was at least a year after the incident).

[iv] The linked article was originally made open access but may have slipped behind the paywall. Rob Barry gave me a link to a PDF of the article on his personal website; if you can’t access the article itself, email me and I’ll send you that link.

 


No comments:

Post a Comment