Wednesday, September 16, 2020

Which way to the cloud?

This post is based on my keynote presentation for Tripwire’s Energy Working Group on August 18 – the first time that was a virtual event. I want to thank Tripwire for inviting me to present; it was a great experience.

Introduction

In the last year or two, I’ve had a lot of conversations with people in the NERC community who wonder where things stand with the cloud and the NERC CIP standards. There’s a lot of interest in this topic, simply because just about every electric utility is using the cloud in some way on the IT side of the house, while at the same time most of them are holding back on the OT side because they aren’t sure what’s allowed and what’s not.

I’ll discuss what is and isn’t allowed today, but the bigger question is “When will this situation change? When will we be able to use the cloud as freely on the OT side of the house as we do on the IT side now?” I’d like to first go through the situation as I see it now, and then discuss where (if anywhere) we go from here.

BCSI in the cloud

The first thing to keep in mind is that the question of whether NERC entities are able to put BES Cyber System Information (BCSI) in the cloud is very different from the question of whether they can put BES Cyber Systems (BCS) themselves in the cloud. Both cases are “illegal” from the NERC CIP point of view now, but one is borderline legal while the other is very much on the wrong side of the law.

Of course, the borderline legal case is BCSI in the cloud. Many NERC entities now have BCSI for Medium or High impact BES Cyber Systems in the cloud, usually because it’s stored there by a cloud-based app like configuration management or vulnerability management. I know at least some Regions are explicitly allowing their entities to do this, and I would guess all of the others will allow it to happen – although if you have any doubts, you should raise this issue with your Region before you make the plunge. And of course, you need to take steps to mitigate risk, especially by encrypting the BCSI both at rest in the cloud and in transit to and from the cloud.

The primary reason that BCSI in the cloud isn’t “legal” now is that no cloud provider could provide you the evidence you need to comply with three requirement parts in CIP-004, as Kevin Perry pointed out for me in this long-ago post in 2017; his comments remain as correct today as they were then. But the difference today is that help is on the way: A drafting team is in the middle of balloting changes to CIP-004 and CIP-011 that seem to be a very creative (and risk-based!) way to allow BCSI to safely reside in the cloud, without requiring – for example - that AWS document that they’ve removed physical and logical access to any server housing your BCSI within a day, whenever any one of their 200,000 (or whatever the number is) data center employees is terminated. So even if you’re hesitant to have BCSI in the cloud now, it will be perfectly legal within probably two years, if not sooner.

BCS in the cloud

However, the situation is very different when we start to talk about putting BES Cyber Systems themselves (as opposed to just information about them) in the cloud, although doing this wouldn’t be hard, since for example there are a number of cloud-based SCADA offerings today. There are two major problems that arise when we talk about putting BCS themselves in the cloud.

The first problem is that there’s no way to designate a cloud-based BES Cyber System in CIP-002 R1, which is of course the requirement where the NERC entity has to identify and classify all of its BCS. It’s easy to see the reason for this if you follow the logical chain of steps for identifying and classifying BCS, using CIP-002-5.1a R1.1 and Attachment 1:

1.      The first step is to identify Cyber Assets. These are defined as “Programmable electronic devices”, and as of now a “device” is a physical device, not a virtual one (that will change whenever the changes to allow virtualization – currently being drafted by the CIP Modifications SDT - are enacted, but as I’ll explain later, those changes don’t affect the question of whether BCS can be based in the cloud at all).

2.      Next is to identify BES Cyber Assets, defined roughly as a Cyber Assets that, if destroyed or compromised, could have an impact on the Bulk Electric System (BES) within 15 minutes.

3.      Finally, create BES Cyber Systems, which are groupings of one or more BES Cyber Assets.

So a BCS is ultimately composed of Cyber Assets, which are physical devices (my definition of a physical device is one that will hurt if you drop it on your foot. A cloud-based Cyber Asset won’t hurt if you drop it on your foot, if you could even figure out how to hold and drop it in the first place). Ergo, a BES Cyber System can’t be based in the cloud.

Second, even if you could somehow identify BCS in the cloud, there would be no good way to document compliance with a number of CIP requirements. This includes all requirements in CIP-004, CIP-005, CIP-006, CIP-007 (except for R3), CIP-010 (except perhaps for R4) and CIP-011 R2. All of these requirements mandate actions on particular physical devices, and those actions need to be documented in every instance. A cloud provider couldn’t document which physical servers your BCS is located on at a particular point in time, let alone over a 3-year audit period. It would break their business model if they were required to do this.

The bottom line is that Medium and High impact BCS can’t be located in the cloud if an entity wants to be compliant with CIP-002-5.1a through CIP-011-2, or with CIP-013 for that matter. So what’s the solution? Just as there are two CIP problems that currently prevent BCS from being located in the cloud, there are two components to the solution. They are closely related.

The solution – first part

The first problem is that currently a BES Cyber System can only be a collection of physical devices. The current CIP Modifications standards drafting team saw the solution to this problem in June 2018, when they proposed the idea that the terms Cyber Asset and BES Cyber Asset be dropped. BCS would now be the fundamental term for determining scope of the NERC CIP standards (and the SDT proposed changing the BCS definition so it included what’s in the BCA definition now: 15-minute impact, etc).

This change allowed virtual machines to be subject to CIP. With this approach, hardware devices would no longer be the focus of the CIP standards. A BCS could be based anywhere, including at cloud data centers. The individual hardware would never have to be identified, since Cyber Asset and BES Cyber Asset would no longer have any meaning in the CIP standards. BES Cyber System would be the fundamental unit of compliance.

Of course, the SDT developed this solution to allow virtual devices to be BES Cyber Systems. But this would have the same effect for systems based in the cloud: If their loss, misuse, etc. could cause a BES impact within 15 minutes, they could legitimately be BCS. This means that in 2018, the SDT unintentionally solved the first problem preventing BCS in the cloud.

The solution – second part

However, the SDT knew another step was needed. They realized there were some CIP requirements, like CIP-007 R1 and R2 and CIP-010 R1, that were too prescriptive and hardware-focused to work in this new system. They decided to rewrite those in a risk-based format.

Of course, these requirements would still focus on hardware. But they would require a risk management plan like CIP-013 R1 does, not a specific set of actions on specific hardware. Most importantly, the entity wouldn’t have to document they had performed particular actions on every Medium or High impact BCS component (BCA or PCA). They would just show they developed a risk management plan and implemented it.

For example, take CIP-007 R2 (please!). I don’t remember what exactly the drafting team did with this requirement, but here is how I would rewrite it in a risk-based format:

As you probably know, CIP-007 R2 (patch management) requires a set of mitigations that address the risks posed by unpatched software vulnerabilities. R2 requires the entity to 1) check every 35 days for new security patches for every piece of software in its ESPs; and 2) determine whether the patch is applicable to one of their systems. If it is, then 3) the entity needs to apply the patch within another 35 days. If the patch can’t be applied in 35 days, then 4) the entity needs to develop a mitigation plan, which itself needs to be reviewed every 35 days…etc. A cloud provider can’t do this for even one of their servers, let alone hundreds or thousands. It would break their business model.

How would I write this as a risk-based requirement? First I need to identify the risk the requirement addresses. Patch management isn’t a risk – it’s one mitigation for the risk of unpatched software vulnerabilities. If we were to replace CIP-007 R2 with a requirement for managing risks due to unpatched software vulnerabilities, it might read something like CIP-013 R1 does: “Develop and implement a risk management plan to identify, assess and mitigate risks arising from unpatched vulnerabilities in software or firmware installed on BES Cyber Systems.”

However, there are actually four types of risk due to unpatched software vulnerabilities; the current CIP-007 R2 only addresses one of them: the risk of vulnerabilities for which a patch has been developed but not applied. The obvious mitigation for this risk is to apply the patch! So this is the risk that is mitigated by the current CIP-007 R2.

The three other types of risk due to unpatched software vulnerabilities are:

1.      Risks due to vulnerabilities that have been identified in software or firmware, for which no patch will be released in the near term (e.g. the supplier has informed you they will not have a patch soon, for whatever reason. Or perhaps the supplier has gone out of business, or they have discontinued support for the product).

2.      Risks due to vulnerabilities that have been identified in open source or third-party software components included in a software package you have purchased, but for which no patch will be available soon.

3.      Risks due to vulnerabilities in custom software developed by or for your organization.

My new CIP-007 R2 would address all these risks, not just the risk of vulnerabilities for which a patch has been issued. For each of these types of risk, the entity would need to:

a)      Decide whether this is a risk that could ever have more than a low likelihood of being realized. In the case of number 1, the risk has to be assessed for each software supplier. The NERC entity might decide that Microsoft™ will never stop patching products without providing plenty of notice to customers; therefore they have low likelihood for this risk. In the case of number 3, if the organization doesn’t develop any software on its own, the likelihood of this risk being realized is (extremely) low.

b)     If the likelihood will always be low – for all suppliers – the entity would be justified to simply state that this is the case and move on to the next risk. In my methodology, a risk that always has low likelihood is one that has already been mitigated or else simply doesn’t apply to your environment (e.g. risks due to vendor remote access, if your organization doesn’t allow it under any circumstances).

c)      If the likelihood could every be higher than low, the entity does need to take steps to mitigate the risk - but in the case of risks due to suppliers like numbers 1 and 2 above, no mitigation is needed if the supplier already has a low likelihood for this risk, as in the case of Microsoft for risk number 1.

Making the prescriptive CIP requirements risk based was the second part of the SDT’s solution for allowing virtual devices to be covered by CIP. But, again unintentionally, the SDT also pointed to what’s needed for cloud-based systems to be covered. If all the CIP requirements were risk-based, then cloud providers would just have to show that they have good programs for supply chain risk management, software vulnerability risk management, user access risk management, etc. There would be no need to document actions performed on individual devices or with respect to individual employees, to show they’re compliant with CIP. Certifications like FedRAMP might in some cases be accepted as sufficient evidence of a program.

To summarize this section, in 2018 the CIP Modifications Standards Drafting Team came up (at least in principle) with the entire solution required to allow BES Cyber Systems to be put in the cloud. If they had followed through with these changes and they had been approved by the NERC ballot body, the NERC Board and FERC, NERC entities would soon be able to do just that.

What happened?

However, the SDT’s 2018 proposal was abandoned when a lot of NERC entities said they didn’t want to have to implement the big changes that would be required in their current compliance programs. I was quite disappointed when I heard this had happened.

The SDT has now moved on to a much more conventional approach. Instead of getting rid of the hardware device concept altogether, they are expanding the meaning of “device” to include virtual devices. This may address virtualization, but it does nothing for BCS in the cloud. Cloud providers can no more comply with prescriptive requirements for virtual devices than they can comply with prescriptive requirements for physical devices – there is no way they can demonstrate compliance on any kind of “device”, since their business model is built on being able to continually move data and software code between devices and data centers. We’re back at square one, as far as BCS in the cloud goes.

So how do we move forward?

Of course, the CIP Modifications SDT was never given the mandate to address BCS in the cloud. This isn’t their responsibility, and it never will be. To finally address this problem, there needs to be a new Standards Authorization Request (SAR) and a new SDT to address the changes that are required to allow BCS to be put in the cloud. But I don’t want to see a SAR that just requires a new SDT to “solve the problem of BCS in the cloud” or something like that. If there isn’t buy-in from the NERC community up front about the right approach to take, the new SDT will just run into the same problems the CIP Mods SDT did.

I think there needs to be a series of national NERC meetings (virtual, of course) where there’s a discussion of a) what needs to be changed for BCS to be permitted in the cloud, and b) how to make those changes. Only when there’s general agreement on the way forward (perhaps confirmed by a ballot) should a SAR be drafted and a new SDT constituted.

Hopefully, the national meetings will lead those who are reluctant to change their existing CIP programs to understand the cost of this reluctance: If they continue to insist they can’t change, they (or any other NERC entity with Medium or High impact BCS) will never be able to use cloud-based BES Cyber Systems. Period.

Of course, this will require a much more fundamental revision of the CIP standards than even CIP version 5 was. Doing what I’m suggesting will require widespread support among NERC entities, and I see no sign of that now. Does that mean BCS will never be allowed in the cloud?

I actually believe it will happen, although I won’t say when because I don’t know. I think the advantages the cloud can provide for NERC entities are so great that they will ultimately outweigh the general resistance to change.

But I’ve been wrong before…

 

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

Are you wondering if you’ve forgotten something for the 10/1 deadline for CIP-013 compliance? This post describes three important tasks you need to make sure you address. I’ll be glad to discuss this with you as well – just email me and we’ll set up a time to talk.

No comments:

Post a Comment