Monday, September 28, 2020

What could possibly go wrong?

 If you're looking for my pandemic posts, go here.

Two weeks ago, I put up a post that expressed pessimism that Medium and High impact BES Cyber Systems will be allowed in the cloud anytime soon, since doing so will require a complete rewrite of the NERC CIP standards.

Last week, I wrote about the often-repeated hope that somehow, someway FedRAMP compliance on the part of a cloud provider would be taken as evidence of the provider’s compliance with some or all of the standards, thus saving the NERC community the inconvenience of redrafting them from the start. I agreed that FedRAMP might indeed end up being taken as evidence of compliance with the CIP standards – just not the ones that are in place now.

None of the Measures mention anything about FedRAMP or the cloud, so at a minimum the Measures section for most requirements would have to be rewritten to say FedRAMP certification is evidence of compliance. But if that were done – changing the Measures without touching the requirements themselves – every NERC entity with Medium or High impact BCS would immediately do their best to move their entire OT infrastructure into the cloud, since then their required CIP compliance evidence  would effectively be reduced to one sentence: “We have transferred all of our BES Cyber Systems to XYZ Cloud Provider, who is FedRAMP compliant.” A little too good to be true, no?

The CIP requirements are going to have to be rewritten so they’re all risk based. Then all of the standards would essentially be like CIP-013: You need to identify and assess the risks in a particular area of risk (unpatched software vulnerabilities, malicious insiders, vendor remote access, etc), then mitigate them. Even if your BCS were all in the cloud, you would still have to show how you identified and assessed the risks, like all entities would. However, your evidence that most risks had been mitigated would be the fact that the cloud provider was FedRAMP compliant, and specifically that they had “passed” certification for each of those areas of risk (and if it turns out that FedRAMP doesn’t ask questions that specifically address the risks you’ve identified, you would need to provide other evidence from the cloud provider or from your network, showing that the risk had been mitigated).

So if this idea were to move forward, the drafting team would first need to redraft all of the CIP standards along something like the lines I’ve just described. Will that be enough? In other words, if we were to turn all of the current CIP requirements into risk-based ones, and if FedRAMP were to be accepted as evidence of compliance with most or all of them, would that be all it takes for NERC to say that BCS can safely be placed in the cloud?

I contend the answer to that is no. Why do I say this (other than the fact that I obviously enjoy being contrarian)? Because there are a number of risks that arise only from the cloud, that we’re just beginning to learn about – sometimes the hard way. These risks of course aren’t addressed in CIP, and they just as certainly aren’t addressed in FedRAMP either. The drafting team is going to have to take a long, hard look at what these risks are and how they can be mitigated. After they have done that, the team – and the NERC ballot body, the NERC Board of Trustees, and FERC – will also need to be satisfied that the risks they’ve identified, that are unique to the cloud, can be mitigated in some way by the cloud providers, by the NERC entities, or both. If they can’t identify a way to mitigate some of these new risks, and if they think these are significant risks, then they shouldn’t allow BCS to be implemented in the cloud.

What are examples of these risks that apply only to cloud providers? I’ve written about two of them, and I’m sure there are others. Below is my discussion of the first of these. I’ll discuss the second in my (hopefully) next post.

The Paige Thompson Memorial Risk

First and foremost, there’s the risk that was exposed by Paige Thompson, the former AWS employee who engineered the Capitol One breach. Since I think there’s a lot of misunderstanding about what the real risk is in this case, here are the most important points I made in my two main posts on this topic last year (here and here):

1.      Paige Thompson was a technical person who had been fired by AWS three years before this breach was discovered.

2.      She didn’t just breach Capital One’s systems in the AWS cloud; she bragged online that she had breached 30 other AWS customers’ systems. It doesn’t seem she took much if anything from those customers – she seems to have been motivated mainly by a desire for revenge on AWS for firing her (and even though she stole a lot of data from Capital One, she didn’t seem to try to monetize it by selling it on the dark web).

3.      Of course, AWS and other cloud providers deliberately leave security up to each customer - although I’m sure they’ll take responsibility for it if the customer is willing to pay something extra. So technically, the Capital One breach is Capital One’s fault, and the breaches of the other 30 customers are their fault. In fact, AWS initially blamed it on C1 (and C1 accepted that blame, which might have given one or two of their lawyers some apoplexy).

4.      However, Ms. Thompson had also bragged online that her success in penetrating at least 31 AWS customers was due to one specific reason: The customers don’t understand that there’s a big difference between configuring a firewall to protect an on-premises network and configuring a firewall in the AWS cloud. Specifically, the organization needs to understand AWS’ “metadata” service, and she bragged that “everybody gets this wrong”; she said this was how she was able to penetrate so many organizations’ AWS environments.

5.      But as I said in one of the above-linked posts, “If any system is so difficult to figure out that 30 companies don’t get it right (plus God knows how many other Amazon customers who weren’t lucky enough to be on Ms. Thompson’s target list), it seems to me (rude unlettered blogger that I am) that Amazon might look for a common cause for all of these errors, beyond pure stupidity on the customers’ part.” Either that or no longer accept customers who don’t reach a certain IQ level.

So the risk that a disgruntled former cloud provider employee will use their knowledge to break into customer environments in the cloud is a real one (and remember, this is very different from an insider threat. Three years after she was fired, Paige Thompson certainly wasn’t an insider anymore); but the threat that customers will be too stupid to configure their firewalls correctly isn’t a real risk.

The Paige Thompson risk obviously isn’t addressed in FedRAMP, since AWS was FedRAMP certified during the breach (although Ms. Thompson shouldn’t have been able to gain access to the AWS cloud – especially as an administrator – at all; the fact that she did perhaps points to some more conventional vulnerability that I presume AWS has identified and fixed by now).

How can this risk be mitigated? I don’t see any good way that a cloud provider could prevent a former employee from using that knowledge to accomplish this goal, other than somehow sucking that knowledge out of their brain when they leave (or giving the term “employee termination” a whole new meaning, as I very helpfully suggested in one of my posts last summer. However, I don’t believe that terminating your employees in this sense is an HR best practice). And obviously, no NDA is going to prevent a former employee from breaking into their former employer’s cloud and using the knowledge they gained, while working there, to hack into their customers’ environments. If they get caught as Paige did, an NDA violation is going to be the least of their problems.

The burden here is really on the cloud provider. If some service that’s essential for security is so complicated that nobody configures it properly, then a) the service needs to be reconfigured so it is understandable; b) the provider needs to provide intensive mandatory training on that service for all customers; or c) for customers who still don’t understand the service after training, the provider should take responsibility for making sure their environment is properly secured.

I think the new “cloud CIP” should require NERC entities to mitigate the risk that a former employee will use their technical knowledge of the cloud infrastructure to attack BCS that the entity has implemented in the cloud, on top of requiring FedRAMP compliance. I’ll discuss the other serious cloud risk that was discovered last year (also not addressed by FedRAMP, I’m sure) in my next post.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

Are you wondering if you’ve forgotten something for the 10/1 deadline for CIP-013 compliance? This post describes three important tasks you need to make sure you address. I’ll be glad to discuss this with you as well – just email me and we’ll set up a time to talk.

No comments:

Post a Comment