Tom Alrich's Blog: A Typical CIP SNAFU

If you don’t know what CIP stands for, it’s Critical Infrastructure Protection. And if you don’t know what SNAFU stands for, well…this is a family blog, so I can’t tell you. But I’m sure you know what a SNAFU is – a very complicated situation that can’t be resolved in any easy way, which results in inhibiting or squelching some otherwise worthwhile activity or in wasting money and time on something that has no positive impact on either security or reliability (at least, that’s my definition). And by definition, in a SNAFU there is no malicious actor you can point to who is responsible for the situation; it is the inherent contradictions in the system itself that are the cause of the problems. A CIP SNAFU is without a doubt the most serious type of SNAFU there is, as the following story illustrates.

I talked recently with a Control Center manager from a NERC entity about his experiences with CIP compliance. We discussed a number of topics, and he told me a story that I think perfectly illustrates the consequences of having ambiguous or incomplete standards in a compliance regime with huge penalties for non-compliance.

The problem in this case is patch management servers. This entity has two of these servers residing outside of the ESP. They manage and install patches only for devices within the ESP. Of course, these servers fulfill an important role in both security and compliance; there is no dispute about that. Without automated patch management, many large entities simply couldn’t be secure or CIP compliant.

As you might guess, the question was how to classify these servers under CIP v5/v6. It seems someone had interpreted a document from the applicable regional entity as saying that patch management servers need to be classified as Electronic Access Control or Monitoring Systems (EACMS). The RE’s reasoning (as I heard third hand) was that, since the PM servers had direct connections to cyber assets within the ESP, they could in theory be commandeered by a malicious attacker to alter or disable BES Cyber Systems, and thus to affect the BES within 15 minutes.

Since I haven’t seen the document in question, I can’t be sure whether or not this was exactly the argument used by the Regional Entity. But I certainly hope it wasn’t, since it suffers from three flaws. The first is that the PM servers provide neither access control nor monitoring, so how can they be EACMS? The second is that, whether or not an attacker could use the server as an attack vector, this wouldn’t have any bearing on the question whether it is an EACMS – or a BCA, PCA, etc. Almost any computer in the world could be used to mount an attack inside an ESP. And the third is, even if the servers themselves had a 15-minute BES impact, that would make them BES Cyber Assets, not EACMS.

However, in saying that the patch management servers would be no different than any other computer in the world (in being able to attack the ESP), I’m being a little too cute. There is a big difference between these two PM servers and all of the other computers in the world that are outside of the ESP in question: The two PM servers have direct access into the ESP. True, there is a firewall in place (in fact, more than one firewall) to make sure that only traffic coming from these two servers can get in through the particular port or ports required to perform patching; all other traffic will be blocked. But once inside the ESP, the two servers will pretty much have full access to at least some of the devices within the ESP.[i] If they have been commandeered by a nefarious actor, the game is over.

So I can’t be angry at the Regional Entity, as well as the legal department at my friend’s company, for being concerned about the possibility of misuse of the patch management servers. This is definitely a security concern, but it isn’t a concern that is addressed by a NERC CIP requirement.

Of course, the underlying problem here is the subject of my last post: machine-to-machine communications, when one machine is within the ESP and one is outside of it. As discussed in that post, CIP currently requires no controls on the remote machines, or on the communications between them and ESP devices. The new CIP standard for security of the supply chain will address one of the largest, and almost certainly the most problematic, of the sources of such communications – vendor systems that have been granted access to ESP devices for troubleshooting or other purposes. However, it wouldn’t cover the entity’s own patch management servers.

I need to point out that the entity in question is applying all appropriate security controls to the patch management servers; that isn’t an issue. But declaring the servers to be EACMS – or any other cyber asset under CIP jurisdiction, in a Medium or High impact asset – imposes a much larger burden on the entity, due to the reporting and other tasks that are required for compliance. The CIP team would have appreciated not having to do this. I say “would have” because in fact the legal team prevailed, and the patch management servers were identified as EACMS. And sure enough, the CIP team did have to put in a lot of extra work because of this misinterpretation.

So what went wrong here? As with all SNAFUs, the problem wasn’t that someone had evil intent and deliberately misinterpreted the standards in order to make the CIP team lose some evenings or weekends. The legal department was just doing what they see as their job: making sure that all appropriate laws and regulations are followed to the letter. If a regulation is incomplete or ambiguous, they need to follow whatever guidance they have from the regulator.

In this case, the “regulator” was the Regional Entity.[ii] The legal department (who was relying on advice of a trusted consultant) assumed that the meaning of the document they received was clear – and that it required that patch management servers be identified as EACMS. As lawyers, what else could they do but order the CIP team to follow what this document said?

But there’s another “culprit” here, one whose fingerprints I have been finding at a lot of NERC CIP crime scenes recently: the fact that the CIP requirements are prescriptive, not risk-based.[iii] If they were risk-based, the fact that there is only a very small chance that the patch management servers could be used to attack BES Cyber Systems would mean something, even if they were mistakenly classified as EACMS. Rather than have to argue with the legal team about whether the servers were EACMS, the CIP team would simply point out the various protections already in place to keep them from harming BCS, as well as point out that these servers don’t pose much risk so they don’t need the most stringent controls. Or more likely, the lawyers would simply take their word for it as security professionals, and move on to cyber assets that posed a more significant risk to the BES.

The day when the CIP standards are non-prescriptive is at least a few years off. But it will come!

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] The exception to this statement is if the entity has deployed an application-layer firewall and configured appropriate signatures or rules so that the PM servers will be limited to doing patch management, nothing else. This is in theory a preferable control, but it is not required by CIP.

[ii] Of course, the Regional Entity isn’t technically the regulator, nor is NERC – which is a private non-profit corporation with no power to levy fines or compel compliance. The regulator, for all NERC standards, is FERC.

[iii] As I’ve pointed out previously, there are already two non-prescriptive CIP standards: CIP-007-6 R3 and CIP-010-2 R4. Another one is in the works, as pointed out in the post I just referenced. So it’s three (requirements) down, 30 to go!

Tom Alrich's Blog

Wednesday, October 12, 2016

A Typical CIP SNAFU

No comments:

Post a Comment

Get new posts by email: