Tom Alrich's Blog: “We Need Something to Measure”

I have been trying to keep up with the new Standards Drafting Team that is tasked with developing the CIP Supply Chain Security standard that FERC ordered in July. Last Friday, they had a short phone meeting to set the stage for their first face-to-face meeting in Atlanta this week (which I unfortunately had to miss because it conflicts with NERC’s GridSecCon). I won’t summarize the phone meeting, but I do recommend that anyone with any interest in the new standard should get on the SDT’s email list – called the “Plus List” – in order to see documents, learn about upcoming meetings, and in general learn what’s going on (I do want to point out something that some people don’t seem to understand about NERC meetings in general: almost all of them are open to anybody who is a user of electricity. And if you’re not an electricity user, how on earth are you reading this blog post?).

What inspired this post was a single comment made by someone on last Friday’s SDT call, as the discussion was focusing on the preliminary version of the new standard (which is currently called CIP-013-1, proving that the SDT members are not superstitious). This person said something to the effect of, “We need something to measure.” The meaning seemed to me to be, “You can’t have an enforceable requirement unless it can be rigorously audited. You can’t audit compliance with a requirement without having some black-or-white criterion for whether the entity has complied with it.”

I want to say now that I don’t mean in any way to pick on the person who said this (and I honestly don’t know who it was, nor do I care). I will also say that I have heard this assertion made many times. Finally, I want to mention that I and one or two other people on the call quickly pointed out to this person that this will be the wrong approach to use for CIP-013, if for no other reason than because FERC said explicitly in their Order 829 (pages 30 and 31) that the new standard should not be prescriptive. So my guess is this issue has already been laid to rest as far as the Supply Chain Security SDT is concerned.

That being said, there’s something more I want to point out: The majority of the existing CIP requirements are written with precisely this thought in mind, namely that the only good standard is a prescriptive one. And folks, in my humble opinion this is the fundamental problem with the current NERC CIP standards. Because they are so prescriptive, they are unsustainable and will collapse of their own weight if not modified.

I have been making this point in various posts, starting with this one early in the year. And the more I talk to people in the industry and follow developments in CIP, the more I am convinced it is true. In fact, I am now working on a book with two co-authors that will not only discuss the problems with NERC CIP but lay out what we see as a solution to that problem (which could be adapted to other critical infrastructure domains, not just electric power). However, it will be late 2017 at the earliest that the book will be available, so I’m not going to tell you to wait for it to learn why I made the above assertion; I am going to explain why I am saying that. On the other hand, I’m not going to attempt to write the book (or even a chapter of it) in this post. In this post, you will see a high-level view of the argument we will make to support this point, but not a lot of details to support our case. For those details, you will have to wait for the book.

Here is roughly the argument we will make:

Costs for NERC CIP compliance have ballooned with CIP versions 5 and 6, They are only going to continue to balloon due to CIP v7, CIP-013 (the supply chain standard), and a host of future versions that will be required to address security of a lot of areas so far left unaddressed by the current CIP standards (virtualization, the cloud, phishing attacks, and Distribution systems, just to name a few). I know there are at least a few large NERC entities that easily spent 25-50 times as much on implementing compliance with CIP v5 as they did with v1. And guess what? They’ll have to spend some multiple of that amount to address the new and revised CIP standards that will be coming down the road.
This situation might be justifiable if the bulk of this money (say 90%) were going to improve security. However, I have been conducting a totally informal and unscientific poll of NERC entities, asking them how much of every dollar they spent on implementing CIP v5 compliance actually goes to increasing security, as opposed to all of the paperwork, etc. that is required to prove compliance with those highly prescriptive standards. The highest estimate I received was 70%. This itself is appalling, since it says that “only” 30% of what this entity spent on v5 was “wasted” on pure compliance paperwork! The low estimate I heard was about 25%, and I’d say the median was 50% (a couple other knowledgeable observers also say that 50% is probably the industry average, although I admit there is no way this could be objectively verified. Different individuals will have different opinions on whether a particular dollar spent on CIP went to security or purely to compliance activities with no security benefit)
When you combine these two assertions, that the cost of CIP compliance is ballooning and will continue to do so for a long time, and that about half of those costs are not going to securing the Bulk Electric System, you come out with what I call an Intolerable Situation. Either one by itself might be tolerable, but the two together are intolerable. We can’t let things go on like this, until a significant portion of the US GNP is going to NERC CIP compliance, without any proportionate increase in security. Here are some numbers: I believe that at least $2 billion was spent by the industry on implementing compliance with CIP versions 5 and 6, including security products purchased, staff time, consultant time, etc. Let’s be charitable and say that between 30 and 50% of that went only to compliance costs, not security. This works out to $600 million to $1 billion. This is more than I earn in a year, and is a lot of money to waste.
This isn’t to say that the CIP standards should be “frozen” as they are, to keep costs from ballooning even more than they have already; there is just about universal agreement in the cyber security community that there is a lot more that needs to be addressed in CIP. But it is to say that we need to find out why so many in the CIP community feel that one of every two dollars they spend on compliance is going down the rat hole and fix that problem, hopefully before new increases in the scope of CIP lead to another doubling or tripling of the amount that utilities have to spend on complying (30-50% of which larger number will also not go to security).
In other words, what we need to do is find a way to fix the whole CIP compliance regime (which is a lot more than just rewriting the standards) so that a much greater percentage of every dollar spent on implementing and maintaining compliance actually goes to security. To go back to the previous illustration, if 80% of CIP v5-v6 spending went to security rather than 50-70%, this would be an effective increase of $200 to $600 million in industry spending on cyber security – without NERC entities being required to spend an additional dollar. If we could get to 90% (which I believe is possible), that would be an increase of $400 to $800 million.
But as I’ve just said, the scope of CIP is going to keep increasing, and entities will still end up spending more money, even if CIP is rewritten. However, if CIP is rewritten it will make the increased spending much more palatable. If entities feel that most of what they spend on CIP will actually increase their security and that of the BES, they are much more likely to support such scope changes than they are now, when they know that close to half of the increased cost will simply be wasted, from the point of view of promoting cyber security.

Why do I think we have this problem? It is because the CIP standards are prescriptive to a fault (I do want to emphasize the disclaimer found on all of my posts: that any opinions expressed in this blog – and certainly in this post – are mine alone, not those of my employer or for that matter any other entity or person). And why are they prescriptive? Because of the idea – which is still prevalent in NERC circles – that the only way to write enforceable mandatory requirements is to make them prescriptive; prescriptive standards lead to measurable outcomes, so that in theory it should be completely clear whether or not the entity has complied (no auditor discretion allowed or required). But this prescriptivism leads to NERC entities spending large amounts of money on activities that don’t increase security but do help them avoid getting cited for non-compliance with the prescriptive requirements. Even more importantly, prescriptive standards greatly inhibit NERC’s ability to expand CIP’s scope to address new domains like the cloud, without engaging in a painful, expensive, years-long standards development process.

I will first illustrate this point with the Tale of Two Requirements: CIP-007-6 R2 and R3. R2 (Security Patch Management) is the bad guy here. For example, take the requirement part (R2.2) that says the entity needs to, every 35 days, determine whether there are new security patches available from the vendor of every piece of software installed on even one BES Cyber System or Protected Cyber Asset, and evaluate whether the patch applies to their systems or not. It doesn’t matter if the software is installed on one or 1,000 systems. It doesn’t matter how important the software is to the entity's operations. It doesn’t matter whether the vendor has never released a security patch and probably never will. This has to be done every 35 days, for every piece of software on a BCS or PCA.

From the point of view of measurability, this requirement is a champ. It is in theory very easy for an auditor to review the entity’s documentation to determine whether 35 days or less elapsed between the patch release date and the date the evaluation was done. However, NERC entities – especially larger ones – are finding this a tremendously burdensome requirement to comply with, due in part to the difficulty of contacting loads of smaller software vendors once a month to see if they have made any security patches available. But they have to do this, and they have to do it within 35 days (that is of course just the first step in complying with CIP-007 R3. There are a couple other specific deadlines after that). And most important of all, they have to have good documentation that they have done this, for every software product for every month – and have it readily retrievable when the auditor asks to see the evidence that all security patches were identified and evaluated for these particular systems in this particular month.

But where does the 35 day deadline come from? Is there some principle of cyber security – or computer science or electrical engineering – that dictates that security patches need to be identified and evaluated within 35 days or all hell will break loose (as conceivably could be the case with other NERC standards, where failure to meet a particular parameter or deadline could perhaps lead to a cascading outage of the BES)? Of course not. This is an arbitrary deadline, chosen simply because the CIP v5 SDT felt they had to choose some deadline in order to – of course! – have something measurable. But entities then have to spend a lot of effort and money designing processes and installing systems to make sure this deadline is never exceeded, for all of the perhaps hundreds of software packages installed within their ESP. Measurability has a high cost, and shouldn’t be invoked just to make audits easier.[i]

Now let’s look at CIP-007-6 R3, Malicious Code Prevention. This is one of two non-prescriptive requirements currently found in the CIP standards (the other is CIP-010-2 R4. Plus, one requirement is in the process of being switched from prescriptive to non-prescriptive by the CIP v7 SDT: CIP-003-7 “R3.1”[ii]). The heart of this requirement is part 3.1, which reads “Deploy method(s) to deter, detect or prevent malicious code.” That’s it. There is no requirement to deploy anti-virus software with no other options (as in the previous CIP versions), no requirement to check for new signatures daily, etc. By the same token, the entity doesn’t have to put in place a lot of procedures and systems to comply with arbitrary deadlines, as well as train and monitor a host of staff members who have lots of other important tasks on their plates. And most importantly, the entity doesn’t have to generate reams of documentation showing that they complied with arbitrary deadlines for say A/V signature deployment every day for every system within the ESP.

I could go on and on regarding this topic (and we will discuss it thoroughly in the book!), but you hopefully get the point. I hold prescriptive requirements responsible for the Intolerable Situation mentioned above. The obligation to meet arbitrary deadlines and other targets - without consideration of the risk posed by a particular system, software package, vendor, etc. – drives a lot of the huge and increasing cost of NERC CIP compliance.

And this situation will just get worse as CIP is expanded to cover more domains, like supply chain and the cloud. I think FERC realizes this, which is why, for the last two major expansions of CIP that they have ordered – substation physical security in CIP-014 and supply chain security in CIP-013 – they have gone out of their way to say they don’t want NERC to take a prescriptive approach in writing the standards. At some point, I predict they will have to order NERC to rewrite all of CIP in a non-prescriptive fashion.

I do want to say now that I don’t blame NERC, FERC, or any individuals employed by or associated with those organizations, for this situation. Nobody set out to achieve this end. In my opinion (again, this will be elaborated and justified in the book), the situation is due totally to the fact that, until CIP-014, which is entirely non-prescriptive, NERC was in the business of writing prescriptive standards. And for their original mission – protecting against single acts of commission or omission that can have catastrophic physical effects on the BES – that is exactly what they should be doing.

But cyber security is different. There is no way an entity can mitigate (or even identify) every cyber vulnerability that affects even a limited group of systems like BCS. But the CIP standards (with the exception of CIP-014 and the two other CIP requirements mentioned above) implicitly assume that this is a goal that can be achieved. And, given the size of the potential fines that NERC entities face for non-compliance with even one requirement in any NERC standard, they have lots of incentive to devote their entire security budgets to CIP compliance, and leave nothing for anything that isn’t required. That they don’t in fact do this – but definitely do spend a lot of money on cyber security beyond what CIP requires – is testament to the fact that these entities truly are concerned with security, not just with compliance.

To sum up my argument, in my personal opinion the perceived need to have “measurable” requirements has led to the situation where a huge – and growing – amount of money is being spent on NERC CIP compliance without anywhere near a proportional increase in security of the electric grid. This situation will only get worse if the entire CIP compliance regime isn’t re-thought and re-written.

What’s the solution? The easy answer is that I and my co-authors will go into that in great detail in the book. The not-so-easy answer is that we are currently still having a lot of discussions, both among ourselves and with others, about this question. The general direction is clear, but that is all that’s clear at the moment. It won’t suffice simply to rewrite all of the current CIP requirements in a non-prescriptive format, as is currently being done with CIP-003-7 “R3.1” (which I described above). The CIP standards need to non-prescriptively address all threats to the cyber security of the cyber assets that run the BES, whether they come in through the IT network (as in the Ukraine attacks), through the cloud, through “machine-to-machine” communications from outside the ESP, etc.

On the other hand, it’s not possible to simply have a CIP standard that says “Make sure you’re cyber secure”. The entities will need to be told the different areas they need to address (patch management, securing virtual systems, etc), and provided with guidance on best practices for addressing those areas. They will then be audited based on how well they addressed each area; and yes, the auditors will need to be cyber security professionals who can determine how well the entity did address each area, without having to resort to a checklist with arbitrary boxes labelled “35 days”, etc.

Even more importantly, the CIP framework needs to be able to rapidly incorporate new threats that will pop up in the future that are currently unheard of, without having to go through a multi-year standards development process. So there will need to be some sort of governing body that will regularly meet to review the primary threats to the cyber security of the BES, and add or subtract areas from the list of what the entities need to address (as well as write guidance for new areas). None of these goals can be achieved by simply rewriting the current standards; there has to be a new compliance regime.

But I’m not going to force you to wait for the book to hear about my ideas for what should replace the current CIP standards. In this blog, I will keep you up to date with what I am thinking in this regard. And I’d be very interested in hearing your ideas as well.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I am not trying to pick on the CIP v5 SDT here! The next paragraph describes a very non-prescriptive requirement that they also wrote. Plus times have changed. I attended a number of the SDT’s meetings (in 2010-12), and I freely admit I never once even thought about raising the issue of prescriptive vs. non-prescriptive requirements. It just wasn’t something I even considered important until about a year ago. Now it’s all I think about.

[ii] I use the quotes because it is technically Section 3.1 of Attachment 1 of CIP-003-7 not requirement R3.1.

Tom Alrich's Blog

Wednesday, October 19, 2016

“We Need Something to Measure”

1 comment:

Get new posts by email: