Tom Alrich's Blog: June 2016

Tuesday, June 21, 2016

Is NERC CIP Hindering Innovation?

Stating the Problem

The CIP standards are always just a small part of the NERC CIPC (CIP Committee) meetings, which focus on many different initiatives and issues regarding cyber security in the electric sector. Usually, the CIP discussion (always led in the past couple years by Scott Mix or Tobias Whitney) is just a summary of what has been going on lately with the standards, and there is nothing too surprising about what is said. However, Tobias Whitney’s presentation at the most recent CIPC meeting, in St. Louis on June 7, was quite surprising – to me and a number of others. And it was quite welcome as well.

The title of Tobias presentation (which hasn’t been posted as of June 17. I will post the link in a comment below when it is available) was something to the effect of “Is CIP Hindering Innovation?” Tobias explained that what he meant by this question was whether there are new technologies that utilities would like to deploy in their OT environments, but which they aren’t deploying for fear they will run afoul of some CIP requirements by doing so.[i] One example of this negative effect is virtualization, as I discussed in this post.

What was probably most surprising about the presentation[ii] was that Tobias wasn’t at all disputing the answer to the question in the title. He started out admitting that CIP can restrain innovation, and the example he used of this was the well-known fact that, under CIP versions 1 through 3, some utilities had held off on deploying routable connectivity to their substations – or in at least a few cases literally ripped it out. They did this because CIP v1-3 provided a “get out of jail free” card in cases where there was no external routable connectivity to an asset; the asset automatically was deemed not to have any Critical Cyber Assets. Since the presence of Critical Cyber Assets, like BES Cyber Systems in CIP v5 and v6, was the sole determinant of whether there was a substantial CIP compliance responsibility at an asset, these utilities decided the benefits of deploying routable connectivity to a substation (such as not having to send a technician to the substation to deploy every new patch) were greatly outweighed by the compliance costs of having CCAs and therefore being subject to all the CIP requirements.

While this is certainly an example of CIP inhibiting innovation, I don’t think it was the right one for what Tobias was trying to say. The rest of his presentation discussed areas where deployment of particular technologies may be hindered because those technologies aren’t addressed at all in CIP currently; this may be leading NERC entities not to want to deploy those technologies, for fear of running afoul of some CIP requirement or other. But the example he used was one where there was no uncertainty at all: If there was no external routable connectivity, there were no CCAs, period. In other words, the CIP v1 – v3 standards as written provided a direct incentive to utilities not to deploy ERC. I’m sure this wasn’t the intent of that provision when it was inserted in CIP v1, but it can’t be said that v1-3 ignored the issue of external routable connectivity. Intentionally or not, it directly discouraged its deployment. But that’s not the problem with the other technologies Tobias was discussing.

Tobias produced a list of six technologies whose deployment may well be hindered because of uncertainty about how CIP would apply to them; and he made it clear there are almost certainly other technologies in the same boat as well. He listed:

The cloud. Of course, the main problem here is that, if CIP is applied literally to data stored in the cloud, every person who has access to any server that might store some of the utility’s CIP-protected data will need to be vetted by that utility. This is almost impossible for cloud providers to guarantee.
Renewables. Tobias said the main issue here is new GOs (Generation Operators) who manage a lot of “behind the meter” renewables (mainly solar panels, of course). Tobias said that some of these GOs have aggregate capacity approaching 1500 MW, which surprised me. Of course, the bright-line criteria, and especially criterion 2.1, were never written with the idea that there could be anything as recondite as behind-the-meter generation.
IEC 61850. Since 61850 uses IP, there isn’t much question that it is a routable protocol; thus, uncertainty isn’t really the problem here - in theory, 61850 communications should be treated just like any other communications. The problem is that 61850 is inherently a real-time protocol and having to apply controls to it, like encryption, might well make it not worth deploying. The definition of Low impact External Routable Connectivity (LERC) in CIP v6 specifically excludes external communications like 61850 and GOOSE from being considered LERC.[iii] However, there are many areas of application of 61850 (external 61850 communications for Medium impact assets, as well as internal 61850 communications for Medium or High impact assets) that are simply left unaddressed in the current CIP standards. This is where the uncertainty comes in.
Virtualization. I wrote about this issue in the post referenced above. However, the v7 SDT is hard at work drafting revisions to the v5/v6 standards and definitions that will take account of virtualization, rather than simply ignoring the technology. So in this one case, it can be said that in a few years CIP will take account of virtualization. That can’t be said for any of the other technologies on this list, of course.
Wireless. I believe what this means is WiFi within the ESP, where the issue is that it is fairly hard to restrain the signal from crossing the PSP, even if the PSP is well outside of the ESP. What CIP requirements will be violated if that happens? Probably a lot.
End of life systems. To be honest, I don’t know what Tobias meant by this, and I don’t think he talked about it in his presentation (it was listed on a slide).

I agree with Tobias that there are other technologies you could add to this list. One that I mentioned, in a question after his presentation, was managed security services (managed firewall, managed authentication, etc). I have been told by one auditor that he knows of no NERC entity in the US, which has Critical Cyber Assets or Medium or High impact BES Cyber Systems, that is using a third-party service to monitor or manage their ESP. In fact, the auditor told me that his region had to tell one entity to terminate their use of such a service and that he did this with a heavy heart – since he believed the entity was more secure with the service than without it.

The reason that managed security services don’t work under the current CIP regime is the same one that I alluded to for cloud services above. If a NERC entity with High and/or Medium impact assets contracts a third party to monitor and/or manage security information from their ESP, then every employee of that third party, who has any physical or logical access to any server that stores the NERC entity’s security information, will have to be vetted by the NERC entity just like any of their own employees who has such access. In practice, this probably means the third party firm will have to segregate servers that house any of the entity’s data in a separate room with its own badge reader; only the employees who have been vetted will have access to that room. In practice, of course, this is impossible for most third party service providers to implement.

NERC’s Proposed Solution

The above was Tobias’ statement of the problem; the 120 or so people in the room seemed to agree that he had done a good job so far. So what was Tobias’ proposed solution? I regret to say that the agreement in the room seemed to vanish when he discussed that.

At first glance, you might wonder why there was so much disagreement with Tobias’ solution. He proposed something to the effect that the best minds in the NERC community get together and draw up guidelines for – say – using cloud services or WiFi in an ESP. The implication was that, if NERC entities followed those guidelines, they likely would not be judged to have run afoul of the current CIP v5 and v6 requirements. So entities could implement those technologies without fear of suffering the swift hand of judgment by NERC or their region, for violating requirements that make no mention at all of the technology in question in the first place. Many of the people in the room commented, and most of those comments (including mine) were – respectfully – negative.

If you are very new to the NERC CIP world, you might wonder what all the complaints are about. After all, it seemed that Tobias was saying that NERC was willing to be flexible in enforcing the CIP standards, so as not to prevent adoption of any of these new technologies. In other words, the rules would be bent in some way, so that entities could implement these technologies. What’s so bad about that?

What’s so bad about this – in my opinion - is that we’ve heard this song before. That is, NERC was previously faced with serious questions regarding how the CIP standards should be interpreted, and they promised to resolve them in creative ways. Yet after almost two years of trying different approaches to do this, they had to admit there was no way this could be done, other than the two methods that are currently “allowed” by the NERC Rules of Procedure: Requests for Interpretation (RFIs) and Standards Authorization Requests (SARs). Both of these methods literally take years to produce results. Meanwhile, NERC entities are left in exactly the same uncertainty that they were always in, before NERC tried to improve the situation.

To be specific, in early 2014 many NERC entities started spending some quality time trying to understand the CIP v5 standards, since FERC had just approved them in November 2013. As they did this, they started having lots of questions about what the requirements meant.[iv] The question came up of how NERC could provide guidance on these questions – given that the only two “official” ways to do that were RFIs and SARs. NERC’s answer to this question was very clear: “Don’t worry, we’ll take care of it!”

NERC talked about various ways in which they could provide guidance. One of the first was to embed guidance in the RSAWs; after one try at this it became clear this approach wouldn’t work. Another approach was the CIP v5 Implementation Study. But when the study came out, it was clear that – while it did provide a lot of good ideas for how to implement CIP v5 compliance – it didn’t (indeed, it couldn’t) address any interpretation questions.

Probably NERC’s most promising idea for providing guidance was the Lessons Learned. They drafted a large number of these, some addressing some difficult guidance issues including “Programmable” and virtualization. But in the end, only ten LL’s were finalized. As with the Implementation Study, these ten documents provided very good tips on how to implement compliance, but none of them provided guidance on interpretation questions. All of the draft LLs that did address such questions (along with the late, unlamented Memoranda) were withdrawn.

These interpretation questions were then turned over to the new Standards Drafting Team (as described in this white paper from the NERC CIP v5 Transition Advisory Group). Some of these questions[v] will be resolved in the next CIP version, but that is unlikely to be in effect before 2018 at the very earliest. So it turns out that there was no other way for NERC to provide CIP v5 guidance, except through RFIs and SARs. Meanwhile, NERC entities are still on their own as far as interpretation of these v5 issues goes – just as they were in early 2014.

So I hope that I – as well as others in the room at the CIPC meeting – can be excused for our skepticism when Tobias once again told us that the problem of CIP hindering innovation could be addressed by some sort of creative thinking on NERC’s part. While Tobias’ (and NERC’s) intention is certainly good, there is simply no way that NERC entities can ever feel completely safe in implementing these new technologies until they are actually directly incorporated into the standards (i.e. the SAR process), even though that will take years. Trying to once again figure out a way to circumvent this long process is bound to fail.

The Real Solution

So what is the real solution to this problem of CIP hindering deployment of new technologies? I can see two options. The first is to simply go through the SAR process for each of these areas. That is, NERC should start working today (well, tomorrow is OK) on SARs for every technology discussed above (except virtualization, since that has already been included in a SAR). For each of these technologies, they will have to appoint a Standards Drafting Team (they certainly don’t want to dump all this on the current SDT![vi]). Some of these SARs might be combined, so that each one wouldn’t require its own SDT.

Of course, it will probably take six months for each new SDT to be appointed and have their first meeting. Then it will take at least 6-9 months for them to come up with their first draft. That will then be submitted to a NERC ballot, and – if the past is any guide – will almost certainly fail to pass. So a new draft will be developed in say three or four months, submitted for a new ballot – and probably also voted down. After at least 3-4 ballots, there will be a final version of the new (or revised) standards. This will be submitted to FERC and approved by them in probably no less than six months. So each of these new technologies will take at a bare minimum three years to be actually “built in” to the CIP standards. At that point, NERC entities can safely implement them.

Incorporating each of these new technologies will require an expansion of the CIP standards, so that they may be - say - twice as large once all of the above technologies are accounted for.

And of course, as new technologies come around that NERC entities would like to take advantage of, the same process will need to be repeated for each one. So there will be an ongoing need for CIP drafting team members! In fact, I can see this becoming a career option in itself, with courses being taught in colleges on NERC CIP SDT Membership. Is this great or what? Meanwhile, the NERC CIP standards will approach the length of the Bible, and some utilities will start dedicating a majority of their staff to CIP compliance!

I don’t know about you, but I find Option 1 to be profoundly depressing. It will provide ironclad assurance that the electric utility industry will always be behind all other industries in taking advantage of new technologies (how long ago did virtualization first appear? And how long will it be before CIP v7 finally makes it safe to implement in your ESP? My guess is there is at least a 15 year gap between the two dates). And it will also assure that an ever-increasing share of utilities' budgets will go to CIP compliance. What’s Option 2?

Option 2 is something I have been talking about for a while, and I can promise you there’s a lot more to come. In fact, I and two co-authors are starting work on a book on this topic. The fact is, I believe NERC CIP is now on a completely unsustainable course. In the post just referenced, I focus on the idea that it is economically unsustainable, and what I’ve just discussed simply reinforces that. CIP compliance costs mushroomed under CIP v5, and they will continue to expand as CIP is expanded to include new areas (virtualization, supply chain, the cloud, etc). Further expanding CIP to accommodate each of the technologies I’ve been discussing will simply assure that expansion will continue, until total NERC CIP compliance costs approach the level of the national GDP.

And what I’ve just been writing about adds a whole new level of unsustainability to CIP. Not only are the direct costs of CIP compliance going to continue to expand rapidly as time goes on, but the indirect costs are as well. These indirect costs include the “costs” that utilities incur by not being able to take advantage of new technologies (in their ESPs) like virtualization and the cloud. I heard today of a major utility whose control center server room used to be chock full of boxes in every nook and cranny. Now all those boxes have been compressed into less than three racks, saving lots of money as well as greatly simplifying server administration. Any utility that isn’t doing virtualization in their control centers is probably forgoing (proportionately) similar cost savings. And just think of the savings a utility might reap if there were some way to use cloud services within ESPs (currently I believe that is impossible under CIP, although I’d love to hear if you are doing it and are comfortable with the compliance aspect).

Why are the CIP standards so unsustainable and so unfriendly to innovation? It is because they are prescriptive. Prescriptive security standards address a certain pre-determined set of threats and are based on a pre-determined set of technologies (and if you think that CIP is becoming less prescriptive, please read this post). That would be fine if both the threats and the technologies didn’t change, but that simply isn’t the case; in fact, one could make the case that these are both changing faster than ever.

What is needed are non-prescriptive standards – of which a good example is CIP-014. With non-prescriptive standards (I used to call these “risk-based”, but I and my co-authors have decided that isn’t the right term. We’re currently trying to figure out what is the right term - perhaps "threat-based"), regular assessments identify the threats the utility faces, as well as the technologies in place that need protection; the outcome of these assessments yields the set of steps the utility must take to address cyber security. The threats and technologies in scope will be as up-to-date as the last assessment; if a utility decides it needs to deploy cloud technologies, it will get a new assessment that addresses the threats appropriate to the cloud. Most importantly, the security measures the utility takes will be to address the threats it faces, as well as the technologies it employs – not a one-size-fits-all set of measures that applies to all utilities in North America.

This is why, while I support the current SDT’s efforts to draft CIP v7 and I am enjoying participating in some of their meetings, I don’t want to see any further prescriptive versions of CIP, whether to address the cloud, wireless, renewables, or whatever. The next CIP version needs to be non-prescriptive. Otherwise, CIP will become the Monster that ate the North American Electric Utility Industry.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] “Innovation” is probably not the correct word here. I think of innovation as what led to these technologies being developed in the first place. A more appropriate title might have been “Is CIP Hindering Deployment of New Technologies?”

[ii] I believe it will be posted by NERC, but I don’t think it is up yet. When it is posted, I will publish a link to it in the comment section below.

[iii] FERC ordered this definition be rewritten in Order 822, although their objections had nothing to do with the wording about 61850. The v7 SDT is now feverishly working on drafting the new definition, and I am finding the online discussions to be very interesting.

[iv] In addition, there was at least one blogger who was inflaming the situation by himself asking a lot of questions about what the standards meant, for example in this post.

[v] But certainly not all of them! See this post.

[vi] NERC has said that, once FERC issues their long-awaited Order to develop supply chain security standards, a new SDT will be appointed for that task. This is partly because the current team already has enough on their plate, and also because there will be somewhat different skills needed for this new task.

Thursday, June 16, 2016

Lew Folkerth’s Last-Minute Checklist

In case any of you aren’t sure, the initial compliance date for CIP v5/v6 is 15 days from today, July 1. Entities with High, Medium and Low impact assets all have compliance obligations on that day. And the ever-vigilant Lew Folkerth of RF has just put out a last-minute “checklist” listing some areas where he thinks entities may be deficient. You can find it in RF’s most recent newsletter in Lew’s usual column, “The Lighthouse”.

I won’t try to summarize the article, but here are a few high points:

Being part of a Regional Entity, he emphasizes evidence you will need to have starting July 1.
There is a sidebar listing five things that Lows have to do by that date.
There is only one point on which I disagree with him. It is found in the second column of page 11, starting with “Ensure your reliability…” He says “In implementing a compliance program, it is important that we not only obey the letter of the Standard, but that we achieve the intent of the Standard as well.”

Regarding this last point, I have previously written about the fallacy of believing that it would ever be possible to determine the intent of one of the CIP standards. Thinking that you will be able to justify a particular compliance action you have taken (or not taken) based on some “knowledge” you may have of what the Standards Drafting Team “intended” a requirement to mean is a fool’s errand.

On the other hand, I know well that Lew doesn’t mean that. I believe what he is saying here is that entities always need to be looking beyond the CIP requirements to what is really required from a pure cyber security point of view. And that the auditors won’t be just looking strictly at whether you have complied with the letter of the requirement, but whether you have taken the steps you should reasonably take for cyber security, even if they aren’t mandated by the CIP standards.

However, it is important to keep in mind that, when you’re audited, you won’t get a PV for not going beyond the CIP standards. You may get a Recommendation that you do that, but it can never become a PV, even if you choose not to follow the Recommendation. For a more in depth discussion of this point, see Lew’s discussion in this post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Sunday, June 12, 2016

LERCing in the Shadows

At the SPP and WECC meetings I attended recently, one of the biggest concerns was the meaning of LERC (Low impact External Routable Connectivity). I of course don’t have the answer to that question – and neither did the SPP or WECC auditors – but I’d like to at least get all the facts (that I know of) about this question on the table:

FERC ordered NERC to rewrite the definition of LERC in Order 822; they gave NERC until March of 2017 to do this. Rewriting this definition is now the number one concern of the new Order 822 Standards Drafting Team, since this is the only item on their agenda that has a deadline (although they have an aggressive schedule for all of their work, aiming to finish a first draft of the new standards, that addresses everything on their plate, by the end of this year. I’m somewhat skeptical they can do that, but I wish them well!). They want to have the draft of the LERC definition finished in July of this year, since it will of course have to go through (probably) multiple ballots before it can be approved by the NERC Board of Trustees and sent to FERC.
The question of what constitutes LERC is almost identical to the question of what constitutes ERC (External Routable Connectivity). When there is an answer to the question of what LERC is, the question of what constitutes ERC will (with some small modifications) also be answered. Since the meaning of ERC is probably an even bigger issue than that of LERC, I was at first concerned that the SDT would address LERC (because they have to) but not ERC. However, I asked this question at the NERC CIPC meeting in St. Louis last week and Scott Mix replied that, while the SDT has to address LERC first because of the FERC deadline, they will address ERC as well, as part of their first draft of the revised CIP standards.[i]
Of course, since both LERC and ERC are on the SDT’s agenda, there are now no “right” or “wrong” answers to the question of what these terms mean. However, when the first draft of the LERC definition is posted (and balloted) this summer, NERC entities will at least have something substantive to look over – and, as I pointed out in my post on the WECC meeting linked above, WECC (and perhaps other regions) is recommending that entities look to the SDT’s work as providing at least a good clue about what may be coming.[ii]
The ERC and LERC questions come down to this: When there is routable communication from a control center to an asset such as a transmission substation or generating station, and some sort of intermediate device, located between the external communication source and one or more BES Cyber Assets located at the asset, does something to the communication stream - such as proxying it and/or converting it to a serial protocol - is there still LERC or ERC, or not? More to the point, in what cases does the routable communications get “broken” by the intermediate device, and in what cases can it be said that LERC or ERC still exists, despite whatever the intermediate device does? I have never heard of any other case in which there is a serious question whether there is LERC or ERC, other than when there is such an intermediate device.[iii] Of course, in the case where no device intervenes in the communication stream, there should be no question about whether there is or isn’t ERC or LERC. If the stream is entirely routable, there is LERC. If it isn't routable, there isn't LERC.
The ERC question first came to my attention as being a serious one in late 2014. This post was the first one I wrote on the topic, and it was quickly followed by at least four more. I then returned to the topic the next year, concluding with these three posts: here, here and here.[iv] After the last post, I concluded that ERC (and by implication LERC) is a black hole. With each of my previous posts, I had breathlessly reported something someone said that seemed to me to be the defining word on ERC; inevitably, I would be back a week or two later with the news that the question was more subtle than I’d realized, but a new pronouncement I’d just received was surely the final word. I finally realized that this process would go on forever; there seems to be no end to the subtleties involved in the concept of ERC. So I stopped writing these posts.
In the last of these posts, I concluded that the best way to “define” ERC (and LERC) would be as a series of use cases. Purely as an example, a use case could be: When the intermediate device does A to the data stream (for example, the device requires authentication), ERC/LERC is broken; when it only does B (for example, merely converts routable communications to a serial protocol), ERC/LERC is not broken. That remains my advice to the SDT today: You will never come up with a pure dictionary-style definition. And if you do, it will be technical enough that it will require someone to have an EE and a PhD in data communications to understand it. Of course, this doesn’t bode well for implementing the definition in the real world, since neither the auditors nor the entities would have either the education or inclination to devote the time required to understand the definition so that they could easily apply it. It is much better to have use cases that can be easily applied to particular situations.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] At the CIPC meeting, I engaged in a little hyperbole when I suggested in my question to Scott that it would take only 15 minutes to convert the LERC definition to a definition of ERC. Dave Revill of Georgia Transmission – who is on the CIP v7 SDT, as he was on the v5 and v6 ones – pointed out that it won’t be quite that simple, since the ERC definition will need to take account of an ESP, which is not a factor for LERC. Of course, that’s correct. It still should be a much easier job to convert the definition than just about anything else on this SDT’s agenda!

[ii] On the other hand, a draft standard or definition is certainly not mandatory in any way. If you have a good reason for disagreeing with the draft definition of LERC, and you can’t wait until the definition is finalized early next year because you need to start your Low impact work now, you should go ahead and document your definition, as well as the reasoning that led to it. Even if the definition that is ultimately approved by FERC differs from yours, there is no way, in my opinion, that you could ever be held in violation – even four or five years from now – because the final definition wasn’t available when you needed it. You can’t simply put your compliance effort on hold for this.

[iii] And please don’t think that, when I say “intermediate device”, I’m in any way referring to the NERC defined term “Intermediate System” (which is a term that only applies in High or Medium impact environments, of course). A device that functioned like an IS could in some cases be what I’m calling an intermediate system, but they aren’t equivalent concepts. Mine is much more general: it just means some device that sits in the communication path and makes some sort of change to the communications. That’s all.

[iv] I believe there were at least one or two posts in between these two groups of posts, but who has time to go through all those posts, anyway? The author of this blog obviously loves to talk!

Monday, June 6, 2016

A Conversation with Lew Folkerth on CIP-002-5 R1

I have been writing about problems with CIP-002-5.1 R1 and Attachment 1 for more than three years, starting with this post in April 2013; I’m sure I have written more than 50 posts on various aspects of this topic. I recently came indirectly to have a discussion on this problem with Lew Folkerth of RF. I had just written a post discussing the two main ways I’m using the term “enforceable” regarding CIP v5 and v6 and how I suspect that, in what I call the strict sense of the term, many of the requirements are in fact unenforceable. The day after the post appeared, Lew emailed me with a question on what I had said.

My answer led to a back-and-forth discussion that I think will be interesting to the three or four of you who also have nothing better to do than to ponder the subtleties of the wording of CIP-002-5 R1 - the most important requirement in CIP v5 and v6, and IMHO the most poorly written. In this post, I will reproduce my conversation with Lew almost in full (removing only some extraneous comments, and adding a few clarifying phrases to my statements. I have left Lew’s as he wrote them). I will add comments, which will all be in italics.

Lew’s Original Email:

This is where you lose me (referring to my post):

“For an example of this, consider the fact that CIP-002-5.1 R1 never requires the entity to document a methodology for identifying BES Cyber Systems. In fact, it never requires the entity to identify BCS in the first place, although that requirement is certainly implied by the fact that all the other requirements apply to BCS.”

CIP-002-5.1 R1 Part 1.1 states, “Identify each of the high impact BES Cyber Systems according to Attachment 1, Section 1, if any, at each asset.”

What am I missing?

My Reply:

Good point, Lew! The short answer is the word “Identify” in R1.1-1.3 really means “classify”. You’re sent to Attachment 1 to do your BCS identification, but Attachment 1 just assumes you’ve already identified your BCS and you now need to classify them. Of course, R1 (or Attachment 1) itself gives zero guidance on identification (you have the discussion of the BROS in the Guidance, but there is nothing in R1 itself that would lead you to believe this has any applicability to the BCS identification process); this is why entities need to develop their own methodology for identification and classification of BCS.

(Tom’s later note) I believe there are two basic approaches to BCS identification. One is the “bottom up” approach, where the entity identifies Cyber Assets using that definition, then BES Cyber Assets according to that definition; finally, they aggregate BCAs into BCS (or just call each BCA a BCS, as I know some entities have done, perhaps out of an over-abundance of caution). The other is the “top-down” approach, where the entity uses the list of BES Reliability Operating Services performed by the asset in question to identify the BCS that fulfill those BROS. For a description of these two approaches (and how they can – and should – be combined), see this post from early 2015.

Lew’s Response:

CIP-002-5.1 R1 Part 1.1 requires an entity to identify each of the high impact BES Cyber Systems, if any, at each asset. The phrase “according to Attachment 1, Section 1” modifies the imperative verb “identify.” Think of it like this:

a) Identify:

i. What: each of the high impact BES Cyber Systems, if any;

ii. How: according to Attachment 1 Section 1;

iii. Where: at each asset.

• You are correct in saying that by the time you classify a BES Cyber System, you should have already identified it. You should do this based on the Glossary definition. You need to show that each BES Cyber System is one or more BES Cyber Assets, that the BES Cyber Assets are logically grouped, and that the logical grouping of BES Cyber Assets performs one or more reliability tasks.

• While the Requirement does not discuss identifying BES Cyber Assets explicitly, this identification is strongly implied by the cascading Glossary definitions, and by the Purpose section of CIP-002-5.1: “To identify and categorize BES Cyber Systems and their associated BES Cyber Assets.”

My Response

Thanks, Lew. I’ll agree with you that the SDT combined “identify” and “classify” into “identify” in R1.1-1.2. But let’s think through the actual steps that are literally implied by this. As far as I can see, the only interpretation of R1 and Attachment 1 that complies with the wording is the following – and I don’t think any NERC entity would be happy following these steps, nor would any auditors be comfortable auditing them.

1. The list of 6 asset types in R1 is like a FORTRAN FOR loop[i]. “For each of the following asset types, do 1.1-1.3.” Alternatively, you can interpret this as saying “For each of R1.1 to 1.3, iterate through each of the six asset types.”

2. You start on 1.1. To comply with this, you have to first go to every control center you own, then every transmission substation, every plant, etc. For each of these, you first have to identify every BCS (using the nested definitions, as you said. Of course the Guidance talks about the BROS, but there is nothing in R1 to lead you to believe that you could actually use the BROS to identify BCS. So you have to assume you need to use what I call the bottom-up approach, starting with the definitions).

3. At every one of the six asset types (regardless of classification, because as you know assets officially aren’t classified in CIP v5!), you will have to identify every Cyber Asset (assuming you know what “programmable” means), determine whether it’s a BCA, then group those into BCS. Then you have to run each of those BCS through the criteria in Section 1 of Attachment 1, to see if it’s a High.

4. I emphasize you have to do this for every BES asset you own – identify all Cyber Assets, then BES Cyber Assets, and finally BES Cyber Systems. It doesn’t matter whether the asset will eventually turn out to be High, Medium or Low impact. There is nothing in R1 or Attachment 1 that tells you that R1.1 applies just to High impact assets (and, of course, there is no such thing as High, Medium or Low impact assets, again according to the literal wording of R1 and Attachment 1). So an entity with 500 Transmission substations and 100 generating plants has to do this for all 600 of those assets, before it can determine what its High impact BCS are. Obviously, this is a huge task, but strictly following the wording leaves you no other choice.

5. You have to do the same process for 1.2, except that you don’t have to redo the BCS identification step. You do have to run each BCS, that wasn’t already classified as High impact, through the Medium criteria, to identify Medium impact BCS.

6. So far, you’ve had to identify BCS at all BES assets, then identify the ones that are High impact under R1.1 and Medium impact under R1.2. Of course, you are now left with a list of Low BCS; low BCS are simply all the BCS that aren’t High or Medium. Of course, this conflicts with the statement that a list of Low BCS isn’t required. But you will inevitably develop a complete list of your Low BCS by going through the above process. That is, if you’re inclined to comply with the literal wording of R1 and Attachment 1.

7. You now go to R1.3, which sends you to Section 3 of Attachment 1. This says you’re to identify “BES Cyber Systems not included in Sections 1 or 2 above…” The clear implication of this phrase is that you started out Attachment 1 with your complete list of BCS at all assets, then subtracted out the Highs and Mediums, leaving your list of Low BCS. How could you possibly literally comply with the wording without doing that?

8. Of course, R1.3 itself says you’re supposed to identify each “asset containing a Low BCS”, and as we all know, this doesn’t require literally identifying all BCS at Low assets. In practice, what everyone I know is doing to comply with this requirement part is starting with their total list of BES assets (ones that correspond to one of the six asset types in R1), removing the High and Medium impact ones, and considering the rest as Lows.[ii] Then, they identify BCS at the High assets and consider them High BCS. Finally, they identify BCS at the Medium assets and consider them Medium BCS. At the Low assets, they don't do anything more than list them, complying with R1.3. But this ignores the fact that the phrases “High impact asset” and “Medium impact asset” have no explicit meaning in R1, since assets themselves are never officially classified. The criteria in Attachment 1 are for BES Cyber Systems only, although they do refer to assets and thus are widely believed to be classifying the assets themselves.

So strictly speaking, you’re right about “identification” being required by R1.1 to 1.3 (although the implied identification process, based on the Cyber Asset and BCA definitions only, constrains the entity to completely ignore what I call the top-down, BROS-based approach), since that is the word used. But those requirement parts also require classification of BCS, since there is no other place in R1 that you are called on to do that.[iii]

Of course, as I’ve said repeatedly in my blog, most of this doesn't cause problems in real life, since entities (and almost all auditors) are treating the v5 process as just a variation of CIP-002-3 R1-3. In other words, you first classify the assets as High, Medium or Low impact (in v3, there were just two asset classifications: Critical and non-Critical; but it’s the same idea). Then you identify your BCS at those assets (and BCS are somewhat equivalent to Critical Cyber Assets in v3). This process works fairly well, and it is how - consciously or unconsciously - almost all entities identified and classified their BCS; it is also, I believe, how most regional auditors would describe the process of BCS identification and classification. So it's great that there is widespread agreement on this; but it's not so great that this process doesn't correspond at all with the wording of R1 and Attachment 1!

I do want to point out that one good innovation in v5 is that it does allow classification based on Facilities (lines, transformers, etc), especially in substations (in fact, that is the correct way to classify BCS at substations, since criteria 2.4 to 2.8 all have “Facilities” as their subject, not “Substations”). On the other hand, I have yet to find an entity that is actually classifying based on Facilities, except in the case of mixed-ownership substations, where it’s almost imperative that you do this.

So the big problem with CIP-002-5.1 R1 is that it is impossible to comply with the requirement as written – without devoting a substantial fraction of the entity’s total revenues to visiting all BES assets and identifying all BCS there. Fortunately, both NERC entities and auditors have come up with a workable “interpretation” of this requirement that allows them to have an agreed-upon methodology for BCS identification and classification. But that interpretation doesn’t correspond to the wording of R1 and Attachment 1. IMHO, this is why CIP-002-5.1 R1 is completely unenforceable in the strict sense; and it may possibly make all of v5 and v6 unenforceable as well. If NERC ever wants to have v5 and v6 be strictly enforceable, it will have to completely rewrite R1 and Attachment 1 (although I doubt even that would make v5 and v6 completely enforceable. There’s a separate issue that I won’t get into here, but that I’ve discussed before); but I don’t see that ever happening.

Lew’s Final Reply

I agree with everything except the last statement, about the strict enforceability of CIP-002-5.1. I do agree with your post from last Wednesday where you said we are unlikely to ever find out. And if we do find out that CIP-002-5.1 isn’t strictly enforceable, I expect governmental authority will step in and require changes that are enforceable.

Tom: Some people have suggested that the idea that CIP v5 may be “unenforceable in the strict sense” means that an entity would actually have to file a lawsuit contesting a CIP fine, and win the suit, before there would be any practical impact of that unenforceability. Of course, if that were to happen, it would be many years in the future.

However, I believe CIP v5’s unenforceability will be evident long before that happens. If auditors come to believe that a requirement isn’t strictly enforceable, they’re not likely to write PVs for it in the first place, unless the entity has simply not bothered to comply with the requirement at all. For example, I see no way an entity will ever receive a PV for not correctly applying the phrase “associated with” in Section 2 of Attachment 1, since there is no definition of the phrase. More broadly, I don't see any PVs being issued for not following proper procedure in identifying and classifying BCS under R1. Since the only procedure that can be said to be "proper" is the one described above - which almost literally no NERC entity would be willing to go through - there is simply no way the courts would uphold a fine based on such a "violation".

I see it as the job of the Regions to ensure the reliability of the BES; RF’s official mission statement is “ReliabilityFirst preserves and enhances bulk power system reliability and security across 13 states and the District of Columbia.” To support this mission we have a wide range of tools, including various forms of outreach, risk assessments, and controls evaluations. These non-CMEP tools are, of course, backed up by the full power of the CMEP if that is needed.

As for your analysis of the language that says the “bottom-up” method is strongly implied by the Standard, I agree. But we’re training the audit teams to accept the “top-down” as well as the “hybrid” approach (top-down with a cross check of systems bottom-up style). Since, for a large entity, the “bottom-up” approach can take orders of magnitude more resources to complete than “top-down,” this just makes good business sense.

(Tom) I totally agree it makes good business sense. But it is unfortunate that R1 (which should really be broken into at a minimum three or four requirements) requires so much interpretation and “gap-filling” in order to comply with it.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] The fact that I had to refer to FORTRAN to explain my point, and the fact that Lew understood the reference, is a rather sad commentary on both of our ages.

[ii] If an asset that isn’t High or Medium doesn’t contain any control systems, it won’t be an “asset containing a Low impact BCS”, and it won’t be High, Medium or Low. It will just be out of scope.

[iii] I have pointed out several times that one of the biggest problems with the wording of CIP-002-5 is that all of the process of identifying cyber assets in scope for the standards is compressed into one requirement; in contrast, in CIP v3 the three main tasks – developing a risk-based assessment methodology, applying that methodology to identify Critical Assets, and identifying Critical Cyber Assets, which are defined as Cyber Assets “essential to the operation of” the Critical Assets – each had their own requirement. This makes both compliance and auditing much easier.

Last fall, I identified 15 separate steps that are required to identify and classify BES Cyber Systems, which is of course the main goal of CIP-002-5. A huge amount of the current confusion about asset identification in CIP v5 could have been eliminated had each of these been addressed in its own requirement. Instead, they were all collapsed into a single requirement, and at least seven or eight (probably more) of the tasks are never actually called out in that requirement; instead, they are simply implied by definitions. It’s as if the Standards Drafting Team was temporarily replaced by a group of haiku masters, for whom compression of as much meaning into as few words as possible was the highest goal. Of course, compression of meaning should be the absolutely last goal of enforceable standards, not the first. Clarity should be first, not somewhere near the end.

Wednesday, June 1, 2016

The News from WECC

In my post yesterday, I listed some good takeaways I gathered at SPP’s 2016 CIP Workshop in Little Rock last week. This post describes takeaways from the WECC Low Impact Workshop in Salt Lake City, the day after the SPP event. The presentations are here.

As the title of the workshop states, the subject was entirely Low impact assets – this is actually the fourth such workshop WECC has done, I believe. Attendees were both large entities with High, Medium and Low impact assets, as well as entities that only have Lows, for whom this in many cases is their first experience with CIP compliance. The fact that the “big guys” and “little guys” were both in attendance can be taken as evidence that, no matter how simple the Low requirements might seem, there is nothing simple about them!

While I thought all of the presentations were good, there were two that provided the most takeaways for me. The first (and the leadoff presentation of the workshop) was by the inimitable Dr. Joe Baugh, entitled “Identifying and Auditing Low Impact BES Assets”.[i] This was focused entirely on an important question regarding Low impact assets that I haven’t seen any other region (or NERC) address so far: making sure you’ve identified your Low assets properly in the first place, before you start worrying about how to comply for those assets.

This isn’t too surprising, since Joe has all along been focused on one standard: CIP-002-5.1, and especially on the bright-line criteria in Attachment 1. My biggest takeaway from his presentation was that NERC entities who have High and/or Medium impact assets – and have been scrambling to come into compliance by July 1 – may be making a mistake if they think they have already done all they need to do to identify their Low assets, simply by subtracting their Highs and Mediums from their total BES assets. I think every entity should go back and make sure they haven’t either over- or under-identified their Lows.

Those entities are certainly making a mistake if they think that they don’t have to really worry about asset identification for Lows. Joe made the point – repeatedly – that each entity with Low assets needs to develop and document a methodology for identifying those assets, just like entities with High and Medium BCS need to develop a methodology for identifying those as well.[ii] If your entity explicitly called out the procedures for identifying Low assets in your High/Medium methodology, then you should be fine. But if you kind of glossed it over (on the idea that Lows are just the “leftover” BES assets once you subtract the Highs and Mediums), I recommend you go back and develop an explicit Low impact asset identification methodology – plus document it.

Here are some of the specific points Joe made:

If you haven’t already done this, it’s a good idea to go back and run all of your assets through the BES definition to make sure you haven’t over-identified BES assets. Of course, since Low assets are simply those BES assets that aren’t Highs or Mediums[iii], the only way you can decrease the number of your Lows is to decrease the number of BES assets altogether.[iv] You can do this either through an Exception Request (which doesn’t go through BESnet) or a Definition Request (which does – see the bottom slide on page 6 of Joe’s presentation).
On pages 16-19 of his presentation, Joe provides a very good discussion of segmenting generation in plants that meet criterion 2.1.
On pages 19-21, Joe makes the point that substations can have both Medium and Low impact BCS, at least under criteria 2.4 and 2.5.[v] I have been making this point for a while, although using a slightly different rationale. But I’m glad to hear Joe making it, since it can conceivably save some entities from over-classifying relays and other cyber assets at substations (and it is especially helpful with shared substations, where one owner has higher-voltage lines that are Medium impact, while the other has distribution-level voltage lines that are Low impact).
Joe pointed out that, for any Critical Assets an entity may have had under CIP v3, they should approach their Transmission Planner, Balancing Authority, or Reliability Coordinator to make sure they won’t be Medium impact under criteria 2.3, 2.6 or 2.7.
Joe also stated that, for entities that chose to move to v5 during the transition period between CIPs v3 and v5, any Areas of Concern that were identified during an audit in the transition period should be mitigated before the next audit – otherwise, they might receive a PV.
Joe made the important point that there should be no direct routable connections between BCS at substations, which don’t go through an EACMS (in the case of Medium substations) or a LEAP (in the case of Lows).

The second “presentation” (really consisting of two presentations on two different days) was by Lisa Wood and Eric Weston of WECC; it was entitled “Assets containing Low impact BCS”. It discussed in detail what WECC will be looking for in audits of entities with Low assets. Here are some of the main points from this presentation:

Logging of physical access is a good thing but isn’t required in v5/v6.
Managed firewalls are permitted at Lows, as long as you can get the configuration files (evidently, one entity was having trouble getting those files from their vendor. Of course, not being able to track and control the firewall configuration would create a real problem for compliance).[vi]
Since the definition of LERC is now before the CIP v7 Standards Drafting Team, WECC will follow the SDT’s discussion of that topic, and potentially use it to inform their interpretation of LERC (I hope to have a post specifically discussing this topic very soon). In other words, it’s not a bad idea for entities to keep an eye on the SDT as well.[vii]
If the entity has a device that functions like an Intermediate System in CIP-005 R2 (although Intermediate Systems aren’t required for Interactive Remote Access to Low BCS, as they are for High and Medium BCS), this can be considered to constitute a “protocol break” for LERC.[viii]

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] You may notice that Joe uses the politically incorrect term “Low impact BES asset”, rather than “asset containing a Low impact BES Cyber System”. He underlined his heresy by also referring to “High impact BES assets” and “Medium impact BES assets”. He’s lucky the church gave up burning heretics at the stake years ago. If not, both he and I might be crispy critters by now!

[ii] Unfortunately, CIP-002 R1 doesn’t say anything about developing such a methodology, as it did for the RBAM in CIP v1-3. But this is implicitly required nevertheless, both for Highs and Mediums as well as for Lows. Of course, for Highs and Mediums, the real point of the methodology is identification of High and Medium BCS. For Lows, it is simply identification of the Low assets, or if you will “assets containing a Low impact BCS”.

[iii] With the exception of BES assets that have no control systems at all. Since a Low asset is defined as an asset “containing a Low impact BCS”, it obviously can’t have any BCS if it doesn’t have any control systems. So these assets aren’t High, Medium or Low impact – they fall outside the CIP standards altogether.

[iv] Joe points out that you can remove assets from the BES list through BESnet.

[v] I would say that there’s no reason that 2.6 – 2.8 shouldn’t also be included in this statement.

[vi] I was quite interested to hear that WECC was condoning managed firewalls for Low impact assets. This makes a lot of sense to me, given the large number of Low assets and the potential savings from having a managed solution. But I also know that managed firewalls (or any sort of managed security services) are pretty much verboten for High or Medium impact assets. This is because of the difficulty of getting a managed services provider to agree to take the fairly onerous steps required to restrict access to the entity’s BCS information on their servers. I’ve been told by an auditor that he knows of no entities at all that are using managed security services for Critical Assets under v3, or Medium impact assets under v5. But, since the Low assets aren’t subject to the same information protection requirements, this shouldn’t apply to them. I’m quite glad to see that WECC is explicitly making this statement.

[vii] The best way to follow the SDT is to get on their mailing list, so that you get announcements of all meetings (both onsite and phone only, although the onsite meetings can be followed by webinar as well), as well as draft documents and comments. To do that, email cip_mod_sdt_plus@nerc.com

[viii] Of course, there is currently no official definition of LERC, since FERC wasn’t happy with what NERC came up with in v6 and has ordered a new definition be developed by March 2017. So this can’t be considered the same as an actual Interpretation of a requirement, but it should be encouraging to WECC entities that they can have at least this one island of certainty in the vast sea of uncertainty that is NERC CIP version 5.