Tom Alrich's Blog

Saturday, November 12, 2016

Reminder: Webinar on Supply Chain Security on Nov. 18

On Friday, November 18 from noon to 1:30 PM EST, please join me and other presenters from Deloitte and Cisco for a webinar on Supply Chain Security and Regulation for the electric power industry. The webinar is presented by Utilities Technology Council (UTC). For a full description and the registration link, please go here.

If you aren’t sure you can make it, please sign up anyway; you’ll be able to view the recording when it is available. See you then!

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Thursday, November 3, 2016

This Post has Absolutely Nothing to do with NERC CIP

Cubs Win! Cubs Win!

This blog isn’t primarily about sports, of course, but I do want to call your attention to the extremely prescient call I made in a post in December 2014:

"What are the chances the date actually will be pushed back? I’d say they’re slightly better than those of the Cubs winning the World Series next year. But you never know. It’s been “Wait ‘til next year” for 106 years here in Chicago; one of these centuries, next year will come.”

Of course, I was off by one year (and I really only predicted that they would win the Series in say the next 3 or 400 years), but hey…I’ll take it. Now, you may point out that I wasn’t actually predicting the Cubs would win the Series, only that there was a non-zero probability that they would (perhaps .00001 percent). This is true, but I want to defend myself by underlining one of the realities of living in Chicago as long as I have (which is 44 years, although 26 of those have been in Evanston, Illinois): You could never survive if you got your hopes up for the Cubs every time they won a few games or engaged a promising pitcher – you would be doomed to repeated disappointment and likely suicide. So even admitting there was a non-zero possibility of their winning the Series took an act of extraordinary courage, if I say so myself J. From now on, such courage won't be required.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Wednesday, November 2, 2016

Your Vote Counts!

As I write this, the whole country is anxiously awaiting the upcoming vote. It can be said without exaggeration that the fate of the country’s infrastructure – indeed, its entire economy - rests on the outcome of this crucial ballot…..Of course, I’m talking about the upcoming NERC ballot on the second draft of CIP-003-7, which incorporates changes made to address FERC’s directive that NERC clarify the meaning of the word “direct” in the LERC definition.[i]

Briefly, the current draft of CIP-003-7, which will be balloted soon, is the second draft. The first was soundly voted down, and I am concerned the second one may be as well. This bothers me because I believe the opposition to both drafts is based on widespread misunderstanding of what they say, and because this opposition could lead to one of several serious outcomes, none of which is positive.

First, if this draft is voted down, the SDT – feeling the cold wind of FERC’s March 2017 deadline on their faces - may feel obliged to accommodate this misunderstanding and produce a new requirement that will be a big step backwards from the current draft, but which is likely to pass. The requirement will probably end up being prescriptive rather than non-prescriptive, as it is in the current draft. It is also very possible that the new requirement will effectively require an inventory of all BES Cyber Systems at Low impact sites, something that most in the industry have so far opposed as too big a burden on them – but which may end up to be necessary in order to produce a requirement that people understand.

However, it is also very possible that this will be the last ballot on this issue. Remember, NERC faces a deadline of next March to present to FERC a revision to the LERC definition (and if required, to the requirement itself) that clarifies the meaning of the word “direct”. NERC doesn’t have the option of saying next March, “Sorry, FERC. We just couldn’t come up with something that the membership liked. Better luck next time.”

Section 321 of NERC’s Rules of Procedure states in effect that, if the NERC Board of Trustees decides that the balloting process has failed to produce a standard or definition that is required to meet a FERC directive, they can order the Standards Committee to develop one on their own (with input from stakeholders, but not requiring a ballot). Given that the ballot results won’t be known until December, if it doesn't pass this time it is likely that the BoT will take this step right away, since going through the normal process of producing a third draft and balloting it would very likely mean NERC would miss the deadline. And if this happens, who knows what will be produced? Perhaps we’ll go back to what is in CIP-003-6, along with simply tweaking the LERC definition to clarify the word “direct”.[ii] That would presumably satisfy FERC’s directive, but it would result in the requirement losing the big advantage it now has over the v6 version, as I’ll discuss below.

This post – one of the longest I’ve written (which is saying a lot!), if not the longest – will make the case that what an entity needs to do to comply with Section 3.1 of Attachment 1 of CIP-003 has not changed at all from version 6 through the current second draft of version 7. In fact, there have been two big improvements: making the requirement non-prescriptive and clarifying the meaning of “direct” per FERC’s directive. There is no downside that I can see. This is why I think the current opposition to the new draft is based simply on misunderstanding of what it says. If you want to take what I have just said on faith that it is true, then you don’t have to read the rest of this post. If for some reason you obstinately refuse to take my word as gold and require proof of what I say, read on.

I will discuss how the Low impact access control requirement works, first in CIP-003-6, then in the first and second drafts of CIP-003-7. I need to go into some detail to explain these things, so you might want to get a sandwich before you read the rest of this post.

LERC in CIP-003-6

LERC (Low impact External Routable Connectivity) first appeared in CIP v6. The main part of the definition of LERC in CIP v6 reads “Direct user-initiated interactive access or a direct device-to-device connection to a low impact BES Cyber System(s) from a Cyber Asset outside the asset containing those low impact BES Cyber System(s) via a bi-directional routable protocol connection.” This definition was illustrated by a set of “reference models” in the Guidance and Technical Basis for CIP-003-6. The reference models illustrate particular cases where there would or wouldn’t be LERC.
This definition was coupled with the “requirement” found in section 3.1 of Attachment 1, which reads “For LERC, if any, implement a LEAP to permit only necessary inbound and outbound bi-directional routable protocol access..” Of course, LEAP means Low impact Electronic Access Point.
The definition explicitly provides two ways that LERC can be “broken”. First, the fact that the connection must be bi-directional means that use of a data diode (or “unidirectional gateway”) will break LERC, since that leads to unidirectional communications. Second, the word “direct” means that any indirect communications with a Low-impact BCS will break LERC.
Combining this with the LERC definition, the requirement effectively reads “In cases where LERC is not ‘broken’ by one of the conditions contained (even implicitly) in the definition, implement a LEAP.” The reference models help the entity to decide whether LERC is or isn’t broken in a particular case, as well as discuss different ways to deploy LEAPs.
To go back to the two conditions that break LERC, “bi-directional” is straightforward, but the word “direct” is not. How does an entity distinguish direct from indirect communications? The reference models in the CIP-003-6 Guidance and Technical Basis provide the v6 SDT’s opinions on this question. Reference Model 4 (page 34) provides one example of what does not break LERC: a device that converts the routable protocol to serial, within the boundary of the asset. The caption says this device merely “(extends) the communication between the business network and the Low impact BES Cyber System”. Moreover, the LIBCS is “directly accessible from outside the asset”. So the diagram shows there is LERC, and a LEAP is installed.
Reference Model 5 provides an example of a case where LERC is in fact broken. In this model, there is a “non-BES Cyber Asset” which features “non-routed, application layer separation of routable protocol sessions”; in other words, this is a device like what is called an Intermediate System in CIP-005-5 R2 (e.g. a proxy server or terminal server). Because it breaks the communications session with the external device and establishes a completely separate session with a device within the ESP, there is no “direct” communication.
Reference Model 6 (page 36) is most likely what caused FERC (in Order 822) to order that NERC clarify the meaning of “direct”. In the caption for the diagram, the SDT said “There is a layer 7 application layer break or the Cyber Asset requires authentication and then establishes a new connection to the low impact BES Cyber System.” There are two concepts here, both of which the SDT indicates can break LERC. The first is if re-authentication is required before the external device can access a BES Cyber System, and if a new session is established to the LIBCS. The second is the “layer 7 application layer break”.[iii] I believe the first is straightforward, but I admit to being as confused as FERC was about the second.
In addition to the above conditions that break LERC, that are “explicitly” called out in the LERC definition, there are two other conditions that aren’t explicitly called out, but which simple logic indicates will also “break” LERC. One is separation of the IT and OT networks within the asset, with the external routable connectivity only coming into the IT network (this is a common arrangement in some generating plants and substations). In this case, there can clearly never be any direct communication between an outside system and a LIBCS.
The other implicit condition that breaks LERC is if the external routable connection is not in effect at the boundary of the asset. From the v7 SDT meetings where this was discussed, I know that the team members were concerned about the case where there is a short routable communications segment leaving a control center, with the remainder of the communications being serial – including the communications that crosses into the Low impact asset, on its way to one or more LIBCS within the asset. The LERC definition in v6 simply says that LERC exists if there is routable connectivity between a Cyber Asset outside of the Low asset and a LIBCS within it; it doesn’t say anything about whether that connection is serial or routable when it comes into the Low asset. The v7 SDT decided to fix that omission by stating that a connection that is LERC must be routable when it crosses into the asset.
So let’s summarize what happens with respect to LERC in CIP-003-6: a) LERC exists if there is a “direct” bi-directional routable connection between an external device and a BCS within a low asset; if LERC exists, the entity must implement a LEAP – there is no other option. b) The LERC definition implicitly assumes that the external connection is routable when it crosses into the asset. c) The following conditions can “break” LERC, meaning no LEAP is required: a data diode; an “intermediate system” that establishes a new communications session with the LIBCS; a device that requires re-authentication of the remote user or system and also establishes a new communication session with the LIBCS; a separation of the IT and OT networks within the Low impact asset; and finally a “layer 7 protocol break”. It is the last of these conditions that FERC seems to have had in mind when they asked for clarification of what the term “direct” means in the LERC definition.

First Draft of CIP-003-7

The SDT published their first draft of the revised CIP-003-7 in July; I attended the meeting in Chicago where they finalized this draft in principle and wrote about it in this post. Here are the major changes they made in this draft:

They revised the definition of LERC to read “Routable protocol communication that crosses the boundary of an asset containing one or more low impact BES Cyber System(s)…”[iv] Note that the three conditions that “break” LERC – data diodes, network segmentation, and “indirect” communications – were removed from the definition; but have no fear, they were still addressed, as discussed below.
Another condition that results in LERC not being present – which was implicit in the wording of the v6 definition – is that there is no LERC if the connection from the outside device to a LIBCS is non-routable when it comes into the asset. The SDT made this explicit in the new LERC definition. In doing this, the SDT used the new term “asset boundary”, in order that the entity would be able to determine unambiguously whether or not this condition was met. However, they didn’t define the term, which led to a lot of suspicion that auditors might cite entities for not properly declaring their asset boundary, even though the Guidance and Technical Basis (pp 31-32) tried to make clear that there was no “correct” definition of an asset boundary – it will vary depending on the circumstances.[v]
The SDT decided to remove the three conditions that break LERC from the definition because they realized that these are actually controls that mitigate the risk posed by LERC. So instead of saying that the existence of LERC, if not broken, requires a LEAP (and no other controls), they decided to say that, if there is LERC as per the new “control-free” definition, the entity will have a choice of controls to apply. The control could be a LEAP[vi], but it could also be one of the three controls that were removed from the definition. So the SDT drew up a new set of Reference Models that illustrated the different types of controls that could be applied.
At the meeting I attended in June, the SDT first tried to put all of the possible controls for LERC in the “requirement” itself (I use the quotation marks because the actual requirement CIP-003-7 R2 simply refers the reader to the details in Attachment 1. The section that deals with electronic access control is 3.1, so it is in effect the electronic access control “requirement”); they were in a bulleted list – which means they were separate options that could be applied. In very short order, the team realized this wouldn’t work – there were all sorts of shadings and variations that couldn’t be captured in a simple bulleted list. So they decided to make the requirement a non-prescriptive one. They changed it to read “Implement electronic access control(s) for LERC, if any, to permit only necessary electronic access to low impact BES Cyber System(s).”
I was quite pleased to see the SDT do this, since I have come to believe that all of the CIP requirements should be made non-prescriptive; assuming this is the form of the requirement finally approved, it will be the third non-prescriptive requirement in CIP (along with CIP-007-6 R3 and CIP-010-2 R4). With the new requirement language, the different controls shown in the reference models (starting on page 32) now became simply suggestions on how to mitigate the risk posed by LERC. If the entity has an equivalent control (perhaps encryption) they want to apply, and if they can convince their auditor that it is as effective as the ones listed in the Guidance, they will be deemed in compliance with this requirement.
In the first draft of CIP-003-7, the conditions that “broke” LERC in CIP-003-6 all became controls that can be used to mitigate the risk posed by the presence of LERC. These and other possible controls were described in the reference models found in the Guidance and Technical Basis, starting on page 33. I will discuss each of these reference models, in order to show that the end result of this requirement is the same as the end result of the v6 requirement, with the exception that the v7 requirement is non-prescriptive.
The first reference model shows a Low asset in which the IT and OT networks are physically separated, and the external routable connection comes into the IT network. This is the condition that was only implicitly assumed in the v6 definition. It is now explicitly listed as a control: If your Low impact asset is set up this way, you won’t have to do anything else to comply with “requirement” 3.1. So the result is exactly the same in v7 as in v6: if you have separation like what is shown in the concept diagram, you don’t have to do anything else to be in compliance.
I do want to dwell on this point further, because one of the big misunderstandings that arose about the first draft of CIP-003-7 (and seems to have continued in the reaction to the second draft) is that non-OT (i.e. IT) cyber assets could somehow be brought “into scope” for CIP. This is simply not the case. While it’s true that LERC would exist even if only IT assets participated in external routable connectivity, the fact that their network is separated from that of the OT assets would be an ironclad assurance that the entity would be in compliance, without having to implement any other controls. As you'll see below, the v7 SDT tried to eliminate this problem by dropping the LERC definition and incorporating a new one into the requirement. The new "definition" goes back to only addressing routable communications that goes to a LIBCS; so any other communications can be totally ignored.
Reference Model 2 illustrates logical network separation, but the result is the same as in the first model: If the IT and OT networks are separated, this mitigates the risk posed by LERC, and the entity has complied with “requirement” 3.1.
Reference Model 3 illustrates use of a host-based firewall to mitigate risk posed by LERC. In CIP-003-6, this was one example of a LEAP, which no longer exists in v7. But the result is the same: The entity doesn’t have to do anything more to comply with the requirement.
Reference Model 4 illustrates a security device that enforces inbound and outbound electronic access permissions – in other words, a firewall. Again, this would be called a LEAP in v6, but the result is the same in this draft of the v7 requirement – the entity doesn’t have to do anything more to comply with the requirement.
Reference Model 5 shows a centralized security device that controls electronic access to multiple Low impact assets. Once again, this would be called a LEAP in v6, but the result is the same.
Reference Model 6 shows a unidirectional gateway (data diode) that mitigates the risk posed by LERC. In v6, this was one of the two conditions that would break LERC that were explicitly included in the LERC definition. So the result is the same here (perhaps you’re noticing that, in each of the reference models, I’m demonstrating that the result of applying that model is exactly the same in v7 as in v6. As I’ve said, the whole point of this post is to show that nothing that you could do to comply with “requirement” 3.1 in v6 is taken away in v7; plus you can do a lot more, since the requirement is now a non-prescriptive one).
Reference Model 7 shows a Cyber Asset (not a BCA) that performs authentication on inbound traffic. The caption points out that simply requiring new authentication won’t in itself always be sufficient to mitigate the risk posed by the presence of LERC. This corresponds to one of the two conditions shown as “breaking” LERC in Reference Model 6 in CIP-003-6 (the other is the Layer 7 protocol break). In that reference model, the caption states explicitly that simply requiring re-authentication is not enough to break LERC; there needs to be a new connection to the BCS as well. Once again, the two cases are the same, except that in v7 the entity isn’t necessarily constrained to combining a new session with re-authentication. In some cases, re-authentication may be enough, and in other cases something else might be combined with re-authentication to satisfy the requirement. So in this case, not only can the entity do exactly what they could do to comply in v6, but they have much more flexibility (of course, this is because the v7 requirement is non-prescriptive).
Reference Model 8 illustrates the case where a Cyber Asset (not a BCA/BCS) terminates the incoming routable communications session and establishes a new one with the Low impact BCS. This is equivalent to Reference Model 5 in CIP-003-6, although in that case the Cyber Asset “breaks” LERC, while in this case it is one of the many controls that can be applied to meet the requirement. Once again, in the v7 requirement the entity can do exactly the same thing to comply as they would have in the v6 requirement; but they have other options as well.
Reference Model 9 simply shows that one security device can provide both an EAP to “break” External Routable Connectivity (for Medium or High impact BCS) and a LEAP to “break” LERC. It corresponds exactly to Reference Model 7 in CIP-003-6.
Of course, this whole exercise was necessary because FERC wanted NERC to clarify what the word “direct” means in the LERC definition. How exactly did the SDT do this? Reference Models 7 and 8 (paragraphs 14 and 15 above) constitute the SDT’s new “definition” of “direct”. But note that the words “Layer 7 application layer break” don’t appear anywhere in CIP-003-7, as they do in Reference Model 6 in CIP-003-6. I believe these words are what gave FERC all the heartburn in v6, and are why they ordered the clarification of "direct". I predict FERC will find this satisfies their directive.[vii]

The point of this long discussion of reference models is to show that everything an entity ccould do to comply with “requirement” 3.1 in CIP-003-7 can still be done to comply with the same requirement in the first draft of CIP-003-7; however, the wording is different. There is no longer the idea of “breaking” LERC, but simply different controls that can be applied to mitigate the risk posed by the presence of LERC. In addition, there is a big change from v6, in that the requirement is now non-prescriptive. This means the entity isn’t limited to the controls shown in the v7 reference models. If they want to apply a different control, and if they can convince their auditor that it does a good job of mitigating the risk posed by LERC, they are in compliance with the requirement.

Second Draft of CIP-003-7

This is the draft that is up for balloting now. Fortunately for you, my discussion of this draft will be much shorter than for the first draft, since there is much less to discuss! The primary change from the first draft is that the SDT decided to jettison the separate definition of LERC altogether, and incorporate the important parts of the definition into the “requirement” itself. Accordingly, CIP-003-7 Attachment 1 Section 3.1 now reads:

Electronic Access Controls: For each asset containing low impact BES Cyber System(s) identified pursuant to CIP‐002, the Responsible Entity shall implement electronic access controls to:

3.1 Permit only necessary inbound and outbound electronic access as determined by the Responsible Entity for any communications that are:

i. between a low impact BES Cyber System(s) and a Cyber Asset(s) outside the asset containing low impact BES Cyber System(s);

ii. using a routable protocol when entering or leaving the asset containing the low impact BES Cyber System(s); and,

iii. not used for time‐sensitive protection or control functions between intelligent electronic devices (e.g. communications using protocol IEC TR‐61850‐90‐5 R‐GOOSE).

The most important result of this is that the big change that was made in the first draft – making LERC a property of the asset rather than specifically of the Low impact BCS at the asset – has now been reversed. Now, there is no obligation for the entity to do anything unless there is routable communication coming into a LIBCS from outside the asset.

However, the practical effect of this change is the same: If routable communications comes into the asset but only goes into a network that doesn’t contain LIBCS (i.e. an “IT network”), and which is physically or logically isolated from any network that does contain LIBCS, then the entity has no further compliance obligation for this requirement; the only difference is that network separation is called a control in the first draft, whereas in the second draft it is a condition of the entity’s having external routable connectivity to low BCS in the first place. As I discussed in paragraph 8 above, the fact that in the first draft any routable connectivity into an asset was called LERC was a big issue – even though it didn’t place any more compliance burden on the entity than in v6. Hopefully, the people who were worried about this will sleep easier now.

This change is now reflected in the reference models. Since physical and logical network separation are no longer controls, the first two RMs have been removed. RM 1 is now host-based firewall, which was RM 3 in the first draft. The remaining reference models in the first draft are all in the second draft, unchanged except for perhaps one or two small changes; however, they are all numbered “n-2”, if n is their number in the first draft. The physical and logical network separation reference models are resurrected as RMs 8 and 9. However, they are no longer described as controls but rather conditions which would prevent “routable communication between a low impact BES Cyber System and a Cyber Asset outside the asset” (the phrase which now takes the place of LERC). A tenth Reference Model has been added which illustrates how, if the only communications coming in to LIBCS is serial, there is no compliance obligation, either.[viii]

Summing it up

To repeat what I said at the beginning of this post, there has been no change in what an entity has to do to comply with the requirement for Low impact electronic access controls in CIP-003, from version 6 through both drafts of version 7; in addition, the v7 requirement is non-prescriptive (some would say “risk-based”), affording the entity much more flexibility in how to comply than the v6 version did.[ix] I see no way that NERC entities can lose if this draft is approved.

I hope the above discussion has satisfied any concerns you have about the second draft, or at least that it has bludgeoned you over the head enough that you’re numb and will agree to anything outrageous that I say (at this point at night, either outcome is fine with me).[x] It’s not my job to tell you how to vote in the upcoming ballot, but I hope you will keep what I have said in mind.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I have also heard some talk about another upcoming vote that occurs on November 8. I think the best commentary on that contest was written – very presciently - almost 100 years ago in a poem by William Butler Yeats. I quoted from that poem at the beginning of this post in early 2015.

[ii] It is also possible that the Standards Committee will decide not to change the LERC definition or the requirement, but simply add wording to the Guidance to clarify what “direct” means.

[iii] I believe this phrase was developed by the Department of Redundancy Department. Layer 7 is the application layer in the OSI model.

[iv] In quoting both this definition of LERC and the original one in CIP v6, I have omitted the language exempting time-sensitive communications, since that didn’t change substantially (if at all) between the two definitions.

[v] From having attended the SDT meeting where “asset boundary” was discussed, I know that the SDT had in mind the idea that all BCS within the Low impact asset need to be included within the boundary – this is a clear criterion for deciding whether or not an asset boundary is correct. In retrospect, it was a mistake for the SDT not to write a definition of asset boundary that stated this criterion. It would have provided relief to the many entities who were worried about having no ability to counter an auditor who might try to nail them on not determining an asset boundary correctly; they could simply point out that all of their Low impact BCS were within the boundary they had declared, and that met the definition. I notice that the Guidance for the second draft of CIP-003-7 does mention this idea.

[vi] In the first draft, the SDT decided the term LEAP could be retired, since the firewall (or whatever would otherwise be used as a LEAP) no longer had a special status as the only control that could be applied to LERC.

[vii] In early 2015, there was a lot of discussion about the meaning of ERC; a lot of it referred to Reference Model 6 in CIP-003-6, especially the “layer 7 application layer break”. This was despite the fact that ERC and LERC are in theory two different things. I know there was a lot of controversy over whether a protocol converter alone could break ERC. I think FERC wanted to kill two birds with one stone when they ordered NERC to clarify the meaning of “direct”. They of course didn’t want the same controversy to come up with respect to LERC; but they also knew that whatever solution the v7 SDT came up with for LERC would end up influencing how the ERC definition was interpreted (even though the v7 SDT probably won’t decide to change ERC. It isn’t in their mandate, but more importantly they have a couple of real tigers by the tail and the last thing they want is to have a couple more of them. I hope to have a post out on their predicament soon).

[viii] The undefined term “asset boundary” has now been dropped from the requirement altogether, to quell fears that entities would be dinged for not “properly” identifying their asset boundaries.

[ix] I am beginning to realize that some CIP compliance professionals are actually worried about non-prescriptive requirements, since they will no longer have a rigid requirement to comply with but will instead have to exercise judgment (which judgment the auditor may not agree with!). I realize this is a concern for entities that have had bad experience with auditors that exercise poor judgment (since a lot of judgment is required for complying with and auditing the current CIP requirements, even though they are almost all prescriptive). I hope to have a post on this syndrome soon, but I liken it to the case of some longtime prisoners who reach the end of their sentences and want to return to prison, because they know no other way of life.

[x] Now that I think of it, each of the two main Presidential candidates embodies one of these two alternatives. I will leave it as an exercise for the reader to decide which one tries to convince rationally and which one bludgeons.

Tuesday, October 25, 2016

UTC Webinar: Supply Chain Risk and Regulation

If you aren’t sure you can make it, please sign up anyway; you’ll be able to view the recording when it is available. See you then!

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Friday, October 21, 2016

Registration Open for NERC Supply Chain Security Conference on Nov. 10

NERC has just announced the sign-in sites for their one-day Cyber Security Supply Chain Management workshop in Atlanta (at the Grand Hyatt Atlanta, near NERC’s headquarters in Buckhead) on Thursday, November 10. Registration for the on-site meeting is here. Registration for the webinar is here. There is no charge for the workshop.

This will be your only chance for face-to-face (or “browser-to-browser”) interaction with the Standards Drafting Team developing the supply chain standard. My guess is the workshop will fill up soon, so you should probably sign up soon if you want to attend, either in-person or virtually (note: I signed up before I uploaded this post! My mother didn’t raise a stupid child).

Here is an excerpt from the email the SDT sent to the Plus List today: “The purpose of the technical conference is to facilitate early stakeholder engagement in the development of standards requirements addressing the Commission’s directives. The Standards Drafting Team (SDT) has developed the Standards Authorization Request (SAR) describing the project scope, which is posted on the project page for stakeholder comment. Additionally the SDT has begun developing proposed requirements and is seeking stakeholder feedback during technical conference discussions. The timing of the technical conference supports providing the SDT with valuable input for their use in developing a draft standard that is ready for subsequent formal commenting and balloting in early 2017.”

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Wednesday, October 19, 2016

“We Need Something to Measure”

I have been trying to keep up with the new Standards Drafting Team that is tasked with developing the CIP Supply Chain Security standard that FERC ordered in July. Last Friday, they had a short phone meeting to set the stage for their first face-to-face meeting in Atlanta this week (which I unfortunately had to miss because it conflicts with NERC’s GridSecCon). I won’t summarize the phone meeting, but I do recommend that anyone with any interest in the new standard should get on the SDT’s email list – called the “Plus List” – in order to see documents, learn about upcoming meetings, and in general learn what’s going on (I do want to point out something that some people don’t seem to understand about NERC meetings in general: almost all of them are open to anybody who is a user of electricity. And if you’re not an electricity user, how on earth are you reading this blog post?).

What inspired this post was a single comment made by someone on last Friday’s SDT call, as the discussion was focusing on the preliminary version of the new standard (which is currently called CIP-013-1, proving that the SDT members are not superstitious). This person said something to the effect of, “We need something to measure.” The meaning seemed to me to be, “You can’t have an enforceable requirement unless it can be rigorously audited. You can’t audit compliance with a requirement without having some black-or-white criterion for whether the entity has complied with it.”

I want to say now that I don’t mean in any way to pick on the person who said this (and I honestly don’t know who it was, nor do I care). I will also say that I have heard this assertion made many times. Finally, I want to mention that I and one or two other people on the call quickly pointed out to this person that this will be the wrong approach to use for CIP-013, if for no other reason than because FERC said explicitly in their Order 829 (pages 30 and 31) that the new standard should not be prescriptive. So my guess is this issue has already been laid to rest as far as the Supply Chain Security SDT is concerned.

That being said, there’s something more I want to point out: The majority of the existing CIP requirements are written with precisely this thought in mind, namely that the only good standard is a prescriptive one. And folks, in my humble opinion this is the fundamental problem with the current NERC CIP standards. Because they are so prescriptive, they are unsustainable and will collapse of their own weight if not modified.

I have been making this point in various posts, starting with this one early in the year. And the more I talk to people in the industry and follow developments in CIP, the more I am convinced it is true. In fact, I am now working on a book with two co-authors that will not only discuss the problems with NERC CIP but lay out what we see as a solution to that problem (which could be adapted to other critical infrastructure domains, not just electric power). However, it will be late 2017 at the earliest that the book will be available, so I’m not going to tell you to wait for it to learn why I made the above assertion; I am going to explain why I am saying that. On the other hand, I’m not going to attempt to write the book (or even a chapter of it) in this post. In this post, you will see a high-level view of the argument we will make to support this point, but not a lot of details to support our case. For those details, you will have to wait for the book.

Here is roughly the argument we will make:

Costs for NERC CIP compliance have ballooned with CIP versions 5 and 6, They are only going to continue to balloon due to CIP v7, CIP-013 (the supply chain standard), and a host of future versions that will be required to address security of a lot of areas so far left unaddressed by the current CIP standards (virtualization, the cloud, phishing attacks, and Distribution systems, just to name a few). I know there are at least a few large NERC entities that easily spent 25-50 times as much on implementing compliance with CIP v5 as they did with v1. And guess what? They’ll have to spend some multiple of that amount to address the new and revised CIP standards that will be coming down the road.
This situation might be justifiable if the bulk of this money (say 90%) were going to improve security. However, I have been conducting a totally informal and unscientific poll of NERC entities, asking them how much of every dollar they spent on implementing CIP v5 compliance actually goes to increasing security, as opposed to all of the paperwork, etc. that is required to prove compliance with those highly prescriptive standards. The highest estimate I received was 70%. This itself is appalling, since it says that “only” 30% of what this entity spent on v5 was “wasted” on pure compliance paperwork! The low estimate I heard was about 25%, and I’d say the median was 50% (a couple other knowledgeable observers also say that 50% is probably the industry average, although I admit there is no way this could be objectively verified. Different individuals will have different opinions on whether a particular dollar spent on CIP went to security or purely to compliance activities with no security benefit)
When you combine these two assertions, that the cost of CIP compliance is ballooning and will continue to do so for a long time, and that about half of those costs are not going to securing the Bulk Electric System, you come out with what I call an Intolerable Situation. Either one by itself might be tolerable, but the two together are intolerable. We can’t let things go on like this, until a significant portion of the US GNP is going to NERC CIP compliance, without any proportionate increase in security. Here are some numbers: I believe that at least $2 billion was spent by the industry on implementing compliance with CIP versions 5 and 6, including security products purchased, staff time, consultant time, etc. Let’s be charitable and say that between 30 and 50% of that went only to compliance costs, not security. This works out to $600 million to $1 billion. This is more than I earn in a year, and is a lot of money to waste.
This isn’t to say that the CIP standards should be “frozen” as they are, to keep costs from ballooning even more than they have already; there is just about universal agreement in the cyber security community that there is a lot more that needs to be addressed in CIP. But it is to say that we need to find out why so many in the CIP community feel that one of every two dollars they spend on compliance is going down the rat hole and fix that problem, hopefully before new increases in the scope of CIP lead to another doubling or tripling of the amount that utilities have to spend on complying (30-50% of which larger number will also not go to security).
In other words, what we need to do is find a way to fix the whole CIP compliance regime (which is a lot more than just rewriting the standards) so that a much greater percentage of every dollar spent on implementing and maintaining compliance actually goes to security. To go back to the previous illustration, if 80% of CIP v5-v6 spending went to security rather than 50-70%, this would be an effective increase of $200 to $600 million in industry spending on cyber security – without NERC entities being required to spend an additional dollar. If we could get to 90% (which I believe is possible), that would be an increase of $400 to $800 million.
But as I’ve just said, the scope of CIP is going to keep increasing, and entities will still end up spending more money, even if CIP is rewritten. However, if CIP is rewritten it will make the increased spending much more palatable. If entities feel that most of what they spend on CIP will actually increase their security and that of the BES, they are much more likely to support such scope changes than they are now, when they know that close to half of the increased cost will simply be wasted, from the point of view of promoting cyber security.

Why do I think we have this problem? It is because the CIP standards are prescriptive to a fault (I do want to emphasize the disclaimer found on all of my posts: that any opinions expressed in this blog – and certainly in this post – are mine alone, not those of my employer or for that matter any other entity or person). And why are they prescriptive? Because of the idea – which is still prevalent in NERC circles – that the only way to write enforceable mandatory requirements is to make them prescriptive; prescriptive standards lead to measurable outcomes, so that in theory it should be completely clear whether or not the entity has complied (no auditor discretion allowed or required). But this prescriptivism leads to NERC entities spending large amounts of money on activities that don’t increase security but do help them avoid getting cited for non-compliance with the prescriptive requirements. Even more importantly, prescriptive standards greatly inhibit NERC’s ability to expand CIP’s scope to address new domains like the cloud, without engaging in a painful, expensive, years-long standards development process.

I will first illustrate this point with the Tale of Two Requirements: CIP-007-6 R2 and R3. R2 (Security Patch Management) is the bad guy here. For example, take the requirement part (R2.2) that says the entity needs to, every 35 days, determine whether there are new security patches available from the vendor of every piece of software installed on even one BES Cyber System or Protected Cyber Asset, and evaluate whether the patch applies to their systems or not. It doesn’t matter if the software is installed on one or 1,000 systems. It doesn’t matter how important the software is to the entity's operations. It doesn’t matter whether the vendor has never released a security patch and probably never will. This has to be done every 35 days, for every piece of software on a BCS or PCA.

From the point of view of measurability, this requirement is a champ. It is in theory very easy for an auditor to review the entity’s documentation to determine whether 35 days or less elapsed between the patch release date and the date the evaluation was done. However, NERC entities – especially larger ones – are finding this a tremendously burdensome requirement to comply with, due in part to the difficulty of contacting loads of smaller software vendors once a month to see if they have made any security patches available. But they have to do this, and they have to do it within 35 days (that is of course just the first step in complying with CIP-007 R3. There are a couple other specific deadlines after that). And most important of all, they have to have good documentation that they have done this, for every software product for every month – and have it readily retrievable when the auditor asks to see the evidence that all security patches were identified and evaluated for these particular systems in this particular month.

But where does the 35 day deadline come from? Is there some principle of cyber security – or computer science or electrical engineering – that dictates that security patches need to be identified and evaluated within 35 days or all hell will break loose (as conceivably could be the case with other NERC standards, where failure to meet a particular parameter or deadline could perhaps lead to a cascading outage of the BES)? Of course not. This is an arbitrary deadline, chosen simply because the CIP v5 SDT felt they had to choose some deadline in order to – of course! – have something measurable. But entities then have to spend a lot of effort and money designing processes and installing systems to make sure this deadline is never exceeded, for all of the perhaps hundreds of software packages installed within their ESP. Measurability has a high cost, and shouldn’t be invoked just to make audits easier.[i]

Now let’s look at CIP-007-6 R3, Malicious Code Prevention. This is one of two non-prescriptive requirements currently found in the CIP standards (the other is CIP-010-2 R4. Plus, one requirement is in the process of being switched from prescriptive to non-prescriptive by the CIP v7 SDT: CIP-003-7 “R3.1”[ii]). The heart of this requirement is part 3.1, which reads “Deploy method(s) to deter, detect or prevent malicious code.” That’s it. There is no requirement to deploy anti-virus software with no other options (as in the previous CIP versions), no requirement to check for new signatures daily, etc. By the same token, the entity doesn’t have to put in place a lot of procedures and systems to comply with arbitrary deadlines, as well as train and monitor a host of staff members who have lots of other important tasks on their plates. And most importantly, the entity doesn’t have to generate reams of documentation showing that they complied with arbitrary deadlines for say A/V signature deployment every day for every system within the ESP.

I could go on and on regarding this topic (and we will discuss it thoroughly in the book!), but you hopefully get the point. I hold prescriptive requirements responsible for the Intolerable Situation mentioned above. The obligation to meet arbitrary deadlines and other targets - without consideration of the risk posed by a particular system, software package, vendor, etc. – drives a lot of the huge and increasing cost of NERC CIP compliance.

And this situation will just get worse as CIP is expanded to cover more domains, like supply chain and the cloud. I think FERC realizes this, which is why, for the last two major expansions of CIP that they have ordered – substation physical security in CIP-014 and supply chain security in CIP-013 – they have gone out of their way to say they don’t want NERC to take a prescriptive approach in writing the standards. At some point, I predict they will have to order NERC to rewrite all of CIP in a non-prescriptive fashion.

I do want to say now that I don’t blame NERC, FERC, or any individuals employed by or associated with those organizations, for this situation. Nobody set out to achieve this end. In my opinion (again, this will be elaborated and justified in the book), the situation is due totally to the fact that, until CIP-014, which is entirely non-prescriptive, NERC was in the business of writing prescriptive standards. And for their original mission – protecting against single acts of commission or omission that can have catastrophic physical effects on the BES – that is exactly what they should be doing.

But cyber security is different. There is no way an entity can mitigate (or even identify) every cyber vulnerability that affects even a limited group of systems like BCS. But the CIP standards (with the exception of CIP-014 and the two other CIP requirements mentioned above) implicitly assume that this is a goal that can be achieved. And, given the size of the potential fines that NERC entities face for non-compliance with even one requirement in any NERC standard, they have lots of incentive to devote their entire security budgets to CIP compliance, and leave nothing for anything that isn’t required. That they don’t in fact do this – but definitely do spend a lot of money on cyber security beyond what CIP requires – is testament to the fact that these entities truly are concerned with security, not just with compliance.

To sum up my argument, in my personal opinion the perceived need to have “measurable” requirements has led to the situation where a huge – and growing – amount of money is being spent on NERC CIP compliance without anywhere near a proportional increase in security of the electric grid. This situation will only get worse if the entire CIP compliance regime isn’t re-thought and re-written.

What’s the solution? The easy answer is that I and my co-authors will go into that in great detail in the book. The not-so-easy answer is that we are currently still having a lot of discussions, both among ourselves and with others, about this question. The general direction is clear, but that is all that’s clear at the moment. It won’t suffice simply to rewrite all of the current CIP requirements in a non-prescriptive format, as is currently being done with CIP-003-7 “R3.1” (which I described above). The CIP standards need to non-prescriptively address all threats to the cyber security of the cyber assets that run the BES, whether they come in through the IT network (as in the Ukraine attacks), through the cloud, through “machine-to-machine” communications from outside the ESP, etc.

On the other hand, it’s not possible to simply have a CIP standard that says “Make sure you’re cyber secure”. The entities will need to be told the different areas they need to address (patch management, securing virtual systems, etc), and provided with guidance on best practices for addressing those areas. They will then be audited based on how well they addressed each area; and yes, the auditors will need to be cyber security professionals who can determine how well the entity did address each area, without having to resort to a checklist with arbitrary boxes labelled “35 days”, etc.

Even more importantly, the CIP framework needs to be able to rapidly incorporate new threats that will pop up in the future that are currently unheard of, without having to go through a multi-year standards development process. So there will need to be some sort of governing body that will regularly meet to review the primary threats to the cyber security of the BES, and add or subtract areas from the list of what the entities need to address (as well as write guidance for new areas). None of these goals can be achieved by simply rewriting the current standards; there has to be a new compliance regime.

But I’m not going to force you to wait for the book to hear about my ideas for what should replace the current CIP standards. In this blog, I will keep you up to date with what I am thinking in this regard. And I’d be very interested in hearing your ideas as well.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I am not trying to pick on the CIP v5 SDT here! The next paragraph describes a very non-prescriptive requirement that they also wrote. Plus times have changed. I attended a number of the SDT’s meetings (in 2010-12), and I freely admit I never once even thought about raising the issue of prescriptive vs. non-prescriptive requirements. It just wasn’t something I even considered important until about a year ago. Now it’s all I think about.

[ii] I use the quotes because it is technically Section 3.1 of Attachment 1 of CIP-003-7 not requirement R3.1.

Tuesday, October 18, 2016

This is your Chance!

The NERC Supply Chain Security SDT has just announced a one-day industry workshop in Atlanta (at NERC’s headquarters, most likely) on November 10. According to an email sent out yesterday, the workshop “proposed Reliability Standards requirements, guidance, and industry practices for mitigating the risk of cyber security incidents that may be introduced in the supply chain of industrial control system hardware, software and services associated with Bulk Electric System operations. The workshop is being held as part of the standards development process to address FERC Order No. 829.”

This will probably be the only chance that NERC entities get for live interaction with the SDT on this standard. You may want to save the date; while the email doesn’t mention it, I would guess the workshop will be “broadcast” via webinar. The sign-up details and agenda will be emailed out soon; I’ll publish them when I receive them. Or you can receive them if you sign up for the SDT’s Plus list. Send me an email at talrich@deloitte.com if you’d like me to send you the address to request that.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.