Saturday, May 24, 2014

What’s Wrong with CIP-002-5 R1?

Warning: Exceedingly long post ahead.  You are advised to take frequent rest stops and be sure to maintain your hydration level.

I had hoped to be finally writing the long-delayed Part II of my blockbuster Identifying BES Cyber Systems at Substations post by now.  However, a funny thing happened on the way to the post office.  I have been engaging in a lot of conversations with various people about the whole CIP-002-5 R1 question lately[i], and it just gets more complex and confusing each time I look at it.

In Part II, I was going to conclude what I’d started in Part I and lay out a consistent methodology for BCS identification at substations (it actually applied to all assets, but substations made it much more complex, which is why I focused the post on them).  Unfortunately, I no longer think there can be any consistent methodology, at least the way I was doing it - which was trying to follow roughly the way the requirement flows.

I do now believe there could be a consistent methodology (and one that could be understood, which is most important), but it wouldn’t follow R1.  Rather, it would follow the way R1 should have been written in the first place.  This includes breaking it up into four requirements, each addressing a single part of the process.  I distinguish four parts: asset/Facility identification, asset/Facility classification, BES Cyber System identification at Medium and High assets/Facilities, and BES Cyber System classification.  Before I’m done, I may combine a couple of these, and I may add another one.  I will do this in a future post, but it may not be for a little while.  My record for completing Part II’s of my Part I posts isn’t stellar.

You might ask what good it does to break R1 into four requirements, since it obviously is only one now, and there is no longer a chance that it can be changed.  The only officially sanctioned way to fix the problem would be to draft a new Standards Authorization Request, choose a new SDT, have them develop a new standard (which would be v7, since the current SDT is working on v6), ballot this a few times, submit it to FERC, and have FERC approve it 6-12 months later.  That’s easily a four-year process. 

But something has to be done soon about the CIP-002-5 R1 problem - by NERC, FERC, the regional entities, Godzilla, Vladimir Putin…somebody.  And whatever is done (perhaps special guidance for the auditors from NERC) will have to break somebody’s rules – NERC’s, FERC’s, the Regional Entities’.  This is too serious a problem to try to fix it using the Muddle Through approach anymore.

Before I write a post about how CIP-002-5 R1 should read, I’d like to tell you what’s wrong with the requirement.  You may point out that I’ve done that before.  In fact, I just went back and counted 16 posts I’ve written on what’s wrong with this requirement, starting with one soon after FERC’s NOPR last April (I started out to write a series of posts on the whole of CIP v5 starting with CIP-002-5, but I immediately realized there were some serious problems with the wording of R1.  I started writing about those problems, and you could say I haven’t ever gotten beyond 002).

As I’ve said before, I don’t think I’m wasting my (and your) time by writing about these problems, since CIP-002-5 is the foundation for the rest of CIP v5 (and of v6, since CIP-002 is unchanged in v6).  That’s why I’m taking the liberty now to try to summarize everything I see wrong with it (I’m sure I’ll miss something, though.  There’s such a wealth of material to write about)[ii], before I plunge into quasi-rewriting it again.[iii]

I wish I could give a single pithy statement (or a paragraph) that summarizes all of the problems with CIP-002-5 R1 in one fell swoop.  But I can’t.  I’ll just deal with these problems in something like a logical order.

Conciseness is not a Virtue
When I started really digging in to CIP-002-5 R1 last April, one of the things that most struck me was its conciseness.  In CIP Version 3, there were three steps required to identify Critical Cyber Assets, each with its own requirement:

  1. R1: The entity was required to develop a risk-based methodology for identifying Critical Assets (i.e. the “big iron”).
  2. R2: The entity needed to apply this methodology (the RBAM) to its set of assets to determine which were in fact Critical Assets.
  3. R3: The entity needed to “develop a list of associated Critical Cyber Assets essential to the operation of the Critical Asset.”  This is the “little iron”.
And that was it.  It almost makes me want to cry to think how straightforward it was (yes, yes, I know there were a lot of issues and disagreements about the actual meaning of these words.  But the words didn’t contradict themselves, and every step was laid out explicitly).

Now we have CIP-002-5 R1.  That one requirement is supposed to do everything in the three requirements above (and more, since the process in v5 has at least one more step).  I’m not saying it would be absolutely impossible to do that, but unfortunately it hasn’t been done here.  The result is a requirement that is remarkably concise but also remarkably vague and contradictory.  Conciseness might be a real virtue when you’re writing haiku  poetry, but it isn’t when you’re writing a requirement with huge penalties for non-compliance.

As an example, there is one entire step – identification of BES Cyber Assets and BES Cyber Systems - in CIP-002-5 R1 that is never stated directly, but simply implied through the definitions of the words used (this requires an entire separate requirement in CIP versions 1-3).  Let’s look at how R1 has you identify your BES Cyber Assets and BES Cyber Systems.  We first find these two directives:

1.1. Identify each of the high impact BES Cyber Systems according to
Attachment 1, Section 1, if any, at each asset;
1.2. Identify each of the medium impact BES Cyber Systems according to
Attachment 1, Section 2, if any, at each asset

This is promising, since both of these requirement parts speak of identifying BCS.  Let’s go to Attachment 1 to find out more.  Surely that will tell us how we identify High and Medium impact BCS.

We arrive at Section 1 of Attachment 1, and find it reads

1. High Impact Rating (H)
Each BES Cyber System used by and located at any of the following:
(followed by the four High impact criteria.  Section 2 of Attachment 1 reads essentially the same way, although this time for Mediums)

Does this tell us how to identify BCS?  No, it starts with the assumption we’ve already identified our BCS (in fact, all of our BCS, not just Highs).  When we get to Section 1, we are to classify them by comparing the pre-existing BCS list with the four High criteria.  But how do we identify BES Cyber Systems in the first place?

All we can do is look to the Glossary for the definition of BES Cyber System:

One or more BES Cyber Assets logically grouped by a responsible entity to perform one or more reliability tasks for a functional entity.

OK, so now we have to go back to the definition of BES Cyber Asset (which I won’t reproduce here – you should know it by heart by now).   We clearly need to start with that definition, then group the BES Cyber Assets we’ve identified into BES Cyber Systems per the definition.[iv] But why can’t the standard tell us to do this?  Why is BCS identification made completely implicit in CIP-002-5?  Why don’t we have a requirement (like R3 in v1-3) that reads something like “Identify your BCA per the definition, then identify your BCS per the definition”?

These are rhetorical questions, of course.  There was never a conscious decision made by the Standards Drafting Team not to have an explicit step for BCS identification.  Rather, the fact that this step is implicit is a result of a fundamental inconsistency at the heart of CIP-002-5 R1, which I will discuss in the very next section.

I could bring up other examples of the over-conciseness of R1 (and will bring up one more later on), but I want to move on to what I consider the most important problem of CIP-002-5 R1.

Have an Apple, Adam?
There is one huge problem that I refer to as the Original Sin of CIP-002-5 R1.  That is the fact that it is torn between two distinct approaches to what it wants to be when it grows up and ends up trying to address both of them, although one more than the other.  Unfortunately, the approach it favors is the one it shouldn’t favor, but in any case trying to straddle the fence like this causes big trouble.  However, I speak darkly; let me explain.

Let’s go back to CIP versions 1-4.  In each of these versions, there were two types of things (for want of a better word) that were identified in CIP-002.  The first was Critical Assets; these were identified using the RBAM in v1-3 and the bright-line criteria in v4.  The second was Critical Cyber Assets, which were identified using the language in CIP-002-3 R3 quoted above.  You had the “big iron” and the “little iron”.  CCAs (the little iron) were defined as “essential to the operation of” a Critical Asset (the big iron), which meant you had to first identify the latter, then the former.  The approach was to first identify the Critical Assets, then identify the Critical Cyber Assets, using the definition provided in the CIP-002 standard.

The first of the two approaches fighting for the soul of CIP-002-5 R1 is basically the v1-4 approach, which requires starting with the “big iron” – the assets themselves.  Where does this approach appear in the requirement?  Well, I used to think it appeared in the list of six assets in R1, but I no longer think so (see more on this below).  The only place it definitely appears is R1.3:

1.3. Identify each asset that contains a low impact BES Cyber System
according to Attachment 1, Section 3, if any (a discrete list of low impact
BES Cyber Systems is not required).

In other words, when it comes to Low impact assets and cyber assets, the only thing that matters from the CIP v5 point of view is the asset itself.  That is the only thing that has to be identified for Lows, and the requirement that applies to Lows (both the current CIP-003-5 R2 and the new, improved CIP-003-6 R2 in CIP version 6) only requires controls on the level of the asset itself, not the individual cyber assets associated with it.[v]

However, despite just one clear use of the concept of assets to determine scope in CIP-002-5 R1, I not only believe this is one of the two main  approaches to the requirement, but I believe it is the one that should be the basis for any effort to “fix” CIP-002-5 R1.  Why do I say this?  It’s because I have talked to many NERC entities about their R1 compliance methodology, and without exception they say they’re using this approach: identify and classify the “big iron” using the bright-line criteria, and then identify the “little iron” or BCS at/associated with High and Medium impact assets/Facilities.  So we can’t just ignore this approach as being “wrong”.  If close to 100% of entities are saying and doing one thing and the requirement seems to be implying the opposite, what needs to give way is the interpretation of the requirement, not the entities.[vi]

Let’s look at this first approach.  What are the equivalents of Critical Assets and CCAs in CIP-002-5?  Clearly, CCAs are equivalent (more or less) to BES Cyber Systems.  But what are the equivalents of Critical Assets?  You might point to the list of six asset types in R1; however, as I’ll show in the next section, these don’t perform the same function at all as Critical Assets did in v1-4.  They are merely the six types of locations you can go to hunt for BCS; the BCS don’t have a direct relationship to them, as CCAs do to Critical Assets in v1-4 (“essential to the operation of…”). 

Then what is the equivalent of Critical Assets in v5?  The only place left to look is the bright-line criteria in Attachment 1.  But what do these refer to?  There is no single word or phrase you can use to categorize the subjects of the criteria.  Some criteria refer to Facilities, others to Control Centers.  These are both defined terms – that’s good.  But the rest of the criteria use a hodge-podge of nice, creative terms (“Commissioned generation”, “BES reactive resource”, “system or group of Elements that performs automatic Load shedding”, etc) that really can’t be summarized in any pithy word or phrase like “Critical Asset” (and don’t correspond to the six asset types in R1, either).  Of course, this is not inherently a deficiency in CIP version 5, but it needs to be understood all the same.

In other words, all you can really say about what the bright-line criteria in Attachment 1 refer to is that they refer to their subjects[vii].  They certainly do not refer to the six asset types in R1, which is what virtually everybody who I’ve heard describe a methodology for CIP-002-5 R1 compliance says (and it’s what I said, too, until a week or two ago.  It’s almost scary how a whole bunch of intelligent people, including me, can believe something to be true, which a 10-minute review of the wording would have immediately revealed to be wrong.  And I wouldn’t have realized it myself had I not been led to the discovery through an in-depth email discussion I was having with Joe Garmon -nsee footnote i for more on him).

Let’s go to the second of the two approaches vying with each other in R1.  This approach – which underlies most of the wording of R1 and Attachment 1 – can be summarized as

  1. Identify BCS
  2. Run these BCS through the Attachment 1 criteria to classify High and Medium impact BCS, as well as “assets that contain a low impact BES Cyber System” (in other words, Low impact assets).
This is the approach that many knowledgeable people – and I’m sure most if not all former SDT members - will say is the correct interpretation of the CIP version 5 asset identification and classification process.  At first, it appears wonderfully simple – just two steps.  But let’s see how they’d work in practice.

The real killer is the first step.  I want to point out again that neither R1 nor Attachment 1 ever explicitly requires the entity to identify BES Cyber Systems.  However, it does tell you to classify[viii] those BCS, and you clearly can’t classify them if you haven’t identified them in the first place.  But do we have to identify all of our BCS?  R1 says explicitly that no inventory of Low impact BCS is required; clearly there shouldn’t be a requirement (even implicit) to identify Low BCS, since that would require an inventory.

However, I see no way to correctly interpret Attachment 1, other than as requiring that before the entity starts doing any classification of BCS, it needs to have identified all of its BCS – High, Medium and Low (since before classification it is impossible to know which BCS will be High or Medium, and which will be Low).  This means the entity needs to not only inventory all of the cyber assets in its control environment, but determine which are BES Cyber Assets and then group them into BCS. 

Before you say “Oh, that can’t be!” let me walk you through Attachment 1.  The first operational parts of Attachment 1 are Sections 1 and 2, in which it clearly assumes you already have a comprehensive BCS list, and only have to run that list through the High and Medium (respectively) impact criteria to classify High and Medium impact BCS.   For example, Section 1 reads:

Each BES Cyber System used by and located at any of the following:

As has already been pointed out, this phrase implies that all BCS have already been identified, and the only remaining task is classification.

Furthermore, Section 3 of Attachment 1 has you identify

BES Cyber Systems not included in Sections 1 or 2 above that are associated with any of the following assets and that meet the applicability qualifications in Section 4 ‐ Applicability, part 4.2 – Facilities, of this standard:

These of course would be Low impact BCS.  And how are they defined?  As BCS that haven’t already been classified as High or Medium impact.  So how else could an entity strictly comply with Attachment 1, other than by having started out with a list of every single one of their BCS, then run them through Attachment 1 to classify them?[ix]

Of course, I can say with confidence that no CIP auditor (at least one who values his or her job and perhaps his or her life) is going to give you a PV for not having created a comprehensive list of BCS before you do any classification at all.  Because if you did, then the provision in R1.3 – “a discrete list of low impact BES Cyber Systems is not required” – wouldn’t apply.  But there is clearly a disconnect here between Attachment 1 and this provision in R1.3.

How do the auditors plan to resolve this disconnect, so they can keep their jobs?  I can’t speak for all of the auditors in all the regions, but I can speak for two auditors who have made their views – which are their own opinions, not necessarily those of their regions – known in presentations.

The first is Kevin Perry, the chief CIP auditor of SPP.  He gave a very good webinar in February in which he went over the entire R1 BCS identification process.  I recommend you read the narrative from the webinar, as well as the slides.  As you’ll see below, I don’t agree with a lot of what he says. However, he has come up with a fairly consistent interpretation of CIP-002-5 R1 that is without doubt closer to the wording than mine is; I just think his interpretation is hard to understand and use, without providing additional insight over my interpretation.

Kevin definitely believes that the purpose of CIP-002-5 R1 is to identify and classify BES Cyber Systems, and that classifying assets isn’t an integral part of that process – so he basically agrees with most of the wording of R1 and Attachment 1.  On the other hand, he doesn’t require entities to identify BCS at Low assets, as would seem to be required by the strict wording of Attachment 1 (which as I just said, seems to require identification of all BCS – High, Medium and Low – before any classification is done).  How does he navigate this tricky issue?

He does this by stating that the entity should, before even looking at BES Cyber Systems, identify “assets likely to include High or Medium impact BCS”.  It is only after having done this that the entity identifies and classifies BCS at or associated with those assets.  How does he handle Low assets?  Even though R1.3 “defines” a Low asset as one that “contains a Low impact BCS”, he still doesn’t require the entity to identify any BCS at Low assets.  I believe he will simply take the entity’s word that an asset contains a Low BCS and is therefore a Low (I’m willing to guess that he will challenge an entity that asserts that a BES asset that hasn’t been classified Medium or High, but still contains some cyber assets that control the asset in some way, isn’t even a Low.  So the entity will have to prove there aren’t BCS in an asset in order to assert that it isn’t even a Low, but they won’t have to show there are BCS in an asset in order to say it is a Low).

Even though Kevin is making sure not to imply that entities need to identify Low BCS, in my opinion this is making the whole situation much more confusing than it should be.  Calling an asset “likely to include…” a High or Medium BCS is tantamount to saying the asset is High or Medium impact.  Why not just eliminate the middle man and say the asset is High or Medium impact?

The other auditor that has weighed in on this is Joe Baugh of WECC.  He recently gave a presentation on CIP-002-5 that presents a different approach.  He doesn’t even pretend that assets don’t have ratings.  He describes a “top-down” approach to asset identification[x], in which the entity first classifies its assets as High, Medium and Low; the entity then identifies BCS at or associated with the High and Medium assets, and simply lists the Low assets themselves.

This was how I characterized the CIP-002-5 R1 process until very recently, and I still prefer it to Kevin’s approach.  However, I’m willing to stipulate that the two processes, when properly implemented, will produce the same results: i.e. the same lists of High and Medium impact BCS and of Low impact assets.  Kevin’s approach will be more confusing and harder to understand, but it is not fundamentally different from Joe’s.

So are we – gulp – making progress here?  After all, if I agree that both Joe and Kevin will get you to the same place, then all it takes to tame the CIP-002-5 R1 beast is to get the different regions to agree on one of these two methodologies.  And indeed, that was my hope when I wrote this post in March.

However, I’m afraid we’re not making progress, but going backwards.  Because I’ve come to realize that, while Kevin and Joe may both lead you to the same place, it’s the wrong place.  They’re both missing some key points about the scope of assets in CIP-002-5.  And that brings me to my last big problem with v5 (at this point, I’ll excuse you if you want to get popcorn or have a deep massage.  I I still have a while to go yet).

Questions of Scope
As I’ve just implied, I think the biggest problem with the wording of CIP-002-5 R1 is the huge ambiguity about scope.  What is the “big iron” that’s in scope for CIP-002-5, that needs to be considered in Attachment 1?  Section 4.2 of CIP-002-5 is supposed to give you the scope of assets that are “eligible” for compliance with CIP v5.  The first paragraph reads

Facilities: For the purpose of the requirements contained herein, the following
Facilities, systems, and equipment owned by each Responsible Entity in 4.1 above
are those to which these requirements are applicable. For requirements in this
standard where a specific type of Facilities, system, or equipment or subset of
Facilities, systems, and equipment are applicable, these are specified explicitly.

Following that is a discussion of the Facilities, systems and equipment that are in scope for Distribution Providers, but I’ll skip that here.[xi]  For the rest of the NERC entities that have to comply (listed in Section 4.1), here is what’s in scope:

Responsible Entities listed in 4.1 other than Distribution Providers:
All BES Facilities.

OK, so you have to include all of your Facilities.  Let’s look up the definition in the NERC Glossary:

A set of electrical equipment that operates as a single Bulk Electric System Element (e.g., a line, a generator, a shunt compensator, transformer, etc.)

You’ll notice the capitalized E in Element – this means there’s a definition for that, too.  It’s

Any electrical device with terminals that may be connected to other electrical devices such as a generator, transformer, circuit breaker, bus section, or transmission line. An element may be comprised of one or more components.

So it seems that BES Facilities are the “big iron” that’s in scope for v5.  But, given the definition of Facility, how could a control center ever be one?  I don’t know too many control centers that have terminals on them.[xii]  Yet does this mean that control centers are out of scope for v5?  If you take Section 4.2 at its word as setting the scope for CIP v5, then I’d say they are.  However, as we get into this discussion, you’ll see that – surprise, surprise – control centers are actually included in v5.  But that requires taking other wording in R1 and Attachment 1 more seriously than the wording in Section 4.2.  This is unfortunately how things work in CIP-002-5 R1: You have to choose which wording you’re going to follow and which you’re going to ignore.  Does that make you feel empowered?  I didn’t think so.

Let’s move on to R1.  There we find the following:

Each Responsible Entity shall implement a process that considers each of the
following assets for purposes of parts 1.1 through 1.3: [Violation Risk Factor:
High][Time Horizon: Operations Planning]
i.Control Centers and backup Control Centers;
ii.Transmission stations[xiii] and substations;
iii.Generation resources;
iv.Systems and facilities critical to system restoration, including Blackstart
Resources and Cranking Paths and initial switching requirements;
v.Special Protection Systems that support the reliable operation of the Bulk
Electric System; and
vi.For Distribution Providers, Protection Systems specified in Applicability
section 4.2.1 above.

Well, it’s nice to see that control centers made this list, along with the other usual suspects, Transmission substations and Generating stations.  Since this section says we’re to “consider” each of these assets for the purposes of parts 1.1 to 1.3, that must mean this list is really what’s in scope for consideration in Attachment 1, right?  But if so, why did we go to all the trouble of discussing “Facilities, systems and equipment” in Section 4.2?  Why not just give this list in 4.2 and say these are what’s in scope?

Let’s go to Attachment 1 to see what it does with this list of six assets.  What do we find?  As we go through the High impact criteria in Section 1, we find they all apply to “control centers”.  Well, that’s  the first item on the list of six asset types, so this makes sense.  But things get hairy in Section 2.  Some of the criteria do vaguely resemble items on the asset list, but how about criteria 2.2 (reactive resources), 2.9 (“Each Special Protection System (SPS), Remedial Action Scheme (RAS), or automated switching System that operates BES Elements..”), and 2.10 (automatic load shedding systems)?  These aren’t on the list at all.  Why did the SDT carefully provide us this list of six asset types and tell us to consider them in Attachment 1, then ignore some of them and add some new ones when we actually get to Attachment 1?

I won’t leave you in suspense.  It’s because the list of six assets in R1 isn’t for comparing with the criteria in Attachment 1; rather, it’s for determining where BES Cyber Systems can be found.  It’s saying that, when you get to Attachment 1 Section 2 and you’re looking for BCS “associated with” the things (I won’t call them assets) listed in those criteria[xiv], you should only look for them at one of those six asset types.[xv]  As an example, an HMI associated with a Medium generating station might be located at a Low generating station, and if it were a BCS it would be a Medium.  But if that same HMI were located in my bedroom, it wouldn’t be a Medium BCS since it’s not located at one of those six asset types.

To be honest, in this case I’m not saying that CIP-002-5 R1 is contradictory (the Lord knows it is contradictory in plenty of other areas).  But it’s only after a year of working heavily on R1 that I’ve realized this, and that’s the whole point.  This part of the requirement is another example where R1 is simply way too subtle for its own good.  If the SDT meant to have these six asset types be the possible repositories of BCS but not the fodder you feed directly into the Attachment 1 criteria, why didn’t they say that explicitly?  Why should it take me a year to figure out (and only then do so because someone clued me in on it – see footnote xv)?  Meanwhile, both Kevin Perry and Joe Baugh (and probably a lot of other auditors) are still saying that these six asset types are listed in R1 because they’re what you feed into Attachment 1.  Not a good situation, when even the auditors are missing the point of the wording.

If the above were the only real problem of scope in R1, it wouldn’t be such a big problem.  It’s very confusing to be told that you need to first consider “Facilities, systems and equipment” as what’s in scope, then that what’s really in scope are six asset types, and finally find out in Attachment 1 that what’s really really in scope are a hodgepodge of assets as well as something called “Facilities”.  Confusion can be fixed with education, but that assumes that the subject matter is clearly enough defined that it can be addressed in a fixed course or document; that is simply not the case with CIP-002-5 R1 as currently worded.[xvi]

However, there is a bigger issue of scope in CIP-002-5 R1.  Criteria 2.3 – 2.8 of Attachment 1 refer to Facilities, as different from assets.  In 2.3, I believe that Facilities refers to units of a generating station that have been designated as important to reliability.  The implication of this criterion is that the unit or units so designated will be Medium impact, and the other units in the plant will be Low impact.

Criteria 2.4 – 2.8 refer to substations[xvii].  In these criteria, I believe that Facilities – the NERC definition of which has already been stated above – refers to lines, transformers, etc. located at a substation.  In these criteria, the Facilities are classified, not the assets (substations) where they are found.  Therefore, in each of these criteria there can be some lines at a substation that are Medium impact, whereas others are Low impact.  Their associated BES Cyber Systems (whether or not located at the substation subject to the criterion) will be Medium impact.

This fact has not been publicized at all.  Other than some large Transmission entities who have closely studied this matter (and NATF), every other entity I have discussed this with – and literally every presentation I have seen or read by NERC or Regional Entity personnel – has assumed that Criteria 2.4 to 2.8 refer to the substation (the asset).  Given this interpretation, every BES Cyber System associated with that substation will be Medium impact, not just those BCS that are associated with a Medium impact Facility (usually a line) at the substation.  I do not see how this interpretation is supportable, and given that a number of large transmission entities are basing their interpretation on the viewpoint I have just stated as my own, there will be many problems when audits of CIP version 5 compliance begin.  Unless something is done, of course.

What’s to be Done?
To conclude this post (I’ll bet you thought you’d never see those words!), I believe CIP-002-5 R1 and Attachment 1, as currently worded, are too deeply flawed to be the basis for asset identification and classification in CIP version 5.  This is because of the many instances of missing or unclear wording, as well as outright contradictions.

However, I agree there is no longer any opportunity to change the actual wording short of FERC remanding Order 761, which isn’t going to happen.  I did hope for a while that the regions could put their heads together and come up with a common interpretation, but there is no sign of this happening, either. 

The other solution may be for NERC to issue some sort of audit guidance on this requirement (and perhaps on others as well); if this were uniformly adopted by the regions, it would at least put a Band-Aid™ on the wound[xviii].  But we’re now just 22 months away from April 1, 2016.  It is inexcusable that there is still so much uncertainty[xix] at this point.[xx]

Note: I won’t be putting notices of new posts on LinkedIn anymore.  If you’ve been relying on those to learn of new posts (or even if you haven’t), I suggest you subscribe to the FeedBurner feed by entering your email address in the box at the top of this post.  And in case you’re wondering, I can’t see any of the email addresses that have been entered, although I can see their total (now about 130), as well as useful information such as how many are from Uzbekistan vs. from Botswana.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] A large number of them have been with fellow blogger Joe Garmon, who is also stupid enough to want to take on CIP-002-5 in his posts (I told him to just stick to writing about the 2016 Presidential race, but no, he wouldn’t listen to me).  They have been quite good, although I don’t agree with him on everything he puts up (and he doesn’t agree with everything I put up – but then I don’t agree with a lot of it, either).  In any case they have been very good discussions, and we’ve each learned a lot from the other.

[ii] There is another reason why I’m writing this post.  It is based on the document I submitted to FERC on May 21 on this subject – problems with CIP-002-5 R1.

[iii] I did rewrite it last year in this post.  I recently reread that post and thanked my lucky stars that FERC hadn’t ordered NERC to implement it.  It wasn’t terrible, but I’ve learned so much more since then about how v5 should be written, as opposed to how it actually is written.

[iv] Even this simplifies the process, since the Guidelines and Technical Basis for CIP-002-5 describes the BES Reliability Operating Services, which provide a second way to identify BCS.  I call this approach the “top-down” one, vs. the “bottom-up” approach of identifying BCAs and grouping them into BCS.  I personally believe that it’s important to use both approaches, since in some cases they will yield different results, and you may end up over- or under-identifying BCS if you don’t use both approaches.

[v] The phrase “assets that contain a low impact BES Cyber System” doesn’t actually require the entity to identify Low BCS.  It is merely a circumlocution that allows R1 to maintain the fig leaf that it is really for identifying and classifying BES Cyber Systems, not assets (although it does allow an entity to exclude BES assets that don’t contain any cyber assets at all from even being Lows, since they obviously couldn’t contain a Low BCS).  More on this in a moment.

[vi] Note that I wouldn’t say this in the hypothetical case that the requirement were written clearly, but for some reason 100% of entities didn’t comply with it.  That would be a different issue.   You don’t just change a requirement because people don’t understand it.  But it has to be understandable in the first place, and that isn’t the case in CIP-002-5 R1.

[vii] In Joe Garmon’s most recent post, he tries to summarize the subjects of the BLC as “Facilities, locations, Systems and Control Centers”.  This is a valiant try, but even this isn’t enough to encompass all of the different subjects of the BLC.  I think with two or three more terms you might summarize those subjects, but why bother?  Entities just have to look at each criterion to figure out what it applies to.  This isn’t elegant at all, but it would be simply wonderful if lack of elegance were the only real problem with CIP-002-5 R1.

[viii] If you’re part of one of the numerous NERC entities (what used to be called the “Silent Majority”, to bring back a term from the Nixon era) that ascribes to the first interpretation of CIP-002-5 R1 (i.e. first identify your big iron, then the little iron associated with it – this is of course the correct interpretation in my view), you might find it a little strange that someone would think the bright-line criteria are for classifying BES Cyber Systems.  Because without exception, the criteria all refer to types of assets (big iron) not cyber assets (little iron).  

But, strange to say, I’m not sure you’ll find anybody at NERC or the regional entities who will say anything other than that the criteria refer to cyber assets (BCS in particular).  These people can point to the fact that the overall structure of Attachment 1 does state clearly it’s for BCS classification, not asset classification.  I won’t argue with that – in fact, that’s why I’m saying there can’t be a consistent interpretation of CIP-002-5 R1 and Attachment 1 that doesn’t ignore a lot of the wording of Attachment 1.

To support this assertion, I point you to an obvious fact: The CIP v5 bright-line criteria are based on the CIP v4 criteria (some are exactly the same in wording, while others differ.  A few were added or removed).  Those v4 criteria referred to assets, not cyber assets; they were there to give entities “bright lines” for identifying Critical Assets, vs. the alleged ambiguity of the RBAM process in CIP versions 1-3.   The v5 SDT essentially wrote the little preamble phrases to each section (“Each BES Cyber System used by and located at any of the following:” and “Each BES Cyber System, not included in Section 1 above, associated with any of the following:”) under the mistaken idea that inserting these would magically transform the criteria into criteria for BCS, not assets. 

Unfortunately, that didn’t work, and it couldn’t have.  If the SDT had wanted real criteria for BCS, they would have had to create completely different criteria such as operating system (for example, Windows BCS might be considered High or Medium risk, while non-Windows BCS might be considered Low risk).   Of course, this would also require changing the overall schema for High-Medium-Low in v5 from one based on supposed impact to the BES to one based on inherent risk of the cyber asset.  That is another point on which I feel there has been a great deal of self-delusion.  But I’ll save that discussion for another day.

[ix] Of course, the entity comes to Section 3 through R1.3, which does say that an inventory is not required.  This is roughly like ordering someone to steal food, but prefacing that order with a statement that no law shall be broken in the process of complying with that order.

[x] Not to be confused with what I call the top-down approach to BES Cyber System identification, described in footnote iv above.  Joe’s use of the terms “top-down” and “bottom up” refers to differing approaches to the overall methodology for compliance with R1, not to the specific question of identifying BES Cyber Systems – which is how I use those terms.

[xi] Not that I’m trying to slight DP’s, but this post will be complicated enough as it is.  Some of my best friends are DP’s.

[xii] I wish to thank a person from a Canadian entity that pointed this anomaly out to me last summer.

[xiii] I’m told by people who know a lot more about this that trying to clearly identify Transmission vs. Distribution substations is very hard in practice.  And it’s even harder when they share the same fence.  This isn’t a problem with CIP-002-5 R1 itself, but rather with the very concept of using bright-line criteria.  As I wrote in a post in 2012, which is reproduced in this post from last year, the electric power industry is so fragmented and individualized that the idea of having bright lines is very hard to realize in practice.  I don’t know what can be done about this problem, other than lots of guidance by NERC and the regions.

[xiv] Note I’m deliberately referring just to Section 2 (Mediums) of Attachment 1, not Section 1 (Highs).  This is because High BCS need to be “used by and located at” the asset – so there’s no question where you’ll find those.  Since Medium BCS just have to be “associated with” the asset/Facility, they can be elsewhere – but only those associated BCS that are located at one of the six asset types are Medium impact; the rest are Low impact.

[xv] I want to thank an Interested Party who explained this to me a couple months ago.  Before then, I wandered in the darkness of thinking that the Attachment 1 criteria actually had something to do with those six asset types.

[xvi] I have seen attempts to flow chart the CIP-002-5 R1 process, but unfortunately there is always something that is never taken into account.  For anyone trying to flow chart the requirement, I have this advice: If you want to try, chart the way the requirement should have been written, not how it is actually written.  In my opinion, as currently written the requirement simply cannot be flow charted.  Of course, that in itself is a good sign there is something seriously wrong in R1, since a requirement that can’t be flow charted can’t be complied with.

[xvii] Of course, there is an exception to this statement.  Criterion 2.6 refers to “Generation at a single plant location or Transmission Facilities at a single station or substation…”  The phrase about Generation seems to refer to a type of asset, while “Transmission Facilities…” refers to Facilities.
[xviii] It wouldn’t be a legal solution to the problem, since if an entity were to sue NERC and FERC over a penalty for CIP-002-5 R1 violation that they didn’t feel they deserved, the only thing the court would look at would be the wording of R1.  And I’ve often said that any judge who takes more than a cursory look at R1 will simply throw it out – which would invalidate the rest of CIP v5 as well.  That could happen even if NERC does issue audit guidance.

[xix] I was told by a lady from one of the largest NERC entities that she was being bugged by upper management to give estimates for annual costs of maintaining CIP v5 compliance after April 1, 2016.  The best she can do now is give estimates with a 50% margin of error either side.  She drily noted that her management wasn’t used to seeing ranges like that.  She said the problem was the big uncertainty over what assets – big and little iron – would be in scope.

[xx] There is another solution as well.  I hear that the McDonalds in Williston, ND (the heart of the fracking boom) are offering $20/hr. with a $500 signing bonus.  Think about it – no more worrying about BCA/BCS/BLC, etc.  Tempting, no?

Wednesday, May 7, 2014

Bashing Utilities for Fun and Profit

I have been impressed (if that’s the right word) for a long time by the fact that many commentators in the popular and trade press seem to agree on one key point: the electric utility industry is rife with security vulnerabilities and is ready to fall over tomorrow, with just one small push by some 11-year-old in Estonia.  This in spite of the fact that I don’t know of a single directed cyber attack on a power facility (or facilities) that has led to an outage for even one household for one minute.  And while there definitely was a very serious physical attack on a substation in California last year (which didn't cause an outage, but well could have had the stars been aligned a little differently), the industry very quickly stepped up on their own to address problems, even before FERC ordered a mandatory standard.

This isn’t to say that the electric power industry doesn’t need to do more for cyber security, or that it doesn’t occupy a very special position that requires a lot of scrutiny; even a one-minute outage for a whole city would be devastating.  But it is to say that it’s really unproductive for people who should know better (and if they don’t know better, they shouldn’t be standing up as experts) to be making these statements with no facts to back them up.

I also think I know why these people don’t get called out more often.  It’s because the electric power industry is mostly regulated, and the entities that comprise it don’t want to stir up any controversy, even if this means sitting back and taking some pretty unfair hits.   Here are three examples of what I’m talking about:

Exhibit A is this article on the Smart Grid News website.  I need to first say that I have a lot of respect for what Jesse Berst has done for the smart grid; it’s no exaggeration to say he’s a huge reason the smart grid is as successful as it is today.  But this doesn’t excuse sloppy reporting and commentary, which is the case here[i].

The article is about NERC’s approval of CIP-014, the new physical security standard for substations ordered by FERC in early March.  The article says nothing good about the standard and makes three arguments why it is virtually worthless:

  1. On the plus side, the proposed rules gained 82% approval. On the downside, most of the votes were from the very utilities who (sic) would be subject to these new regulations.”  This is a frightening observation – it seems the very rules that are needed to protect our way of life are being written by the thoughtless (evil?) utilities that have to follow them!  What could be worse? 
This ignores the small point that all NERC regulations are drafted by the utilities subject to them, and have been since NERC was founded in 1967.  But this is even more shocking!  We should tell Congress!  However, Congress already knows this, since they wrote the Energy Policy Act of 2005.  This was the law that put in place the current structure in which FERC is the regulator while NERC drafts standards, submits them for FERC approval (or rejection), and audits the member entities for compliance.

Congress didn’t pass this law on a whim late one night.  It was very well vetted with lots of hearings, testimony, etc.  And there are very good reasons why this structure was put in place, one of the most important being that neither FERC nor any other organization has the ability to understand all the nuances (many extremely local and particular) of the electric power industry.  It is best that the people most involved day-to-day draft the standards, while FERC makes their own judgment regarding their adequacy (and they often push back.  In fact, it seems that is happening now with the new CIP requirements for Low impact assets – more on that in a later post).

  1. Proponents of much stiffer security safeguards wanted things like blast barriers to protect transformers and other critical equipment. NERC's proposed rules, however, allow utilities to determine on their own whether or not their substations are critical...”  This is of course also very shocking – it seems the fox is not only guarding the henhouse, but is also judge and jury for any case that might be brought against him for violating the new standard. Because all the standard says is that utilities need to decide on their own what it applies to!  Of course, we all know that these sinister utilities will choose to apply it to few if any substations. 
Again, I hate to introduce these messy things called facts into the discussion, but I do wish to point out that this is exactly what FERC ordered NERC to do.  They realized there was so much variability in substations and their environments that it would be impossible to draw up hard and fast prescriptive rules for deciding which were critical and which weren’t.  Not only that, but there was no way NERC could even develop such rules in the 90 days FERC was giving it.  Instead, FERC recommended that the standard require NERC entities to conduct a risk assessment to identify critical substations, then have that assessment reviewed by a neutral third party;  they will also be audited on how well they did this.  Smart Grid News could have easily found this out by reading FERC’s Order from March, or my post that followed hard upon the Order.

Note from Tom on 2/20/18: It is quite interesting to point out that, not only did electric utilities not under-identify the substations subject to CIP-014, they over-identified them beyond NERC's wildest dreams. I believe there have been over 1,000 substations identified in North America as critical facilities under CIP-014. In my opinion, this is far more than should have been identified. It's simply a testament to the fact that the power industry wants to do the right thing, even if it costs them a lot of money (which this will, of course).

  1. “...and then to decide on their own what measures they should take to protect the facilities, if any.”  This is also shocking on the face of it.  It seems that, even for substations (and control centers) that the utility deems critical, they can decide all on their own what needs to be done to protect them.  And knowing utilities, they’ll decide it’s sufficient to just hang a sign that says “Do Not Attack” on every critical substation and call it a day. 
Again, this was what FERC ordered, and for the same reasons as above: there is way too much variability in substations for any sort of prescriptive rules to work.  More importantly, though, the utilities don’t “decide on their own” what measures they should take.  FERC ordered, and NERC included in the standard, that each NERC entity should have a third-party review of their physical security plan, as well as of the risk assessment they used to designate their critical substations in the first place.  And NERC has set criteria for what third parties can conduct this review.

Now let’s go to Exhibit B.  This isn’t any one particular article or post, but a number of comments I’ve seen on LinkedIn and other forums over a number of years about why the PCI standards for credit card security are so superior to NERC CIP.  These are usually posted by consultants from the IT security industry who have decided to make the switch to the Dark Side of control system security, and are anxious to prove their manhood (I haven’t seen any woman do this) by – what else? – bashing the utilities for writing wimpy standards that just don’t cut it.

I don’t know whether the PCI standards are technically more rigorous or well-written than CIP.  However, I do know there’s a big difference in the cyber security records of the two industries.  We could start with Target, then go back through a whole host of massive thefts of credit card data, all the way back to TJX (TJ Maxx).  I’m sure the total losses to American business from these breaches have been in the billions (and it seems most costs are borne by the banks that have to reissue cards, not the retailers themselves).  If the electric power industry had the same record, we’d all be sitting at home in the dark (and I obviously wouldn’t be writing this post, which some might say would be a good thing).

I will point out a couple really egregious things about PCI.  First, it may be very good at protecting what it protects, but it seems all of the big breaches have been through another channel that was left unprotected by PCI.  The best case in point is Target, where it seems there was no real separation between the corporate systems and those that processed credit card data.  Someone broke into the account of an HVAC contractor (who clearly didn’t need access to credit card data and should have been excluded from it if some sort of least-privilege analysis had been applied), and was able to traverse the network to get to the real-time transaction network – which allowed them to plant their malware on the point-of-sale systems. 

Of course, this would be like having a Balancing Authority’s control center on the same network as the corporate systems, so that the same HVAC contractor’s account could be used to bring down the entire control area.  Were this to be possible, the entity would have been in gross violation of just about every CIP requirement for a long time; and at a million dollars a day per requirement violated, that would be one hefty fine.  But Target was and is completely PCI compliant!  Obviously, whatever technical merits PCI may have don’t make up for the fact that the standard is fairly narrowly focused on the systems that store and process credit card data.  CIP is similarly focused on control networks, but at least it is written from the standpoint that the control network needs to be protected from the corporate network – and the fact that this is the case is probably a big reason why there haven’t been any outages caused by cyber attack.

Another thing about PCI that’s always amazed me is that the entity being audited pays the auditor!  These auditors, called QSAs, are security firms that have passed a rigorous certification; I’m sure they’re quite qualified to do the audits.  But there is such an inherent conflict of interest in this process, since every auditor must have it in the back of their head that they’d love to be given the job of fixing all the problems they find.  You might think this would incent them to find lots of problems, but that’s not the way psychology works.  If someone tells me – in a report that has to be provided to the proper authorities – that I have all sorts of problems, do you think I’ll love them for doing that?  Or am I more likely to love someone who gives me a pass on some of the worst problems – yet still reserves a few meaty ones that he can help me fix?

Now we come to Exhibit C, which is my favorite.  It’s a speech that Mike McConnell, former NSA Director and now Vice Chairman of Booz Allen Hamilton, gave in early March of this year.  In it, he said “In my mind, there is 100% certainty that cyber attacks will occur.”  And of course, he pointed to the power industry[ii] as the likely venue for this attack.  As he said, "Just imagine being in New York City in the middle of the summer with no power." 

Sounds pretty scary, huh?  100% certain!  Of course, that’s an exaggeration, since the only events that are 100% certain are those that occurred in the past.  And I’d say one of the most costly and destructive cyber attacks in the past was that perpetrated by contractor Edward Snowden against Mr. McConnell’s former employer, the NSA.  And who was the firm that placed Snowden at the NSA?  Why, Booz Allen!  Hmmm....

July 1: I just posted an update, discussing a new case in which Smart Grid News attacked (by implication) PG&E for not reporting the Metcalf attack until a year later, without bothering to ascertain whether the allegation was true or not.  I certainly hope I don't have to post more of these, but I will if necessary.

All opinions expressed herein are mine, not necessarily those of Honeywell International, Inc.

[i] I did submit a comment on the article, but it was never posted.

[ii] Also the banking industry.