Tom Alrich's Blog: June 2014

Sunday, June 29, 2014

Chasing our Own Tails

This post is actually an extended footnote to the previous post on the CIP-002-5 RSAW. But it’s lengthy enough –and relevant in its own right – that I decided to make it a separate post. It has to do with the discussion at the end of what I call “Issue 4” – i.e. the big issue with this RSAW that I think could and will lead to the final collapse of CIP v5, if nothing is done to change this.[i]

In the section on Issue 4, I brought up the first note in the “Notes to Auditor” section at the end of the RSAW:

Results-based Requirement: The auditor should note that this is a results-based Requirement. As such, the entity has great latitude in determining how the result is achieved. The auditor should focus on verifying that the result is complete and correct.

I pointed out in the previous post that this sounds wonderful, but it only works if the “result” aimed for in the requirement is clear and unambiguous; this is certainly not the case here.

However, on further reflection I realized there is a basic logical flaw in this statement. You can see that by asking how the auditor will determine if the intended result of CIP-002-5 R1 – the correct identification and classification of BES Cyber Systems – was achieved.

Let’s imagine an easy case: Suppose the auditors were given special glasses that made each actual BES Cyber System appear blue. All they would have to do is put these glasses on, look around the facilities being audited, note the blue systems, and compare this to the list already provided by the entity being audited. Of course, any discrepancies will be instantly identified, and appropriate PV’s can be issued.

But there are no special glasses in this case. So how does the auditor determine if a system in front of him/her is a BCS? You would probably say, “Of course, it’s a BCS if it meets the definition of a BCS.” So what is the definition of a BCS? It’s linked to the BES Cyber Asset definition[ii]:

A Cyber Asset that if rendered unavailable, degraded, or misused would, within 15 minutes of its required operation, misoperation, or non-operation, adversely impact one or more Facilities, systems, or equipment, which, if destroyed, degraded, or otherwise rendered unavailable when needed, would affect the reliable operation of the Bulk Electric System.

Is an auditor going to be able to tell if an alleged BCS meets this definition simply by looking at it? Certainly not. He/she is going to ask the entity to show how they arrived at the determination that this is a BES Cyber System. In other words, they will look at the documented process for this determination, meaning they will look for a discussion of how this result was achieved.

Now let’s go back to the RSAW quote above, but substitute in what we have just learned:

Do you see the problem? Even though this is supposedly a “results-based” requirement, the only way to determine whether the result is correct is to look at how it was achieved. And as I said repeatedly in the previous post, the RSAW says nothing about what the process of BCS identification / classification should be – except that it should be one which achieves this “result”. So we’re just chasing our own tail here.

What do I recommend to fix the problem? The same thing I’ve been saying for a year: CIP-002-5 R1 needs to be “reworded”. It’s no longer possible to change the actual wording, but there needs to be some interpretation, probably from NERC, of what the requirement means. Unfortunately, the RSAW isn’t that interpretation.

But no matter what happens, R1 isn’t ever going to be a “results-based” requirement; there will never be a way an auditor can determine if an entity has correctly identified BES Cyber Systems other than by looking at the process they used to identify them. But the process itself needs to be made clear, and at the moment it is anything but clear.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] And this isn’t saying that the RSAW is the cause of this problem, but rather the symptom of it. The problem is the inconsistent and ambiguous wording of CIP-002-5 R1. Some people thought the RSAW might fix the problem, but instead it makes it worse by not even attempting to clarify the ambiguity. Instead, it pretends that this is in some way a virtue and that the auditors can and want to step into the role of interpreter/judge/executioner to make all this better. They can’t do that, and they certainly don’t want to.

[ii] This ignores a complication in that I believe almost all auditors will say that an alternative “definition” for BCS is a system that assists with the performance of a BES Reliability Operating Service – discussed in the Guidance section of CIP-002-5 R1 but not in the requirement itself. This just reinforces my argument, so I won’t pursue it further now, for fear of violating what Little League teams call the “Slaughter Rule”.

Thursday, June 26, 2014

The CIP-002-5 RSAW: Waiting for Godot

I have a confession to make: I often start writing a post without knowing clearly what I’m going to say. This is because I don’t start really analyzing a topic until I write a post on it – like most people, I’m fine with using my initial impression as a guide going forward. You can’t analyze everything in depth, or you’d never be able to get out of bed in the morning. This has led many times to a post that started out by saying one thing but ended up saying something quite different.

And so it is with this post. Last week, NERC released drafts of the RSAWs (Reliability Standard Audit Worksheets) for CIP Version 5. I of course immediately downloaded the CIP-002-5 RSAW and quickly reviewed it. I was eager to see whether and how it might shed some light on the ambiguities and inconsistencies in CIP-002-5 R1 that I’ve written so much about. My initial impression was this: I didn’t expect much from the RSAW, and sure enough it didn’t disappoint me. If you’re looking for guidance with the problems in the language of CIP-002-5 R1, you won’t find it here. Meanwhile, it doesn’t even do a good job of accomplishing its limited purpose: translating the words of R1 and Attachment 1 into a format that auditors can use for audits.[i]

The good news is that, as I started working on this post in earnest, I moved away from this initial impression of mild disappointment. The bad news is that I moved to something much more like deep despair. Not only does the RSAW not do a good job of accomplishing its very limited purpose, but it creates huge holes which – if NERC does nothing to fill them – will result in chaos when it comes time to audit compliance with CIP-002-5 R1. And as I’ve said many times before, R1 is the foundation for all of CIP Version 5 (and v6, for that matter). If the foundation is rotten, the house will fall. I believe the whole v5 house will ultimately fall unless something is done about this.

Before I continue, I need to repeat the warning I’ve used a couple times before. It was never more needed than for this post:

Warning: Reading this post may cause unintended side effects in NERC compliance professionals, including loss of sleep, depression, excessive consumption of alcoholic beverages, and thoughts of suicide.

Now to the argument. In a post in March, I tried to look at the different possibilities for help with the problems with CIP-002-5 R1, and concluded that the RSAWs weren’t likely to provide much assistance (in spite of the fact that NERC staff members had hinted at the March CIPC meeting that this might be the case). I said this because I know the RSAWs are intended to provide guidance to auditors based on the literal wording of the standard, and the NERC lawyers will never let them do anything more than that. Anyone who believes (or implies) that the RSAWs can provide interpretation of the requirements is deluding themselves, others, or both.

But there’s a bigger problem, even if you accept the idea that the best an RSAW can do is simply tell an auditor how to audit based on the literal wording of the standard: Because of the many problems with the wording of CIP-002-5 R1 and the fact that – as I’ve said multiple times before – there can be no consistent interpretation of R1 that doesn’t contradict at least some of the wording, just telling auditors how to audit based on the literal wording (if this is even possible) doesn’t in fact give them a consistent, clear methodology for conducting the audit. And it also doesn’t give the NERC entities a consistent, clear methodology for complying with R1 in a manner that will keep them free of violations. The inevitable result of this, unhappily, will be a HUGE increase in the use of auditor discretion in audits – and supposedly one of the goals of v5 was to decrease or even eliminate this problem. The auditors will have to use some methodology, and this RSAW doesn’t provide it. More on this below.

Also, I’m not at all sure this RSAW even does a good job of reproducing the wording of the standard. As I said, there are lots of ambiguities and contradictions in that wording, and you have to do some sort of interpretation even in order to just translate the wording in a consistent manner (since it isn’t consistent in the first place). I’m sure the people writing the RSAW have tried their best, but some of what they’ve come up with is as contradictory as the requirement itself.

I will first discuss three specific issues, in the order in which I encountered them in the RSAW. Note that these were all discussed in mind-numbing detail in my recent 7,000-word post on the problems in CIP-002-5 R1. Finally I will conclude with a discussion of a huge overall issue that pretty much trivializes everything else.

Issue 1: Assets/Facilities

First, there’s the issue of what the criteria in Attachment 1 apply to. I had originally thought that the six asset types listed in R1 were what you applied the criteria to, in order to identify High, Medium and Low impacts; in fact, I know there are to this day many in NERC and the Regional Entities that believe this. However, an Interested Party convinced me earlier this year that the six asset types are simply the locations at which BES Cyber Assets/Systems can be found, and the Attachment 1 criteria really refer to other things. Indeed, when you go through the criteria, you will see that they apply to a lot of things that aren’t included in the list of six asset types.

Most importantly, criteria 2.3 – 2.8 apply to something called “Facilities”, which in a substation means each line, each transformer, each bus, etc. If a Transmission entity assumes that Facilities is just another word for the six types of assets in R1, they may well end up over-identifying BES Cyber Systems. This is because there could be both Medium and Low impact Facilities at a substation, but if you assume that “Facility” means the substation itself, this would indicate that all of the BCS are Medium as well[ii] – and this could well not be the case.

However, I know at least some of the Regional Entities are saying that “Facility” means the substation itself in criteria 2.4 – 2.8. And since the substation is clearly Medium impact in those criteria, all of the BCS at the substation have to be classified Medium as well.[iii]

How does the RSAW deal with this issue? It doesn’t. Consider this wording under Evidence Set 1 on page 5:

2. A list of all high impact BES Cyber Systems identified, and the asset(s) with which the BES Cyber System is associated.

3. A list of all medium impact BES Cyber Systems identified, and the asset(s) with which the BES Cyber System is associated.

These clearly imply that BES Cyber Systems are only classified by their association with assets. Of course, “assets” isn’t a NERC defined term, but in this context I believe it means a member of one of the six types of assets listed in R1. And there is definitely no mention of Facilities. The upshot of this is that a Transmission entity won’t be able to have both Medium and Low BCS at a Medium substation; all BCS will be Medium – unless the RSAW is changed, of course.[iv]

Issue 2: “At” / ”Associated with” / Whatever

I think I could write a small book by now on the question of the use of “at” vs. “associated with” in Attachment 1. I did write a whole post on this in February. Unfortunately, there is a lot of confusion on this issue in the RSAW.

The first example is item 2 under Evidence Set 1, which I quoted above. Here, “associated” is used to refer to High BCS, when the wording in Attachment 1 Section 1 is “used by and located at”. A BCS is classified as High because it is used by and located at a High control center, not “associated with” it. If the SDT had used “associated”, this would have meant that every RTU in a substation controlled by the control center could be a High impact BCS – and entities would have to spend a fortune protecting substations as High impact assets.

Fortunately, the writer(s) of the RSAW wasn’t consistent in this regard, since line 1a under Evidence Set 2 (page 5) says

a. A list of high impact BES Cyber Systems, for which this entity has full or partial compliance responsibility, used by and located at the asset

Here, they got it right. But don’t celebrate yet, since near the top of page 7 we find this line:

Verify that the high and medium impact BES Cyber Assets associated with the sampled asset have been correctly identified.

Now they’re back to using “associated with” for both High and Medium BES Cyber Assets (and this should read “BES Cyber Systems”, of course, not “BES Cyber Assets”. The BCAs themselves aren’t rated, which I would hope the NERC staff understands).

There’s now a new issue related to the “at / associated with” question. That is what I am now officially dubbing the “Whitney Doctrine”. This doctrine states that “Physical location IS a determinant factor for impact classification.” It has the effect of solving the “transfer-trip” relay question in a way that made a lot of Transmission entities quite happy, as described in my previous post. It has no basis in the wording of CIP-002-5 that I can see, but as I said in that post, I’m perfectly fine with this – somebody has to push the words of R1 around so that there is a consistent interpretation.[v]

However, it does mean that “associated with” now has no meaning, since because of the Whitney Doctrine both High and Medium BES Cyber Systems need to be located at the asset (vs. just High BCS in the dark days before the Doctrine was promulgated, as described in my last post). And it also probably means that you won’t be able to rate a BCS at a criterion 2.5 substation according to the line (i.e. Facility) it’s associated with. You can’t say that a relay is “at” a line, but you can say it’s "at" a substation. So the relay will have to take the classification of the substation (Medium in this case).

However, maybe Tobias will figure out a way to make the Whitney Doctrine compatible with the Facilities principle, thus saving his “transfer-trip” ruling (and possibly his life, since the TO/TOP’s wouldn’t be pleased to hear he was going back on what he said at the CIPC meeting, just for the sake of consistency). He’s a smart guy.[vi] But this does show the whack-a-mole effort that is required when trying to fix problems in the Attachment 1 criteria; you solve one problem (to actual applause, in the case of Tobias) and another pops up somewhere else because of your “solution”.

Issue 3: “The Phantom Low Impact BES Cyber System” and other Tales of Mystery and Horror

I have several times discussed the fact that there is an inherent contradiction in CIP-002-5 between Attachment 1 and Requirement 1. The former assumes you start out with an inventory of all your BES Cyber Systems; you subtract out the Highs and Mediums, and voila the rest are Lows. Meanwhile, R1 makes clear you don’t have to inventory cyber assets at Lows. So how would you ever get the comprehensive list that Attachment 1 clearly implies you need?

One result of this contradiction is that there is a big asymmetry between R1.1 and 1.2 on the one hand, and R1.3 on the other:

1.1. Identify each of the high impact BES Cyber Systems according to

Attachment 1, Section 1, if any, at each asset;

1.2. Identify each of the medium impact BES Cyber Systems according to

Attachment 1, Section 2, if any, at each asset; and

1.3. Identify each asset that contains a low impact BES Cyber System

according to Attachment 1, Section 3, if any (a discrete list of low impact

BES Cyber Systems is not required).

1.1 and 1.2 require you to identify BCS, but 1.3 just requires you to identify an asset that “contains” a low BCS. It immediately goes on to tell you that you don’t need a list of those low impact BCS. So how on earth can you know whether an asset contains one of these mythical beasts? Of course, what would make much more sense would be if 1.3 read something like “Identify each Low impact asset”; in fact, every NERC entity I’ve talked to about CIP v5 asset identification says this is exactly how they’re interpreting 1.3.

So why did the SDT twist their words into knots in R1.3? It’s because the entire premise of R1 and Attachment 1 is that assets (i.e. the “big iron”) are never classified; just the BES Cyber Systems (the “little iron”) are. The only way they could say something that means “Low impact asset”, without using those dreaded words, was to use the circumlocution in R1.3. This in spite of the fact that nobody is interpreting the words this way – in fact, nobody could interpret them that way, given the contradiction between the first part of R1.3 and the parenthetical expression at the end of it.

All of this might just be amusing, but the fact that this wording is used in the RSAW (which it is) could potentially cause trouble. I refer now to two lines on page 7:

Verify that the asset has been correctly identified as containing a low impact BES Cyber System.

and

Verify that the sampled asset does not contain a BES Cyber System.

In the first line, the auditor is being asked in effect to make sure the entity has properly identified Low impact assets; in the second line, he/she is being asked to make sure the entity isn’t lying when they say an asset isn’t even a Low[vii]. But because “Low asset” is a verboten term in CIP v5, the RSAW uses the language of R1.3.

But how is the auditor going to verify this? If I were an auditor new to CIP v5 and I saw these two lines, the first thing I would ask the entity is to show me their list of all Low BCS; of course, that doesn’t exist. So how does the auditor verify that the asset contains or doesn’t contain a Low BCS? The answer is that the auditor needs to know that saying an asset contains a Low BCS is the same thing as saying it is a Low asset – wink wink, nudge nudge. So he verifies whether or not these are Low assets by doing what every entity is now doing anyway: assuming the criteria in Attachment 1 refer to assets (or Facilities) and identifying Low assets as those that aren’t High or Medium.[viii] But shh…don’t tell anybody they did this.[ix]

I hope I don’t have to tell you that requiring knowledge that is nowhere stated in the requirements isn’t a wonderful way to have auditors audit a standard with million-dollar-a-day penalties.

Issue 4: The BIG One

I really shouldn’t call this just one of four issues. That’s like saying there were a number of issues on the Titanic the night of April 14 1912, including a shortage of party hats and also an iceberg they’d just hit. And like that iceberg, this issue could literally sink CIP Version 5, and probably will if nothing is done to address it.

Unfortunately, this isn’t just a problem found in the wording in one or two places in the RSAW; it’s found just about everywhere. Let’s start with Evidence Set 1 on pages 5 and 6. There are three phrases that contain the words “the process required by R1”, for example

Evidence that the process required by R1 was implemented to determine the list of high and medium impact BES Cyber Systems.

What’s wrong with these words? Tell me, what is the “the process required by R1?” You might say, “That’s simple, it’s the methodology you use to comply with Requirement 1”. I say, “Great, now tell me what that methodology is.” Then what do you say? I know that I certainly couldn’t tell you that methodology. I recently wrote a post where I started out thinking I was going to finally nail down this methodology. I ended up breaking it off unfinished, and finally came back to admit that I couldn’t finish it. This is because I have come to the conclusion that there is no consistent, complete methodology that can be stated for compliance with R1, which will at the same time not do some violence to the wording of either R1 or Attachment 1.[x]

So why does the RSAW blithely imply that “the process required by R1” is something that any child could recite? The whole point of the RSAW is to tell the auditors how to audit the requirements, not to leave the hardest part as an exercise for the reader. It’s as if you read a cake recipe for the first time. It first carefully lists all the ingredients – that’s good. Then it says, “Now bake the cake according to the required process.” Wouldn’t you expect the recipe to tell you what that process was?

Let’s go on to page 6. There we read several phrases with the words “can be reasonably expected”, as in

Verify that the process can be reasonably expected to identify all high and medium impact BES Cyber Systems at each asset.

This is really great. The most important parts of R1 (classifying BCS in this case) are being judged not on the basis of whether they were properly followed by the entity, but on whether they’re “reasonable”. And whose reason is going to decide this? Why, the auditor’s! In the end, the way you will be judged on how well you classified BCS (as well as Low impact assets) is by whether your auditor thinks what you did was reasonable.

Does anyone else think this just might be a tiny little problem? Like perhaps the sound of that iceberg hitting in 1912? I have nothing against the auditors – they’re probably much more “reasonable” than the rest of us. But I thought one of the cardinal principles for CIP v5 was that it was going to eliminate the need for auditors to use their discretion – and now we’re not only not eliminating that, we’re requiring them to do so! I know a number of auditors, and the last thing they want to do is have to use their own reason to bridge a wording gap in the requirements. They sometimes do have to do this, but for a fairly well-defined issue like what “routable” means. Here, they’re being asked to each read the great works of Western philosophy and regulatory compliance, so that they can each come to their own understanding of what CIP-002-5 R1 means.

Let’s go on. In this same section, we now find multiple uses of the words “correct” or “correctly”, for example

1. Verify that the high and medium impact BES Cyber Assets associated with the sampled asset have been correctly identified.

2. Verify that the impact rating of each identified BES Cyber System is correct.

And what exactly is the “correct” way to identify BCAs, or the “correct” way to assign impact ratings to BCS? Of course, there’s no clue provided. The auditors are clearly required to use their “reason”. Wonderful.

Finally, we come to the “Notes to Auditor” section at the end and read this blockbuster:

This makes sense on the surface. After all, wasn’t that one of the big problems with the previous CIP versions? The requirements dictated how you should do something rather than just saying what you should do and letting you figure out how to do it?

This would make sense here, too, if there were some clear definition of the results that need to be achieved. But what are those results? They're things like the “correct” classification of BCS, a “reasonable” methodology for complying with R1, etc. – and these are left totally undefined.

Here’s an analogy. Let’s say you were being audited on your navigation abilities. The auditor might say, “Drive to City Hall.” It would make sense that she wouldn’t care how you got to City Hall, just that you found the place. However, now suppose she is told to audit your navigation abilities, but not tell you where to go. She is just to tell you to go “somewhere reasonable”, and make sure you did a reasonably good job of getting “somewhere reasonable”. So you drive wherever you want, and as long as she’s sure you got somewhere, she has to pass you. Obviously, this would make the entire auditing process meaningless, since there would be no objective standard for deciding you had gotten “somewhere reasonable”.

(June 29: I have written a long footnote to this discussion that I decided to make a new post. You can find it here).

And friends, that is exactly the problem here. The RSAW spends most of its time addressing the simple wording problems of CIP-002-5 R1 (and doesn’t even do a good job of that, as shown in Issues 1-3 above), and hides the really thorny problems behind words like “reasonable” and “correct”. The result is a requirement that can’t be objectively audited at all. The only two ways an auditor can audit R1 are a) Simply give everybody a pass, or b) Decide on their own what R1 means, and apply that meaning mercilessly in their audits.

To be honest, I don’t think there’s too much danger of b); I doubt there’s a single auditor who wouldn’t commit suicide or change careers if he had to be continually telling people that they were getting PV’s because his “reason” told him they should get them. Instead, what will happen is that nobody will ever be judged wrong on how they identify and classify their BES Cyber Systems. This is exactly like the RBAM situation in CIP v1-3, where there was virtually no way an entity could have been found to have identified its Critical Assets incorrectly – except perhaps by turning in a game of Hangman and saying that was the RBAM.[xi]

And as I’ve said previously, if CIP-002-5 R1 is rendered unauditable by a lack of objective criteria to guide an audit, then the rest of the v5 requirements become unauditable as well. I’m more convinced than ever that, if nothing is done about this and an entity receives a PV for R1 and challenges it in court (which they can do – CIP is regulatory law), the judge will take 15 minutes to read the requirement, exclaim “What is this ___ (stuff)?”, and throw CIP-002-5 out. That will of course also invalidate CIP-003-5 through CIP-011-1. This might actually be a good thing were it to happen say in the next six months. Unfortunately, there’s no way it can happen before about five years from now, at which point there will be a huge investment in complying with CIP v5. For that to all be put into question…well, it won’t be pretty, that’s for sure.

And now, Our Conclusion

I have been writing about this fundamental problem with CIP Version 5 for over a year; others have written and spoken about this as well. But since human beings are averse to contemplating great upheavals, people have been waiting for Godot to come and fix everything. I myself hoped FERC would do that in Order 791, and later hoped (against hope) that the new SDT would decide to address the problem. More recently, people have been looking to the RSAWs as their hope; and I’m sure there are many who think the upcoming Transition Study Lessons Learned cases will be the answer.

Folks, I’m telling you: Godot ain’t coming. This problem has to be solved by the NERC community. The best solution is for NERC to step in and do something. The second best is for FERC to demand NERC do something (or do it themselves, although that would be a very radical step that I don’t think anyone wants to see, since it would set a very bad precedent). I also used to think that the regions could get together and fix this, but I no longer see that as possible or likely. Who knows, maybe even an industry group like the EEI or the Transmission Forum could pull this off.

But I do know this is a disaster waiting to happen. We’re heading full speed toward the cliff, smiling all the way over.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] I will point out that this failure to even accomplish the limited purpose of the RSAW isn’t ultimately the fault of the people who wrote it, but of the wording of R1 and Attachment 1. The many inconsistencies and contradictions in their wording make it literally impossible to craft an RSAW that accomplishes even its minimal purpose.

[ii] An exception to this would be BCS associated with an SPS that might be located in the substation. If the SPS is Low and the BCS is only a BCS because of its association with the SPS (not with the substation itself), then the BCS would also be Low impact.

[iii] This post discusses this point in more detail.

[iv] I should point out that this is actually an interpretation of CIP-002-5 R1. It seems the decision has been made by NERC that, no matter how the Attachment 1 criteria read, the only things they can apply to are the six asset types. So “Facilities” means “assets”, and this means the different lines in a substation (and their BCS) meeting criterion 2.5 are all Medium impact. And it means that criterion 2.3 now applies to an entire plant, not just to a single generating unit (a “Facility”). So if a single unit at a plant has been designated “Reliability Must Run”, the entire plant will have to be Medium impact, not just the unit. There are other consequences of this as well. The upshot: I can’t fault NERC for making an interpretation in the RSAW, since I’ve been urging them to do this. But I think a number of entities will be unhappy with what this interpretation means for their compliance costs.

And I really don’t think this was a deliberate interpretation, either. It is interesting that Tobias Whitney’s presentation at the NERC CIPC meeting two weeks ago quite clearly states that BCS at substations meeting criterion 2.5 are classified according to the Facility (i.e. line) they’re associated with (see slides 29 and 30). I think this is the correct interpretation, but somehow this insight wasn’t passed on to the person writing the RSAW, even though they probably work for Tobias.

[v] One of my favorite quotes from Lewis Carroll’s Through the Looking Glass is this one (I know, I’ve already used it in a previous post. Hey, I’m all about sustainability):

"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean- neither more nor less."

"The question is," said Alice, "whether you can make words mean so many different things."

"The question is," said Humpty Dumpty, "which is to be master, you or the words. That's all."

[vi] I wish to point out another anomaly in the use of “associated with” in the RSAW. This seems to be just the result of someone’s laziness, but it needs to be corrected. In the discussion of Evidence Set 2 on page 5, item 1c says

c. A list of all BES Cyber Assets associated with each high or medium impact BES Cyber System identified in a. or b. above.

Can you see what’s wrong with this? BES Cyber Assets are what makes up a BES Cyber System; they certainly aren’t “associated with” the BCS in the sense that Medium BCS are associated with assets or Facilities that meet one of the criteria in Section 2 of Attachment 1. Fortunately, the writer found the right term on page 7, where we find this:

Verify that the BES Cyber Assets (and Cyber Assets, if any) comprising the BES Cyber System are identified.

So “associated with” needs to be replaced with “comprising” in the previous quote, and I’ll be happy – about this one point, anyway.

[vii] An example of such an asset is a generating station or Transmission substation that doesn’t contain any cyber assets (in a control capacity) at all. It clearly won’t contain any BCS, either.

[viii] Unfortunately, even this procedure doesn’t work to identify assets that aren’t High, Medium or Low, such as those that contain no control cyber assets at all.

[ix] I was told that at least two auditors say they will require a list of Low BCS, precisely so they can make sure the entity correctly identified assets “that contain a Low impact BES Cyber System.” At this point, the problem has moved beyond being simply amusing.

[x] Actually, I knew that when I started to write the post in question. But I did think I could write a consistent methodology that would conform to what the wording of R1 would be, if the SDT had taken more time to get it right. I don’t even believe that anymore.

[xi] I realize there were some entities that did get assessed for having an incorrect RBAM in CIP v1-3 (I believe these mostly happened in the later years). And I also realize that most NERC entities are going to do the right thing for BCS identification, even if they are given a free pass on CIP-002-5. But the big problem is that, if everybody gets a pass on CIP-002-5 R1, it undermines the legitimacy of the rest of CIP v5. How can you really be penalized for violations of let's say CIP-007-5, when everybody knows that you could have simply said you didn't have BES Cyber Systems? If CIP-002-5 R1 is unauditable, then that spreads to the rest of the standards as well. As I said, R1 is the foundation of the CIP v5 "house". If the foundation is rotten, the house falls.

Thursday, June 19, 2014

Tobias Scores!

I missed the NERC CIPC meeting in Orlando last week, but I’ve heard from a couple parties about the hearty ovation Tobias Whitney of NERC received when he announced NERC’s “ruling” (or something like a ruling) on the “transfer-trip” relay question, which I have been referring to as the “far-end” relay question. Essentially, this is the question whether a relay on a line connected to a Medium impact substation itself becomes a Medium BES Cyber System due to that fact.

What did Tobias say about this?....Envelope please…..It is NOT a Medium BCS[i]! I’m told that great joy broke out in the room, stock values of TO/TOP’s immediately doubled, and Tobias was crowned with laurels and given the key to the city.

I certainly supported this decision (which was anticipated), but my concern was not so much whether he would say this but what he would point to as the reason for the decision. This is because, while the relay question (really, an interpretation of Criterion 2.5 in Attachment 1 of CIP-002-5.1) is a weighty one for Transmission entities, it is actually fairly small and well-defined compared to the much more weighty issues with CIP-002-5 R1 that I have been complaining about for over a year, and recently tried to summarize in this brief 7,000-word post.

I have said for a while that there are no longer any “legitimate” means for addressing these interpretation problems (at least not before the compliance date for v5), so somebody needs to bend the rules and basically give an interpretation, no matter that there is no solid legal basis for doing so. I suggested a number of parties that could do this, although I didn’t mention my favorite – Judge Judy. However, if she can’t do it I’m quite glad to have NERC do so.

But what NERC can’t do is pretend that any interpretations they give on CIP-002-5 R1 (and Attachment 1) are based strictly on the wording – because, as I’ve said many times, there can be no consistent interpretation of that requirement. Something has to give in the wording, if you’re going to resolve the many problems with the requirement.

I first considered the transfer-trip relay question as one more example of these insoluble wording problems. But in the middle of the night last week, an Interested Party came to me in a dream and pointed out that there was in fact a consistent interpretation of Criterion 2.5 that would remove the idea of the far-end relay automatically becoming a BES Cyber System. Did this fill me with great joy? Far from it! I was now filled with dread that NERC would announce their decision on the relay question was based on this interpretation, and it would set a completely wrong precedent - by implicitly stating that any more interpretations would have to be based on the strict wording of R1. And that is a fool’s errand if there ever was one.

As I had said in the previous post, “I will feel a lot better about this upcoming ruling from NERC if it doesn’t try to justify itself as being a valid interpretation of the wording, but instead is simply imposed as an act of Divine fiat.” And I’m very happy to say that Tobias did exactly that: he didn’t even try to justify his ruling based on the wording. Rather, he said something to the effect of, “Physical location IS a determinant factor for impact classification.”

Now, this may seem to you to be some sort of strict interpretation of the wording, but I defy you to tell me where it says this in R1 or Attachment 1. On the contrary, it seems very clear to me that Attachment 1 tells us that any BES Cyber System that is associated with an asset/Facility in one of the Section 2 criteria is a Medium; it doesn’t matter whether it’s located at the asset in question or not (the Interested Party’s argument overrode this interpretation, but it only applies narrowly to criterion 2.5).

Essentially, by stating that physical location is the determining factor in this case, Tobias is saying that all Medium BCS must be located at the Medium impact asset with which they’re associated, period - no matter what the wording may seem to say.[ii] This is exactly the kind of attitude NERC needs to take as they address the much deeper issues in CIP-002-5 R1.[iii] Keep up the good work!

Note: There was a new development on this issue in August, as described in this post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] Unless it’s also routably connected to the Medium substation.

[ii] Of course, High BCS already have to be located at the High asset (control center), since the wording for them is “used by and located at” not “associated with”.

[iii] NERC released drafts of the v5 RSAWs this week. I will discuss the CIP-002-5 RSAW in a post in the near future. I will say that it seems pretty decent to me, and it even provides some interpretation that I agree with. But it certainly doesn’t address all of the CIP-002-5 problems with one fell swoop. But I never expected it to, either.

Wednesday, June 11, 2014

Medium Impact Control Centers

My Interested Party friend has weighed in with some helpful information on a point I’d noted in my recent post on field assets controlled by High control centers. In that post, I stated that I believed that field devices (like RTU’s) under the control of High impact control centers wouldn't themselves become Highs, due to the words “used by and located at” in Section 1 of Attachment 1.

However, I also pointed out in an end note that this is probably different for Medium impact control centers, since the corresponding wording in Section 2 of Attachment 1 is “associated with”, meaning a BES Cyber System doesn't have to be physically located at the control center in order for it to be a Medium BCS.

However, the IP sent me an email pointing out that the Guidelines and Technical Basis discussion for CIP-002-5 does include language showing it was the SDT’s intent that Medium control centers be treated the same as Highs in this regard: BCS do have to be located at the control center in order for them to become Medium BCS. Here is what he says:

From the Interested Party

Even Medium impacting control center BCS do not extend beyond the confines of the control center. Yes, it is not as crystal clear in the language of Criteria 2.11, 2.12, and 2.13 because of the “associated with” language at the beginning of Section 2. However, the reader can rely upon Guidelines and Technical Basis to provide sufficient guidance that makes the expectations clear. The guidance explicitly states (emphasis is mine):

Criterion 2.11 categorizes as medium impact BES Cyber Systems used by and at Control Centers that perform the functional obligations of the Generator Operator for an aggregate generation of 1500 MW or higher in a single interconnection, and that have not already been included in Part 1.

Criterion 2.12 categorizes as medium impact those BES Cyber Systems used by and at Control Centers and associated data centers performing the functional obligations of a Transmission Operator and that have not already been categorized as high impact.

Criterion 2.13 categorizes as medium impact those BA Control Centers that “control” 1500 MW of generation or more in a single interconnection and that have not already been included in Part 1. The 1500 MW threshold is consistent with the impact level and rationale specified for Criterion 2.1.

This is consistent with the explicit expectation of “used by and at” for High impact BES Cyber Systems, which only apply to control center BCS. It is worth noting that the guidance for Criterion 2.13 fails to include the language “those BES Cyber Systems used by and at”. I believe this is an oversight by the drafting team that was not caught in review. I believe it is safe to assert the missing language because the guidance otherwise asserts the control center itself to be Medium impacting and that is inconsistent with the rest of the Criteria and the direction of the Standard overall.

There is no expectation that Medium impacting BCS at a control center will automatically convey Medium impact to the BCS at every substation the control center systems communicate with. Only if the entity has defined a super ESP that encompasses the control centers and the substations in one perimeter will the issue of Protected Cyber Asset come up that would result in treatment of the substation BCS as medium impacting.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

Tuesday, June 10, 2014

An Interested Party Weighs In

It seems the Interested Party I referred to in the previous post is still interested, because he wrote me a lengthy email outlining his position on the “far-end” relay question. And guess what? He offers an interpretation of criterion 2.5 which has convinced me that 2.5 really does intentionally confine BCS “associated with” the Facilities referred to in that criterion to the substation itself – meaning that relays at the “far end” of the lines in question aren’t automatically Medium impact, unless they otherwise meet one of criteria 2.4 – 2.8.

Here is my summary of his argument, although I’ll reproduce his words below (with slight editing to protect the guilty):

He agrees generally with how I interpret 2.5, but he points out something I missed: the words “at a single station or substation”. This phrase is a modifier for “Transmission Facilities” (and he very helpfully points out that it is good to dust off your sentence-diagramming skills from eighth grade, although I’m pretty sure I learned that in fourth grade. Perhaps I was in an – ahem! – higher performing class than he was), so it means the Facility has to reside within the substation.
I have all along been assuming that Facility mainly refers to entire lines in this criterion, but since an entire line isn't "within a substation", the "Facility" in question must be something else (of course, since “at a single station or substation” isn’t in the other substation criteria 2.4, 2.6, 2.7 and 2.8, those criteria can apply to entire lines, as well as to other Facilities). In other words, the relay within the Medium substation is Medium because it supports a Medium Facility, while the relay within the Low substation is Low because the Facility that it supports is Low impact.
So what are the Facilities that the two relays support? The answer is the circuit breaker. The relay is (usually) a BES Cyber System affecting the breaker, not the line. And since the breaker is only in one substation, not two (as is a line), there can be no question of relays at neighboring substations becoming Medium impact.

The nice thing about this interpretation (as opposed to the one from a regional auditor that I discussed in the previous post) is that it doesn’t do violence to the term “Facilities” in the Medium bright-line criteria; it doesn’t re-interpret this term to mean something like “assets”. So I take back what I said in the previous post about preferring that NERC arbitrarily rule that “far-end” relays aren’t Mediums; it seems there can be a direct interpretation of this wording that accomplishes that, without debilitating side effects.[i]

And another good point about the discussion below is that this person really knows about substations and provides some very good advice on classifying transformers and other topics. I recommend that any transmission entity read this (and also anyone else who doesn’t have a life outside of NERC CIP).

The Interested Party’s Tale (apologies to Chaucer)

First, here is the exact language of the Criterion, straight from the NERC website:

Transmission Facilities that are operating between 200 kV and 499 kV at a single station or substation, where the station or substation is connected at 200 kV or higher voltages to three or more other Transmission stations or substations and has an "aggregate weighted value" exceeding 3000...

Now, let’s all climb into Mr. Peabody’s Way Back Machine and revisit our sentence diagramming lessons from 8^th grade. Please pay attention to the strategically placed comma which, incidentally, was not in the earlier versions of the standard. The comma separates two cogent thoughts and in this case provides crystal clear meaning. OK, so now let’s break down the statement into three segments for clarity:

(1) Transmission Facilities that are operating between 200 kV and 499 kV

(2) at a single station or substation,

(3) where the station or substation is connected at 200 kV or higher voltages to three or more other Transmission stations or substations and has an "aggregate weighted value" exceeding 3000...

As I have consistently asserted in the past, the asset itself is not categorized, only the BES Cyber Systems located at that asset.[ii] The section (segment 3) that states “where the station or substation is connected at 200 kV or higher voltages to three or more other Transmission stations or substations and has an ‘aggregate weighted value’ exceeding 3000” defines the characteristics of the asset that this Criterion is applicable to. In other words, the Criterion is only applied to BES Cyber Systems located at a Transmission station or substation connected at 200 kV or higher to three or more other Transmission stations or substations. Unless that qualification is satisfied, the Criterion is not applicable at all. And, if the Transmission station or substation is connected at 200 kV or higher to three of more other stations or substations, the Criterion still does not apply unless the aggregated weighted value exceeds 3000 per the referenced table. The table awards 700 points per Transmission Line operated between 200 kV and 299 kV, and 1300 points for lines operated between 300 kV and 499 kV. Interestingly, you get no points for lines connected at 500 kV or above, but that really makes sense because that condition is covered unconditionally by Criterion 2.4. And you get nothing for Transmission Lines operated below 200 kV. But, I digress. The other thing you must understand is that if you have two or more parallel lines connecting substation A to substation B, that only counts as one connection for the “connected at 200 kV or higher voltages to three or more other Transmission stations or substations” provision of the Criterion but each line contributes its own value when calculating the “aggregate weighted value.”[iii]

OK, so now we know which stations or substations the candidate BES Cyber Systems must be located at. Now, let’s go look at the first phrase (segment 1) in the Criterion statement. The candidate BES Cyber Systems must be associated with “Transmission Facilities that are operating between 200 kV and 499 kV.” That is how we can correctly state that any BES Cyber System associated with a Transmission Facility operating outside of the voltage range is not a Medium impacting BES Cyber System per this criterion. That does not mean the BES Cyber System will not be categorized as Medium impacting by another Criterion, but it is not Medium impacting by the application of Criterion 2.5. Transformers are a special case because they operate at two voltages; more on that later.

OK, but we are not done yet. And this is what cements the correct reading of the Criterion as opposed to pulling a view “out of the blue.” Look at the (segment 3) portion of the Criterion, above. The Transmission Facility operating between 200 kV and 499 kV as referenced in the Criterion must be operated at “a single station or substation.” You can also read this as “located at” if you prefer since the Transmission Facility is clearly operated at the place where it is physically located. Why is this important? It is important because this statement limits the application of Criterion 2.5 to only a subset of all possible Transmission Facilities. It includes the transformer and shunt compensator declared in the NERC Glossary of Terms definition of Facility because they are physically located at and operated at a single station or substation. It also includes the circuit breaker that connects one end of a Transmission Line to the Transmission System. The circuit breaker is a Transmission Facility; the list in the definition is an example and is not all inclusive. But, the Transmission Line, while a Transmission Facility per the Glossary definition, is not a Transmission Facility that Criterion 2.5 applies to. The Transmission Line, by its very nature, is operated at more than one station or substation. It has to be connected to at least two stations or substations or it cannot be a line. And Criterion 2.5 clearly says operated at “a single station or substation.”

The relay in the substation control house operates the circuit breaker and is clearly “associated with” the circuit breaker for the purposes of applying the Criterion. The protection schemes running in the relay (and coordinated between the near and far-end relays for certain types of schemes such as pilot relay and transfer-trip) are to protect the Transmission Line, but the relay does that by operating the circuit breaker. The relay is technically not directly associated with the line and that issue is moot anyhow because the line is not operated at a single station or substation.

So, if the relay (the BES Cyber System) association is not applicable to a Transmission Line, then the categorization of the BES Cyber System must be based solely upon the Transmission Facility from the subset of applicable Transmission Facilities (the circuit breaker in this case) that it is associated with. The far-end relay is associated with the circuit breaker located and operated at the single station or substation at the other end of the Transmission Line. And, when you apply Criterion 2.5 to that candidate BES Cyber System, the station/substation qualifications (segment 3) are applied.

If there is an association at all between the two, it is a relay-to-relay association, not a relay-to-Transmission Facility association. And, a relay-to-relay association is not an association that would make the far-end relay Medium impacting by default.

Now a word about transformers. The transformer is unique in that it is operated at two voltages. It is, however, operated at a “single station or substation.” If either side of the transformer is operated between 200 kV and 499 kV, then it is a Transmission Facility that meets the qualifications of Criterion 2.5 and any BES Cyber System associated with the transformer, even those operating the side whose voltage is outside of the 200 kV to 499 kV range, is Medium impacting.

I wish to thank the Interested Party. A very helpful discussion.

If you would like to know what happened with this controversy, you can find out in this post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] Of course, this solution just applies to criterion 2.5. I don’t think there are such easy solutions to all of the other wording problems in CIP-002-5 R1 and Attachment 1.

[ii] He’s right that he’s said this consistently in the past. And I’ve consistently said that, while his take on this issue is probably closer to the wording of CIP-002-5 R1 and Attachment 1 than mine is, it’s a moot point – since literally every entity I’ve talked to so far has said they first classify assets/Facilities, not BES Cyber Systems; the latter get their rating through the former. In practice, I know he advocates an intermediate step where the entity does in fact look at the asset/Facility and develops a “preliminary” classification for it; this then guides how the BCS at or associated with that asset/Facility will be classified. So there isn’t much difference between what we both say in practice; I just feel his approach adds more verbiage and potential confusion.

[iii] He makes a good point here that I hadn’t realized. However, this discussion also shows you how incredibly complicated the supposed “bright-line” criteria really are.