Monday, February 9, 2015

Making it up as they Go along, Part I: the ISO New England Affair

I have had a number of epiphanies over the course of writing this blog and examining/reporting on CIP version 5.  I had another one last weekend after I had written my post on the WECC CIP User Group (CIPUG) meeting in Anaheim.  It concerned a couple of private discussions of an issue that was never brought up at the meeting itself.

Here’s the issue: In January, I had three entities with generation assets in New England independently approach me with the same question (two earlier in the month, one at the CIPUG itself).  They had all received an email from the ISO New England that said:

“In accordance with Criterion 2.6 of NERC Standard CIP-002-5, ISO New England has determined that Generation Facilities represented by your company have an AVR and/or PSS (if equipped) that is critical to the derivation of IROLs and their associated contingencies, as specified by FAC‐014‐2, Establish and Communicate System Operating Limits, R5.1.1 and R5.1.3.” 

For those of you who (like me) aren’t generation gurus, AVR refers to “automated voltage regulator” and PSS is “power systems stabilizer”.  These are systems that can regulate voltage when it gets out of a certain range required for stability of the grid.  IROLs - “Interconnection Reliability Operating Limits” - are defined in the NERC glossary as

“The value (such as MW, MVar, Amperes, Frequency or Volts) derived from, or a subset of the System Operating Limits, which if exceeded, could expose a widespread area of the Bulk Electric System to instability, uncontrolled separation(s) or cascading outages.”

Evidently ISO NE requires most (or even all) generators to have an AVR system, as well as an RTU to report its status back to ISO NE.  To make a long story short, it seems that knowing the status of these systems is required for ISO NE to derive IROLs.  The email is saying that “Generation Facilities” with AVR fall under Criterion 2.6 of Attachment 1 of CIP-002-5.1 as Medium impact.  The criterion reads:

“Generation at a single plant location or Transmission Facilities at a single station or substation location that are identified by its Reliability Coordinator, Planning Coordinator, or Transmission Planner as critical to the derivation of Interconnection Reliability Operating Limits (IROLs) and their associated contingencies.”

All three entities reached out to me because they wanted to know what ISO NE meant would be in scope as Medium impact.  Was it the whole plant?  The unit?  Or just the AVR system?  Two employees of one of these entities brought this up to me at the CIPUG, which they were attending because they also have assets in WECC.  After discussing it with them separately at dinner, we all (by chance) ended up discussing it the next day at a break, with a high-ranking NERC staff member who was attending the meeting.

Let me be clear before I go any further that my interest in this discussion isn’t primarily how it turned out, although I will tell you now that all three parties – the entity, myself and the NERC staff member –ended up in agreement on how this email should be interpreted (and a later email from ISO NE reinforced that conclusion).  My interest in this case (perhaps “fear and dread” is a better term) has much more to do with the process by which this “decision” was made.  The epiphany I experienced was the realization that, at least as far as the Attachment 1 criteria go, this process will probably not be the exception but the rule – in other words, between now and 4/1/16 the only way of addressing questions about application of the bright-line criteria (and perhaps in other areas of CIP v5 as well, like CIP-002-5.1 R1 and CIP-005-5 R1) will be through ad hoc “rulings” – not Lessons Learned, FAQs, etc.  This deals a further blow to any small hope I may still have had that CIP-002-5.1 (and perhaps one or more other v5 standards) can ever be a clear-cut, enforceable standard.  It also shows that NERC and the regions are simply making up their approach to CIP v5 implementation as they go along – it is now way too late to deal with these issues in the “proper” way.

Now that I’ve told you the punch line of this post, you’re free to go.  However, if you want to know why I came to this conclusion, you can read on.

I’m sure a lot of entities had questions about the ISO NE email, since the ISO subsequently sent a follow-on email.  Because these two emails were obviously sent to the same large number of entities, I have no ethical qualms about discussing and quoting their contents.  The email described a conversation among representatives of ISO NE, NPCC (which is, of course, the Regional Entity covering New England) and a large integrated utility.  The heart of the email was this sentence:

“NPCC indicated that its expectation is that because AVR/PSS status is the specific component of a generator that is critical to the derivation of IROLs, Generator Operators must protect the generator’s primary means of transmitting AVR/PSS status to ISO-NE under CIP-02-5.1 as a Medium Impact BES Cyber Asset.”

In other words, neither the entire plant nor even one unit is Medium impact under Criterion 2.6; just the AVR/PSS system (and its RTU) is.  At the WECC meeting, I, the entity (not the one referred to in the above email) and the NERC staff member all agreed this was the right approach from the standpoint of protecting the grid, as well as not requiring a lot of unnecessary compliance expense by a large number of generation entities.[i]

So is everybody happy?  It seems so – everybody but me.  Because I insist on asking a silly question: Is this “ruling” actually in compliance with the wording of CIP-002-5.1 R1, Attachment 1, and Criterion 2.6?  The answer to that is “no”.  I do agree with the outcome of our discussion (and the second email from ISO NE), but it is clear to me that this outcome was achieved in a completely ad hoc manner, without any close attention to how the standard is written.  And as I mentioned above, I fear this method of determining questions about the bright-line criteria will become the rule, not the exception.

Let’s go back to the discussion between the entity, me and the NERC staff member.  The entity started out by saying they thought the original ISO NE email (neither I nor the NERC staff member had heard of the second email at this point) meant that only the AVR system and the RTU were Medium impact – and that they were Medium impact BES Cyber Systems.

The NERC staff member’s initial reaction was that either the whole plant or a single unit was Medium impact.  I believe he said that because he subscribes to the mistaken (but widely held) belief that the Attachment 1 criteria refer to assets of the six types listed in R1 (control centers, Transmission substations, etc).  Needless to say, “AVR systems” isn’t one of those six asset types.  Moreover, the email said “ISO New England has determined that Generation Facilities represented by your company have an AVR…”  Whatever “Generation Facilities” means, it can’t be the same as the AVR system; otherwise, the email wouldn’t talk about Generation Facilities having an AVR.  Therefore, the NERC staffer said what he did: the plant (or the unit) is what is Medium impact.

This was met by a horrified reaction from the entity, since if that “ruling” had stood, it might have easily added $1MM or more to their CIP compliance costs (and to those of every other generator that received the email).  At this point, I helpfully piped up to say that, since Criteria 2.3 – 2.8 all refer to “Facilities”, and since AVR might meet the NERC definition of Facility[ii], then the email made sense.  In essence, the email was saying “You have a generation Facility – AVR – that is subject to Criterion 2.6 since it is essential to our derivation of IROLs.”

But how would this lead ISO NE to conclude that the AVR system is a Medium impact BES Cyber Asset?  Presumably because Criterion 2.6 would make it so.  But how does it do that?

Let’s look at the context of all of the Medium impact criteria (i.e. all the 2.X ones).  They are all prefaced with the somewhat mysterious phrase “Each BES Cyber System, not included in Section 1 above, associated with any of the following:”  This phrase is logically preceded by requirement part 1.2, which reads “Identify each of the medium impact BES Cyber Systems according to Attachment 1, Section 2, if any, at each asset.”  In other words, this whole chain means “The entity has to identify and classify BES Cyber Systems that are associated with..” (whatever Criterion 2.6 talks about).  What does 2.6 talk about, anyway?

As I mentioned above, a lot of people – including, unfortunately, many in the NERC regions and in NERC itself[iii] – will tell you that the Attachment 1 criteria apply to BES “assets”[iv] that correspond to one of the six types listed in CIP-002-5.1 R1: control centers, Transmission substations, etc.  This is absolutely not true (I have discussed why this is so in a few posts, including this one under the heading “Have an Apple, Adam?”  I hope to devote an entire post to this fallacy in the future).  The only definitive statement that can be made about what the bright-line criteria refer to is that they refer to their subjects.[v]  A less definitive - but more instructive - statement would be that the criteria refer to a) “assets” (including but not necessarily limited to the six types in R1); b) “Facilities” (a NERC defined term); or c) subjects of some of the Medium criteria that meet the definition of Facilities (like SPS, RAS or load-shedding systems) without using that word (I’m referring most specifically to Criteria 2.9 and 2.10).  Even better, just don’t worry about what the criteria refer to.  They refer to what they refer to, period.  But they definitely refer to a lot more than the six asset types; if you don’t believe me, read them and see for yourself.

To return to our discussion, the “subject” of 2.6 is “Generation at a single plant location or Transmission Facilities...”  Since we’re not talking about Transmission Facilities here, ISO NE must have been considering “Generation at a single plant location” to be the operable part of 2.6 for the purpose of their email notification.  What does this phrase mean?

The word “generation” is used in two other criteria: 2.3, where it appears as “Each generation Facility” and 2.1, where it appears as “Commissioned generation, by each group of generating units at a single plant location…”  Given this, it seems pretty clear that the above phrase in 2.6 means either the whole plant or at least a single unit - i.e. the interpretation the NERC staffer originally gave.  If he was right that the email referred to the whole plant or at least one unit, this would mean that every BES Cyber System associated with the plant or unit would be Medium impact; this of course might be quite a lot.

But since I was convinced (not having the criterion in front of me at the time, but having read it a few times before) that  Criterion 2.6 referred only to “Facilities”, and since I knew that a single system could be a Facility (as in the case of some SPS and RAS systems), I thought that the AVR “system” (meaning the hardware that actually implements its purpose, as well as the cyber assets that control and/or monitor the hardware) was what was Medium impact – i.e. the system was a “generation Facility”.  Of course, my position was supported by the fact that the email referred to “Generation Facilities”[vi].  All three parties agreed this was the right conclusion, and we returned to the NERC meeting, happy[vii] that we had made our own small contribution to the jurisprudence of NERC CIP Version 5.

However, when I actually reread Criterion 2.6 afterwards, I realized I had been wrong.  I had been thinking it read something like “generation and Transmission Facilities” – in other words, “generation Facilities and Transmission Facilities”.  However, it reads “Generation at a single plant location or Transmission Facilities…(my emphasis)”  If the SDT had wanted the first part to mean “Generation Facilities” they would have used those words.[viii]  On the other hand, “Generation at a single plant location” almost certainly means the whole plant or at least a unit (and of course, the AVR system can’t be considered “generation”.  By itself, it doesn’t generate anything, except perhaps compliance uncertainty). 

So the NERC staffer had been right in the first place, and shouldn’t have changed his mind, at least as far as the wording of the criterion is concerned.  However, given that the two employees of the registered entity that had received the email were quite passionate that this was the wrong answer – and given that they were both fairly big guys – it is understandable that he changed his mind (especially when I chimed in with my opinion supporting the two guys.  NERC staff members always defer to what I say without questionJ).

I was going to deliver this bad news to the entity the week after the CIPUG.  That is, I was going to say I had been wrong and the NERC staffer should have stuck with his first interpretation: either the whole plant or at least one unit would be Medium impact.  However, one of the entity employees forwarded me the second email from ISO NE – which said unequivocally that the AVR system (with its RTU) was a Medium impact BCS[ix], and this was the only system that would have to be so designated.  That settled the matter, as far as I was concerned.  This also saved me the embarrassment of having to tell this entity I was wrong (which rarely happens, of course).

But the fact is that I was wrong, and the NERC staff member was right in his original “ruling”: the whole unit is Medium impact if you go by the actual wording of Criterion 2.6.  Yet is this staff member going to raise a stink and insist to ISO NE and to the generation entities that this is the case?  I highly doubt it, not because he’s a coward but because a) ISO NE has issued their “ruling” in the second email; and b) the ISO clearly made the right decision from a purely practical point of view:  What needs to be protected to solve this IROL issue is the AVR system, nothing more.[x] 

At Long Last, the Conclusion
So what is the point of this long and seemingly pointless narrative?  After all, I’ve said I agree that the final “decision” – made in our informal meeting at WECC as well as in the ISO NE email – was a good one.  What’s my beef?

My beef is that this was such an arbitrary process, and one that is totally unsupported by the wording of CIP-002-5.1.  It is clear that none of the Attachment 1 criteria apply to BES Cyber Systems directly; rather they apply to Facilities and assets, as I said above.  Once you’ve identified the asset/Facility as Medium impact (using the criteria), then you find the BCS “associated” with it and classify those as Medium BCS.  For ISO NE to say that the generation entity is required to identify particular Cyber Assets (the AVR system and its RTU) as Medium BES Cyber Assets betrays – frankly – a lack of understanding about perhaps the fundamental component[xi] of the asset identification and classification process in CIP Version 5.[xii]  As for the fact that NPCC – who presumably approved of the second email, since they were one of the three parties at this meeting – also showed this lack of understanding…well, I can only say I don’t ultimately blame them either.  I do blame CIP-002-5.1, because its many inconsistencies and contradictions are at the heart of this problem. And I blame NERC for continuing to forge ahead with implementing this standard as these problems become more and more evident (see "programmable", "adversely impact", "reliability purposes", etc).

Moreover, it was obviously very arbitrary for the NERC staff member (a very prominent member of the CIP team) to first state the correct answer in our conversation in Anaheim, then back off when he got opposition.[xiii]  To be politically correct, I should reprimand him for even trying to give an answer; instead, he should have said he’d take this issue back to the NERC Transition Advisory Group, so they can put it on their list of Lessons Learned they intend to write. 

But I certainly understand why he wouldn’t want to do that, either.  It would be many months – if not longer – before the TAG could get to this issue, and then it would take months to draft the Lessons Learned document, post it for comment, and finalize it.  Given all of this, there will probably only be a few months between when it’s finalized and April 1, 2016; what good does it do anyone to release it then?[xiv]  In fact, except for a few questions for which Lessons Learned will soon be finalized, it’s probably now better for NERC to put off any new Lessons Learned (or other general “rulings”) on CIP-002-5.1 R1 and Attachment 1, and just issue individual rulings on particular issues (even applying to single entities, if that’s needed).

In other words, given that NERC has this deeply flawed standard, and that it is now too late to change it without a huge upheaval[xv], probably the only way it can be implemented – given the many problems with the Attachment 1 criteria – is to do exactly what NERC and the regions seem to be doing, and will do more and more in the near future: make individual ad hoc “rulings” to resolve particular issues.  These rulings may sometimes be for the entities in a particular ISO footprint; they may sometimes be for entities in a certain region; and I’m sure they’ll sometimes be for particular entities all by themselves.  This is going to happen more and more, and is already starting to happen.  Of course, I’ll document those instances when I find them; I’m sure I won’t have to look too hard (I already have one other example that will appear in one of the next posts in this series).

Of course, it’s kind of hard to call CIP a “standard” when its interpretation is done in an ad hoc manner.  Maybe NERC can figure out another name.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] I have no idea how applicable this “ruling” will be in other NERC regions or other ISO footprints.  Such a discussion is above my pay grade.

[ii] Whether it does or not would be an interesting question, since I think a Facility would have to be operated at high voltage and have terminals on it.  I’ve never seen an AVR system, so I don’t know whether or not that is the case.  Fortunately, the answer doesn’t matter for this post.

[iii] And myself, up until around this time last year.

[iv] With “assets” being an undefined term, commonly used in the industry to refer to “big iron”: substations, generating stations, control centers, etc.

[v] Although someone pointed out to me that the Attachment 1 criteria don’t have subjects, since they aren’t sentences in the first place.  However, even I have to draw the line somewhere.  I say they have subjects, and I won’t attempt to justify this.  If you want to fight about this, just name the time and place and I’ll be sure (not) to be there.  Of course, the ultimate arbiter of intellectual disputes used to be the dueling pistol.  Fortunately, I don’t believe they make those anymore.

[vi] Of course, ISO NE shouldn’t have capitalized “Generation”, since it isn’t a NERC defined term.

[vii] As I said in my previous post, there was a lot of fantasy going on at that meeting, including by yours truly.  These fantasies allowed people – NERC entities, NERC and WECC staff members, and consultants like me - to go away feeling they’d made a lot of progress at the meeting.  Whatever gets you through the day is OK with me (with some exceptions)…

[viii] This is supported by the fact that Criterion 2.3 refers to “Each generation Facility”.  The SDT could have easily used exactly the same phrase in 2.6, had they wanted to mean that.

It was pointed out to me that the Guidance and Technical Basis for CIP-002-5.1 does pretty clearly imply that the SDT meant to say “generation Facilities”, when it says “Criterion 2.6 includes BES Cyber Systems for those Generation Facilities…”  Of course, the SDT shouldn’t have capitalized “Generation” here.  Leaving that quibble aside, it is quite distressing that they would state so clearly in the Guidance that they meant “generation Facilities” in 2.6, and then quite deliberately use a different phrase in the Criterion itself.  

And since I’m in a conspiratorial mindset at the moment, and I’m in a footnote that nobody who has anything better to do will read (this means you, by the way), I’ll expand on my theory about why the SDT might have made this “error”.  The NERC definition of Facility reads “A set of electrical equipment that operates as a single Bulk Electric system Element (e.g., a line, a generator, a shunt compensator, transformer, etc.)”  Since “generator” is the only one of those terms that applies to a generating plant, this could easily be taken to indicate that, in the context of a generating station, “Facility” would refer to at least a unit of the plant (since a unit is the smallest part of the plant that could be called a “generator”).  Thus, had the SDT used the words “generation Facility” in Criterion 2.6, this might have been taken to mean simply “generation unit”.  This means that notices like the one from ISO NE might have been taken to mean a whole unit was in scope, thus substantially increasing compliance costs for generation entities; ergo the SDT didn’t want to use the term “generation Facilities” in 2.6.

While I’m on a roll, I’ll continue.  I think it’s quite unfortunate that the SDT used the non-defined term “generation” three times in Attachment 1, each time to mean something different.  In Criterion 2.1, the term definitely means an entire generating plant.  In 2.3, it is part of the term “generation Facility”, and clearly means one or more individual units in a plant.  Yet in 2.6, the SDT seems to have wanted to say “generation Facility” (and did say that in the Guidance), but, perhaps because of the consideration just mentioned, they didn’t.  I think this is because they wanted to make the phrase “Generation at a single plant location” in 2.6 apply to other generation Facilities (like AVR), not just the units themselves.  I don’t know whether the SDT thought it was being pretty cool by using the same term in three different ways, but it certainly introduced even more confusion into the bright-line criteria (I’ve already complained several times about the fact that the SDT tried to be far too parsimonious with words, and ended up making CIP-002-5.1 R1 essentially unenforceable).

[ix] Both the first and second notes actually said that the AVR “system” was a BES Cyber Asset.  They should really have said BES Cyber System, since the AVR system includes the AVR computer itself and the RTU.

[x] Of course, some argue that the bright-line criteria are way too lenient on generating plants, since the only plants that are clearly Mediums are those over 1500MW.  Whether or not that’s true, it’s a different issue.

[xi] The ISO NE email provides an even more egregious demonstration of lack of understanding of CIP v5.  The last paragraph contains this sentence: “If an entity receives a notification that a generator is critical to the derivation of IROLs and that notification does not specify any particular component that is critical, then entities would have to consider CIP protection for the entire station as a Medium Impact BES Cyber System.”  So they’re saying that an entire generating station could be a BES Cyber System!  Let’s see…a BCS is made up of BCAs, which are themselves Cyber Assets.  Where do the turbines or the boiler fit into this?  I never thought of them as cyber assets before.  Again, I don’t particularly blame ISO NE; it simply shows the magnitude of misunderstanding that’s out there about the v5 asset identification/classification process, including at some entities that play key roles in the process of implementing v5.

[xii] I will admit there is a chance that the assertion in the email that the AVR should be a Medium BCS under 2.6 doesn’t mean that ISO NE misunderstood how CIP-002-5.1 R1 is supposed to work.  It may mean they have realized that the general understanding of the word “generation” in 2.6 meant that the AVR system couldn’t fall under that criterion – and the whole plant or unit would have to be Medium impact; so designating the AVR a Medium BCS, however “illegal” given the wording of the standard, was the only way to achieve the result they wanted (and as I’ve said, even I agree that this is the best result).  I would give this possibility more credence if ISO NE hadn’t also made the error described in the previous footnote.

[xiii] It may seem odd that I’m reprimanding this NERC staffer for changing his opinion to the one I was advocating in Anaheim – and which I still say is the best decision from a reliability and resource efficiency point of view.  But hey...I don’t want NERC to change its “rulings” based on something I or anybody else says (and because of this staffer’s position in NERC, what he said will definitely be taken by the entity as definitive and will probably be proudly recounted by the entity in the future, should how they handled the AVR issue be questioned in an audit).  I want NERC to rule based on what the standard says.  And if the standard is too poorly written for them to be able to properly address the issue at stake, I want them to admit that and write a SAR for a new standard.

[xiv] In fact, I would argue that NERC needs to stop producing all Lessons Learned (at least on CIP-002-5.1) at some point in time before the compliance date.  This is because it can be literally counterproductive for these documents to come out too late.  For example, I’ve written about the need for clarification on the meaning of “affect the BES” in the definition of BES Cyber Asset.  NERC doesn’t even have this on their list of planned Lessons Learned, and at this point I’d tell them not to bother putting it on.  An entity with High or Medium impact assets needs to be working now on identifying their BES Cyber Assets; since NERC hasn’t come out with any guidance on this issue, entities need to “roll their own” now – as I’ve been saying in a number of posts.  Were NERC to come out with a Lessons Learned in say November, what good would that do – except to call into question the whole asset identification process at the entity, long after it is too late for the entity to make any changes to the list of BES Cyber Assets/Systems?  It’s now better to just wait until after 4/1/16 to start more Lessons Learned on CIP-002, unless they can be finalized by this summer.  Even better, NERC should take up my suggestion to push the compliance date back by a year, so that more LL’s can be developed in time for them to actually do some good.

[xv] I continue to assert that CIP-002-5.1 needs to be rewritten.  This will require an admission of a huge failure on NERC’s part.  Organizations in general – and NERC is certainly no exception – are always very reluctant to take such a step.  But I’m willing to bet that in six to nine months’ time, the situation may have changed to the point that this doesn’t look like such a crazy suggestion.

No comments:

Post a Comment