Tom Alrich's Blog: 2016

Thursday, December 29, 2016

An Auditor Addresses Auditing on Ambiguous Authorities

My most recent post discussed what kind of information the NERC regions are willing to provide regarding how they interpret (little “I”, of course!) areas of ambiguity in the CIP standards, as well as the format they’re willing to provide it in (verbal or written). As almost a sidebar in that post, I made the assertion – which I used to repeat about once a week during the big debates in 2014 about how to handle ambiguity in the runup to CIP v5 compliance – that, in cases where the entity has to make a decision on how to comply with a truly ambiguous requirement, they need to look at all available guidance from NERC, the regions, Tom Alrich’s blog, The Tibetan Book of the Dead, the I Ching, etc. But in the end, it is up to the entity to make – and especially document – how they made their decision.

If the entity gets audited three years later and the auditor doesn’t agree with the decision they made, there’s no way he or she can issue them a PV (and have it upheld). You gotta do what seems best given the available information at the time. I pointed this out because an auditor I was quoting in the post had endorsed this position by email (quoted in the post).

However, in seeing how I had interpreted what he said, the auditor emailed me back. Here is the full text of his email:

“A clarification...

”When I said we would audit based on what the entity actually did and using the best information at the time, I was not saying we would forgive the entity if they relied on advice or guidance provided at the time of the inquiry, only to be found non-compliant at the time of audit. At the time (I) was referring to the time of audit. And what the entity actually did is very important because we have seen entities misapply our guidance.

“We provide guidance using the very best information available at the time of the inquiry. But, as we all know, things change. Standards are revised. Interpretations, although infrequent, do get approved by FERC. NERC issues guidance through its Section 11 process. And, while we try very hard to thoroughly understand not only the nuances of the Standards, but also the question being asked, we are not infallible. We reserve the right to get smarter.

“We expect the entity to take our guidance into consideration as they would any other. Moreover, we expect the entity to keep up with changes and additional guidance as they evolve. We expect the entity to determine their course of action after due consideration; not just to do something because the Region ‘told them to’ unless the direction is in response to a non-compliance issue, such as a violation mitigation or a RAD.

“That said, the Region is still the best resource. We are closer to NERC, NERC guidance, and the collective wisdom of all eight Regions. Auditors across all eight Regions and NERC have the ability to collaborate and seek consensus on issues as they arise, whether submitted as a question or encountered at audit. We collectively communicate and discuss issues of a frequent basis. We have also had the benefit of seeing numerous and sometimes widely varying approaches to compliance, and know what works and what is problematic.”

The three main lessons I draw from this email are:

Suppose your region provided compliance guidance for a requirement or you read about the issue in some official guidance, and you based your compliance approach for that requirement on what you had been told or read. You did this because this was the most recent guidance you could find. This doesn’t preclude you from still being found in violation at audit. NERC or your region may change its mind or decide it made a mistake, the CIP Standards Drafting Team may issue a draft requirement that clarifies the issue, FERC could issue an Order or a NOPR that affects the issue, etc. In other words, contrary to what I wrote in the previous post, just being able to show that your action was in accordance with the best guidance available at the time doesn’t give you a Get Out of Jail Free card for a future violation.[i]
Even when your region provides guidance on a requirement, they aren’t expecting you to follow it blindly. If you have documentation of guidance issued by some “official” entity that contradicts the region’s guidance and you want to follow that guidance not the region’s, you should feel free to do so.
And it seems the regions aren’t infallible and can change their minds! I was shocked…shocked! to hear this, of course.

At this point, I realized that the idea of the entity having to decide for itself how to comply, based on all the guidance available at the time, made lots of sense in the context where I described it in 2014 (and it wasn’t my idea, but that of a longtime control system/CIP professional at a large generating organization), when entities were staring at a seemingly fast-approaching v5 compliance date and had to make decisions right away if they were going to be in compliance with v5 by the mandated date. But what does the idea mean now, when the compliance date has long passed?

I then replied to the auditor and laid out a set of scenarios where an entity found or received guidance either a long or short time before an audit, the guidance they received was either very quick or very time-consuming to implement, the entity either could or couldn’t implement the changes before the audit, etc. I asked what his region would do if it found potential non-compliance in the audit, in each of the scenarios I laid out.

Fortunately for me (since I now realize my request was fairly foolish), the auditor didn’t take the bait. Here is the entire text of his response:

“The answer to all your questions is... It depends. When was guidance originally issued, if any? When was it revised? What did the entity do before the current guidance? What did they do with the guidance once it came out? Essentially, the entity needs to tell its story and explain why it thinks it should not be found non-compliant. The auditor will listen and evaluate what the entity presents.

“The auditor starts out reading the plain language of the Requirement. Where there is vagueness and uncertainty, we will be conservative in any response we give during outreach or in response to questions. Our recommendations with respect to virtualization bears that out. We have no idea where the SDT will ultimately go. But our intent with conservative guidance is to give the entity a direction that will likely be compliant with whatever the SDT produces and FERC approves. If the entity wants to bet against our advice, they are free to do so. Maybe they will get lucky, maybe they won't. But the guidance we have been giving today is firmly rooted in the language of the Standards today. And, my Region, at least, will explain our position couched in the language of the Standards.

“We will be, hopefully, reasonable in both our guidance and also our ultimate finding. But there is absolutely no way an auditor will declare today how it will find an entity in 2019. We have to see the facts and circumstances at the time of the audit. In the end, we take industry guidance under advisement and give it weight. But, to the extent the guidance includes errors or contradicts the language of the Requirement, we have no choice but to audit to the language of the requirement. The entity can appeal the auditor's finding through the enforcement process.

“Here is an example. By when does an entity have to first test its Incident Response Plan for Low Impact BCS? Many entities think they have until 4/1/2020 and base that on the fact that there was a delayed effective date (by 12 months) for the equivalent Requirement applicable to High and Medium Impact BCS. But, show me where in any Implementation Plan a deferral is specified for Section 4 of Attachment 1 to CIP-003-6. There is none. The Implementation Plan says 4/1/2017. Maybe the SDT overlooked this detail and intended to give a delayed start date. Maybe not. Regardless, all we have to work with is what FERC approved; the specifics in the Implementation Plan. That and the Excel spreadsheet that NERC published that also shows 4/1/2017.

“Now, if 4/1/2017 comes along and we start writing violations, and then the Implementation Plan is changed to delay the first test, then Enforcement can dismiss the violation. But the auditor determination is a finding of fact at the time of the audit.

“Here is another example. The CIP-002-5.1 guidance recently published by NERC was produced by the entities and not the Regions. It contains errors that the authors declined to address following a Regional Entity review. So, what happens if an entity follows that guidance and gets the wrong answer? Possibly a violation for failing to properly identify and categorize their BCS….the entity will likely not receive a violation if they over-categorize their BCS. But declare something Low when it is Medium, they will likely be found non-compliant, regardless (of) what the guidance says. Guidance is not approved by either the NERC Board of Trustees nor FERC. It is given some deference, but it is not…(binding from a regulatory point of view).

“So be very careful trying to characterize what an auditor will do in the future….(T)hat does not invalidate the appropriateness of an entity asking for advice and guidance. They are still better off than asking some of the consultants out there that have not been as closely involved with the CIP Standards.”

I won’t try to summarize this statement; it seems pretty straightforward to me.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] Of course, it is still pretty unlikely that you would receive a large penalty – or perhaps any penalty – when you can show that you were acting on the best information available at the time you had to make the decision.

Wednesday, December 28, 2016

What will the Regions Say?

This is the second post inspired by comments I received after my post entitled “A Lesson Still Unlearned”

In this recent post, I made the statement “Since the Rules of Procedure say nothing about the regions having any authority to interpret the standards, no region will ever commit an interpretation to writing, even in an email. I have heard from a lot of entities that you have to call up an auditor and ask his or her opinion, if you have an interpretation question. They might not tell you, of course, but if they do they will only do it on the phone. Of course, this means that if, three years from now, a different auditor issues a PV because their interpretation was different from that of the auditor you talked to, there won’t be any documentation of what the original auditor told you.” (note I slightly revised this quotation to clarify it)

I got two sets of comments from two CIP auditors on this statement. One set was from an auditor who has contributed to many of my posts over the years. The other set was from Lew Folkerth, who was formerly a CIP auditor but is now head of CIP outreach for the RelabilityFirst region; he has been the subject of a number of my posts. For both of their regions (although I don’t know whether this is true of any of the other regions), my statement above wasn’t completely accurate. Interestingly enough, I seem to have been wrong in different ways for their two regions.

Let’s start with the current auditor. He wrote “My Region receives and responds to far more email requests than…phone calls. And we do so by email…That said, entities like to get on a conference call because it is more efficient and comprehensive to have a two-way discussion than to tag back and forth via email. Entities need to understand that we are giving our best professional opinion, (that) we are not directing an approach or implementation, and that we will audit what the entity actually did in the light of the best understanding of the requirements available at that time.”

In a subsequent email, he elaborated on this: “As far as email responses, it is often the collective opinion of the team and not just one person. We don't usually preface the response. Our entities generally know us well enough that an explicit statement each time is not necessary. We have been doing outreach and responding to questions for 7 years now.”

The auditor is saying that, not only do he and the other auditors in his region respond to “interpretation” questions by email, they do this much more often than by just a phone call. At the same time, he says that in their comments they’re not directing a particular approach to compliance, and they will audit entities on ambiguous requirements based on whatever was the best information available at the time the entity had to make the decision.

For example, suppose you have to make a decision on a particular issue like the cloud or virtualization. You investigate the available guidance and implement your decision; yet NERC subsequently comes out with new guidance that calls into question the judgment you made. This region (and I suspect most if not all the other regions) won’t ding you for not following guidance that wasn’t available when you had to decide.[i]

Moving to the auditor’s second paragraph, it is clear that not only do the auditors in his region respond by email, they also don’t insist that anything they say is merely a personal opinion; they discuss many issues as a team, and are willing to stand behind their team’s collective decision. Of course, this doesn’t mean a) they won’t as a team change their opinion later, nor b) that the individual auditor you talk to won’t actually be giving his or her personal opinion, not the collective one (note the auditor says that they don’t usually preface a statement by saying that it is either a collective opinion or an individual one); so this means you still can’t rely on these emails as being the “official” position of that region. But this does seem to be a step further than what most other regions will do.

Let’s move to Lew Folkerth of RF. Lew writes excellent articles on CIP in the bi-monthly RF newsletter; these articles are always called “The Lighthouse”.[ii] They provide compliance guidance on different aspects of CIP; some have even dared to suggest that what he does constitutes “interpretation”! (of course, I would never use that forbidden word in describing Lew). I have written more than one post on these articles[iii]; you can find all of the newsletters on RF’s website. In an email, Lew wrote “at RF we do a lot of ‘Assist Visits’ which an entity can request through the RF web site. Most Assist Visits are phone calls with multiple RF SMEs and entity SMEs. We seldom, if ever, provide a written response to questions as a group. Individually we may respond to emails, but always with the caveat that this is one person’s opinion and is not an official RF response.” Lew goes on to point out that, in his Lighthouse articles, any “interpretation” he does of the Standards is his own opinion, nothing more.

So Lew is saying that RF’s auditors and outreach people will sometimes respond to questions by email, but they will always preface the email by saying this is their personal opinion. And the same goes for his Lighthouse articles. Any collective opinions will only be expressed verbally, not in writing (and Lew doesn’t even say that RF even formulates any collective opinions of the auditors, as the other region just discussed does). RF clearly doesn’t go as far as the other region goes, and I suspect the other regions fall more in RF’s camp – although I will point out that Lew’s “Lighthouse” articles are literally unique among the regions, in actually providing compliance guidance in an article format.

I’m not making any judgments on any of the NERC regions in this post. There are no official NERC guidelines to the regions for providing unofficial guidance! If anything, the moral is that if you plan to rely heavily on something that a CIP auditor or outreach person in your region tells you, you should find out under what conditions the opinion is provided. Is it an individual opinion? More than that? And you also need to remember that you will never receive an “official” position from any region, even if it is more than the individual auditor’s opinion. There will always be some risk that when you get audited three years from now, the auditor won’t even have heard about what was originally said to you, and in any case will discount it as simply another auditor’s opinion.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] This idea is very similar to what I often said during the time in 2014 and 2015 when entities had to make decisions on ambiguous areas like the meaning of “Programmable”: You can only be held responsible for compliance with the meaning of a requirement as it was understood at the time. This means you have to look at all available guidance, but in the end it is up to the entity to make its decision – although a consultation with its region is definitely also advised.

When I asked the auditor to clarify this point, he wrote back an email that added a further dimension to this particular issue that I hadn’t anticipated. Rather than try to shoehorn that into this post, I will do a new post dedicated to it soon.

[ii] I hereby reveal the meaning of this title. I have been harboring this dark secret for so long, I can no longer continue to do so in good conscience. Lew is a great fan of the lighthouses on the Great Lakes, and always adorns his column with a picture of one of them. I am also a fan of those lighthouses, but I’m sure he has seen far more than I have.

[iii] And my next post will call people’s attention to two recent articles, which I think are really excellent.

Monday, December 26, 2016

“Implementation Guidance”

I received some very good comments on my most recent post. I will discuss them in three new posts. Here’s the first one.

A longtime NERC practitioner – whom I have known for a number of years – emailed me about my most recent post, in which I reiterated the sorry fate of all of NERC’s attempts in recent years to provide official “interpretations” of the wording of CIP requirements and definitions, which didn’t go through the only two processes allowed by NERC’s Rules of Procedure: a Request for Interpretation, or a SAR to rewrite all or part of a standard or standards. NERC’s motivation for trying to unofficially “interpret” the CIP standards has always been admirable: a desire to have a uniform auditing method that all regions and auditors will follow. But in every case, NERC has run up against the same wall they’ve hit before: there is no way to do this short of an RFI or SAR – and both of those take years to yield results (and may come up empty-handed, as in the case of two Interpretations of CIP requirements that were approved by NERC but remanded by FERC four years ago).

In my post, I provided three examples from the most recent NERC CIPC meeting that seem to indicate that NERC (and others in the NERC community) has still not learned this lesson. Regarding the third of these examples, my friend said:

“On your 12/17/16 posting you state:

3) At one point, Tobias brought up something I’d forgotten about: that somehow a number of industry organizations, including the trade associations, have become empowered to write up “guidance” on CIP compliance questions. I had heard this before, but couldn’t understand what it meant – and I still can’t. Any organization has always been empowered to write guidance for its members (and any others who wish to follow it) on how to comply with any standard – whether a NERC standard or not. But there is no way that this can be considered some sort of “official” guidance, which NERC will endorse as something the regional auditors should follow. And if that’s the case, why even imply that allowing the organizations to issue guidance is in some way a mitigation of the wording problems with CIP v5/v6? It isn’t.

“Actually, the ERO Enterprise does endorse Implementation Guidance documents and auditors are directed to show “deference” to the guidance. The Implementation Guidance (documents) are intended to be examples of ways to be compliant with requirements, but not prescriptive as the only way to comply. More info on the process, the ERO’s processes and existing guidance is at http://www.nerc.com/pa/comp/guidance/Pages/default.aspx.”

My friend also attached a short document titled “ERO Enterprise CMEP Practice Guide: Deference for Implementation Guidance” (I’m having trouble reaching NERC’s web site today, so I can’t provide the link; but you can Google it. I don’t think that’s NERC’s fault. I’m in an Asian country that sometimes seems to restrict access to certain sites for reasons unknown to me. I couldn’t reach RF’s site, either). And I now want to clarify the paragraph from my post that my friend quoted. I’m not at all opposed to NERC’s endorsing guidance prepared by other organizations, as long as there is no implication that it will provide some unique perspective on the meaning of a requirement that would elevate it over guidance provided by other less privileged sources – say, this blog.

The NERC document my friend referenced says “ERO Enterprise CMEP staff (essentially, NERC and regional auditors) will provide deference to ERO Enterprise endorsed Implementation Guidance.” And what do they mean by “deference”? The last sentence of the document reads “If CMEP staff determines the registered entity was found in non-compliance with a NERC Reliability Standard or Requirement, but in good faith, relied on Implementation Guidance, CMEP ERO Enterprise CMEP staff will provide deference to ERO Enterprise endorsed Implementation Guidance.”

I will take NERC at its word that Implementation Guidance doesn’t constitute an Interpretation of a requirement or definition, so I’ll stipulate this is perfectly legal (i.e. compliant with the Rules of Procedure). But my problem is that NERC seems to think that the possible future development of Implementation Guidance on sticky issues like VOIP and the cloud (as well as others) constitutes in some way at least partial compensation for the fact that the current CIP standards require interpretation regarding these issues. Implementation Guidance documents on these and other issues will certainly be welcome, but at the end of the day NERC entities will still not understand what the standards say about these interpretation issues (because they don’t say anything about them, or what they say isn’t clear); in other words, there won’t be any certainty on these issues. Once again, the only legal way to provide definitive guidance is an RFI or a SAR.

If you haven’t been reading my posts religiously the past few months, you may think I’m now pushing a hardline position that NERC has to immediately write a bunch of RFIs and SARs and set 10 or 20 new Standards Drafting Teams to work on these. That’s the last thing I want. What I do want is a single SAR to rewrite all of CIP in a non-prescriptive, objectives-based format, which will change arguments like these from ones with grave compliance implications to simply issues requiring guidance. This non-prescriptive format isn’t something completely new, but can currently be found in CIP-013, CIP-014, CIP-007-6 R3 and CIP-010-2 R4[i], as well as at least two other current CIP requirements.

I say this because I am now convinced that CIP is at an impasse: It will be impossible to address significant interpretation questions like VOIP, and especially to accommodate more recent technologies like virtualization and the cloud, any other way. More on this coming soon to a blog near you.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] In listing these examples of current (or future, in the case of CIP-013) non-prescriptive NERC standards or requirements, I’m not saying that any one of the differing formats of these standards and requirements is exactly what should be followed for the “new CIP” standards. I and two co-authors are currently working on a book that will provide (hopefully by the end of 2017) what we think would be the best format, and I will sometimes discuss working ideas for this in my blog.

Saturday, December 17, 2016

A Lesson Still Unlearned

I attended the NERC CIPC meeting in Atlanta this week. There wasn’t a lot of discussion of the CIP standards, since they are just one of the topics discussed at those meetings, and we aren’t currently in the throes of any momentous developments in CIP. But I was struck by one common theme I heard from a couple of the speakers, and realized that a lesson I thought the NERC community had learned over the past four or five years still hasn’t been learned.

The lesson is simple: If there are gaps or inconsistencies in a NERC standard that can’t be fixed with a straightforward reading of the text, the only permanent remedy is to write a SAR (Standards Authorization Request) and draft a new or revised standard that fixes the problem. There is no other remedy allowed by NERC’s Rules of Procedure.

NERC has run into this immutable fact several times in the past (at least regarding CIP), and has spent a lot of time and effort trying to get around it. Does anybody remember the 2012 CANs (Compliance Application Notices) and CARs (Compliance Analysis Reports) that were intended to fix inconsistencies in interpretation of the CIP v3 requirements? They caused a lot of weeping, wailing and gnashing of teeth in the NERC community, and they were ultimately withdrawn.

Now let’s go to CIP v5. After FERC approved v5 in late 2013 and entities took a serious look at the requirements in 2014, they began to realize there were a lot of problems with the wording – ambiguities, inconsistencies and just plain holes. NERC acknowledged there were some problems and vowed to address them in time for entities to be fully compliant by the faraway date of April 1, 2016.

They promised a large and shifting array of documents to fix the problems. There may have been others, but I remember NERC pointing at various times to the V5 Implementation Study, the RSAWs, Section 11 Supporting Documents, the Lessons Learned and finally the infamous Memoranda (which created a huge firestorm, as a result of which all of them were revoked and removed from the NERC web site).

All of these documents fell into one of two types: serious attempts to address gaps or inconsistencies; or “helpful hints” regarding CIP compliance that didn’t address wording problems. Documents of the first type (for example, the Lesson Learned on “Programmable Electronic Devices”, which seemed to many of us to be a sensible way to fix the problem caused by the fact that the word “Programmable” in the Cyber Asset definition wasn’t itself defined) were without exception ultimately withdrawn and removed from NERC’s website – when it became clear they weren’t just providing implementation guidance but actually going beyond the strict wording. Documents of the second type were left in place, and remain there to this day (note that I’m not complaining about the second type of documents. These certainly did fulfill an important need. But they didn’t clear up gaps or inconsistencies in the wording of the standards, as they were initially touted as doing).

Late in 2015, NERC seemed to finally acknowledge the hard truth that I’ve already stated: The only way to change a NERC standard or definition is to go through the standards drafting process, which of course usually takes years. They bit the bullet and called for a new CIP drafting team to fix some problems in CIP v5 (which were identified by the CIP v5 Transition Advisory Group). When FERC approved CIP v6 in January of this year, they ordered NERC to make three changes: clarify the LERC definition, develop requirements for Transient Electronic Devices at Low impact assets, and protect all Control Center-to-Control Center communications. These items were added to the new SDT’s SAR. I supported the new SDT, but pointed out that the really fundamental problems with CIP v5/v6 weren’t even mentioned in the SAR. In fact, I later published a list of important problems that aren’t included in the SAR.

To be honest, at that point (around April of this year) I assumed that it had finally become apparent to the NERC community, and especially to NERC itself, that there was no point in talking any more about ingenious “solutions” to CIP wording problems, that didn’t involve a new SAR[i]. That is why I was surprised by three statements that were made at this week’s CIPC meeting:

1) Tobias Whitney of NERC always leads a discussion of current developments in the CIP standards at CIPC meetings. He discussed the recent Technical Conference that included a day of discussion on the problem of how entities can utilize the cloud while staying compliant with CIP[ii]. He said NERC was working on this issue and would put out a paper soon. He implied that this would settle the matter of how the cloud could be utilized in a CIP environment.

I pointed out in the meeting that this paper will certainly be valuable, but it isn’t suddenly going to change the fact that a strict reading of the CIP standards wouldn’t allow an entity to utilize the cloud. Tobias didn’t seem to disagree with this, but he also didn’t state its implication: Unless NERC plans to write a SAR for a new CIP version to incorporate the cloud (and NERC has made no moves to do that), the only way that a NERC entity can safely utilize the cloud is to reach an accommodation with their Regional Entity to allow this. I happen to think that most if not all of the RE’s will be open to entities that want to utilize the cloud (as they have been to virtualization); but I’m also sure a lot of entities will hesitate to move in this direction, until the cloud is officially recognized in the CIP standards. And I won’t even venture a guess as to how many years from now that will be!

2) The second statement was made during a discussion of the VOIP issue. There, it was again implied that a rigorous, thoughtful analysis of the wording of the requirements and the BES Cyber Asset definition would resolve the problem. I’ve got news for you, guys: This has been a serious concern for at least a couple of years, and has been debated endlessly. At least 7 of the 8 NERC regions have decided they do not consider VOIP systems to be automatically BES Cyber Assets; and I believe the eighth region now takes this position as well. But the problem can’t be fixed permanently until the BCA definition is changed, which – ta da! – requires a SAR (and while the current CIP SDT is charged with changing the BCA definition to address two other problems, they are not charged with addressing this one. Moreover, it is highly unlikely they will decide to add this to their already-overfull agenda).

You may have noticed one common theme to all three of the points above: In the end, the only “interpreters” of the CIP standards that matter are the regions. If the auditors in a region think a requirement should be interpreted in a particular way, the entities in that region would be well served to make sure they understand their thinking, since they are likely to be audited based on that. And guess what? I doubt there’s any NERC entity in the US who won’t assign the utmost importance to their own region’s interpretation of a CIP requirement.

But there is a problem with this: Since the Rules of Procedure say nothing about the regions having any authority to interpret the standards, no region will ever commit an interpretation to writing, even in an email. I have heard from a lot of entities that you have to call up an auditor and ask his or her opinion on an interpretation question. They might not tell you, of course, but if they do they will only do it on the phone. Of course, this means that if, three years from now, a different auditor issues a PV because their interpretation was different from that of the auditor you talked to, you won’t have any documentation of what the original auditor told you. So this is obviously not any sort of permanent solution.

The irony of this is that, after the CAN/CAR debacle with CIP v3, NERC pointed to CIP v5 as the version that would finally fix the problem of regional variability in interpretation of requirements. Not only has v5 (and v6) not fixed that problem, it has made it worse. Because of the much larger number of gaps and inconsistencies in the wording of the v5/v6 requirements and definitions, the regions have now become essential to understanding how to comply. I have said it a number of times – but never in this blog – that the CIP v5/v6 requirements only “mean” what your region says they mean. Nothing more, nothing less. So keep your eye on your region.[iii]

Before I let you go, repeat after me: There is no way a problem with the wording of a NERC standard can be permanently fixed, other than by writing a SAR and drafting a new standard. Ignore anyone who tells you something different.

But before you say, “Well, then we just need to draft a bunch of SARs and get some new SDTs to work on fixing the wording problems in CIP v5/v6”, I recommend you read my next post. This will prove (using advanced mathematics) that we have reached the limit of changes that can be made in the current prescriptive CIP standards. If we want to incorporate new areas like virtualization and the cloud, we will have to move to a non-prescriptive, outcomes-based approach. And that will be the only way to permanently “fix” the problems with the current wording (most of which problems will become moot in a non-prescriptive format).

This will require a complete rewrite of CIP, as well as changes in how the standards are audited and managed. But the alternative is to have what we have now: a set of standards whose interpretation is ultimately dependent on the auditors. If this is your idea of how mandatory standards with million-dollar-a-day penalties should be interpreted, then you should be happy as a clam now. If you aren’t comfortable with this, then I suggest you start asking what can be done to change the situation.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] When I say that a SAR is the only way to change a standard, you may be asking “How about RFIs (Requests for Interpretation)? They are also permitted in the NERC Rules of Procedure. An Interpretation is voted on by the NERC ballot body and approved by FERC, just like a new standard is.” But Interpretations aren’t meant to clarify problems with wording; they are only to explain wording that is already in place. So they can’t help with the problems I’m talking about. Those problems can only be addressed by changes in the wording of the standards or definitions.

[ii] There are a number of CIP problems that come up with the cloud. One of the most important has to do with CIP-004 R3 – R5. For example, if a NERC entity puts data from an ESP in the cloud, it will most likely reside in one or more huge server rooms, to which hundreds of technicians may have access. Since any one of those technicians might in theory be able to walk up to a server with that data and look at it, a strict reading of the standard would require that each one of them be vetted by the entity for access to their data before being allowed in that server room, and that the cloud provider notify the entity when any of those people leaves their employ. No cloud provider in the world will ever agree to do this. The provider would need to make sure that all servers that hold the entity’s data be housed in a locked room, with access granted only to technicians that have been vetted by the entity. This would effectively destroy any cloud provider’s business model.

[iii] I just noticed that I had predicted this situation in a post a month after FERC approved v5 in 2013. In a footnote, I said “I’m guessing that, if CIP-002-5 isn’t changed, the way this problem will be finally dealt with (not solved) is through the Regional Entities taking it upon themselves to develop their own interpretations. These won’t have any more force than a NERC interpretation, but since the RE’s are the ones who do the auditing, it is far more likely the registered entities will follow the lead of their region.” Of course, I was specifically discussing problems with CIP-002 here, but the same can be said for any of the v5/v6 standards: the Regional Entities are the only real arbiters of interpretation questions. This situation will remain until the standards are revised or rewritten entirely (which is my preference, in case you haven’t noticed that yet).

Sunday, November 27, 2016

Not Dead Yet

A number of my posts have started with a mention of something I read in EnergySec’s weekly newsletter. This isn’t because I’m being paid to promote the organization (although I think very highly of them), but because the newsletter often sees things that nobody else does (including me).

One feature of the newsletter is the Blog Roll, where Brandon Workentin, the very knowledgeable EnergySec staff member who edits the newsletter, discusses recent blog posts of interest to the ICS security audience. Two weeks ago, Brandon wrote about my post on the issue of whether VOIP phones should be considered BES Cyber Systems. He pointed out that, while nominally about VOIP, the post “is really about the need, in some cases, for entities and auditors to interpret a requirement ‘as if it had been written differently.’" As usual when Brandon writes about my posts, he hit the nail on the head; in fact, I don’t think I could have summarized the post any better than he did. This is why I always look forward to reading what he says about my posts: so I can learn what I meant when I wrote them!

In the penultimate paragraph of that post, I concluded (in reference to how VOIP systems should be treated under CIP-002 R1) that “…both the entity and the auditor need to insert two words into the (BES Cyber Asset) definition (Note: I could have gone on to say “in order to have a BCA definition that properly addresses VOIP phones and other ‘support systems’”).” I continued, “In other cases (and I will illustrate one case in another post that is coming soon), both the entity and the auditor need to ignore some of the wording in the requirements.” Well, the post that is “coming soon” is here! I will now discuss how one particular requirement in CIP v5/v6 – specifically, CIP-002-5.1 R1 with Attachment 1 – can only be properly followed by an entity or an auditor if part of the wording is ignored. But first some background.

In late April 2013, FERC surprised most in the industry (and certainly me), when they issued their NOPR saying they intended to approve CIP version 5 and have it supersede CIP v4, which was due to come into effect on April 1, 2014. Since they had approved v4 in April 2012 (which was also a surprise to most of us in the industry), I had been focusing on v4 in my blog posts, because after all v4 was now the law of the land, and v5 was still struggling to get the votes it needed to get approved (and there was a serious question whether it would ever be approved).

After the NOPR came out, I decided to write a series of posts on the CIP v5 standards, starting at the beginning with CIP-002-5.1. In this first post, I sat down with CIP-002 R1 and Attachment 1 and tried to piece together exactly how a NERC entity was to identify its BES Cyber Systems and then classify them as High, Medium or Low impact, along with identifying its assets that contain Low impact BCS.

But a funny thing happened as I worked on this post; no matter how much I scrutinized the wording of R1 and Attachment 1, I simply couldn’t put together a logical chain of steps that would produce the required result. I came to a point where no amount of logic could bridge the wide chasm caused by the contradictory nature of the wording I found in R1 and Attachment 1. I concluded that this chasm needed to be closed, in order for CIP v5 to be a set of standards that NERC entities could comply with and that NERC auditors could audit.

The primary problem[i] I identified in this post was in the wording of Attachment 1. This seems to be written under the implicit assumption that the entity has already identified BES Cyber Systems at all of its BES Assets – High, Medium and Low impact – before it even starts to classify them. In Sections 1 and 2 of Attachment 1, the entity is told to identify those BCS that meet the High and Medium impact criteria, respectively. And Section 3, where Lows are identified, seems to follow suit when it tells the entity to identify “BES Cyber Systems not included in Sections 1 or 2 above…” This clearly implies that the entity had a comprehensive BCS list at the beginning of this process. But there’s only one problem with this: The entity isn’t required to identify BCS at Low assets, so it never had a comprehensive list to start from (and therefore it can’t identify BCS “not included in Sections 1 or 2”). In fact, Requirement Part 1.3 of CIP-002 R1 (which “sends” the entity to Section 3 of Attachment 1 to figure out how to identify Low impact assets) says “A discrete list of low impact BES Cyber Systems is not required.”

This discovery led to my writing a series of posts over the summer of 2013, advocating that CIP-002-5.1 R1 and Attachment 1 be rewritten to bridge this chasm. Since I knew that NERC wouldn’t undertake this task on their own, I put my hopes in FERC – that they would order NERC to do this when they approved CIP v5. In fact, I even wrote out my proposed revised wording, and submitted that to FERC during the comment period on v5.

Of course, FERC didn’t take me up on my suggestion when they approved v5 in Order 791 on November 22, 2013 (50 years to the day – in fact, almost to the hour - after the assassination of President Kennedy in 1963. Just coincidence, you say? Well….). Instead, they ordered the four changes that led to CIP v6, none of which addressed this issue. I then pinned my hopes on the idea that NERC would find some way to “interpret” CIP-002 so that it would make sense, even though the actual wording wouldn’t change.

NERC did issue some “Lessons Learned” in 2014 and 2015 that tried to resolve ambiguities (although not the big one I was concerned about), but these were all withdrawn. But NERC came up against the reality that the Rules of Procedure only allow one way to change a standard: write a Standards Authorization Request, seat a Standards Drafting Team, and get to work drafting the new standard (which takes easily 3-4 years to produce results). And there is only one official way to interpret a standard: go through the Request for Interpretation process (which takes easily 2-3 years, with no guarantee of ultimate success. FERC remanded the last two CIP RFIs that were presented to it in 2012, although there may be a test soon when EnergySec’s RFI on Criterion 2.1 in Attachment 1 of CIP-002-5.1 is presented to them).

In the end, NERC admitted there needed to be a new version of CIP (although the fact that it is a new version seems never to be stated publicly. Instead, the drafting team to this day is called the “CIP Revisions” SDT, presumably to demonstrate that NERC has a great sense of humor), and indeed the new SDT started work last spring. The results of their work will be implemented in pieces over probably the next 2-6 years, although the fundamental problem I’m concerned with will most likely never be addressed at all.[ii]

You may wonder why, if there is such a basic contradiction in the wording of CIP-002-5.1 R1, the whole edifice of CIP v5 hasn’t come crashing down. After all, R1 can very easily be said to be the fundamental requirement in CIP v5 and v6 (and it will be in v7 as well). This is the requirement where the entity figures out what all the other requirements apply to. If the entity is lucky, their list of applicable assets and cyber assets is very small. If they’re not lucky, there is a huge list, and the entity had better figure on spending some significant portion of their total annual budget on CIP compliance for at least the next 5-10 years (a number of entities are doing exactly this. In fact, I know of one medium-sized entity that has allocated over ten percent of their entire annual revenues to cyber security and NERC CIP compliance, at least for this year).

So why didn’t CIP v5 collapse? There is certainly a lot of confusion over this requirement, and there continues to be confusion today; in my opinion, this confusion alone probably delayed many entities’ implementation of v5 for at least a year. But NERC entities and NERC auditors seem to be in fairly broad agreement now on how to comply with R1, as well as on what constitutes non-compliance with the requirement.

How did this agreement come about, given that applying strict logic doesn’t allow a single consistent interpretation of the wording of CIP-002 R1 and Attachment 1? It’s actually very simple: people reverted back to the overall framework for the asset identification process in CIP v1 – v4, even though that wasn’t directly supported by the language in CIP-002-5.1. To briefly (briefly for me, anyway!) summarize what happened:

The bright-line criteria in Attachment 1 made their first appearance in CIP-002 version 4. If you read the v4 criteria, you will recognize the predecessors to most of the High and Medium impact criteria in v5. In v4, the criteria applied directly to assets, not cyber assets or BCS. BES assets that met one of the v4 criteria were Critical Assets and thus in scope for CIP v4; those assets that didn’t were out of scope for CIP v4 altogether (of course, there was no distinction between High and Medium impact assets. An asset was either critical or it wasn’t in scope for CIP v4 at all). Naturally, there were no criteria applying to Low impact assets, although some assets that would have been Critical Assets under v4 ended up as Lows under v5 (notably blackstart assets and Facilities).
When the SDT started work on CIP v5 after completing v4 (the same SDT developed v2, v3, v4 and v5, although there were a number of personnel changes over the four years the team was in existence), they more or less imported the criteria directly from v4, even though in theory the whole purpose of Attachment 1 had changed. In CIP-002 version 5, Attachment 1 in theory provides criteria for identifying BES Cyber Systems, not Critical Assets as in v4. But instead of finding new criteria that applied to cyber systems, the SDT decided essentially to reuse the asset-based criteria from v4 for the High and Medium criteria in v5. However, they “front-ended” the criteria with language that tried to make them conform with the new purpose of Attachment 1 – to classify BES Cyber Systems.
In retrospect, doing this was a big mistake. The SDT was using asset-based criteria to classify BCS, but the wording of Attachment 1 (and some of the wording of R1) sounded like the BCS were really being classified on their intrinsic BES impact. Question: How could the SDT have developed meaningful, measurable criteria that actually applied to BCS, not assets/Facilities? Answer: They would have had to bring in some measure of how important each BCS was to the BES.

In CIP v1-4, a Critical Cyber Asset was defined as one that is “essential to the operation of” a Critical Asset – so it was the impact on the asset that was important, not on the BES itself. In order to achieve the stated goal of CIP-002-5.1 R1, the v5 criteria (in Attachment 1) should have ranked the intrinsic “essentialness” of each BCS to the BES itself, not to an asset. I admit this would have been very hard to do. But of course, the fact that it would have been hard simply reflects a point that I argued with SDT members in 2011 and 2012: It doesn’t make much sense to talk about the intrinsic impact of a BCS on the BES.[iii] Rather, the impact of a BCS is almost always only through the asset or Facility it controls or supports. It is those assets or Facilities, not the BCS themselves, that should be classified, as was the case in v1-4. As it is, I will demonstrate below that literally the entire NERC community has reverted to the v1-4 method of classifying assets, not BCS, even though this involves ignoring a lot of the wording in CIP-002-5.1 R1 and Attachment 1.

I think the SDT made this mistake because they knew it was now verboten[iv] to require entities to identify BCS at Low impact assets[v]; so they had to let the entity simply identify Low assets, not Low BCS. This meant they had to make all the Attachment 1 criteria (High, Medium and Low) apply to assets, not BCS. If they didn’t do this, how could an entity possibly identify its Low assets, since it hasn’t previously identified its High and Medium assets? Unfortunately, the SDT at the same time tried to maintain the fiction that Attachment 1 was really about classifying BCS, not assets, which is why the literal wording of Attachment 1 is in direct contradiction to the way that 99% of NERC entities and auditors are interpreting it (and the way that IMHO the SDT members intended it to be interpreted).
What the SDT should have done was go back to the CIP v1-4 model: First the entity identifies its most important assets in scope. These were Critical Assets in v1-4, but High or Medium assets/Facilities in v5 (and in v5, the entity uses the criteria in Attachment 1 to classify High vs. Medium assets). Next, the entity identifies the Critical Cyber Assets (in v1-4) or Medium or High BES Cyber Systems (in v5) associated with those assets (with High or Medium BCS defined as those associated with/located at High or Medium assets/Facilities respectively). Finally, the entity subtracts its Medium and High assets from its total list of BES assets; the remaining assets are Lows (of course, there is no requirement to identify BCS at Lows). This wouldn’t have been terribly elegant, but it would have been logically consistent.
However, even though the sequence I just described in the previous paragraph isn’t in the actual wording of CIP-002-5.1, it is in fact how literally every entity (that I know of) is complying with CIP v5/v6, and it is how literally every auditor (again, that I know of) is auditing CIP-002-5.1 R1. Many, if not most, of the entities, and probably more than a few auditors, are quite unaware that the compliance methodology they are following (or auditing against) isn’t in the actual wording of the standard. But that doesn’t really matter. There isn’t much confusion over this issue now, since this consensus has developed in spite of the wording of R1 and Attachment 1 (although there was a lot of confusion over these and other parts of CIP v5 in 2014, when many entities lost part or all of a year in their v5 compliance efforts, as they tried to figure out what the requirements actually meant, and waited for definitive guidance from NERC, which never came).

To finish my long sermon, this is another case where both entities and auditors cannot simply follow the strict wording of the CIP v5/v6 standards in order to comply with and audit them. In the VOIP case in my earlier post, I stated that what almost all entities and auditors are doing is implicitly inserting two words into the BES Cyber Asset definition – when they do that, confusion over how to classify VOIP and other “support systems” melts away. In the case of CIP-002 R1 and Attachment 1, I have just tried to show that the only way to overcome the logical inconsistency of the wording is for entities and auditors to ignore a lot of the actual wording and substitute alternate wording – and they have done that with a remarkable degree of unanimity.

So what’s the big deal with all of this? Since neither of these issues is causing entities to be out of CIP compliance (or to devote a lot of resources to something which they have mistakenly identified as required for compliance), what’s the problem? Aren’t these both simply dead issues?

One problem is that – in my opinion – the need to either supplement or ignore some of the wording of CIP v5 (and especially in CIP-002 R1, the fundamental requirement of CIP v5 and v6) makes the standards unenforceable in what I call the strict sense: If an entity is fined for not properly identifying BCS and challenges that fine on the grounds that the wording of CIP-002-5.1 R1 is contradictory, I think the fine will be thrown out by any judge who spends ten minutes reading the requirement. But since no NERC entity has actually ever challenged a CIP fine in court, I admit this is a fairly theoretical issue.[vi]

The real problem is that sometimes entities take the words of CIP-002 R1 and Attachment 1 literally, and end up mis-identifying BCS and assets for that reason. In my next post, I will provide an example that I recently came across, where this is exactly what happened.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I actually pointed out two fundamental problems with CIP-002-5.1 R1 in the post. The first (which I said was of lesser importance) had to do with what I thought at the time was inconsistent use of the terms “asset” and “Facility”. I later learned that this wording is actually not inconsistent, although it is terribly confusing. In practice, very few entities – and perhaps very few auditors – understand that these two terms actually refer to different things. This without doubt has led many entities to over-identify Medium BCS at Transmission substations, although a few large Transmission entities have told me that the greatly increased effort that would be required, to properly take the word “Facilities” into account to reduce the number of BCS, outweighs the benefit that would be derived from having fewer Medium BCS in the first place.

[ii] I haven’t even advocated that the SDT take up the issue discussed in this post. It would require a huge effort and multiple ballots, and this effort alone would probably push the final implementation of v7 back another year or so. I have the greatest admiration for this SDT, but they have a huge amount on their plate as it is, and I am becoming convinced that they will never succeed in addressing one of the biggest items on that plate (as long as the CIP standards remain basically prescriptive). More on that topic coming soon to a blog near you.

[iii] In fact, I remember a discussion on an SDT call in 2012, when I asked the chairman of the SDT to name a cyber asset that directly impacted the BES - not indirectly through an asset or Facility. He came up with “leak detectors”, a device I hadn’t heard of. Evidently, these sit directly on a line and are not located within a substation or other asset. However, to require that all cyber assets be evaluated for their BES impact, not their impact on an asset/Facility, solely because there are one or two BCS that actually do directly impact the BES, is the textbook example of the tail wagging the dog. Yet this is in fact the stated purpose of CIP-002 R1: identify and classify BES Cyber Systems based on their impact on the BES. Ironically, those leak detectors wouldn’t be in scope for CIP v5/v6 anyway, since the only BCS that are in scope are those located at one of the six asset types listed in R1.

[iv] Here is why it was forbidden: In 2009, the SDT put out a “concept paper” that first introduced the ideas of BES Cyber System and the BES Reliability Operating Services (although the latter were called “BES Reliability Functions” in the paper). While I think both of these are good concepts, I think that pretending that the impact of a BCS on an asset has nothing to do with whether a Cyber Asset impacts the BES – while at the same time including criteria in Attachment 1 that are entirely asset (or Facility)-based – is the Fundamental Sin of CIP v5, and is the reason it will never be enforceable in the strong sense.

[v] To comply with the first draft of CIP-002 v5, which was posted for comment in November 2011, the entity would literally have had to start their BCS classification effort by identifying BCS at all BES assets – High, Medium and Low impact. Then they would go through Attachment 1 and classify each BCS as High, Medium or Low impact. Of course, NERC entities weren’t too excited about this since it meant identifying all Low BCS. The first draft went down to resounding defeat, with none of the standards receiving much more than a 20% positive vote (I wrote a post pointing this out in December 2011. It was posted on a Honeywell blog that is no longer available online. If you would like to see that post, send me an email at talrich@deloitte.com). At this point, the SDT realized they had to make it explicit that BCS didn’t have to be identified at Lows. Unfortunately, rather than completely rewriting CIP-002 R1 and Attachment 1 so that they took account of that new reality, the SDT kind of tinkered around the edges but left the basic structure in place. The result was the logically inconsistent R1 and Attachment 1 that we know and love so dearly today.

[vi] Although I do believe that if this ever happens, the result will be disastrous. Think of what would happen if a judge suddenly invalidated CIP-002-5.1 R1, and there was no longer a legally-approved way to identify and classify BES Cyber Systems. In my opinion, this would effectively invalidate all of CIP v5 and v6 (and, by the time that happens, probably v7). What would NERC do then? Go back to v3? Throw in the towel on regulating cyber security? Beats me.

Tuesday, November 22, 2016

Supply Chain Security Webinar Recording Posted!

Last Friday, Larry Kivett and I from Deloitte, and Edna Conway of Cisco, participated in a webinar under UTC's auspices on supply chain security; my topic was CIP-013, the supply chain security standard currently under development. The webinar was very well received with lots of questions, and I think you'll find it interesting.

The recording can be found here. Note that you can also find the recording of the previous Deloitte-Cisco webinar on Virtualization and NERC CIP at the same link.

If you have any comments or questions on the webinar, please email me them at talrich@deloitte.com.

Sunday, November 13, 2016

Hold the (VOIP) Phones!

In April of 2015, I was trying to write some “Tom’s Lessons Learned” to address CIP v5 “interpretation” concerns that NERC was clearly not going to address itself – mostly ambiguities in the standards. There are many of these. As of today, I count 4,357, but there are still a few hours left in the day and that number might increase.

My third LL had to do with VOIP phone systems. This became a big issue in one of the NERC regions, where the region was saying that VOIP systems that served Control Centers need to be evaluated as BES Cyber Assets. This in itself wouldn’t have caused a lot of consternation if the region hadn’t also said – or at least a number of entities in the region believed they said it – that the burden of proof would be on the entity to show why the VOIP system should not be a BCA.

There is no dispute that phones play a crucial role at Control Centers. Take the example of a Control Center that controls a crucial peaker plant. On a hot day in the summer, if the RTO or ISO called to order that the plant be brought up and couldn’t get through to anybody because the phones were down, there could conceivably be a very serious BES impact.

Let me say at the outset that, if the VOIP system is networked with BES Cyber Systems within the ESP (which I would call a deplorable security practice, if that word hadn’t been overused lately), this whole discussion is moot. It is a PCA and therefore subject to almost all of the requirements that BCS are. The rest of this post assumes this is not the case.

Of course, the question whether a Cyber Asset is a BCA depends on whether it meets the BCA definition, the heart of which reads “A Cyber Asset that if rendered unavailable, degraded, or misused would, within 15 minutes of its required operation, misoperation, or non-operation, adversely impact one or more Facilities, systems, or equipment, which, if destroyed, degraded, or otherwise rendered unavailable when needed, would affect the reliable operation of the Bulk Electric System. Redundancy of affected Facilities, systems, and equipment shall not be considered when determining adverse impact.”

Let’s concede right away that the above example shows that a VOIP system failure in a Control Center could indeed have a very serious BES impact within 15 minutes. So if this is indeed what the BCA definition says, there is no question the VOIP system would have to be a BCA. But note the word “could” here. Does that appear in the definition? No, it doesn’t; instead the word “would” appears, which I translate loosely as meaning “inevitably”. In other words, I interpret the BCA definition to require that there would inevitably be a BES impact if the Cyber Asset were to fail, be misused, etc.

Now, I freely admit that the word “inevitably” is not in the BCA definition. But neither is the word “could” or “possibly”, so it can’t be said that the meaning is that there only has to be a possibility of BES damage for the Cyber Asset to meet the definition. I interpret “would” to mean inevitably, but others may not. This is just one of the 4,357 ambiguities in CIP v5. And, like all the other ambiguities, it can’t be removed by a closer reading of the words. It is just there, period.

If you read “inevitably” into the BCA definition, IMHO, there should be no question that VOIP systems are not BCAs. Sure, the system may go down on a hot day and the initial call from the RTO won’t go through. But will the RTO just give up and say, “Oh well, I guess we’ll just have to accept a cascading outage”? Of course not. They will be prepared for this eventuality, and there will be a number of other ways they can get through. They can call another phone at the same organization (perhaps the Finance department) and ask that they bring the message to the Control Center immediately. Or they will have stored some cell phone numbers of Control Center personnel that they can call in just this situation. Or they can get in via a satellite phone. Or they can use smoke signals or carrier pigeons. The message will get through, one way or the other.

So it seemed clear to me that there would not inevitably be a BES impact if the VOIP system in our hypothetical Control Center went down; given that I interpreted “would” to imply inevitability, it seemed reasonable to say that VOIP systems could never be BCAs. But in response to assertions about alternative communications pathways, I am told (and I heard this once with my own ears) that the region would say something to the effect of “Aha, but as the BCA definition says, the fact that there is some sort of redundant method of getting through can’t be used to ‘excuse’ the VOIP system from being a BCA.”

In other words, the region seemed to be arguing that the fact that there were so many alternative communications pathways couldn’t be used to excuse the VOIP system from being a BCA. This assumed that the fact that the cellular communications systems, satellite systems, etc. effectively back up the VOIP system in the Control Center fell under what was meant by “Redundancy” in the BCA definition.

If this were true, then my insertion of the word “inevitably” in the BCA definition has no effect since, if all of those alternative systems can’t be considered, then the VOIP being down in the Control Center will inevitably have a huge impact on the BES; therefore, the VOIP system will be a BCA. But the fact is that there are all of these alternative methods of communication, and there assuredly won’t be a BES impact.

This always struck me as not at all what the CIP v5 drafting team had in mind when they talked about redundancy.[i] It seemed that they meant identical systems, configured identically, deployed in such a manner that if one failed, the other would immediately pick up where the other left off – think redundant servers. But the local cellular system and the Control Center’s VOIP system are hardly identical. If the VOIP goes down, the cellular system (or another like the satellite phone system) will indeed provide an alternative communications vehicle, but the converse isn’t true: If the cellular system goes down in say a city, the utility’s VOIP won’t instantly back it up. So there is no symmetry here.

While my original post on VOIP didn’t mention it, there is another good argument why the word “inevitably” should be inserted in the BCA definition: If a Cyber Asset should be declared a BCA even if its loss, misuse, etc. wouldn’t inevitably impact the BES, then other systems at BES assets – especially including HVAC and lighting – need to be treated the same. If the heat goes off in a power plant in northern Ontario in January, there might be a need to shut it down – but it certainly isn’t inevitable, given that people might be able to just put on their coats, hats and gloves and keep working. So if the magic word “inevitable” isn’t inserted in the BCA definition, all of these systems would have to be declared BCAs as well.

At this point, you might bring up NERC’s FAQ document from early 2015, which I wrote about in this post. That document stated that “support systems” should be excluded from consideration as BES Cyber Assets; the two examples provided were HVAC and lighting systems, although I know some of the NERC regions consider VOIP systems to fall in the same category. Because of this, I believe almost all of the regions haven’t required their entities even to consider whether VOIP systems are BCAs.

I’m certainly not objecting to this practice, since it produces the same result as what I’ve been advocating. However, what is disturbing about this FAQ statement is that there is absolutely no basis for it in CIP v5. Where is the definition of “support system”, and where in the BCA definition does it say that support systems are excluded from consideration? Obviously, the answer is “nowhere”.

But what about the region that had (according to some entities in the region) originally stated that VOIP systems[ii] had to be declared as BCAs? Did they change their position? Since I stopped hearing cries of anguish from that region, I assumed they had. However, at a recent compliance meeting for that region, one of their auditors clearly implied they had not.

He did this when he replied to a question whether, if an entity had in place the Alternative Interpersonal Communications system (AIC) mandated by the NERC COM-001-3 standard (essentially, a backup phone system for control centers), this would negate the need to declare a VOIP system as a BCA. The auditor placed some conditions on his statement, but agreed that this was the case.

Of course, since the result of this statement conforms to what I’ve been advocating, I’m not going to scream and rant about this (there are plenty of other opportunities to do that!). However, I want to point out that the implication of the auditor’s answer is that no, the region has not changed its basic position. Effectively, all they are now saying is that there is one particular type of alternative communications system that does actually lead to the VOIP system not having an inevitable impact on the BES. But the implication of this is that all of the other alternatives I mentioned earlier – cell phones, satellite phones, etc. – do not make the VOIP system’s impact on the BES any less inevitable, and do not make the system any less of a BCA. This was the region’s original position (again, judging by what I was told by some entities, although I myself heard this stated at a compliance meeting for that region earlier this year).

However, the auditor introduced an interesting new word in his reply: “dissimilar”. He used it in referring to the AIC system, meaning it was dissimilar to the VOIP system (which indeed it is). I believe the reason he used this word is he is implicitly advocating that it be inserted in the second sentence of the BCA definition, so that this sentence would now read something like “Redundancy of affected Facilities, systems, and equipment shall not be considered when determining adverse impact, in the case where such Facilities, systems, and equipment[iii] are similar[iv] to the Cyber Asset under consideration.”

This is terrible language from a legal point of view, but it illustrates the idea. If the entity is considering whether a Cyber Asset is a BES Cyber Asset, and if there is a redundant system (e.g. a backup server) that is in a similar configuration, the mere presence of that redundant system does not remove the inevitability of the Cyber Asset’s impact on the BES if lost, misused, etc. This is partly because the same attack that disabled the original Cyber Asset may very well attack the similarly-configured redundanct system. On the other hand, if there are multiple alternatives that remove the inevitability from the impact (in the case of VOIP these are the cell systems, smoke signals, etc), then that redundancy can be considered as removing the inevitability.[v]

And if this is what the auditor is implying, then I am in total agreement with him. In fact, I believe the ambiguity regarding the status of VOIP, HVAC and lighting systems, with respect to their status as BCAs, could be completely eliminated if two words (actually, one word and one phrase) were added to the BCA definition: “inevitably” and “similar”. I would advocate that the current CIP v7 SDT add this to their agenda, were that agenda not already overwhelming (more on this point in a post coming soon).

So why did I write this post? After all, there clearly is no danger that entities will be forced to declare their VOIP systems as BCAs. And even though I’ve just stated how I think the ambiguity regarding VOIP can be cleared up, I’m not even forcefully advocating that this needs to happen.

Well, I’ll tell you why I’m writing this post. I’m writing it because it illustrates an unfortunate fact of life regarding the CIP v5 and v6 standards (and almost certainly the v7 ones as well): Sometimes, the only way to effectively comply with and audit a requirement is to interpret a requirement or definition as if it had been written differently. In this case, both the entity and the auditor need to insert two words into a definition. In other cases (and I will illustrate one in another post that is coming soon), both the entity and the auditor need to ignore some of the wording in the requirements.

I used to get very worked up over the fact that, in my personal opinion, the need to do this makes the CIP v5 and v6 standards unenforceable in the strict sense that a fine that is appealed to the courts by an entity would almost surely be thrown out, due to the ambiguity in key areas (almost all having to do with CIP-002 R1 and asset identification). However, I’ve more lately come to the conclusion that this is not worth worrying about. I say this not because my advancing age makes me more mellow (although it does in my case), but because I’ve found so many other examples where requirements or definitions need to be reinterpreted in order to be followed or audited, and where there is about zero probability that these will be fixed in any of our lifetimes. What’s another one or two ambiguities?

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I had to think long and hard before I used the words “what (the SDT) had in mind”. This is because I have written at least one post pointing out that there is no possible way to verify what the SDT had in mind – or more specifically what they “intended” – when they used or didn’t use any particular words. I excuse myself here because I’m really referring more to common English usage (and specifically IT and engineering usage), rather than conducting a hypothetical inquiry into the state of mind of the SDT members when the BCA definition was approved.

[ii] Actually, this whole discussion applies to a lot more than VOIP systems. Any PBX that relies on Cyber Assets to operate would have to be considered as well, not just those that rely on VOIP. However, this issue has always been discussed as relating to VOIP, so I’ll continue to call it that.

[iii] I would much prefer that “Facilities” and “equipment” be removed from this phrase (as well as from Section 4.2 of all of the CIP standards), so that it just referred to “systems”. I have always considered the use of this phrase to be a huge mistake.

[iv] In an email discussion on this issue with an auditor from another region, he pointed out to me that, if there is only one dissimilar system that backs up a Cyber Asset, then that should not be considered as removing the inevitability of the impact of the loss of the Cyber Asset on the BES. I agree with this, and thank the auditor for pointing it out to me (I’m not going to add this to my proposed new definition, though. If the v7 SDT asks me to draft the complete language of my proposed changes to the BCA definition, I’d be happy to oblige. I would like to see a number of other changes as well, that have nothing to do with the subject of this post). Of course, VOIP, HVAC etc. all have multiple “systems” backing them up, as I have illustrated above. In other words, if there were really only one other way – say a particular vendor’s cell phone system in that area - to get through to the Control Center if their VOIP system were down, then the message really would not get through, assuming both the VOIP and the cell system were down. But it is very hard to imagine a case where there would be no other way at all to get the message to the control center in 15 minutes.

[v] Of course, the entity should not be able to blithely state that there are a lot of alternatives, without documenting that there are any. In the case of VOIP, this shouldn’t be too hard an exercise.