Wednesday, April 15, 2015

Tom’s Lesson Learned No. 2: “Adversely Impact” and “Support Systems”

I posted the first of Tom’s Lessons Learned last week, and received a few serious comments the next day.  And guess what – the comments, and the email discussions I had with the people who sent them, revealed that a new Lesson Learned is needed!  As I have said in the past, interpretation questions on v5 – especially on BCS identification and classification – are like the Greek myth of Hydra, the multi-headed serpent that grows two new heads for every one that is cut off.  They’re never-ending.

My first Lesson Learned was about the meaning of the words “adversely impact” in the definition of BES Cyber Asset.  In that post, I stated that the best way to think about the impact of a particular Cyber Asset on the BES is as a two-step process: the Cyber Asset adversely impacts the asset or Facility it’s associated with, then the asset/Facility adversely impacts the BES itself.  

Why did I say this?  I must admit I didn’t make my reason clear in the previous post, but I’ll say it now: For the vast majority of Cyber Assets, it is meaningless to talk about their having a direct impact on the BES.  If they have an impact, it is only through the asset/Facility they’re associated with.  For example, the DCS in a generating station doesn’t directly impact the BES, other than through the station itself.  Another example: the relay controlling a circuit breaker for a 500kV line doesn’t in itself impact the BES.  If it opens the breaker, it’s the loss of that line (the Facility) that has an impact, not the relay itself.

So the meaning of “adversely impact” in the BCA definition comes down to two questions:

  1. Does the loss or misuse of the Cyber Asset necessarily adversely impact the asset/Facility?
  2. Does this adverse impact on the asset/Facility translate into an adverse impact on the BES?
If the answer to either of these questions is No, the Cyber Asset isn’t a BCA.

However, just stating these two questions doesn’t lead us much further down the path of understanding what “adversely impact” means.  What really matters is what the auditor will think it means when he/she pays you a visit say three years from now.  Let’s say you didn’t identify a particular Cyber Asset as a BES Cyber Asset because you don't think its loss or misuse would lead to an adverse impact on the BES; the auditor disagrees with this.  How do you justify your decision?

The short answer to this question is that you pull out the extensive documentation you created in April 2015, which justifies why you did this.  The document will say something like “We decided that Cyber Asset X wasn’t a BES Cyber Asset because we asked these two questions (here you list the two questions above, perhaps not with my exact wording).”  If the auditor asks why you used that approach, you can answer, “NERC said in their April 1, 2015 FAQ release that ‘adversely impact’ meant ‘negatively impact’.  Since we already assumed that was the case, this didn’t help us understand what the phrase actually meant.  Therefore, we rolled our own interpretation.”

Of course, your April 2015 document described this approach in detail (it might include elements of both this current post and the earlier one, although I don’t recommend you state that you were following Tom Alrich’s advice.  If you do, the region may triple the VSL for whatever violation you’re assessed).

But how will you defend your answers to the two questions?  For both questions, it’s safe to assume you won’t need to defend a “Yes” answer.  It’s only if you are saying there is no impact, either of the Cyber Asset on the asset/Facility or of the asset/Facility on the BES, that you may get challenged by your auditor.

Let’s deal with the first question first.  How would you defend a decision that the loss or misuse of a particular Cyber Asset won’t adversely impact the asset/Facility it’s associated with?  For a control system, I really don’t see a way to do that.  A control system has to have an impact; otherwise it isn’t a control system (remember, the “within 15 minutes” part of the BCA definition is separate from what we’re discussing here.  Even though a system impacts the asset/Facility and the latter impacts the BES, if it doesn’t do that within 15 minutes, it still won’t be a BCA.  But that is a later step in the BCA identification process).

Moving to the second question, how would you defend a decision that the adverse impact on the asset/Facility (caused by the loss or misuse of the Cyber Asset in question) wouldn’t adversely impact the BES?  The problem is that there are a whole host of ways that an asset or Facility could impact the BES.  It seems to me that you would have to show you had considered all of those ways, in coming to the conclusion that there wouldn’t be an adverse impact. 

Where does this list of “ways” (i.e. modes of impact) come from?  Fortunately, the SDT has already addressed that question, although not in the CIP-002 R1 standard itself.  The BES Reliability Operating Services – discussed in the Guidance for CIP-002-5.1 – constitute a list of ways that an asset/Facility can impact the BES.  I think it should be sufficient if you simply showed the auditor that the loss or misuse of the Cyber Asset

a)      Won’t impact the asset/Facility (i.e. question 1 above)
b)      in a way that would impede the ability of the asset/Facility to fully fulfill one or more of the BROS that it normally fulfills (question 2).

So does this solve all of our problems?  Do we now know for sure what “adversely impact” means in the BCA definition?  I’d like to say we do, but a problem remains: What I’m advocating contradicts the actual wording of the BCA definition and CIP-002-5.1 R1.  This isn’t necessarily a problem, since I have pointed out in many posts – starting with this one – that the NERC entity needs to “roll their own” interpretation or definition, absent any fairly authoritative guidance from NERC on the matter.[i]  As I mentioned above and in the previous post, NERC’s only attempt to address this issue, in the April 1 FAQ release, didn’t provide any useful new information.

I say that my interpretation of “adversely impact” contradicts the wording of the requirement.  By that, I’m referring to the fact that the requirement is nominally for identification and classification of BES Cyber Systems, with the assets/Facilities only entering into the process by being what the bright-line criteria refer to (for example, see Section 3 of CIP-002-5.1, “Purpose”). 

However, I contend that the only way an entity can really comply with the spirit of R1 is to think in terms of CIP v1-3, where you first identified Critical Assets and then Critical Cyber Assets that are “essential to the operation of” those Critical Assets.  In fact, I literally know of no entity – and only one region – that adheres strictly to the wording of R1 in this regard.  They are all first identifying assets or Facilities that meet the High or Medium criteria, then identifying BES Cyber Systems associated with them.  The “interpretation” of “adversely impact” that I’m describing in this post reflects that fact.  In other words, this is yet another area where the entity needs to roll their own interpretation - and in fact, all entities have already done so, but most didn't realize it.

I’m going to illustrate this with an example I’ve discussed before.  I have stated on several occasions that there are cyber assets whose loss or misuse can affect the BES, but which still don’t fulfill a BROS.  One example is from an SPP workshop on BCS identification in 2014.  In it, there was a fictional 1500+MW plant with a Stack Emissions Monitoring System (SEMS), which provides information on what chemicals are being emitted in real time (some say that the proper acronym is CEMS, referring to Computerized Emissions Monitoring System. Since I don't address religious questions in this blog, I won't weigh in on this issue).

Let’s suppose the plant has a very stringent EPA permit that requires it to shut down within ten minutes of an environmental excursion (if the problem can’t be fixed in that amount of time). Therefore, the plant manager has made it clear to the operators that, if the SEMS shows an environmental excursion for ten minutes, they must shut the plant down.  This means a hacker could take over the SEMS and make it provide false data showing an excursion, resulting in a shutdown.

Does this mean the SEMS can impact the BES?  Absolutely.  But does it also mean the SEMS fulfills a BROS?  No, it doesn’t.  Environmental monitoring isn’t a reliability function.  If there were a huge excursion and everyone outside the plant got sick, this would be a big problem but it wouldn’t affect reliability.  The lights would stay on.

After my post on this topic last week, I engaged in an email discussion with an auditor with whom I often exchange ideas.  I pointed out to him that SEMS doesn’t perform a reliability function; he disagreed, and said that, if its misuse can result in the plant being shut down (as I’ve just described), this means it does affect reliability.  Obviously, if the plant is shut down it can’t perform the BROS it normally performs, such as supporting voltage. 

At the time, I didn’t know how to answer this argument, but I knew there was something wrong with it.  What was wrong was that I wasn’t applying to this question the two-step process for determining whether there is adverse impact – which I’d just described the day before!  Once you apply that process, the mystery clears up: The SEMS can have a severe adverse impact on the plant (question 1 above), and the plant’s being down will have an adverse impact on one or more BROS and therefore the BES (question 2).  In this way, a system that doesn’t directly fulfill a BROS can still be said to “adversely impact” the BES.

Even though I recommend that entities roll their own v5 interpretations, in cases like this where there is nothing more official from NERC, I can’t say that I think NERC is off the hook.  I would very much like them to acknowledge that the best way to determine whether a Cyber Asset can “adversely impact” the BES is to use this two-stage process.  Of course, I’d also very much like there to be world peace and for the Cubs to win the World Series this year….enuf said (Note on Dec. 4: While the Cubs didn't win the World Series this year, they got a lot farther than anyone expected. Just goes to show that anything can happen).

You probably noticed that, besides “adverse impact”, the phrase “support systems” was in the title of this post.  How does this come in?  I didn’t set out to write a post – let alone a Tom’s Lesson Learned – on this topic, but Brandon Workentin from EnergySec emailed me the day after the first post to ask what seemed to be an unrelated question on support systems; I now realize this topic is very much related to adverse impact, and is in fact addressed by what I have just said above.

Brandon expressed confusion (as have many others) about systems like HVAC and UPS.  These are systems that could in some cases impact the BES within 15 minutes (let’s say the heat fails in a power plant in northern Ontario in January, and it literally becomes impossible for the staff to stay at their posts; or a UPS doesn’t kick in in the event of a power failure and a control center goes dark) – should they be considered as possible BES Cyber Assets/Systems?

NERC addressed this question in the November 25 FAQ document.  Unlike their response on adverse impact in the April 1 FAQ document, they did actually answer the question - they said these systems should not be considered as possible BCAs.  They said that “support systems” (like HVAC and UPS) aren’t in scope for v5 (unless they’re within an ESP, in which case they’re Protected Cyber Assets).  I don't disagree with this answer, but I do disagree with NERC's reasoning behind it.

The problem with this answer is, what is the definition of “support system”?  A definition would allow entities not to waste time and money treating support systems as BCS.  On the other hand, if there is no definition, what is to prevent entities from declaring systems like DCS and EMS as “support systems” and therefore exempting them from being BCS?  I'm not saying that we now need a definition of "support systems".  What NERC needs to do is stop bringing in ad hoc arguments to justify their opinions; when they do this, it opens up a potential can of worms that they clearly hadn't anticipated.  This is something like the folk religions that dream up a deity for every natural phenomenon; it yields wonderful explanations, since you can always say that it's raining because the rain god was in a good mood.  But what do you then do with all these deities you've invented?

But you know what, NERC?  I’m going to help you out on this one, just because I’m that kind of guy.  What I’ve just discussed in this Tom’s Lesson Learned explains why HVAC and UPS shouldn’t be considered as BCAs; you don’t need to introduce mythical beasts like support systems or Bigfoot.  There is no denying the HVAC and UPS will have an adverse impact on the asset/Facility that they support, if they are lost or misused (i.e. my first question above).  However, and unlike with the SEMS described above, it isn’t certain that a loss of HVAC or UPS will result in the asset/Facility not being able to fulfill one or more BROS (my second question).  Even if the heat or A/C is lost, there might be some sort of mitigating actions that could be performed – like putting on overcoats or bathing suits, respectively – that would prevent a BES impact.  With the SEMS situation, if the plant shuts down it’s down – the impact is immediate and can't be mitigated.

There’s one more system I want to discuss, since I’ve used it as an example several times previously, and since I now need to change what I’ve said about it; that is the fire suppression system in a substation.  I’ve been saying all along that, even though the system doesn’t directly fulfill a BROS, it needs to be protected as a BCS since its non-availability when needed (i.e. in the event of a fire) could result in the loss of an asset/Facility with which it’s associated (in the case of a substation, it will usually be one or more high-voltage lines that are Facilities meeting one of the criteria 2.4 – 2.8).  

The auditor with whom I discussed SEMS last week also pointed out that he didn’t think the fire suppression system should be a BCS, simply because there is no assurance that the loss of the system when needed will result in an impact on the BES (that is, even though there may be an impact on the asset/Facility, there’s no assurance that will translate into a BES impact).  Maybe someone is working in the substation and grabs a fire extinguisher to put out the fire.  Or maybe the wind is blowing in a different direction, such that the line in question is never endangered.

So I have to agree that the schema I’ve described above would remove the fire suppression system from having to be considered as a BCS.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Honeywell.

[i] Of course, this isn’t the first time I’ve pointed out that coming up with a coherent interpretation of something in R1 requires doing some violence to the wording.  All I can say is “ya gotta do what ya gotta do”, and refer to the noted compliance expert Lewis Carroll, from his Through the Looking Glass:

"When I use a word," Humpty Dumpty said, in rather a scornful tone, "it means just what I choose it to mean- neither more nor less."
"The question is," said Alice, "whether you can make words mean so many different things."
"The question is," said Humpty Dumpty, "which is to be master, you or the words.  That's all."

Folks, with CIP-002-5.1 R1 and Attachment 1, the question isn’t what the best interpretation of the existing wording is.  Rather, it’s what wording will yield a consistent and logical requirement in place of the – in many places – inconsistent and illogical wording currently in place.  It’s just a question of who will be master, you or the current R1 wording.  You have to make this requirement work for you, even though that requires ignoring some parts of the wording and reinterpreting other parts.

No comments:

Post a Comment