I
posted
the first of Tom’s Lessons Learned last week, and received a few serious
comments the next day. And guess what –
the comments, and the email discussions I had with the people who sent them,
revealed that a new Lesson Learned is needed!
As I have said in the past, interpretation questions on v5 – especially
on BCS identification and classification – are like the Greek myth of Hydra,
the multi-headed serpent that grows two new heads for every one that is cut
off. They’re never-ending.
My first
Lesson Learned was about the meaning of the words “adversely impact” in the
definition of BES Cyber Asset. In that
post, I stated that the best way to think about the impact of a particular
Cyber Asset on the BES is as a two-step process: the Cyber Asset adversely
impacts the asset or Facility it’s associated with, then the asset/Facility
adversely impacts the BES itself.
Why did I
say this? I must admit I didn’t make my
reason clear in the previous post, but I’ll say it now: For the vast majority
of Cyber Assets, it is meaningless to talk about their having a direct impact
on the BES. If they have an impact, it
is only through the asset/Facility they’re associated with. For example, the DCS in a generating station
doesn’t directly impact the BES, other than through the station itself. Another example: the relay controlling a
circuit breaker for a 500kV line doesn’t in itself impact the BES. If it opens the breaker, it’s the loss of
that line (the Facility) that has an impact, not the relay itself.
So the
meaning of “adversely impact” in the BCA definition comes down to two questions:
- Does the loss or misuse of the Cyber Asset necessarily adversely
impact the asset/Facility?
- Does this adverse impact on the asset/Facility translate
into an adverse impact on the BES?
If the
answer to either of these questions is No, the Cyber Asset isn’t a BCA.
However,
just stating these two questions doesn’t lead us much further down the path of
understanding what “adversely impact” means.
What really matters is what the auditor will think it means when he/she
pays you a visit say three years from now.
Let’s say you didn’t identify a particular Cyber Asset as a BES Cyber
Asset because you don't think its loss or misuse would lead to an adverse impact on the BES; the auditor disagrees with this.
How do you justify your decision?
The short
answer to this question is that you pull out the extensive documentation you
created in April 2015, which justifies why you did this. The document will say something like “We
decided that Cyber Asset X wasn’t a BES Cyber Asset because we asked these two
questions (here you list the two
questions above, perhaps not with my exact wording).” If the auditor asks why you used that
approach, you can answer, “NERC said in their April 1, 2015 FAQ release that ‘adversely
impact’ meant ‘negatively impact’. Since
we already assumed that was the case, this didn’t help us understand what the
phrase actually meant. Therefore, we rolled our
own interpretation.”
Of course, your April 2015 document described this approach in detail (it might include elements of both this
current post and the earlier one, although I don’t recommend you state that you
were following Tom Alrich’s advice. If
you do, the region may triple the VSL for whatever violation you’re assessed).
But how will
you defend your answers to the two questions?
For both questions, it’s safe to assume you won’t need to defend a “Yes”
answer. It’s only if you are saying
there is no impact, either of the Cyber Asset on the asset/Facility or of the
asset/Facility on the BES, that you may get challenged by your auditor.
Let’s deal
with the first question first. How would
you defend a decision that the loss or misuse of a particular Cyber Asset won’t
adversely impact the asset/Facility it’s associated with? For a control system, I really don’t see a
way to do that. A control system has to
have an impact; otherwise it isn’t a control system (remember, the “within 15
minutes” part of the BCA definition is separate from what we’re discussing
here. Even though a system impacts the
asset/Facility and the latter impacts the BES, if it doesn’t do that within 15
minutes, it still won’t be a BCA. But
that is a later step in the BCA identification process).
Moving to
the second question, how would you defend a decision that the adverse impact on
the asset/Facility (caused by the loss or misuse of the Cyber Asset in
question) wouldn’t adversely impact the BES? The problem is that there are a whole host of
ways that an asset or Facility could impact the BES. It seems to me that you would have to show
you had considered all of those ways,
in coming to the conclusion that there wouldn’t be an adverse impact.
Where does
this list of “ways” (i.e. modes of impact) come from? Fortunately, the SDT has already addressed
that question, although not in the CIP-002 R1 standard itself. The BES Reliability Operating Services –
discussed in the Guidance for CIP-002-5.1 – constitute a list of ways that an
asset/Facility can impact the BES. I
think it should be sufficient if you simply showed the auditor that the loss or
misuse of the Cyber Asset
a) Won’t
impact the asset/Facility (i.e. question 1 above)
b) in a way that would impede the ability of the asset/Facility to fully fulfill one or more of the BROS that it normally
fulfills (question 2).
So does this
solve all of our problems? Do we now
know for sure what “adversely impact” means in the BCA definition? I’d like to say we do, but a problem remains:
What I’m advocating contradicts the actual wording of the BCA definition and
CIP-002-5.1 R1. This isn’t necessarily a problem, since I
have pointed out in many posts – starting with
this
one – that the NERC entity needs to “roll their own” interpretation or
definition, absent any fairly authoritative guidance from NERC on the
matter.
[i] As I mentioned above and in the
previous post, NERC’s only attempt to address this issue, in the April 1 FAQ
release, didn’t provide any useful new information.
I say that
my interpretation of “adversely impact” contradicts the wording of the
requirement. By that, I’m referring to
the fact that the requirement is nominally for identification and
classification of BES Cyber Systems, with the assets/Facilities only entering
into the process by being what the bright-line criteria refer to (for example,
see Section 3 of CIP-002-5.1, “Purpose”).
However, I
contend that the only way an entity can really comply with the spirit of
R1 is to think in terms of CIP v1-3, where you first identified Critical Assets
and then Critical Cyber Assets that are “essential to the operation of” those
Critical Assets. In fact, I literally
know of no entity – and only one region – that adheres strictly to the wording
of R1 in this regard. They are all first
identifying assets or Facilities that meet the High or Medium criteria, then
identifying BES Cyber Systems associated with them. The “interpretation” of “adversely impact”
that I’m describing in this post reflects that fact. In other words, this is yet another area
where the entity needs to roll their own interpretation - and in fact, all entities have already done so, but most didn't realize it.
I’m going to
illustrate this with an example I’ve discussed before. I have stated on several occasions that there
are cyber assets whose loss or misuse can affect the BES, but which still don’t
fulfill a BROS. One example is from an
SPP workshop on BCS identification in 2014.
In it, there was a fictional 1500+MW plant with a Stack Emissions
Monitoring System (SEMS), which provides information on what chemicals are
being emitted in real time (some say that the proper acronym is CEMS, referring to Computerized Emissions Monitoring System. Since I don't address religious questions in this blog, I won't weigh in on this issue).
Let’s
suppose the plant has a very stringent EPA permit that requires it to shut down
within ten minutes of an environmental excursion (if the problem can’t be fixed
in that amount of time). Therefore, the plant manager has made it clear to the
operators that, if the SEMS shows an environmental excursion for ten minutes,
they must shut the plant down. This
means a hacker could take over the SEMS and make it provide false data showing
an excursion, resulting in a shutdown.
Does this
mean the SEMS can impact the BES?
Absolutely. But does it also mean
the SEMS fulfills a BROS? No, it
doesn’t. Environmental monitoring isn’t
a reliability function. If there were a
huge excursion and everyone outside the plant got sick, this would be a big
problem but it wouldn’t affect reliability.
The lights would stay on.
After my
post on this topic last week, I engaged in an email discussion with an auditor
with whom I often exchange ideas. I
pointed out to him that SEMS doesn’t perform a reliability function; he
disagreed, and said that, if its misuse can result in the plant being shut down
(as I’ve just described), this means it does affect reliability. Obviously, if the plant is shut down it can’t
perform the BROS it normally performs, such as supporting voltage.
At the time,
I didn’t know how to answer this argument, but I knew there was something wrong
with it. What was wrong was that I wasn’t
applying to this question the two-step process for determining whether there is
adverse impact – which I’d just described the day before! Once you apply that process, the mystery
clears up: The SEMS can have a severe adverse impact on the plant (question 1
above), and the plant’s being down will have an adverse impact on one or more
BROS and therefore the BES (question 2).
In this way, a system that doesn’t directly fulfill a BROS can still be
said to “adversely impact” the BES.
Even though
I recommend that entities roll their own v5 interpretations, in cases like this
where there is nothing more official from NERC, I can’t say that I think NERC
is off the hook. I would very much like
them to acknowledge that the best way to determine whether a Cyber Asset can “adversely
impact” the BES is to use this two-stage process. Of course, I’d also very much like there to
be world peace and for the Cubs to win the World Series this year….enuf said (Note on Dec. 4: While the Cubs didn't win the World Series this year, they got a lot farther than anyone expected. Just goes to show that anything can happen).
You probably
noticed that, besides “adverse impact”, the phrase “support systems” was in the
title of this post. How does this come
in? I didn’t set out to write a post –
let alone a Tom’s Lesson Learned – on this topic, but Brandon Workentin from
EnergySec emailed me the day after the first post to ask what seemed to be an
unrelated question on support systems; I now realize this topic is very much
related to adverse impact, and is in fact addressed by what I have just said
above.
Brandon
expressed confusion (as have many others) about systems like HVAC and UPS. These are systems that could in some cases
impact the BES within 15 minutes (let’s say the heat fails in a power plant in
northern Ontario in January, and it literally becomes impossible for the staff
to stay at their posts; or a UPS doesn’t kick in in the event of a power
failure and a control center goes dark) – should they be considered as possible
BES Cyber Assets/Systems?
NERC addressed
this question in the November 25
FAQ
document. Unlike their response on
adverse impact in the April 1 FAQ document, they did actually answer the
question - they said these systems should not be considered as possible
BCAs. They said that “support
systems” (like HVAC and UPS) aren’t in scope for v5 (unless they’re within an
ESP, in which case they’re Protected Cyber Assets). I don't disagree with this answer, but I do disagree with NERC's reasoning behind it.
The problem
with this answer is, what is the definition of “support system”? A definition would allow entities not to waste time and money treating support systems as BCS. On the other hand, if there is no definition, what is to prevent
entities from declaring systems like DCS and EMS as “support systems” and
therefore exempting them from being BCS? I'm not saying that we now need a definition of "support systems". What NERC needs to do is stop bringing in ad hoc arguments to justify their opinions; when they do this, it opens up a potential can of worms that they clearly hadn't anticipated. This is something like the folk religions that dream up a deity for every natural phenomenon; it yields wonderful explanations, since you can always say that it's raining because the rain god was in a good mood. But what do you then do with all these deities you've invented?
But you know
what, NERC? I’m going to help you out on
this one, just because I’m that kind of guy.
What I’ve just discussed in this Tom’s Lesson Learned explains why HVAC
and UPS shouldn’t be considered as BCAs; you don’t need to introduce mythical
beasts like support systems or Bigfoot.
There is no denying the HVAC and UPS will have an adverse impact on the
asset/Facility that they support, if they are lost or misused (i.e. my first
question above). However, and unlike
with the SEMS described above, it isn’t certain that a loss of HVAC or UPS will
result in the asset/Facility not being able to fulfill one or more BROS (my
second question). Even if the heat or
A/C is lost, there might be some sort of mitigating actions that could be
performed – like putting on overcoats or bathing suits, respectively – that would
prevent a BES impact. With the SEMS
situation, if the plant shuts down it’s down – the impact is immediate and can't be mitigated.
There’s one
more system I want to discuss, since I’ve used it as an example several times
previously, and since I now need to change what I’ve said about it; that is the
fire suppression system in a substation.
I’ve been saying all along that, even though the system doesn’t directly
fulfill a BROS, it needs to be protected as a BCS since its non-availability
when needed (i.e. in the event of a fire) could result in the loss of an
asset/Facility with which it’s associated (in the case of a substation, it will
usually be one or more high-voltage lines that are Facilities meeting one of
the criteria 2.4 – 2.8).
The auditor
with whom I discussed SEMS last week also pointed out that he didn’t think the
fire suppression system should be a BCS, simply because there is no assurance
that the loss of the system when needed will result in an impact on the BES (that
is, even though there may be an impact on the asset/Facility, there’s no
assurance that will translate into a BES impact). Maybe someone is working in the substation
and grabs a fire extinguisher to put out the fire. Or maybe the wind is blowing in a different
direction, such that the line in question is never endangered.
So I have to
agree that the schema I’ve described above would remove the fire
suppression system from having to be considered as a BCS.
The views and opinions expressed here are my
own and don’t necessarily represent the views or opinions of Honeywell.
[i] Of
course, this isn’t the first time I’ve pointed out that coming up with a
coherent interpretation of something in R1 requires doing some violence to the
wording. All I can say is “ya gotta do
what ya gotta do”, and refer to the noted compliance expert Lewis Carroll, from
his
Through the Looking Glass:
"When I use
a word," Humpty Dumpty said, in rather a scornful tone, "it means
just what I choose it to mean- neither more nor less."
"The question is,"
said Alice, "whether you can make words mean so many
different things."
"The question
is," said Humpty Dumpty, "which is to be master, you or the words.
That's all."
Folks, with
CIP-002-5.1 R1 and Attachment 1, the question isn’t what the best interpretation
of the existing wording is. Rather, it’s
what wording will yield a consistent and logical requirement in place of the –
in many places – inconsistent and illogical wording currently in place. It’s just a question of who will be master,
you or the current R1 wording. You have
to make this requirement work for you, even though that requires ignoring some parts
of the wording and reinterpreting other parts.