In my last post
(which you need to read before this one), I pointed out how impressed I am by
the CIP Modifications Standard Drafting Team’s new approach to incorporating
virtualization into the CIP standards – especially since I regarded their
previous approach as being impossible to get approved, even if it could ever be
fully drafted. They outlined this new approach in a draft white paper a couple
weeks ago (which isn’t posted but which I can send to you if you email me at
the address at the bottom of this post), and in a webinar on June 29. The
slides from that webinar are now available here,
while the recording is available here.
At the end
of that post, I described the two great ideas that are driving this new
approach: 1) Do away with the definitions of Cyber Asset and BES Cyber Asset
and make BES Cyber System the fundamental concept for applicability of the CIP
standards; and 2) Rewrite the prescriptive technical requirements (found in
CIP-005, CIP-007 and CIP-010) in a non-prescriptive fashion, so that there is
no need to draft – and ballot once, then re-draft, ballot twice, re-re-draft, ballot
a third time, etc. – the huge number of detailed changes to these requirements
that would be needed to accommodate virtualization.
I concluded
the post by saying that, for the first idea, I saw a few potential problems but
no show-stoppers, while for the second I did see some potential show-stoppers,
“although none of them wouldn’t be surmountable by the SDT.” I now wish to modify
those statements, since a) I see potential show-stoppers for both ideas; and b)
I’m skeptical about whether the SDT can solve them, since the only real
solution will be rewriting NERC’s Compliance Monitoring and Enforcement Plan
(CMEP) so that there are different rules for CIP than for the other NERC
standards (however, there is a half-solution that doesn’t require CMEP changes.
I will discuss that as well, although not in this post).
That’s the
bad news. The good news is i) The show-stoppers mostly come down to one
problem: How will compliance with these requirements be audited? and ii) The
changes to CMEP are already needed anyway. Furthermore, the need for these
changes will only continue to grow as new and modified CIP requirements and
standards are added. Maybe virtualization will be the catalyst for making these
changes (which, of course, will require a big effort by NERC, FERC, the
Regions, and the NERC entities – no hiding that fact).
Let’s focus
on Great Idea Number 2 (and I’ll hope to discuss Great Idea Number 1 in a future
post). In the webinar and white paper, the SDT pointed to CIP-007 R3 as an
example of a non-prescriptive requirement that could be a model for rewriting
prescriptive technical requirements like CIP-005 R1 and CIP-007 R2. Part 3.1 of
CIP-007 R3 reads, in its entirety, “Deploy method(s) to deter, detect, or
prevent malicious code.” Doesn’t this sound simple? You couldn’t get more
non-prescriptive than this!
Yes it is
simple, but the question is: How will it be audited without requiring auditor
judgment – and a lot of it at that? For example, you need to “deploy methods”
to do one of three things: “deter, detect or prevent” malicious code. But how
good do these methods have to be? As Lew Folkerth pointed out in a presentation
that I wrote about in a post
last year, it should always be assumed that the methods need to be “effective”.
Saying a certain chant every morning to protect against malware isn’t an
effective method. If you tell the auditor that’s your method, you will probably
get a PNC. And I won’t have a huge amount of sympathy for you.
But beyond
that, how can this requirement be audited? Let’s suppose an entity deploys
solely detective methods, meaning they aren’t likely to ever find out about the
presence of a virus on their network until one or more devices have already
been infected. Do you think that, in the case where your network consists
mostly of Windows or Linux machines for which antivirus software is a perfectly
workable option, an auditor isn’t going to ask why you’re satisfied with just detecting malware? And if you tell him
to go take a hike, since you’re clearly “deterring, detecting or preventing”
malware as the requirement says, do you think that will be the end of the
story? Don’t you think he’ll say something like “Well, the risks posed by malware
are such that merely detecting it, when there’s a tried and true (and
inexpensive) method for preventing it as well, isn’t enough”?
I’m not
saying you wouldn’t win this fight – in fact, even if the auditor gave you a
PNC for this, I doubt you would ever actually be found in violation, simply
because the auditor is going beyond what the language strictly says in this
case. If the requirement had been more prescriptive, you could avoid this
problem. For example, the requirement could say that prevention should always
be the preferred option, and that deterrence or detection should only be relied
on by themselves if there is no good prevention option. But of course, this
would then leave an ambiguity about what criteria should be used to establish
that there is no “good” prevention option, so that would need to be put into
the requirement as well – making it even more prescriptive. And so on.
In other
words, what’s the best way that NERC entities can be spared the pain of falling
into auditing problems like this, when a requirement is written non-prescriptively,
like CIP-007-5 R3? The best way is to make the requirement extremely
prescriptive! In fact, this requirement’s upstairs neighbor, CIP-007-5[i] R2
(patch management), is prescriptive precisely for this reason. Here’s why…
In CIP v3, the
patch management requirement was CIP-007-3 R3, which read “The Responsible
Entity…shall establish, document and implement a security patch management
program for tracking, evaluating, testing, and installing applicable cyber
security software patches for all Cyber Assets within the Electronic Security
Perimeter(s).” The two sub-requirements under it (which is what requirement
parts were called in those benighted days) read:
R3.1.
The Responsible Entity shall document the assessment of security patches and
security upgrades for applicability within thirty calendar days of availability
of the patches or upgrades.
R3.2.
The Responsible Entity shall document the implementation of security patches.
In any case where the patch is not installed, the Responsible Entity shall
document compensating measure(s) applied to mitigate risk exposure.
While I’m
sure this language sounded pretty reasonable when CIP v3 was drafted, in
practice there were lots of disputes between NERC entities and auditors – and
many violations handed out, I believe – regarding this requirement. The problem
wasn’t that it was prescriptive (in v3, pretty much all of the requirements
were prescriptive); the problem was that it wasn’t prescriptive enough. If you
compare CIP-007-3 R3 to CIP-007-5 R2, you’ll see what I mean by this.
Here’s an
example: In CIP-007-3 R3, while the entity was required to assess patches for
applicability within 30 days of their availability, they weren’t required to
look for them in the first place. This, of course, led to arguments between
entities and auditors, who often felt it was a reasonable expectation that the
entity should look for patches, not just wait for an email to show up. Of
course, since the requirement clearly didn’t mandate looking for new patches, I
doubt that any potential violation findings for not doing so were ultimately
upheld.
But this
experience was very much on the CIP v5 drafting team’s mind as they worked on
the CIP v5 patch management requirement (which is of course identical to the v6
one). And how did they fix this problem? They made the requirement more prescriptive by requiring the
entity to look for available patches for every piece of software installed on a
device within their ESP. Thus, they established once and for all what was
required and what wasn’t – and they seem to have greatly reduced or even
eliminated interpretation problems for this requirement[ii].
There were
other problems with CIP-007-3 R3, which were resolved in a similar fashion:
- Since there was no requirement that every applicable patch
had to be either installed or the vulnerability mitigated, CIP-003-5 R2.3 added
a requirement for this, as well as a 35-day time limit for that.
- Since there was no requirement that mitigation activities
needed to remain in place until they weren’t needed anymore, this was made
explicit in R2.4.
So the CIP patch
management requirement, which was already prescriptive in CIP v3, was made much
more prescriptive in v5. Why did the drafting team do this? It certainly wasn’t
because they didn’t know there was any alternative to prescriptive
requirements, since several of the other v5 requirements were deliberately made
non-prescriptive (of course, CIP-007-5 R3 was the best example of this. But
CIP-011-1 R1 and CIP-003-5 R3 are also examples, as well as others).
No, making
the patch management requirement much more prescriptive was a deliberate defensive measure. The drafting team
thought that the only way to put auditing problems to rest, like those listed
above, was to establish once and for all what was required and what wasn’t. They
obviously decided that the auditing problems that came with having a somewhat
prescriptive requirement like CIP-007-3 were such a huge time drain for people
involved with CIP compliance that it would be better for them to have to comply
with a much more prescriptive requirement that at least wasn’t ambiguous.[iii]
And how well
has this turned out? I haven’t taken a scientific survey, but I have yet to
talk with a NERC entity – with High or Medium impact BES Cyber Systems – that
doesn’t put CIP-007-6 R2 at or very near the top of the list of current CIP
requirements that cause them headaches – and that require huge amounts of
resources.[iv] One CIP
manager at a medium-sized utility told me that, of all the documentation they
generate for all the NERC requirements
(not just the CIP requirements) in their control centers every year, at least
half of that documentation is due to this one requirement. This is quite
telling, especially when you consider there are probably around 150 total NERC
requirements.
So I hope I’ve
convinced you that the only way to come close to eliminating auditing problems
caused by differing interpretations of a requirement between the auditor and
the auditee, given the current prescriptive NERC
compliance enforcement regime, is to make the requirements
as prescriptive as possible. If you’d like to see all of the CIP standards go “back
to the future” and be made as prescriptive as possible, raise your hand…I didn’t
think I’d see any hands. So now it seems like we’re between a rock (auditing
problems when requirements aren’t prescriptive) and a hard place
(super-prescriptive requirements being the best way to reduce auditing
problems!).
The CIP
Modifications SDT would like to make CIP-007 R2 a non-prescriptive requirement,
because – as they rightly point out – this and other prescriptive requirements
(the example they used in the webinar was CIP-005 R1, although CIP-011 R1 will
also be high on most people’s list of prescriptive CIP requirements) make it almost
impossible to incorporate virtualization into CIP. To go back to the two forks
in the road that they talked about in the webinar (see my previous post for
these), the left fork – which is essentially the path they were pursuing until
earlier this year and is the one I criticized as unworkable - requires drawing
up and balloting some very complicated and certainly controversial definitions
like Electronic Security Zone and Centralized Management System. It also
requires making a slew of modifications to CIP requirements to accommodate
these things, which will be very hard – if not impossible - to accomplish with
prescriptive technical requirements still in place (of course, drafting all
these changes will be hard, but that will be the easy part. Each change would
have to be balloted multiple times and endlessly debated).
Of course, it
was crystal clear in the white paper and the webinar that the SDT has already
rejected the left fork. But what about the right fork? That’s the one based on
the two Great Ideas I listed above. The key to that fork is making BES Cyber
System the fundamental building block of CIP compliance, which eliminates a lot
of very difficult questions that would be involved with defining something like
“virtual Cyber Assets” or “virtual BES Cyber Assets” (as I used to think had to
be done before anything else could be accomplished on virtualization). A system
can be composed of physical or virtual components, or a mixture of them. It
will be up to the entity to decide what their systems are and identify those
that meet the new BCS definition (which will incorporate the 15-minute impact
criterion from the BES Cyber Asset definition).[v] The SDT
believes that with this change, there will probably be no need to change any of
the “non-technical” requirements (which includes all CIP-003 through CIP-011
requirements other than those in CIP-005, CIP-007 and CIP-010). But as for the
technical requirements, they will all need to be made non-prescriptive, if they
aren’t already so.[vi]
What if we
tried to make CIP-007-6 R2 non-prescriptive? What would it look like? If we use
CIP-007-6 R3 as a model, and also using the fact that the threat that patch
management mitigates is that of software vulnerabilities, it might read
something like “Deploy methods to protect against software vulnerabilities[vii].”
Let’s do a
thought experiment here: Think of an auditor from your NERC Region. What would
they say if they came to audit you two years after the above non-prescriptive
version of CIP-007 R2 came into effect, and you told them you had decided that
patch management was the best way to mitigate the threat of software
vulnerabilities? I think they would be glad to hear that, wouldn’t you? Indeed,
it’s hard to think of any program to mitigate this threat that wouldn’t include patch management in
some way.
But what if
you then told them that your patch management program includes checking for new
patches for very important software every month but for other software only
once a year? And that, once a patch has been deemed applicable, you’re
deploying or mitigating it within one month for very important software, but
within one year for other software? My guess is the auditor will say “Well, I
like the fact that you’re checking for and applying new patches for important
systems every month, but you’re only taking those steps for other systems once
a year. I want you to cut the interval for other software down to about three
months.”
And what
will be your reply to him? Will you simply point out that what he’s requesting
goes way beyond the language of the requirement? In fact, you might point out
to him that there’s nothing in the language to prevent your patching even
important software only once a year (or every ten years, for that matter). What
is going to give here? You’re right that the requirement language doesn’t
require that you do what he is saying; and he’s right when he says it would be
hard to find any cybersecurity expert who would assert that patching software
once a year was OK. You and the auditor will be at an impasse.
Assuming this were all to come to pass, who’s
at fault here?
- You, since you’re clearly not following good cybersecurity
practices?
- The auditor, since he’s trying to interpret the patch
management requirement to say more than it does?
- The CIP Modifications drafting team, for
getting rid of the prescriptivity of the patch management requirement,
causing this sort of impasse to be possible?
Let’s state
the problem here. It isn’t that ambiguity or loose wording is causing auditors
and auditees to take different “interpretations” of what the words of a
requirement mean, which is how most people would state the problem. If that is
the problem, then it can only be solved by doing one of the following:
- Doing a Request for Interpretation, which only works in a
case where the wording of the requirement or definition isn’t ambiguous,
but still needs to be “teased out” to deal with a particular situation.
But this requires going through a process with multiple drafts and
ballots, followed by FERC approval (which isn’t always forthcoming, which
the then-Interpretations Drafting Team found out in 2013, when FERC remanded two
Interpretations they’d worked on for a couple years. That’s why there is
no longer a NERC CIP IDT).
- Providing some sort of mandatory “guidance” that will be more
or less binding on auditors and NERC entities. This was first tried with
the CANs and CARs under CIP v3, which were almost all retracted. And when
CIP v5 was approved by FERC and entities started realizing there were a
lot of ambiguities, NERC tried
to do this using a whole slew of vehicles – FAQs, RSAWs, the CIP v5
transition study, Lessons Learned, Memoranda. Almost all of these were
retracted as well, simply because NERC isn’t allowed to provide mandatory
interpretations.
- Revising the ambiguous standards or definitions, which of
course requires a new drafting team, ballots, etc. What NERC finally did
in 2015 – when all the guidance attempts were clearly not working - was
refer a few of the thorniest
issues with CIP v5 to a new SDT (although they left out many
other issues). However, the SDT has so far not made any progress on these
issues, since they had to work on more pressing FERC mandates such as
CIP-003-7. But their new virtualization proposal will very neatly
“address” four of the five items that NERC added to their mandate. Those
four items all deal with the definitions of Cyber Asset and BES Cyber
Asset, and since the SDT is proposing to just eliminate those two
definitions altogether, they are killing five stones with one bird. Pretty
neat! However, there’s no reason to suspect that all CIP ambiguities can
be dealt with by simply eliminating the requirements or definitions in
question. Would it were so!
Which is all
to say that if ambiguous wording is really the problem, it’s an insoluble one;
but it isn’t the problem. I used to believe that the problems with CIP v5 were
due to ambiguity, but now I’ve decided they’re much more fundamental. The
problem is that cyber security isn’t electrical engineering. NERC was founded
by engineers trying to solve problems caused by lack of standardization among the
different power market participants (NERC was founded in the wake of the Northeast
Blackout of 1965, the proximate cause of which was an improper relay
setting).
The solution
to these problems was standards requiring very specific actions that could be
very accurately audited: You either set your relays according to the standard
or you didn’t.[viii]
In other words, the only kind of requirements that make sense in the 693 world
are measurable ones.[ix]
NERC's auditing program was developed to govern audits of measurable requirements, so it
countenances nothing but rigid “either you did it or you didn’t do it”
judgments.
However,
cybersecurity is a statistical process. If one utility doesn’t patch one system
in its control center for two months, it isn’t likely that this alone will lead
to a cascading BES outage. If ten utilities – all neighbors – don’t patch any
of the servers in their control centers for ten years, this could very well
lead to a cascading outage, but even that isn’t certain. So where do you draw
the line when you’re drafting a patch management requirement?
The answer
is you can’t draw a line. What’s needed is for the entity to have in place a
good program for patch management, period. The only way to judge whether a
program is good or not is for a) the requirement to be written non-prescriptively;
and b) the auditor to be able to exercise good judgment, and to be able to provide
advice to the entity so that they can improve their cybersecurity program. But
NERC doesn’t allow auditors to do this now.
What the CIP
Modifications SDT wants to do, in the second Great Idea that is part of their
proposal to deal with virtualization in CIP, is to implement a) but not b). And
since b) requires developing a CIP-specific version of NERC's auditing methodology, I can see why they
wouldn’t want to tackle this. But this means that just moving ahead with their
Great Ideas simply won’t work, and the result will be to greatly increase the
kinds of conflicts between auditors and auditees that we’ve seen too much of
since CIP v5 became enforceable.
However, as
I hinted above, there is a partial solution, that wouldn’t require rewriting
NERC's auditing program, but would probably allow the SDT to move forward with their virtualization
proposal. And this partial solution has already been implemented at least three
times in CIP in the past two years – this is what I call “plan-based”
requirements.
Since this
is already a very long post and since I’m tired, I’m going to break here. I’ll
be back in one week (probably not before) with the third post in this series,
which will (I think) bring this discussion to its exciting conclusion. But I also want to point out that I could never adequately lay
out these issues – of what the fundamental problems are with CIP and how to
address them – in my blog, no matter how many posts I devoted to the subject. However,
I’m now working on a book, with a co-author, where I am doing just that. It
will be a big slog, but I believe I’m about halfway there, if not a little
more. That will be my “final” answer to these very important questions.
Any opinions expressed in this blog post are strictly mine
and are not necessarily shared by any of the clients of Tom Alrich LLC.
If you would like to comment on what you have read here, I
would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that
if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or
challenges like what is discussed in this post – especially on compliance with
CIP-013. And if you’re a security vendor to the power industry, TALLC can help
you by developing marketing materials, delivering webinars, etc. To discuss any
of this, you can email me at the same address.
[i] Of course,
all of the CIP-007 requirements are the same under v6, the version currently in
effect. But since I’m talking about the CIP v5 drafting process here, I want to
refer to the v5 requirements.
[ii] While I’ve
heard many complaints about the big burden of CIP-007-6 R2 compliance, I can’t
remember a single complaint about an auditor interpreting that requirement
differently from how the entity does. As I point out below in this post, this “victory”
for NERC entities has come at a huge price, which is that the cost (in money
and staff time) required to comply with this requirement is much greater than
any other current (or past) CIP requirement.
[iii] I want
to thank a longtime observer of the CIP drafting teams for pointing this out to
me.
[iv] See this
post for Lew Folkerth’s observation that – in his opinion (and I don’t know
whether this has changed in the last year and a half or not) – any NERC entity
that isn’t self-reporting violations of CIP-007 R2 doesn’t understand the
requirement; in other words, the requirement is impossible to fully comply
with, at least for entities with a lot of cyber assets in scope for CIP.
[v] I have a
feeling that some auditors might object to leaving it completely up to the
entity to determine what is and isn’t part of a BCS, with no underlying BCA
definition. I hope to discuss this further in the fourth post in this series.
[vi] The SDT
used the term “objectives-based” a few times in the webinar and the white
paper. But a true objectives-based standard has to be measurable, since
otherwise there is no way to determine whether or not the entity has measured
the objective. But there are no measurable objectives in the field of
cybersecurity. And please don’t tell me that a 35-day deadline for assessing
patches for applicability, or a 24-hour deadline for removing access to BCS, is
a cybersecurity objective! The objectives of cybersecurity are mitigating the
various cyber threats, like malicious insiders or someone unauthorized taking
remote control of an important system. There is no direct way to measure how
well an entity has mitigated a threat, since the lack of realization of a threat
may simply be due to luck, which could change at any moment.
[vii] The
CIP Modifications SDT hinted in their white paper that they’re considering
eliminating patch management as a requirement altogether, and replacing it with
a requirement for vulnerability management. However, it seems to me that all
the problems we’ve just been discussing with the patch management requirement
would simply reappear (and perhaps in spades) in a vulnerability management
requirement. Vulnerability management is certainly worth further consideration,
but simply implementing it doesn’t mitigate auditing problems.
[viii] I
realize I’m probably vastly oversimplifying the PRC standards here – I know they
aren’t always clear-cut, either. But in the case of the 693 standards,
questions about what they mean are real ambiguities, and can be resolved by one
of the three methods just discussed in this post. That isn’t the case with CIP
questions like what we’re discussing now.
[ix]
Prescriptive standards are definitely measurable, but objectives-based ones are
as well, as long as the objective itself is measurable. However, as I’ve
already said, cybersecurity objectives aren’t measurable.
7/12: In rereading this post just now, I realized that at one point I referred to the CSO706 drafting team (which drafted CIP v2-v5), when I should have said the current CIP Modifications team. I apologize for this error.
ReplyDelete