A longtime principle
of risk management - for financial risks, general business risks, security
risks, etc. – is acceptance of risk. To accept a risk is to acknowledge that it
exists, but at the same time to decide that the costs of mitigating the risk
outweigh the expected loss if the risk is realized. This is simply an
acknowledgement that there are too many risks to make it worthwhile even to
attempt to mitigate them all. Of course, this is especially true in cyber
security, where there are a very large number of risks, if not an infinite
number.
The team that
drafted NERC CIP version 1 was composed of cyber security professionals who
understood this principle very well. So they made sure that the draft of CIP v1
that was submitted to FERC made liberal use of acceptance of risk. Many requirements
were written so that the entity could either perform what was required or “accept
the risk” and not do anything. This seemed to the team to be an eminently
reasonable approach.
However, it
didn’t seem at all reasonable to FERC. They ordered
the “acceptance of risk” language be removed from the standards; this was done
in CIP v2. Their reasoning was very simple: There is no way a single entity can
accept risk on behalf of all the entities that are part of the Bulk Electric
System.
FERC’s
reasoning made sense in the context of all of the standards that had so far
been developed by NERC (the so-called “693” or “Operations and Planning”
standards). These standards are almost all based on the laws of physics: You do
or don’t do something (like trimming trees regularly under transmission lines),
and the next thing you know, there’s a cascading outage that blacks out a large
swath of the US and Canada. Or you make one wrong move in a key substation, and
the next moment there’s a disturbance that takes out most of south Florida and
is felt in less than a second in Canada. By not doing what they should have
been doing, the utilities involved were essentially accepting risk on behalf of
the entire North American power grid – but the rest of the grid had no chance
to weigh in on whether they should do this or not. They just felt the
consequences.
So FERC
applied that reasoning to CIP v1, probably the first mandatory cyber security
standard for the power grid of any country or region of the world. And why
shouldn’t they? As everyone involved with developing CIP said at the time (and
as many in the NERC community still say), the CIP standards are just basic best
practices. Any NERC entity that doesn’t follow them leaves a hole that’s just
waiting for some adversary to walk through.
But the big
problem with taking a “best practices” approach to cyber security is it assumes
that the set of cyber threats that need to be mitigated is the same for all
entities and for all times. Specifically, it assumes that a) there will be no
significant new risks that appear over time, and that all entities face the
same threat landscape. And if this assumption doesn’t prove to be true, then b)
the standards will be flexible enough to incorporate these new or variable
threats.
Well, guess
what? Assumption a) isn’t true for cyber threats. New threats appear all the
time, and different NERC entities face different threats and to different
degrees. As for assumption b), we all know too well that the NERC standards development
framework makes it close to impossible to incorporate new cyber threats into
the standards in anything less than a few years. For example, the phishing and
ransomware threats have been in place for many years, yet there isn’t even
discussion of developing new requirements to deal with these. And the threat
posed by malware-infected laptops was well known since the late 1990’s, yet
when did a requirement come into effect that addressed this threat? 2017.
This
situation has strained the current NERC CIP standards to close to the breaking
point, with entities spending huge amounts of time and money complying with
certain very prescriptive CIP requirements, but then being starved of resources
to deal with other cyber threats that aren’t addressed in CIP at all.[i] FERC
knew this when they wrote Order
829 in 2016, which ordered NERC to develop a supply chain cyber security
standard that was risk-based and not “one size fits all”. While they didn’t say
it this way, this seems to me to be the reasoning they followed:
- It was clear to FERC, as it’s clear now, that supply chain
is by far the biggest area of cyber risk faced by electric utilities, or
almost any other industry, today. Think Target, Stuxnet, NotPetya, and the
current Russian attacks on the US power industry – all came, or are coming,
through the supply chain. FERC felt it was necessary for CIP to address
supply chain risks as soon as possible and as thoroughly as possible.
- At the same time, FERC knew that the US power industry was
struggling mightily just to comply with the existing CIP standards, in
large part because they aren’t risk-based and only allow for one type of
variation among individual entities: the impact level of particular assets
on the BES (and even then, only in three very broad categories). Asking the
industry to take on a huge new burden for supply chain security was
impossible.
- The only way that the burden of the new standard would be manageable
would be if it were risk-based, with the entity itself determining the
most important risks for it to mitigate, as well as how to mitigate them.
- If the entity does this, it will be able to allocate its
limited risk mitigation budget (and who has an unlimited budget?) in a way
that every dollar or hour spent on supply chain cyber risk mitigation
reduces the maximum possible amount of cyber risk. The alternative is a
set of mostly prescriptive CIP requirements (as in CIP-002 through -011)
that decide for the entity what are the risks it needs to address, as well
as exactly what it needs to do to mitigate each of them. There is no room
for variation among entities or over time. This will inevitably result in
a lower amount of total risk being mitigated, than if the entity is in
charge of deciding for itself what risks it will mitigate, and how it will
mitigate them.
This is why
CIP-013 requires the entity to do just three things: a) Develop a supply chain
cyber security risk management plan (R1[ii]); b)
Implement the plan (R2); and c) Review the plan every 15 months (R3). If the
entity follows the wording of R1 (and it isn’t that easy!), it will be sure to
get the most “bang for the buck” in allocating its limited budget to address
supply chain cyber risk. And if the entity follows the wording of R3, it will
be sure to make adjustments to its plan over time, so that it continues to
address the most important supply chain cyber risks, using the most current
mitigations for those risks.
Where does
acceptance of risk come in here? It’s very simple: There’s no way that risk
management can work, unless the entity can accept some risks without mitigating
them. The entity has to decide what are the biggest threats that it faces,
determine the risk posed by each of those threats, and line the threats up in a
big spreadsheet, ranked by their degree of risk. Then the entity needs to start
at the top and decide which threats (risks) it can mitigate, then draw a line
under the last one of these. The entity will mitigate all of the risks above
the line, and it will accept all of the risks below the line.[iii]
Of course,
FERC never said that the supply chain standard should allow acceptance of risk,
and CIP-013 doesn’t use those words at all. But I contend that it doesn’t have
to. It makes no sense to talk about risk management if you aren’t accepting
some risks – since, in the cyber security domain and even more so in the supply
chain cyber security domain, there are close to an infinite number of risks. In
complying with CIP-013, the entity is not only allowed to accept some risks, it
is required to do so; otherwise, it’s
literally impossible to comply with the standard without bankrupting the
utility. And that doesn’t exactly do a whole lot for grid reliability.
If you’re
wondering whether I’m the only person saying this, I’m not. Lew Folkerth of RF,
the only person within the wider NERC organization who has written about how to
comply with CIP-013, has written two great articles for RF’s bi-monthly
newsletter (which you can get in PDF form if you want to email me. Otherwise,
you have to download two 13 MB newsletter files). I’ve written about these in
four posts, starting with this
one (these posts haven’t ended, since Lew is promising a third article, to
appear in the newsletter that will be released this month. And you thought the
Muller report was going to cause the most excitement when it’s released this
month? Not in the Alrich household, I can tell you that!).
In the first
of Lew’s two articles, he says, regarding auditing of CIP-013 R1.1: “You will
need to be able to show an audit team that you have identified possible supply
chain risks to your high and medium impact BES Cyber Systems, assessed those
risks, and put processes and controls in
place to address those risks that pose the highest risk to the BES.” (my emphasis)
If you think about it you’ll realize that, in saying that the auditors will
look to see that you’ve addressed the highest supply chain cyber risks, Lew’s
also saying that the auditors aren’t going to expect you to mitigate risks that
aren’t among the highest.
In his
second article, Lew makes this more explicit when he says: “You can’t address
all risks, so you will need to prioritize the risks you will address.” He goes
on to describe a process almost exactly like the one I described above,
including identifying threats (risks), assigning risk scores to each one, then
planning to mitigate the threats with the highest scores. Amen.
Most sermons
open with a reading of Scripture. But it seems this one has ended with it.
Any opinions expressed in this blog post are strictly mine
and are not necessarily shared by any of the clients of Tom Alrich LLC.
If you would like to comment on what you have read here, I
would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that
if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or
challenges like what is discussed in this post – especially on compliance with
CIP-013. To discuss this, you can email me at the same address.
[i]
I would love to refer to a post for more information on what I’ve just said,
but it’s scattered around lots of posts. However, I did write an article on
this topic for a UK security journal (print only), that I’m allowed to
distribute in a PDF file. If you’d like me to send that to you, send me an
email at the address above.
[ii]
R1.2.1 through R1.2.6 list six specific risks that must be mitigated in the
plan. They are there because FERC specifically ordered each of those to be
included, when they wrote Order 829. You can think of these as the most
important risks to be mitigated, but certainly not the only ones. The entity is
in charge of deciding what other risks it will mitigate, consistent with its
budget.
[iii]
This is a simplified description, since
it may be possible for the entity to mitigate the same amount of risk, or even
more, by only partially mitigating each of the risks at the top of the
spreadsheet - yet mitigating more of them than if it required either total or no
mitigation. But this is too complicated an idea to discuss in this post.
No comments:
Post a Comment