My post on the billing system at Colonial Pipeline brought out
great comments from two wise men of the power industry cybersecurity world:
Kevin Perry and Tim Roxey. As you’ll see, they didn’t say the same thing at all,
but they didn’t contradict each other, either. Rather, Tim’s comments built on
Kevin’s.
Here’s a quick summary of my
previous post, although I hope you’ll read it if you haven’t yet:
·
Even though the
ransomware attack never reached Colonial’s OT network, it did bring down their
billing system.
·
And even though it
might seem odd that the loss of the billing system could bring down pipeline
operations, there were actually good reasons for why that happened (which I’ll
let you read).
·
I concluded by
pointing out that “Tom’s First Law of OT Networks says that an ‘operations-focused’
company – as opposed to an information-focused company like an insurance
company or a consulting firm – will be forced to bring their OT network down if
their IT network falls victim to a ransomware attack.”
I stand by what I said, but
Kevin’s and Tim’s email comments made me realize that I hadn’t asked the more interesting
questions:
1.
How can we identify
systems that don’t directly control operations, yet can have a huge impact on
operations just the same (i.e, IT systems that perform functions required for
operations)? And when we’ve identified them, what measures can we take to
protect them better than other systems on the IT network that clearly have no direct
operational impact, like say the systems that run the utility’s charitable
operations?
2.
Should those systems
be regulated by OT-focused cybersecurity compliance regimes, such as the
dreaded…(and here I cross myself, despite not being Catholic)…NERC CIP?
3.
Or maybe we need to go
beyond all this talk about regulation and protecting systems, and think about
what the real problem might be?
Briefly, Kevin addressed questions
1 and 2; Tim took question 3 (not that I even thought of these questions until
now, of course). I’ll start with what Kevin said, and cover what Tim said in my
next post.
On Thursday, Kevin wrote this to
me:
I would argue that any “IT” system,
or system component that is essential to keeping to OT operational needs to be
considered OT and kept isolated from the rest of the IT world. As you
noted, electric metering, whether at the customer point of delivery or in a tie
substation, is OT. The data from the meters are fed into the IT billing
systems. If the billing systems are down, bills will be delayed, but the
meter data collection will continue until it can be transferred to the billing
systems. It is inexcusable that the OT must be shut down because an
essential IT system is down.
Here are the points that I infer
Kevin is making:
1.
This problem wouldn’t
have happened in the electric power industry, since an electric utility's operations (including metering)
can continue, even when the bills can’t be generated (no pun intended).
2.
The billing system is “essential
to operations” in the pipeline industry (or at least in Colonial’s case), although
not in the electric power industry (meaning it isn’t a BES Cyber System, or BCS).
3.
If there were a cyber
regulatory regime like NERC CIP in place in the pipeline industry, the billing
system would need to be considered the equivalent of a BCS.
4.
Regulation or no, the
pipeline industry should protect their billing systems using at least some of the
same measures (including isolation) used to protect OT systems.
I responded to Kevin’s email with
the question, “If you think certain IT systems should be isolated, would you
favor an expansion of the CIP standards to require network isolation, as well
as perhaps some (although not necessarily all) of the other CIP requirements?”
I want to make one point here: CIP
already covers a large group of systems that many electric utilities consider
to be part of IT, not OT. Those are systems located in Control Centers. While
these systems certainly perform an OT (and in many cases BES) function, they
aren’t Industrial Control Systems, since they’re implemented on standard
Intel-based hardware and run standard IT operating systems: Windows™ and Linux.
A lot of the management that needs to be done on them is the same as what needs
to be done for say financial systems.
And interestingly enough, Control
Centers aren’t included in NERC’s 80-page “definition of the BES”. That
definition requires an asset to be connected to the grid at 100kV or higher.
The only reason systems in Control Centers are even included in CIP is because Control
Centers are specifically called out in CIP-002 R1.1. So it wouldn’t be unprecedented
if other “IT systems” were in scope for CIP, although CIP-002 would have to be
amended for that to happen.
Kevin (a member of the NERC teams
that drafted Urgent Action 1200, the CIP predecessor, as well as CIP versions 1
and 2, and who was then Chief CIP Auditor for the SPP Regional Entity for about
ten years, until his retirement in 2018) replied to my email by saying:
A proper CIP-002 assessment of all
Cyber Assets linked to the proper functioning of the readily identifiable OT
should be sufficient. In the early days, some entities tried to move
systems out of scope simply by moving them out of the ESP (Electronic
Security Perimeter). My team always took a hard look at the
historians that were outside the ESP and also their map board display systems.
Most entities simply used their historians for temporal data storage and
non-real time engineering analysis, and keeping them out of scope was OK.
But I am also aware of at least one
entity that used their historian to drive their map board displays and also
used the historian data for real-time decision making. Their historians
were Critical Cyber Assets (now BCS) because they were used for real-time
operations. At least one entity had map board displays that were not
readily available on the dispatcher console, thus the map board also became a
CCA/BCS. And my team did not stop with systems used for the entity’s
real-time operations. An entity who declared their ICCP servers out of
scope because they were not using the outbound data (destined for their RC or
another BA or TOP) themselves found their decision frowned upon. Even
though they might not be receiving real-time data from a remote association,
they were supplying real-time data essential to the recipient(s). When
they argued to the contrary, my team referred them to the TOP and IRO standards
that compelled them to send what was initially known as “Appendix 4B” data.
So, apply the same logic to the
billing system and you will see the meter data collection subsystem is
absolutely a BCS if its failure causes you to shut down your OT (SCADA/EMS)
systems. The part of the billing system that sends the invoices and
payments is not. Processing invoices and payments can wait until you get
that system back up.
Here is what I take away from what
Kevin says that he doesn’t favor expanding the CIP requirements to include systems
located on the IT network because, if a system on the IT network meets the
definition of BES Cyber System (which the different examples he used all do,
even though the entities that operate them hadn’t classified them as such), it must
be treated as a BCS, including being located within the ESP (i.e. the OT
network). Of course, this only applies at Medium and High impact BES assets.
Low impact assets aren’t required to have ESPs.
So a system like the pipeline
billing system – if it existed in the electric power world – would need to be
treated as a BES Cyber System, subject to all the privileges (?) attendant on
that august designation.
I then asked Kevin whether he thinks
utilities should designate their meter data collection systems as BCS. His
answer was nuanced, yet at the same time quite clear:
Inconsistent. The meter data
loss does not impact reliability within 15 minutes (Tom’s note: The
definition of BES Cyber Asset/BES Cyber System requires that the loss or misuse
of the system would have an impact on the Bulk Electric System within 15
minutes. If it has an impact but it will usually take longer than that to
happen, it’s not a BCS). But it also does not cause the utility to
shut down the grid. Loss of telemetry does not stop the revenue-quality
meter from collecting data. Loss of the meter itself does not stop the
flow of electricity. There are procedures for dealing with an occasional
failure, including redundancy and inter-utility meter data reconciliation.
If the meter is only a revenue
meter, then it does not need to be a BCS.
If the meter also reports real-time flows and/or voltage, then it is a
BCS. That is what I meant by
inconsistent.
So Kevin is saying that, given the
current NERC CIP requirements, there are only two choices: The meter data collection
system is a BCS or it’s not. If it’s a BCS, it doesn’t get any break from any
other BCS, in terms of the number or types of requirements that apply to it. If
it’s not a BCS, it’s completely out of scope for CIP.
But there are certainly cases
where a lack of good security on the IT network can result in an outage of the
OT network. I described a dramatic example of that in this
post, where a ransomware attack that shut down the IT network but didn’t touch
the OT network (as in the case of Colonial), in the end resulted in two large
Control Centers being completely shut down for up to 24 hours, with the grid in
a multistate area being run by cell phone.
It’s safe to say that none of the
systems on the IT network of this utility met the definition of BCS, so there
was no single system that led to the Control Centers being brought down – yet
they were brought down anyway. This seems to me to point to the need for CIP to
be extended in some way to cover IT assets – perhaps as some sort of “halfway
house” asset. But there’s no way that the current CIP standards should be extended
to cover anything else. They first need to be completely rewritten as
risk-based. Then we can look at extending them to IT, based on the relative risk
levels of OT vs. IT.
I’ll turn to Tim Roxey’s comments
in my next post.
Any opinions expressed in this
blog post are strictly mine and are not necessarily shared by any of the
clients of Tom Alrich LLC. Nor
are they shared by the National Technology and Information Administration’s
Software Component Transparency Initiative, for which I volunteer. If you would like to comment on what you
have read here, I would love to hear from you. Please email me at tom@tomalrich.com.