In the first post that I wrote on the Colonial Pipeline incident (debacle might be a better word), I pointed out a similarity between that attack and another serious ransomware attack in 2018 – this one on a large electric utility. The similarity was that in both cases, the OT network (the Control Centers, in the case of the utility attack) ended up being shut down, even though the victim organization swore up and down that only their IT network had been affected by the incident.
Of course, a lot of people have argued, in the case of
Colonial, that they must be lying: the ransomware actually did penetrate the OT
network, so they had no choice but to shut it down. I certainly have no way of
knowing whether that’s true or not, but the point is that it really doesn’t
matter. Plus I think (just because of what I know about the NERC CIP requirements)
that it’s very unlikely the ransomware penetrated the utility’s Control
Centers.
But the point is that it doesn’t matter: In the case of
Colonial and in the 2018 case,
continuing to operate the OT network while the IT network was recovering from
ransomware made no sense. They both had to come down and be remediated.
In the 2018 case, the utility’s reason for bringing the
Control Centers down (and since they never admitted they had gone down, I
learned this from someone who at the time worked at a neighboring utility) was
that they couldn’t take the chance that even one of the Control Center systems
had become infected; they had to treat the entire CC network as infected, and
wipe and rebuild every system.
In my initial post, I didn’t state
Colonial’s reason for bringing the OT network down, since I didn’t know
it. However, in my second
post on Colonial (this is the third), I noted that WaPo had pointed
out that Colonial couldn’t do any invoicing while the IT network was down – so if
they had continued to operate the pipeline while the IT network was down, they
would have ended up delivering a lot of gasoline (all of it?) for free. The
practice of providing your product or service for free is frowned upon in
business school classes, from what I hear (it’s a great way to build the
customer base, but not such a great way to build profitability).
That alone would be a pretty good
reason to shut down the OT network (and it would have been for perhaps most organizations
that are hit by ransomware), but I also received a comment on that post from
Unknown – who, along with his good friend Anonymous, is one of the two most prolific
commenters on my posts. Unknown pointed out that “Pipelines are like banks and
oil in the pipeline is like cash in the bank. If a bank loses its ability to
track who gave them cash (or who they loaned it to), then there is no point
opening the doors, even if they can safely store the money in the vault.”
Of course, this makes all the
sense in the world. Colonial doesn’t own the gasoline and other products that are
delivered by its pipelines; it just charges a fee for delivering them. If it
can’t track who delivered products to its pipeline, or who received them on the
other end, then it’s going to be on the hook for the full cost of those
products – just like a bank can’t tell their customer “We’re sorry, but all of
your money is gone. We have no idea what happened to it. But don’t even think
of suing us, since we’re not liable for this. We apologize for any
inconvenience this might have caused you.”
So here was another very good
reason why the OT network might have been shut down. That made three reasons
(including the one from the 2018 incident) why an OT network would be shut down
if the IT network were compromised by ransomware. I thought I’d developed a
complete catalogue of reasons.
However, on Monday I read this article
in Utility Dive that pointed to another important reason why the OT
network would have to come down if the IT network did: If the ransomware did
jump to the OT network, it might cause an uncontrolled shutdown of operations,
which of course can be very damaging. By proactively shutting operations down
according to the required procedures, Colonial avoided this outcome.
But the article brought up another
reason as well: Continuing to operate the OT network while the IT network was still
infected (and remember, Colonial paid the ransom to unlock their IT systems and
start running again. Just decrypting a system doesn’t remove the ransomware) raised
the risk that the ransomware would jump to OT. A prudent organization, before
they decided to leave their OT network running, would have to ask: Do we trust
our security controls enough that we’re sure we’ll prevent the ransomware from getting
into the OT network?
Colonial might well have decided
that the answer to this question was No, and with good reason: Because so much
having to do with OT (ICS) security is relatively new and untested, it would be
very hard for even the most seasoned ICS security professional to say – with a
straight face – that he or she is 100% sure that the ransomware won’t jump from
IT to OT, possibly leading to a catastrophic uncontrolled shutdown that might
leave the whole system down for a long time.
But the article wasn’t finished
with possible reasons for shutting down the OT network when the IT network succumbs
to ransomware. For another reason, it turned to Tim Conway of SANS, who is
quoted (in a webcast last Thursday) saying “(If you consider the networks to
be a bookshelf, with IT systems on one end and OT systems on the other), there's
a whole bunch of stuff that lives…in between (the IT and OT systems).” This
“in-between zone” includes both IT and OT assets, as well as a lot of business
intelligence data. Tim gave the example of the “manufacturing execution system
(MES)”, as one of the systems that sits in this zone on the bookshelf.
The article continues:
Conway suggested using 2017's
NotPetya ransomware attack as an example: Maersk
was a collateral victim of the malware, but the shipping company didn't
have any issues on the "far end" of the OT side, such as crane
control or maritime shipping.
"But if the issue was, they
didn't know what was inside the containers, that impacts all of its
operations," he said. It would then complicate labeling the NotPetya
attack as one on IT or OT.
This is an excellent example of
another attack (which literally almost sunk – pun intended – Maersk) on what
was technically the IT network, which required the organization to bring down
their OT network as well. It makes no sense to continue shipping operations, when
there’s no way to know what you’re delivering and to whom. In fact, this is
almost exactly analogous to one of the reasons why Colonial might have shut
down their OT network, since they’re in the (liquid) goods transportation
business as well.
I’m sure there are other reasons
why a company might need to shut down their operational systems, even if only
their IT systems are brought down by a ransomware attack. So it seems clear to
me that it’s almost inevitable that what I call an “operations-focused” company
– as opposed to an information-focused company like an insurance company or a consulting
firm – will be forced to bring their OT network down if their IT network falls
victim to a ransomware attack.
You can refer to this from now on
as “Tom’s First Law of OT networks”. If I come up with another, I’ll be sure to
let you know.
Any opinions expressed in this
blog post are strictly mine and are not necessarily shared by any of the
clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would
love to hear from you. Please email me at tom@tomalrich.com.
No comments:
Post a Comment