Tom Alrich's Blog: What we know about the grid attack, and more importantly what we don’t

On April 30, I wrote a post pointing to a story by Blake Sobczak of E&E News about what seems to be the first reported cyberattack that affected power grid operations (as opposed to say an attack on some utility’s web site). I updated this with two more posts that week, which you can find here and here; the second of these two posts is based on a new article by Blake.

Even though there were still big questions about the attack and Blake came out with a new article last week, I haven’t written about it since then. This isn’t because there was nothing more to say, but because I just didn’t have time to say it (and because there had been another momentous – at least in my book – event in the power industry earlier, namely the publication by Lew Folkerth of RF of a new article on CIP-013. I wrote about that in my only post last week).

In fact, after my third post on the attack I had a very good discussion online with four people who are very knowledgeable about securing the grid against cyber attacks. All are longtime industry cyber professionals; three work for power market participants, while the fourth works for an important industry equipment vendor. One is Trevor Hodges of Tesla Energy, although the other three prefer not to be named. These discussions, along with Blake’s most recent article, allow me now to summarize what I think has been either stated publicly so far, or can reasonably be inferred from those public statements. This certainly doesn’t answer all questions about the attack, but it at least lets the remaining questions be much more targeted.

First, my three posts two weeks ago were all focused on who could be the entity that was the victim of this attack (and if multiple entities were attacked, there would presumably have been multiple OE-417 reports). I was asking this not because I consider it vital to know this information for its own sake, but because there were still big questions about what kinds of assets were actually affected by this attack (or more specifically, what kinds of assets had their communications affected by the attack) – knowing the entity would help to understand the attack itself.

It had seemed to me and others that Peak Reliability, the Reliability Coordinator for most of the Western US, including the four counties in California, Wyoming and Utah that were specifically mentioned in the attack, was the logical victim of the attack. But that was ruled out in Blake’s second article, which quoted Peak as denying this (and there’s no reason to believe they wouldn’t tell the truth about this).

That meant that the victim must have been either a utility or an independent power producer (with their own control center). Blake’s second article quotes a DoE official as saying the attack was on an “electric utility”, which rules out IPP’s, as well as Peak. But since there’s no utility whose service area covers all four affected counties, this means it is probably a utility whose control center – besides being a Balancing Authority for one or more regions – controls generation assets in those counties.

The only problem with this theory is that there’s no obvious utility that meets this definition. One of my four friends did a little checking in public reports and identified a utility that controls generation in three of the four counties – the fourth being Los Angeles County - but none that controlled generation in all four. He forwarded the name to me, but that utility has also denied that they were the one, so it looks like we’re at a dead end on the utility name (although I wouldn’t reveal it anyway, unless it were already public).

So maybe the utility was just one of the nodes attacked, and through the network the attackers went after the generators – but the latter aren’t owned by the utility. Whatever. But we now have at least enough information to put together a rough scenario for what went on.

The attack seems to have been initially against the control center of a large Western utility (which probably wasn’t located in one of the four counties reported to DoE for the actual event). And the NERC E-ISAC has reported that the attack was on Cisco ASA firewalls (or routers), using a vulnerability that had been reported in June 2018. This attack can cause a “Denial of Service” condition, which in this case meant the device rebooted, cutting off communications to (presumably) the three or four generating plants (the words DoS were initially thought by some, including me, to mean that the attack itself was a DoS, or even a DDoS, but it seems clear now that the E-ISAC just meant that the attack caused the ASA’s to reboot).

As had been stated from the beginning (in the OE-417 report), there was no loss of generation, meaning none of the plants went down, even though they lost communications with their BA. Loss of control center communications is a very common occurrence in OE-417 reports, but the difference in this case is that the cause of this loss was a cyber attack.

This is pretty much what has been publicly stated. Here are questions that I feel should be answered, so the industry can file this in either the “OMG, this is really serious. We have to get on this right away!” file, the “S__t happens. Get over it” file, or – most likely - something in between.

It seems the attack came from the internet, but then the question becomes why the ASA’s were directly accessible from the internet in the first place, since the utility in question could have had their own private network and wouldn’t have had to be exposed like that.
However, many control centers rely on public internet-based VPNs for communication with generators, usually because of the big cost savings compared to private carrier solutions. If so, the VPN endpoints could have been the ASAs that were attacked. And if this is the case, utilities need to know this, since they need to update their risk estimates for using VPNs for control center communications.
But there’s also the possibility that a private VPN to the generating plants was cut off, because the same router or firewall was used for internet communications (presumably to the utility’s IT network, since a control center should never be directly connected to the internet) – and that was where the attack came from. In other words, the attack came in from the internet and only brought the communications with the plants down because the router happened to be on the same physical device; this seems pretty hard to believe, since it’s hard to see a utility springing for an expensive private network, then cheaping out by not buying a separate ASA for their internet connection. In any case, any entity doing this should carefully consider the risks of continuing to use the same device for both IT and OT purposes, especially when the IT network involves a direct internet connection.
And there’s at least one other possibility for this: The ASA was being used for interactive remote access (through VPN on the public internet) to systems in the control center, as well as for communications with the plants. Again, the same risk assessment is needed if any utility is doing this now.
The E-ISAC report states that the communications outage was only five minutes, yet the OE-417 states that the attack continued for nine hours. What does this mean? Did someone keep forcing the ASA to reboot – and sever the VPN each time – over a period of nine hours? Maybe the attacker had reached the end of his shift, and the next guy didn’t start until eight hours later? If the problem was solved in five minutes, it’s hard to understand why the utility would say it lasted for nine hours.
What about the motivation of the attacker? One of my friends stated that it couldn’t have been someone deliberately targeting damage to the grid itself, since simply shutting off communications with a few power plants – which definitely can’t bring them down by itself – wouldn’t itself affect the grid. In fact, bringing communications down would almost never result in the plant itself going down, although it would mean that the people in the plant would have to make sure they had other communications active with the control center (which they need to have anyway, of course).
However, one of my other friends pointed out that the attacker could have been trying to use the attack to gain information from the ASA, not to actually cause it to reboot; the CVE detail says “It is also possible on certain software releases that the ASA will not reload, but an attacker could view sensitive system information without authentication by using directory traversal techniques.” In other words, the attacker(s) were doing reconnaissance for a much more sophisticated attack at a later time, but they were set back when the ASA unexpectedly rebooted. It’s not at all certain that this wasn’t a targeted attack, even a sophisticated one.

The bottom line here is that much more information is needed about this attack, so that electric power entities can learn what they need to do to prevent this from happening in the future, and to re-assess risks to BES communications (which of course aren’t covered by NERC CIP currently). I certainly hope that the E-ISAC will report their findings when they’ve finished their analysis.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or challenges like what is discussed in this post – especially on compliance with CIP-013. To discuss this, you can email me at the same address.

Tom Alrich's Blog

Tuesday, May 14, 2019

What we know about the grid attack, and more importantly what we don’t

No comments:

Post a Comment

Get new posts by email: