Yesterday’s online Wall Street Journal carried a very good article by Rebecca Smith, titled “Duke Energy Broke Rules Designed to Keep Electric Grid Safe”, on what I can finally call the Duke Energy penalty of $10MM for NERC CIP violations. The last paragraph (which was cut out of the print edition) reads:
“The state of compliance is pretty rotten,” said Tom Alrich, a utility consultant who helps utilities audit their security programs. He added that he knows Duke spends a lot of money on its critical infrastructure protections. “I really doubt they are much more insecure than anyone else,” he said.
While I said those words, I do want to point out that they were part of a much longer discussion with Rebecca on Friday, which I believe she will expand on in a future article. Because I think my comments above are likely to be misunderstood without context, here’s a summary of the argument I was making (although I’ll admit that it was only partway into our discussion that I began to understand what I now believe to be the biggest issue in the Duke penalty. The neatness of my exposition below – if it is neat - is only with the benefit of hindsight!).
- I think the cyber security of the North American Bulk Electric System is very good, and that is in large part due to the CIP standards; I have had this opinion for years.
- But there is a big problem with CIP (actually, there are four or five, but this is the biggest): The requirements don’t follow a risk-based approach, in which the entity first identifies the most serious cyber security risks they face, as well as the most important systems that needed to be protected. This would allow NERC entities to allocate their scarce cyber security resources toward mitigation of the biggest risks, and to the systems that most need to be protected.
- The CIP-013 and (to some extent) CIP-014 standards follow this latter approach, as do individual requirements like CIP-010-2 R4, CIP-011-2 R1 and CIP-007-6 R3. On the other hand, there are very prescriptive requirements (the two worst offenders being CIP-007 R2 and CIP-010 R1) that require an inordinate amount of resources to comply with. And even with a lot of resources available, it is literally inevitable that a large organization like Duke will regularly suffer multiple failures on these requirements. I know all of them do.
- Of course, Duke evidently had more than their share of failures; this was without a doubt in part due to the number of acquisitions they have made of late, which almost always leads to compliance problems. And NERC rightly points to management failures as the main cause of the problems, which probably explains the size of the fine. As we know, sometimes it takes a shocking event to get management’s attention.
- NERC also points out (in the second paragraph on page 12 of part 1. Due to the high security implemented on the PDF, I can’t copy any text from the document) that the risk to the BES in Duke’s 127 CIP violations is much more due to their sheer quantity, and the fact that many of them were of long duration and were repeated, than to the number that were “serious” (13 were serious, while the rest were either “minimal” or “moderate” risk).
- As further evidence, “Jason R” (whom I know well, but I’m withholding his name at his request – not an unusual request in the utility industry!) pointed out in a comment posted yesterday on my last post that a quick scan of the requirements that Duke violated didn’t indicate any violations of requirements that apply only to High impact assets. Since High impact Control Centers are the crown jewels of the BES, this implies that the violations were concentrated in Medium-impact substations or generating stations. An attack on any one of these (or even a number of them) would be much less serious than an attack on a single High Control Center. This supports NERC’s statement in the previous bullet point.[i]
But don’t get me wrong. I’m certainly not saying that Duke is being unjustly punished for minor violations that everybody else does all the time. They are definitely a very serious violator of NERC CIP, and their penalty reflects that fact.
But here’s the rub: What does this mean for security? Was (or is) Duke a ripe plum just waiting for a halfway-competent Russian-sponsored hacker, or maybe even a script kiddie in Finland, to take over and hold maybe five or ten percent of the Eastern Interconnect hostage (perhaps until the US recognized the Donbass area of the Ukraine as an independent country)? Or would destructive software like Shamoon or NotPetya have trashed their control centers and darkened major cities for days? No, I don’t think so. This was a major compliance miss by Duke, but I’m sure (and NERC says as much, multiple times) that the chance of any cascading outage (or even just a local outage) was always very remote.
So what will be the effect of Duke’s penalty? Obviously, they’re going to have to (and already are, I’m sure) devote an even-bigger amount of resources toward CIP compliance than they already are. They especially need to beef up their compliance management and oversight processes, which NERC says are the big problem. This will definitely make Duke much more CIP-compliant, without a doubt.
Now, let’s think about the overall security of the grid (or at least Duke’s portion of it); will that be enhanced to the same degree as Duke’s compliance posture? Not a chance, and here’s why. Any NERC CIP compliance professional will tell you that a large percentage of the money and time that their organization spends on CIP compliance doesn’t go to enhancing the security of the grid, but to required activities that require large amounts of time, which are way out of proportion to whatever security benefit they provide.
My poster child for this is CIP-007 R2.2, which requires that the entity check every 35 days with every vendor of any software or firmware installed on any device within a Medium or High impact Electronic Security Perimeter to see if a new security patch has been released since the last month – and this regardless of whether the system is extremely critical or just mildly important, or whether or not the vendor has released a single security patch in the last twenty years. And the entity has to document – again, for every version of every piece of software or firmware installed on any device in a Medium or High impact ESP - that all of this was done every 35 days, since a NERC rule is that if you didn’t document it, you didn’t do it. For a large entity like Duke, this probably amounts to hundreds or even thousands of software or firmware packages (and each different version thereof, of course) that need to be checked on every month, along with documentation that each one was checked on.
I’ve asked a number of CIP people to estimate what percentage of every dollar they spend on NERC CIP compliance actually goes to cyber security; their estimates have ranged from 25% to 70%. Think about it. They’re saying that the absolutely best case is that 70% of spending on CIP compliance actually goes to enhance BES security, and the worst case is that only 25% does! I honestly think the median is probably around fifty percent.
So it’s clear: Duke will spend a boatload of time and money on enhancing their compliance posture, and while this will all go to improve compliance, probably only about half of it will go to enhancing their security posture. Moreover, even if 100% went to security, that still wouldn’t make them secure, since there are so many cyber security threats (and they’re unfortunately growing all the time) that aren’t addressed in CIP at all. Four big ones I can think of without breaking a sweat:
- Phishing, the ultimate cause of the Ukraine attacks, and the main focus of the current Russian attacks on the US grid. It is very hard to even think of a major cyber attack in the last few years that didn’t start with phishing.
- Ransomware, which has already forced a couple of US utilities to pay ransom, but which fortunately hasn’t yet penetrated control networks. In this category I also include NotPetya, at $10 billion in damages by far the most destructive cyber attack ever (of course, courtesy of our good friend Mr. Putin).
- Machine-to-machine access to BES Cyber Systems. This is specifically exempted from control by CIP now. This threat will be mitigated to some degree when CIP-013 comes into effect on July 1, 2020. Meanwhile, it’s what worries me most about the Russian attacks on vendors – that they could take over devices operated by the vendors, that communicate directly with BCS. It’s very hard to see how an entity could cut off a well-planned MtM attack before it could cause serious damage.
- Vulnerabilities in custom software, running on BCS, that was developed by or for the NERC entity. Since there aren’t typically regular patches made available for custom software, it isn’t covered at all by CIP-007 R2 - and this threat also won’t be addressed in CIP-013. When government supply chain security professionals heard, in a presentation by Howard Gugel of NERC at the Software and Supply Chain Assurance conference in McLean, VA last September, that vulnerabilities in custom software aren’t addressed in any way in any of the CIP standards including CIP-013, they were incredulous.
Of course, I’m sure utilities are spending money and time mitigating these threats (and many others not addressed at all in CIP), because they know it’s important. But here’s the root of the problem: Money doesn’t grow on trees, even for big utilities like Duke. Ultimately, if Duke has to spend a lot more money on NERC CIP compliance than they have been, at least some of it will come from mitigation of the cyber risks that aren’t addressed at all in CIP. And to the extent that these risks are more important than some of those addressed by CIP (e.g. does the risk of not identifying a new security patch within a month, for a few low-risk pieces of software, justify the enormous amount of resources currently being spent by NERC entities on compliance with CIP-007 R2.2? Especially when that same amount of resources might go to, say, additional anti-phishing training, or code review of custom software to find vulnerabilities?), this will probably – dare I say it? – result in an actual weakening of Duke’s cyber security posture, not strengthening it. And there’s nothing I just said that doesn’t apply to every other NERC entity with Medium or High impact BES assets, even those that have so far been fortunate enough to avoid $10MM fines.
I’ll admit that what I’ve just written goes far beyond what Rebecca and I discussed on Friday! Hopefully she’ll be able to explore this topic further in a future article. And whether or not she does, I will certainly do that in future posts (as I already have in a number of previous posts, attacking different sides of the problem). You have been warned.
Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.
If you would like to comment on what you have read here, I would love to hear from you. Please email me at firstname.lastname@example.org. Please keep in mind that if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or challenges like what is discussed in this post – especially on compliance with CIP-013; we also work with security product or service vendors that need help articulating their message to the power industry. To discuss this, you can email me at the same address.
[i] The WSJ article specifically points to violations of CIP-005 R2 as serious (i.e. it seems some Medium or High impact BES Cyber Systems weren’t protected by two-factor authentication and/or encryption of Interactive Remote Access). If these were widespread (the document is so highly redacted that it’s impossible to tell whether there were a few systems unprotected or a thousand), I would agree this was a serious vulnerability (hopefully it’s corrected now). This is especially a concern in light of the ongoing Russian cyber attacks, that are in part trying to use IRA as a means of accessing Electronic Security Perimeters.