Tom Alrich's Blog: September 2018

Wednesday, September 26, 2018

When will FERC approve CIP-013?

I must admit that I expected FERC to approve CIP-013 by now. I thought it was close to certain that they would approve it in September, since a) NERC turned it in for their approval at the end of September 2017, and b) September was the last month they could approve CIP-013 in time for it to come into effect on April Fool’s Day 2020. Now the compliance date will be July 1, 2020, unless FERC doesn’t approve CIP-013 until after Q4 – in which case the date will be October 1, 2020.

The next possible date for FERC to approve CIP-013-1 is at their monthly Sunshine Meeting the third Thursday of October. FERC issued their Notice of Proposed Rulemaking (NOPR), which said they intend to approve CIP-013-1, in January. This means it will be at least nine months between the NOPR and the Order approving the standard.

This is definitely on the long side, although not unprecedented. FERC issued their NOPR saying they would approve CIP version 5 in April 2013 and they approved it in Order 791 on November 22, 2013[i], which is seven months. CIP v5 was a complete rewrite of all of CIP, including adding two new standards. While CIP-013-1 is a very different kind of standard from any of the currently enforced CIP standards and therefore requires a lot of scrutiny, it would be hard to argue that it needs as much as did CIP v5. In any case, this is why I don’t think FERC will continue their pondering beyond Q4, so I believe it is likely the compliance date will be 7/1/2020.

What’s ironic about this is that FERC, in their Order 829 mandating that NERC develop a supply chain security standard, gave NERC only a year to a) write a Standards Authorization Request and get it approved by the ballot body; b) form a drafting team and set them to work developing the first draft; c) submit the first draft to the ballot body and have it roundly voted down (I believe it received nine percent positive ballots - not exactly good enough, given that 68% is required for approval); d) redraft and re-submit the standard (which the drafting team had to do three times); e) have it approved at the next quarterly NERC Board of Trustees meeting after final approval by the ballot body; f) have the lawyers put on their final touches; then finally g) submit the standard to FERC for them to approve.

The amazing thing is that NERC was able to do all of this and still meet the one-year deadline. And guess what happened? FERC will now have taken at least 13 months to approve CIP-013, and maybe more than that. Hurry up and wait, it seems.

I do need to point out that there is only one FERC Commissioner still in office from when Order 829 was issued. And that Commissioner, Cheryl LaFleur, actually dissented from the Order because she thought that FERC should give NERC more time (to read her elegant six-page dissent, go to page 67 of the PDF of the Order).

I totally agreed with her position in my post on the Order. Now I am even more sure that she was right, for this reason: While I think that CIP-013 is very well-written, and is the closest approach yet to how I would rewrite all of CIP if given the chance, it suffers from the near-fatal flaw of being fundamentally un-auditable under NERC’s current prescriptive compliance enforcement process. I have discussed that problem in a number of different posts already, most recently here. Since the problem of making plan-based requirements (which is what the requirements of CIP-013 are) auditable by NERC had already been solved by the CIP v6 drafting team when they drafted CIP-010 R4 (as I explained in the post just linked), I think the CIP-013 drafting team would probably have discovered the same solution if they had had more time to develop the standard. As it is, they had a million fires to deal with just to get CIP-013 passed, and perfection wasn’t something they could afford to aim for.[ii]

In the NOPR, Commissioner LaFleur clearly identified this problem. She issued a statement with the NOPR that included this passage: ““The proposed standards would provide significant flexibility to registered entities to determine how best to comply with their requirements. In my view, that flexibility presents both potential risks and benefits. It could allow effective, adaptable approaches to flourish, or allow compliance plans that meet the letter of the standards but do not effectively address supply chain threats. I hope that we will see more of the former, but I believe the Commission, NERC, and the Regional Entities should closely monitor implementation if the standards are ultimately approved” (my emphasis). In my opinion, this is exactly the big problem with CIP-013.

However, this problem isn’t insurmountable. The NERC Regions aren’t constrained to pass every CIP-013 supply chain cyber security risk management plan handed to them, simply because it has the correct title at the top. Even in the strictest auditing regime, an auditor would be allowed to use necessary judgment to determine what constitutes a “good” plan.

So I guess the real problem is not that CIP-013 is un-auditable, but that the auditors will be free to use lots of discretion in auditing, with one auditor stamping a plan as acceptable that another auditor – perhaps within the same Region – would deem unacceptable. This can be avoided if there is a serious effort to develop guidance that describes what should be in a good plan (this might be developed by NERC or by a third party. Unfortunately, neither the CIP-013 Implementation Guidance document prepared by the standards drafting team, nor the recent document put out by the North American Transmission Forum, provides any serious guidance on how to put together a good CIP-013 plan).

Of course, such guidance can’t be considered binding either on the auditor or on the entity being audited, but at least it would provide an indication of the level of performance that should be deemed acceptable; the entity wouldn’t have to follow the guidance exactly, but if they turned in a very minimalist plan, they would need to be able to convince the auditor that it provided roughly the same level of protection as does the plan described in the (as yet unwritten) guidance.

CIP-002-5.1a R1 provides a good illustration of what I mean by this. Perhaps the biggest ambiguity in complying with this requirement (and that’s saying a lot) is that the definition of BES Cyber Asset uses the phrase “impact on the Bulk Electric System” without any further description of what that means. Yet an entity needs to have some idea of what BES impact means, in order for them to have any confidence that they have identified their BES Cyber Systems properly in complying with R1. This is because almost any device that uses electricity – my electric toothbrush, for example – could be considered to have some miniscule impact on the BES.

The Guidance and Technical Basis attached to the standard describes the BES Reliability Operating Services. The BROS were an official part of the CIP-002 R1 compliance process in the first draft of CIP v5 (which was soundly voted down in December 2011), since they formed part of the BES Cyber Asset definition itself – a BCA was defined then as a Cyber Asset that fulfilled a BROS. However, the drafting team, when they met to pick through the wreckage of the first draft at ERCOT’s headquarters in January 2012, decided that the BROS weren’t really an auditable concept – so they moved them into the Guidance and Technical Basis. But the important thing is that they didn't throw out the BROS altogether.

To be honest, I didn't think NERC entities would pay much attention to the BROS after this (since it was no longer mandatory to consider them), but I’ve been pleasantly surprised to see that a number of NERC entities still consider whether a Cyber Asset fulfills one or more BROS, as they decide whether or not it’s a BES Cyber Asset. So, while an entity isn’t required to identify any system that fulfills a BROS as a BCS, and while an auditor isn’t allowed to require the entity to perform the BROS analysis in identifying their BCA/BCS, in fact there has been a tacit agreement among entities and auditors that they will do exactly this.

So it’s good news that there is this tacit agreement regarding identifying BCA/BCS using the BROS, but at the same time it’s bad news that the BCA definition is so open-ended that unwritten and unspoken agreement is required to make audits something more than pin-the-tail-on-the-donkey exercises. By the same token, it’s bad news that CIP-013 R1.1 provides close to no guidance on what should be in the entity’s supply chain cyber security risk management plan, but it will be good news if there can be some tacit agreement between entities and auditors that a certain yet unwritten guidance document provides a good description of what should be included in a good plan.

Ya gotta count your blessings where you can find them, I guess.

Please note that the free CIP-013 webinar workshop offer I made this summer is still good! Just drop me an email and we can set up a time to discuss this by phone.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

[i] Fifty years almost to the hour after the assassination of President Kennedy in 1963. Coincidence, you say? I don’t know…

[ii] And I will admit that, while I did attend some of the on-site and phone-based drafting team meetings, it never occurred to me that this flaw was present, or that CIP-010 R4 exemplified a solution. This realization only came to me this year, as I’ve been working on a book about CIP’s problems and how they can be fixed. Had I realized this, I would certainly have brought it up to the drafting team, although they simply didn’t have the time to deal with it then.

Thursday, September 13, 2018

What does CIP-013 R1.1 tell us?

If you’re a CIP-013 groupie, you may have noticed that I focus on CIP-013 R1.1 a lot in this blog, and not so much on the other CIP-013 requirements. In fact, I’ve only discussed R3 once, and I’ve never discussed R2, since all it says is “Each Responsible Entity shall implement its supply chain cyber security risk management plan(s) specified in Requirement R1.” Not much to discuss there!

And in R1, I focus on R1.1, instead of R1.2. Is this because I don’t like big words like “authenticity” and “Interactive Remote Access” – both of which are found in R1.2? No. While it’s true that I don’t approve of excessive polysyllabilality, I myself have been known to use long words at times. The reason I focus on R1.1 is that this is without doubt the heart of CIP-013. In R1.1, the entity draws up their Supply Chain Cyber Security Risk Management plan, while in R2 and R3 the plan is implemented and then reviewed annually. If the entity doesn’t draw up a good plan because they don’t know what to put in it, they obviously won’t implement anything worth implementing, or review anything worth reviewing. For that entity, the whole CIP-013 exercise will be a waste of time and money.

But the loss will go beyond the entity. As the Russian cyber attacks recently brought into the open by DHS show, the bad guys have figured out that the best way into every large organization nowadays – and most certainly electric utilities – isn’t to mount a full assault on the front gate of the castle, with its myriad protections. It’s to go around to the back door with a single lock on it that the tradesmen use. To continue the somewhat strained metaphor, if the attackers can find a place to hide in the cart that brings in the hay for the animals, they stand a much better chance of breaking in to the castle. So supply chain attacks are already becoming the vector of choice for the discriminating cyber hacker. This is the biggest vulnerability for the electric sector, even though as of now there haven't been any successful supply chain attacks on control networks, except for two wind turbines.

In this post from August, I stated that I think CIP-013 R1.1 is un-auditable, because it provides nothing for the entity to key on to include in their plan – and I used CIP-010 R4 Attachment 1 as my poster child for a good plan-based requirement. As long as NERC auditors are only allowed to focus on whether the entity has complied strictly with the specific wording of a requirement (which is the case now, unfortunately), a requirement that doesn’t have some specific wording for the auditors to key on is simply un-auditable. CIP-010 R4 Attachment 1 provides specific (although not prescriptive) criteria for what should be in the plan; CIP-013 R1.1 doesn’t.

However, I’m exaggerating when I imply that R1.1 provides no help at all to the entity as they develop their plan; there is some information in there, and it is enough to get you started in writing your plan. This post will list the information that I have found in R1.1. It doesn't provide anything near a workable guide to developing a CIP-013 plan, but it at least is a start.

Here’s the full text of R1.1:

(The plan(s) shall include)…one or more process(es) used in planning for the procurement of BES Cyber Systems to identify and assess cyber security risk(s) to the Bulk Electric System from vendor products or services resulting from: (i) procuring and installing vendor equipment and software; and (ii) transitions from one vendor(s) to another vendor(s).

And here’s what I get out of this all-too-brief text:

The last time I took a close look at R1.1 in this blog, I decided there was no significance to be assigned to the fact that it begins by mandating “..processes used in planning..” That seems to be the Department of Redundancy Department at work. You are developing a plan for supply chain cyber security risk management. You aren’t developing a plan for “processes used in planning supply chain cyber security risk management”. So ignore these words.
The plan is to “identify and assess cyber risks to the BES from vendor products and services…” You may notice that there’s nothing about mitigating those risks. I assume this is just an oversight. Of course, being required to identify and assess risks, but not having to do anything to mitigate those risks, wouldn’t make any sense. So you need to read this as a requirement to “identify, assess and mitigate” risks.
But what are these risks you have to mitigate? Aye, there’s the rub. CIP-010 R4 also requires you to develop a risk management plan. In Attachment 1 to that requirement, you find a list of types of risk that you need to mitigate in the plan, as well as high-level suggestions for how to mitigate these risks. What do you find in CIP-013 R1.1? You find the two bullet points (i) and (ii). What are these? Are these risks to be mitigated, too?
No, I call these “risk areas”. They are essentially subdivisions of the overall world of supply chain cyber security. They aren’t risks themselves, so you still need to find risks to address within each one of these areas. But this does at least provide guidance on where to start.
What specifically are the risk areas? Even though there are two bulleted points, there are actually five risk areas. Notice that the two points are preceded by the words “risks…from vendor products or services..” This means you need to consider each of the two bullet points from the points of view of both vendor products and vendor services.
Next, notice that bullet (i) is “procuring and installing vendor equipment and software”. Breaking this up yields “procuring vendor equipment and software” and “installing vendor equipment and software”. Each of these is itself a risk area, but remember that we are supposed to look at these from the points of view of both products (which means hardware and software) and services. So this means we have to add “procuring vendor services” and “installing vendor services” to the list of risk areas.
Of course, you don’t “install” services! But you do utilize them. So I reword the second of these as “utilizing vendor services”.
As for bullet point (ii), it’s already sufficiently general that we don’t need to list separate risks areas for products and services.

So here’s my list of risk areas that need to be addressed in your CIP-013 plan (and these are enforceable, since they are explicitly stated in the requirement, even if they’re a little hard to see initially):

a) Procuring vendor equipment and software;

b) Installing vendor equipment and software;

c) Procuring vendor services;

d) Utilizing vendor services; and

e) Transitions between vendors.

Now you know at least where to begin as you develop your plan. Your plan needs to address each of these five risk areas. From there, you need to find important risks to mitigate in each area. There’s more information for your plan, to be gleaned from R1.2. I’ll discuss that in another post soon.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or challenges like what is discussed in this post – especially on compliance with CIP-013. And if you’re a security vendor to the power industry, TALLC can help you by developing marketing materials, delivering webinars, etc. To discuss any of this, you can email me at the same address.

Wednesday, September 12, 2018

500,000!

Today this blog passed 500,000 “pageviews” (meaning someone went to a page in the blog, even though they might have read more than one post), since its inception at the end of January 2013. This doesn’t include people who read the posts from the email feed – and as of today, there are 639 subscribers to the feed (by the way, I have no idea who they are).

Of course, 500,000 hits is probably a slow Thursday morning for one of the Kardashians’ blogs, but I’ll take it. I honestly thought a few years ago that, once NERC CIP version 5 came into effect, the CIP news would kind of die down and I’d start having trouble finding something to write about. Fortunately for me but unfortunately for you (if you work in CIP compliance for a NERC entity), there has continued to be lots of controversy and confusion, as well as new standards developed and approved – and when things were a little slow this summer, the Russians stepped in to save the day[i]! I guess I was just born lucky.

So I suspect it might be a little while before I run out of things to talk about. I once thought I might focus on reviews of online cat videos – I guess I’ll always have that as a fallback option.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

[i] And when the story of the attacks themselves was flagging, DHS' confused and shifting portrayals of the attacks took up the slack very nicely. Verily, my cup runneth over.

Friday, September 7, 2018

Two comments on yesterday’s post

I received two very good comments on yesterday’s post. In neither case do I think it’s warranted to change the post, but I want to get both of these in the open for the sake of openness (we believe in transparency here at Tom Alrich’s Blog!).

First Comment

In yesterday’s post, I was making the point that DHS had clearly implied a number of times that the Russian cyber attackers had compromised control centers of US electric utilities. The examples I used were four quotations from two articles by a Wall Street Journal reporter who had attended the first of DHS’ four briefings on this matter. In both of these articles, she wrote about conversations on the Russian attacks that she had with DHS staff members (as well as staff from other government agencies, such as DoD), both before and after the briefings.

The third of the four quotations was this one:

“Here’s a real smoking gun. Later in the same article, you find this quote ‘In March, Homeland Security and the FBI pinned responsibility on…Energetic Bear, for intrusions into utilities that gave attackers remote access to critical industrial-control systems, called SCADA.’ SCADA is found in utility control centers, not plant control rooms. Again, this isn’t a direct quote from DHS, but I'm also sure the reporter didn’t dream that they said this.”

This morning I received this email from Kevin Perry, recently retired CIP compliance auditor from the SPP Regional Entity:

“SCADA is not limited to control centers. SCADA is Supervisory Control and Data Acquisition and is any system that performs those functions. In the control center, SCADA is often combined with network applications like state estimation and contingency analysis to be an Energy Management System (SCADA/EMS). The plant control system at a generating plant is a SCADA system. As is the process control system at a paper mill and other automated manufacturing facilities.”

By the way, I wish to take this occasion to wish Kevin a happy retirement (although it sounds like it might be anything but retirement, as is often the case in this industry). He has been a real thought leader on NERC CIP (in fact, I would say the thought leader, although his position as an auditor limited what he could say publicly). He taught me almost everything I know about the intricacies of CIP version 5 (although he and I have a couple long-standing differences of opinion on that version – which of course is the foundation for all the current CIP standards, even though some of them are now on a higher version number. It’s unlikely that either of us will move to the other’s side on these issues, although we can always have a civil conversation about them). He was vice chair of the NERC CSO706 drafting team during its first year, when they drafted CIP versions 2 and 3 (the team went on to draft v4 and v5). He was also chairman of the NERC CIPC and a member of the team that drafted Urgent Action 1200, the predecessor to CIP. I believe he – until his retirement before Labor Day – was one of perhaps two people in the ERO Enterprise (i.e. NERC and the Regional Entities) that was most knowledgeable about CIP.

In his statement, Kevin is saying that it isn’t necessarily true that DHS was referring to control centers when they said SCADA systems had been penetrated, since generating plants are controlled by SCADA systems. I agree it’s technically true that generating plant control systems are SCADA, and it’s also true that, in all other industries, the systems that run a plant are called SCADA. But in the power industry, systems that run generating plants are called distributed control systems (DCS). I’ve never once heard the term SCADA used in reference to a generating plant.

However, I think the sentence from the WSJ article, that appears right after the one I quoted, makes it quite clear that the person who said this (from DHS or the FBI) had utility control centers in mind. It reads “These systems govern power flows and keep electricity supplies balanced with demand and thus prevent blackouts." This could only refer to the control center of an electric utility.

Second Comment

The second comment was posted on yesterday’s post itself by “JasonR”. He commented:

"’HMI screen shot showing a diagram of a gas combustion turbine’ - this evidence alone doesn't mean a Control Center was compromised. Almost all entities have read-only stations connected to a server which has a read-only historical feed from Production (typically via a data diode). Often times, the same exact "client" interface is used, and other than a lack of control access, it appears identical. Further, both the gas turbine HMI and wind farm could be monitored by a single entity with views into each system as I described, and no compromise on any control networks. This all could be just one CxO's laptop that was hacked who had read-only access to view both.’

JasonR is obviously a very technically savvy guy. For those like me who don’t quite fit that description, let me translate this. He’s saying two things. The first is that, just because the attackers obtained a screen shot of a Human-Machine Interface (HMI) screen and the HMI should always be on the control systems network[i], it doesn’t mean the attackers actually penetrated the control network. This is because there are various technologies (the most common being a “data diode”) that allow secure one-way transfer of data (like HMI screens) from the control network to the IT network. So the attackers could have viewed the HMI screen just by attacking the IT network, which is much easier than attacking the control network.

The implication of what Jason says is that there wasn’t actually any penetration of the control network at the small combustion turbine unit that was depicted in the HMI screen that DHS displayed during the web briefings on the Russian cyber attacks. And the implication of this statement is that I was wrong in asserting “This means that either a) Christopher Krebs, the person who said that only one facility - and at that facility only two wind turbines - was compromised was wrong; or b) Leslie Fulop, the earlier spokesperson who said that a single plant was compromised, was wrong.” In fact, if it turns out a CT plant’s control network wasn’t penetrated (because, as Jason implies, the Russians only accessed the IT network), then neither Christopher nor Leslie was wrong – rather, I was, for which I apologize if this is true.

However, I don’t think I was wrong. This is because Leslie Fulop emphasized that the asset that was penetrated was a very small generating plant, whose loss wouldn’t affect the grid at all. If it was a very small CT plant that was attacked, it’s unlikely the plant would have put in place a data diode (which isn’t cheap) to safely transfer data from the control to the IT network. What’s much more likely is that, in this small plant, there is no distinction at all between the IT and control networks – meaning that penetrating the IT network is the same as penetrating the control network. So the Russians had access to the control systems, no matter which “network” they thought they were attacking.

Jason’s second point is that it’s possible that only one generating entity was attacked, but it controlled both a wind farm and a small CT plant. There could have been a manager who receives production data from both assets on his or her laptop. As in Jason’s first point, this would be a safe practice if the production data were transferred securely, for example with a data diode. Again, the Russians could have penetrated the laptop without having to penetrate the control network. In this case, neither on the wind farm nor on the CT plant would the control network have been penetrated. Once again, if this were true both Christopher and Leslie would be right, and I would be wrong.

However, I find this scenario very hard to believe. For one thing, I doubt there are too many generators that have both a small wind farm and a small CT plant (it’s kind of like finding a very small company that operates both a bakery and a quick lube franchise. Not much synergy there). Almost all of the time, it will be one or the other. More importantly, if a manager is receiving access to real-time production data on his or her laptop, it must mean that it’s read-only data, meaning the manager doesn’t have any control of the power generation process. So it doesn’t matter that the Russians penetrated his or her laptop – they’re never going to be able to affect either the wind turbines or the CT plant! But in that case, what was the point of these briefings, if the Russians never once obtained the ability to make any impact on the US grid at all?

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

[i] In the first Ukraine attacks, the attackers entered via the IT network, but they were able to get into the control network (and thus to the HMIs) because the VPN connection didn't use multi-factor authentication – a definite violation of good ICS security practice and NERC CIP! Since the attackers had been rooting around the IT network for months, they had some engineer's credentials, and used those to get into the control network. This is how they were so easily able to trip circuit breakers to cause outages.

Thursday, September 6, 2018

What if it really was control centers?

As I described in my last post, until last Friday I believed that a) despite whatever misleading statements were made in the DHS briefings on the Russian attacks (and duly broadcast in the press), the Russian attacks didn't penetrate any utility control center (i.e. EMS) networks. I also believed that b) while it was unfortunate that the two walk backs by DHS were different (one said that only a single plant was compromised, but the second said that just two wind turbines were compromised), it was certainly conceivable that the first one was mistaken and the second was simply a correction of it.

However, last Friday a longtime industry observer pointed out to me that the single piece of concrete evidence of compromise of a control system network that was presented in the DHS briefings was an HMI screen shot showing a diagram of a gas combustion turbine, which had been taken by the attackers and uploaded . It seems that both a gas CT plant and a wind farm were compromised. This means that either a) Christopher Krebs, the person who said that only one facility - and at that facility only two wind turbines - was compromised was wrong; or b) Leslie Fulop, the earlier spokesperson who said that a single plant was compromised, was wrong.

However, I still believe that this confusion was just that – not intentional. But for me to continue to believe this much longer, I (and presumably others) would like DHS to, for once and for all, say exactly how many control system networks were compromised and what kinds of assets they were associated with (gas CT plants, wind farms, coal plants, nuclear plants - God forbid! - or anything else). Since this will be DHS’ fourth story, I sure hope this doesn’t have holes as well.

But now I want to ask, what if a) is wrong above? That is, what if there were actually one or more utility control centers penetrated – meaning the actual OT network, not the IT network that’s physically contained in the control center? Why am I asking this? Does it mean I've begun to think that really happened? No it doesn’t, mainly because the implications of that, if true, would range from serious to truly horrific. But there are obviously a lot of people outside of the utility industry (including in the technology press) who are quite ready to believe this, which is why the story rocketed around the world a month ago that the Russians had penetrated “hundreds” of US utilities and were poised to throw the entire US into darkness at a single word from Mr. P. I contend these people wouldn’t hold this view so easily if they knew the real implications – which I’ll outline shortly.

But why do these people believe the story? Of course, it’s because it came from major press reports. And where did those reports come from? Was it just the fertile imaginations of some reporters? Unfortunately not. There were a number of statements from DHS during the briefings that a reasonable person would assume meant that multiple utility control centers had been compromised. Here are a few of them:

Jonathan Homer of DHS said in the first briefing that “They got to the point where they could have thrown switches” and disrupted power flows[i]. Of course, we all know that he was talking about software “switches”, but by talking about disrupting power flows he could only be referring to software running in utility control centers, not control rooms of individual generating plants. An attacker that penetrated a plant control room wouldn’t be able to do anything more than shut down the plant. And since DHS has admitted that the “single plant” that was compromised was very small and couldn’t affect the grid if it was lost, there is simply no way that this is what Mr. Homer could have meant when he talked about disrupting power flows (assuming the walk back is correct).
The second WSJ article on this topic, dated August 7 (which I wrote about in this post), starts by saying “Top administration officials are..(discussing striking back)..to deter attacks such as the successful penetration of U.S. utilities by Russian agents last year.” One paragraph later, the article says “Hackers..claimed ‘hundreds of victims’ in a campaign against the energy sector that ultimately put them inside the control rooms of U.S. electric utilities where they could have caused blackouts..” Again, you can't cause a blackout by shutting down a single small generating plant (and you usually can't cause one by shutting down a big plant, given the redundancy built into the grid). Admittedly, neither of these is a direct quote from someone at DHS, but I sincerely doubt this reporter was just making the stuff up.
Here’s a real smoking gun. Later in the same article, you find this quote “In March, Homeland Security and the FBI pinned responsibility on…Energetic Bear, for intrusions into utilities that gave attackers remote access to critical industrial-control systems, called SCADA.” SCADA is found in utility control centers, not plant control rooms. Again, this isn’t a direct quote from DHS, but I'm also sure the reporter didn’t dream that they said this.
In the next paragraph, the article continues “In April, Russian hackers were using…internet routers..as another way to…maintain a hidden presence in control networks…” This is of course a different Russian attack campaign (those guys have been really busy!) that attacked routers. I hadn’t heard that this resulted in penetration of control networks at utilities, but they’re saying it did.

What’s most disturbing about this is not so much that these statements were reported, but that DHS has done nothing to set the record straight, except for the two walk backs which themselves need to be walked back. So when they do their third walk back, I would also like them to explicitly state whether they know of any penetration of U.S. utility control networks (meaning EMS or transmission SCADA) by the Russians, Chinese, Nigerians[ii], Maldive Islanders…whoever.

Nevertheless, I continue to believe that no utility control centers were penetrated. Why do I say this? There are two reasons. The first is that there would definitely have been a lot of very noticeable activity if that had happened. When the Ukraine attacks happened, there was a big inquiry and there were lots of briefings, both classified (initially) and unclassified; for an attack in the US, there would have been much more than this. I have heard nothing about any of these things happening.

The second reason is that, if it’s true that just a single small EMS system was penetrated, the two DHS people who said that only a small plant or two wind turbines were compromised should obviously be immediately fired, both because they lied and because they presumably inhibited DHS from taking the needed steps to notify the industry and the public of the danger. And if - let’s say - even one large utility control center (controlling a major urban area) was compromised, not only should these people be fired, but there should be a full investigation of how that happened and whether higher-ups were involved. If we’re really talking about a city being threatened with a major blackout (which will very likely result in deaths, especially if it’s more than a few hours) due to deliberate actions or inaction by people at DHS, we are now talking about treason, not just dereliction of duty.

And this is why I don’t believe any utility control centers were compromised, despite all the DHS statements implying (or stating) otherwise.

However, we seem to be forgetting something very important: If the Russians have compromised one or more major utility control networks and could be poised to cause a major outage (as most of the news stories on the attacks indicated), this constitutes a true national emergency. I am mystified that this hasn’t been done already, but somebody needs to get on the red phone to Mr. P and tell him very clearly that he needs to immediately cease all cyber attacks against US critical infrastructure (which now includes voting systems, of course), and make sure any malware that has been planted has been removed. He will have 48 hours to get this done, at which point a set of rapidly escalating sanctions will be put in place.

This time the sanctions won’t just consist of putting even more financial pressure on some of his oligarch cronies or exposing to all the world where he’s stashed the approximately $35 billion he’s reported to have amassed for himself (all on his modest salary, I’m sure). One of the progression points might be banning Russian aircraft from all airspace worldwide (of course, this would require coordination with our allies, which seems to be a lost art in Washington these days. Hopefully, somebody still remembers how to coordinate with, rather than bash, our allies) until full compensation is paid to the families of all victims of the shooting down of Malaysian Airlines flight 17 in 2014, as well as to their governments for the direct expenses and general grief their countries have suffered because of this event.

P.S.

Someone suggested to me that the fact that we hadn’t come down harder on the Russians so far for their vigorous attempts to penetrate utility control networks was because we have been doing the same with their utility control networks, and – unlike the Russians – we may have actually succeeded in penetrating them. In other words, everyone does it, so we’d be self-righteous to make a big deal about it – especially since the Russians didn’t succeed.

Here’s a story about another national emergency: the Cuban missile crisis of 1962, when President Kennedy found out the Soviets had installed nuclear-armed missiles in Cuba, aimed of course at the US. Khrushchev had installed them in response to a) the US’ attempted invasion of Cuba at the Bay of Pigs in 1961, and b) NATO’s recent installation of Jupiter nuclear missiles in Italy and Turkey aimed at, naturally, Russia.

Kennedy didn’t tell people to calm down, since the Soviets were quite justified in being a little upset about these two events. Rather, he did what was necessary to defend the US and blockaded Cuba until the Russians withdrew the missiles. In the process, the world came the closest to a full nuclear war than it ever has (and hopefully ever will) - in fact, a Russian sub almost set off World War III all by itself during the crisis, and it was only the action of one Russian commander that literally averted Armageddon.

So if in fact the US has penetrated Russian utility networks, great! We all know that the US isn’t going to launch a cyber “first strike” on a foreign power grid. And we also know that Russia has already launched one of these in the Ukraine. Let’s not wait around for them to launch one here.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

[i] This was quoted in the Wall Street Journal article of July 23. The quotation marks didn’t include the phrase ‘and disrupted power flows’. But I seriously doubt the reporter just inserted that phrase on her own – she was probably paraphrasing Mr. Homer.

[ii] Perhaps the Nigerian prince who has been emailing me to send him money has given up trying to get blood from a stone and has turned to hacking SCADA for a living.

Tuesday, September 4, 2018

What is going on at DHS, anyway?

I have closed a couple recent posts by saying something to the effect of “I wish this were the last time I had to write about what DHS did or didn’t do regarding the story of the Russian cyberattacks on the US power grid, but I fear it may not be”. I have good news and bad news regarding this sentence. The good news is that I don’t plan to have to use it any more in the near future. The bad news is that I don’t need to use it, because there’s no longer any doubt that I’ll be writing more posts on this subject – the story has a long way to go before it’s put to bed.

To be honest, up until last Friday, I thought I had this pretty well figured out. Here’s the timeline:

On Monday July 23, DHS gave a briefing on the Russian cyberattacks. The next day, a number of articles appeared, all saying that the Russians had penetrated “control rooms” for the grid –and usually hundreds of those. All the articles assumed that it was likely that malware had been planted that would allow the Russians to cause outages. As Jonathan Homer of DHS said in a quote in the Wall Street Journal’s article on the 24^th, “’They got to the point where they could have thrown switches’ and disrupted power flows…” Not to be outdone, Michael Carpenter of DoD was quoted in the same article as saying “They’ve been intruding into our networks and are positioning themselves for a limited or widespread attack…They are waging a covert war on the West”.
These two statements, and others, both had only one interpretation, for people knowledgeable about the electric power industry: Control centers (not “control rooms”, the term DHS erroneously used repeatedly) of US utilities had been penetrated by the Russians, and were probably at that minute harboring malware which could be activated at any point by the Russians to cause widespread outages or worse (a much-larger-scale version of the attacks that happened in the Ukraine).[i]
DHS repeated the briefing on Wednesday, July 25. I was able to attend this one, and I thought the overall tenor of this briefing was little different from what had been reported about the Monday briefing. I wrote my first post about this subject that day. In that post, I pointed out that obviously the biggest lesson of these attacks was that supply chain security should be the primary cyber concern of all utilities (this remains my position). However, beyond that, I thought there was too much missing information to be able to draw more conclusions. Finally, I proceeded to violate my own statement by drawing the conclusion that it was very likely that just generation was penetrated, and I guessed it was under 25 generation facilities.
For me, the most compelling part of that briefing was when the presenters displayed a screen shot of an HMI display[ii], which they said had been uploaded from the target system by the Russians. Of course, the only way the Russians could have taken the screen shot was by penetrating the control system itself – where they could presumably have planted malware. I didn’t pay close attention to the screen shot since the presentation was moving on, but a longtime industry observer emailed me later to point out that the screen shot was of a display of a combustion turbine in a natural gas power plant. Keep this in mind; we’ll come back to it later.
On Thursday the 26^th, that same industry observer sent me a link to an article on Power Magazine’s web site, that quoted spokesperson Leslie Fulop of DHS saying “While hundreds of energy and non-energy companies were targeted, the incident where they gained access to the industrial control system was a very small generation asset that would not have had any impact on the larger grid if taken offline.” Hold this thought too; it is also important.
That day, I wrote a pretty indignant post, pointing out the ways this statement contradicted various statements that had been made in the previous day’s briefing. My last statement was addressed to DHS: “…I can’t understand why you would want to pretend that a lot of assets had been penetrated, when it was only one small one. By doing so, you raised this threat from one that all power industry asset owners should be aware of and should be taking steps to prevent, to something approaching an imminent threat to our national security. And it just isn’t that.”
On Saturday July 28, I put up a post that quoted from the New York Times article on this story, published the day before, to the effect that “This week, the Department of Homeland Security reported that over the last year, Russia’s military intelligence agency had infiltrated the control rooms of power plants across the United States. In theory, that could enable it to take control of parts of the grid by remote control.” Despite the big contradiction between these two sentences (if power plants were infiltrated, this wouldn’t enable “taking control of parts of the grid”), the most amazing aspect of this quote was that it was talking about “…power plants across the United States”. It’s hard to reconcile this with Leslie Fulop’s statement. Of course, I know of no effort by DHS to correct the Times story, any more than the other stories.
The following Monday July 30, I wrote a post pointing to two slides (sent to me by a former colleague) that were shown at the briefing on the 25^th, that directly contradicted the idea that only one small generator was penetrated. One of the slides said the Russians “Leveraged early victim to gain entry to two previously accessed utilities and one new victim”. This definitely says at least two “utilities” were “accessed”, and perhaps four. Other statements led to the conclusion that at least three “utilities” were accessed. I concluded the post by saying “…even though the DHS people who put together the briefings (and didn’t provide any immediate corrections when the alarming news stories started flying) were only trying to call attention to a problem, by exaggerating what had happened they have damaged their credibility for future advisories.”
On Wednesday August 1 (i.e. a week later), DHS conducted a high-level briefing for utility CEOs in New York City, attended by no less than the US Vice President, the Secretary of Energy and the Secretary of Homeland Security. It was reported by the indefatigable Blake Sobczak of E&E News, although he chose the (in hindsight) unfortunate title of “Grid leaders clear the air around Russian hacking”. In the article, he quoted Christopher Krebs, undersecretary for DHS's National Protection and Programs Directorate. Mr. Krebs started by saying (in what probably qualifies as the Understatement of the Year) “In the initial webinar, I think there was some context that was lacking…” He went on to say that the Russians had taken control of "a renewable source of energy that would not disrupt the grid." This was later clarified to mean two wind turbines.
A week later on August 8, I put out what would be my last post on this subject for three weeks. In the post, I strongly suggested that DHS needed to finally step up and try to take control of this story, hopefully by having a press conference to say that, while a few individuals at DHS may have exaggerated what the Russians had achieved, this was intended to be for the good purpose of alerting the power industry to the serious situation they face (and there’s no dispute anywhere that I know of – although I haven’t talked with Vladimir Putin lately – that this is very serious, and the utilities need to step up their cyber defense efforts, especially in supply chain security). Despite my earnest appeal (or perhaps because of it), DHS has yet to take me up on the suggestion.
My next post on this subject was three weeks later on Thursday August 30, when I reported on a new link the same industry observer had sent to me, describing Senator Markey’s announcement that he was sending queries to 14 utilities and four agencies asking what measures they were taking to combat the Russian threats. The announcement repeated the same erroneous ideas that had been promulgated in the press earlier. I very helpfully wrote out for DHS how they could respond to the Senator – and, although I didn’t mention it in the post, this could form the basis for a statement that they would provide to the entire US population (or at least those that are paying attention to issues like this in the last days of summer). Senator Markey hasn’t called me to say that DHS took up my suggestion, so I assume that once again they’ve ignored my advice (don’t worry, DHS. I’ve given lots of advice to NERC and FERC that they’ve ignored as well. You’re in good company).
Please note that, as of that post last Thursday, I still believed that the whole problem was due to a few DHS employees that got carried away (with good intentions, of course) and greatly exaggerated what the Russians had accomplished; moreover, the press took what they said as gospel and ran with it (which the press tends to do, of course. If you don’t want them to write something, don’t say it! And don’t be cute and try to imply something that you know isn’t true, without out saying it directly!). Some of the higher-ups at DHS had tried to correct the record, but their efforts had been very limited and hadn’t gone very far, to the extent that three weeks later a US Senator still didn’t know that the news stories had been walked back.
Last Friday, the plot thickened with two events. One was a long phone conversation I had with a party I won’t identify, in which that party suggested two things that really floored me: First, the people at DHS who made the statements at the briefings might have really meant what they were trying to say (notwithstanding the fact that they confused control centers with control rooms): That the Russians had actually penetrated more than one utility control center, where they actually could control the flow of power on the grid itself. That meant to me that they had penetrated the Energy Management System (EMS) which forms the core of the mission of most utilities, and really does allow them to control power flows in a certain domain.
The second thing this person said was that there may be two warring camps at DHS. One camp is the people putting out the story that the Russians penetrated utility control centers. The other camp is the people who are trying to walk this story back by saying that only one small generation asset was penetrated. This is why the two contradictory stories have both continued in a kind of state of quantum coherence, just like an electron can be in two positions at the same time. While I was quite surprised to hear this, I continued to believe in the scenario I outlined in paragraph 12.[iii]
The second event last Friday – and this did cause me to question the position I held on Thursday – was when the same longtime industry observer emailed me to point out a contradiction he’d seen: As I pointed out in paragraph 4, the HMI screen shot shown in the webinar was of a combustion turbine (CT), and it could only have been obtained by the Russians penetrating the control systems that controlled that turbine – and by implication the plant itself.[iv] So the screen shot was taken when the Russians penetrated a gas CT plant.

This person had previously pointed this out to me. But now he said something I hadn’t thought of:

a) As described in paragraph 5, Leslie Fulop of DHS said that in fact only one small generating plant was compromised. Since this was right after the second DHS briefing where the HMI screen shot of the CT had been shown, it would be logical to conclude that the small plant she was referring to was a gas CT plant.

b) Yet a week later in New York, Christopher Krebs of DHS said only two wind turbines had been compromised. Since an HMI screen shot of those turbines would have looked completely different from what was shown in the briefing, there is no possibility that the single small plant referred to by Ms. Fulop was the wind farm referred to by Mr. Krebs (and if it had really just been two wind turbines that were compromised, Ms. Fulop would certainly have said that, not used the words she did).

So it appears that either Ms. Fulop was wrong when she said that one small plant was compromised, or Mr. Krebs was wrong when he said that just two wind turbines were compromised. At the minimum, it seems two plants were compromised – one wind farm and one gas CT plant – unless Ms. Fulop or Mr. Krebs is simply lying. Had these two statements been made the same day or one day apart, one could attribute this discrepancy to the likelihood that DHS people were rushing to walk back the initial story and two different people ended up with two different stories. This isn’t a good thing, but it wouldn’t be unprecedented in the annals of government. But the fact that there was a week (exactly) between the two statements and they still were contradictory, is disturbing. And it’s especially disturbing that, more than a month after all this started to happen, DHS hasn’t put out any statement or held any press conference to explain all of these discrepancies.

As I said, these contradictory statements – and the fact that they were separated by a whole week – made me reconsider the position I held last Thursday. While I still believe that the initial statements in the briefings were deliberate exaggerations made with good intentions (and that no utility control centers were penetrated), I am now suspicious of the two walk back attempts. I find it hard to believe that it’s just the normal fog of war that led to these contradictions. In any case, a single definitive statement of what happened – issued by someone presumably above the fray and the factions – could settle this and the other question. I hope that comes soon. This will be the fourth story that DHS has put out about the Russian attacks. I sincerely hope it will be the last!

Of course, as I just said, I continue to believe that no utility control centers were penetrated, even though the DHS briefings strongly implied that they had been. But if they were – even if only one small utility control center in West Texas was penetrated – this raises the seriousness of the situation to a higher level, and puts official DHS actions in a very different light from being mere bumbling. I will discuss this scenario in (hopefully) my next post, coming soon.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

[i] For an in-depth attempt to clarify the language that DHS used in their briefings – which was repeated by the various news outlets – and determine what they really meant to say, see this post.

[ii] HMI stands for human-machine interface. HMIs are probably the most important component of any control system, since they gather all the data being put out by the various devices in the system and impose it on a real-time schematic drawing of the system; this is what enables operators to understand what is going on, as well as intervene to make changes.

[iii] To continue my quantum metaphor, I had made my observation that caused the wave function to collapse into one of the two contradictory possibilities, just as an observation of an electron causes its wave function to collapse to one position or the other – or at least that’s what the Copenhagen Interpretation of quantum mechanics says. But then, as Richard Feynman famously said, “Anyone who says they understand quantum mechanics doesn’t understand quantum mechanics.”

[iv] Power plants are often controlled by a Distributed Control System (DCS). This controls the combustion turbines, if they’re present. And the DCS usually resides in a single control room. If systems in the room control more than one plant, then the room is actually a control center, and subject to much more stringent NERC CIP controls, as well as other NERC standards. No generation operator in their right mind would ever describe a true control room as a control center.