Tom Alrich's Blog

Sunday, July 26, 2020

The DoE RFI: No, there’s no monster under your bed, but let’s investigate the one peeking in your window

If you're looking for my pandemic posts, go here.

Tim Conway, Patrick Miller and Jason Christopher posted a draft response to the DoE Request for Information on Friday July 24, on the SANS ICS Community site. They asked for comments on it, so here are mine. I have already written two posts (here and here) on the RFI, addressing four of the questions in the RFI (and since a number of questions have a lot of sub-questions, there are probably 20-30 questions in total in the document). Reading the draft response made me want to discuss the RFI in general, since I haven’t done that yet.

Guys, I think you did an excellent job; you’ve clearly put a huge amount of time into this. But I feel a sense of regret that the three of you – all of whom have much better uses for your time – should have even had to make this effort. Because here’s the problem: If I were to have read the RFI with no knowledge of the Executive Order that’s behind it, my initial response would be “You gotta be kidding!” And I would have dumped it in the bit bucket. The same with the EO itself, which is of course what drove the RFI.

But the problem is that neither the EO nor the RFI can be ignored, simply because the EO itself presents a threat to the industry – almost certainly greater than the close-to-infinitesimal threat that the Chinese will cause a cascading outage on the US grid by implanting malware in a bunch of devices like transformers that aren’t controlled by any sort of processor anyway.

Yet the EO on its surface seems to say that, as of the date it was released (May 1), all purchases of any of the 25-odd types of equipment listed in the EO need to be paused, pending DoE giving guidance on “safe” vendors and “safe” products.

At this time I want to point out, as I have before, that I don’t think for one second that DoE was behind this EO. It came from the White House. I imagine that, when the people at DoE heard they were going to be in charge of implementing this order, they didn’t exactly jump for joy.

So here’s how my reply would read:

Dear DoE:

I will have some strong words below, but my objections are to the Executive Order, not to your attempts to make it into something that might at least lead to some benefits to the US electric grid. The basic problem is that the EO as written will lead to the electric power industry wasting huge amounts of money and time chasing highly unlikely – or even impossible – supply chain threats to grid control systems, while completely ignoring supply chain cybersecurity threats to the grid that have actually been realized and have been identified by multiple government agencies as well as the media – yet they haven’t even been investigated yet.

First, let’s be clear about what constitutes a cybersecurity threat to the grid and what doesn’t:

1. Only devices that are controlled by some sort of built-in control logic (usually a microprocessor, but also FPGAs and perhaps other capabilities) can be subject to a cyberattack – these devices are all in some way “programmable”. Of the 25-odd devices listed in the EO, Kevin Perry and I, in a post in early May, only identified five that either always or sometimes meet this criterion. Since the NERC definition of Cyber Asset is “programmable electronic device”, we can summarize this criterion by saying only Cyber Assets are subject to cyberattack.

2. It’s important to point out that a device like a transformer – whose operation isn’t controlled by anything other than the laws of physics – might sometimes have auxiliary devices included with it, that might themselves be programmable. In the case of a transformer, it sometimes has a load tap changer (LTC) associated with it (which might be installed within the housing of the transformer itself, or in some cases outside of it. It might or might not be obtained from the supplier of the transformer iteself), which does have a microprocessor. The LTC is a Cyber Asset, but the transformer itself isn’t one. So transformers shouldn’t even be on the list of devices covered by the EO (although LTC’s might be added), as well as any of the other 20-odd devices that aren’t Cyber Assets.

3. And why do you (i.e. DoE) even ask about minerals? They’re obviously not subject to a cyberattack. Why are they in the EO at all?

4. But even being a Cyber Asset shouldn’t mean a device should be a concern. The device has to be capable of impacting the Bulk Electric System if compromised or destroyed. Cyber Assets that meet this higher criterion are called BES Cyber Assets - and the ones that can actually damage the BES if compromised are Medium and High impact BCAs. In our post referenced above, Kevin Perry and I couldn’t identify any devices that meet the definition of BCA, that a) are actually found on the US grid now and b) are sold by entities headquartered in one of the six “foreign adversaries” identified in the RFI.

5. In fact, the only BCAs that we could identify that might even be assembled in one of the six adversary countries are servers and workstations from Dell or HP, which are sometimes assembled in China. But if these included an embedded supply chain attack, it would obviously affect every business in the US that buys HP or Dell computers – there would be no way to target the attack at the power industry.

6. In other words, there are simply no devices that are sold or assembled by any of the six adversary nations, that could be subject to a grid cyberattack of any kind. But there’s a further qualification: The attack would need to cause something more than a local outage, to justify the power industry devoting any sort of significant resources to preventing the attacks (probably the number one cause of local outages is squirrels. If the goal of the EO is preventing local outages, then we need to figure out what can be done about them. There might be a genetic engineering solution…). And that means it would almost have to be a coordinated attack on multiple devices at multiple locations on the grid. But there must be some means of triggering this attack remotely. Since medium and high impact transmission substations and grid Control Centers are already protected to a very high degree by the CIP standards, it would be very difficult to launch such an attack in the first place, except perhaps if there were a satellite transceiver embedded in the device, by whoever implanted the vulnerability itself.

7. But the EO and the RFI also address the problem of vulnerabilities that might be embedded in components of these devices, like chips. Since it’s hard to know what attacks might even come through components, it’s certainly impossible to rule out those attacks now. But component vulnerabilities – as long as they meet the criteria enumerated above – could conceivably be a worthwhile subject of investigation.

8. However, no utility in the US has the capability even to determine provenance of components (since they mostly come through a bewildering array of middlemen and brokers), let alone to subject them to the kinds of analysis required to identify embedded vulnerabilities or backdoors – which requires poring over schematic diagrams (and good luck getting those from the component manufacturer, if you can even identify the manufacturer in the first place) and examining traces with electron microscopes (of course, even if your average utility had a lab with electron microscopes, this analysis also requires pulling apart devices the utility has paid for, which will at a minimum void the warranty and may well make the device inoperable). DoD definitely has this capability, and I suggest that DoE talk with them about setting up a program like that – assuming this is really such a big problem. And if it is, then I suggest you ask for a substantial increase in your budget for next year, since you’ll need it.

9. But the whole idea that vulnerabilities embedded in hardware – at the device level or at the component level – are a serious threat is simply bogus. I have never heard of a supply chain cyberattack on hardware that could meet all the criteria above - in fact, I've only heard of one successful supply chain attack on hardware, period, although there are probably more. And this was on consumer IoT devices.

10. However, there is definitely one serious threat to BES Cyber Assets that comes through the supply chain, that isn’t addressed at all in the EO: That’s the idea of vulnerabilities and backdoors embedded in software. There have been a number of successful supply chain attacks on software in other industries, such as this one against Delta Airlines. So far, there has been no such attack against the power industry, although there was a famous – and successful – attack on Juniper Networks in 2015 (which is suspected to have been carried out by the US government, although I don’t believe the US is on the list of adversaries).

11. Much more so than for hardware components, it would be a legitimate exercise to go over all the software that controls the grid and look for vulnerabilities or backdoors, especially on the level of third-party or open source components of that software (usually third-party libraries used by and included with the software). There’s no question there are a lot of vulnerabilities in almost any software package, but of course the EO isn’t at all concerned with software for some reason, perhaps because China isn't thought of as a software supplier. There hasn’t been any software sold by a foreign adversary that is installed on BES Cyber Assets, since Kaspersky software was removed a couple of years ago, but since just about any software you buy now has loads of third-party and open source components embedded in it, there's no telling what you might find if you got down to that level.

However, there is one foreign adversary that – according to at least five agencies of our own government – has not only launched supply chain cyber attacks against the US grid, but has succeeded, through those attacks, in embedding malware at multiple locations in the grid, probably even in Control Centers. I wrote this post about that adversary in December, and followed it up with this open email to Karen Evans of DoE a five days later. Ms. Evans left DoE shortly after that, and of course I’ve never received a response from anybody on this.

My main point here is that the Russians have implanted malware in the US grid entirely through supply chain attacks, of two types:

1. In their July 2018 briefings, DHS stated that at least two hundred vendors to the power industry had been penetrated through their remote access systems – which in very few cases were protected by two factor authentication. The briefings strongly implied that the Russians had succeeded in penetrating more than one utility through this vector, and planting malware in control systems. By the way, this alone shows that requiring vendors to secure their own remote access systems should be a big concern of NERC entities as they develop their CIP-013 supply chain cybersecurity risk management plans. This isn’t explicitly stated in the requirements (as the risks found in R1.2 are), but it should definitely be identified as a risk to be mitigated in R1.1.

2. In January 2019, the Wall Street Journal published a great article describing how the Russians had conducted (and continued to conduct) an extensive and successful campaign using phishing emails to penetrate vendors to the grid (my post on the article is here. I don’t have a free link to the article itself, but if you drop me an email I’ll copy the text and send it to you); the article quotes Vikram Thakur of Symantec as saying that eight utility Control Centers were penetrated, and malware was planted. This shows that NERC entities should identify phishing as another significant risk to vendors (as well as to the entities themselves, of course) that needs to be mitigated in their CIP 13 program. Yet there has been no investigation of Vikram's statement, of DHS' statements, or of the statements by the FBI and CIA in the Worldwide Threat Assessment of 2019.

Of course, for the specific threat of malware already implanted in the US grid, the only mitigation is to investigate whether in fact the government reports are true, and if so describe the malware and immediately get that information out to US utilities. The fact that this threat has never even been investigated is an unending source of amazement (and disappointment, to be sure) to me.

So let’s be clear: (1) We have at least three government agencies and a major news outlet saying that supply chain attacks by a foreign adversary have succeeded not only in penetrating vendors to the power industry but penetrating grid assets like Control Centers. (2) Yet the White House says that the biggest threat to the US grid is a mainly theoretical type of supply chain attack for which it’s just about impossible to identify any vector that would have any likelihood at all of succeeding. But which of these is the subject of an Executive Order that could well cost US electric utilities tens of millions of dollars to comply with, and more importantly could hold up needed improvements to the US grid for maybe years? If you guessed Door No. 2, you’re right!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Are you hot at work – or should be – on getting ready for CIP-013-1 compliance on October 1? Here is my summary of what you need to do between now and then.

Or are you a vendor to the power industry that is wondering what your obligations will be under CIP-013, and how you might meet those obligations? Contact me as well.

Thursday, July 23, 2020

What’s the definition of “vendor”?

If you're looking for my pandemic posts, go here.

Recently, I attended a web conference that included CIP compliance people from NERC entities in one Region and some of the CIP auditors for that Region. The auditors provided a good summary of lessons they’d identified from reviewing CIP-013-1 supply chain cyber security risk management plans ahead of the 10/1 compliance date (and if you haven’t had your plan reviewed by your Region, I strongly suggest you do so, before 10/1 of course).

The auditors pointed out that one thing they’d observed in most plans they reviewed was that they didn’t include a definition of “vendor”, since there is no approved NERC Glossary definition of the term. They initially stated that NERC entities subject to CIP-013-1 should use the definition that is found in the Rationale section of the standard: “The term vendor(s) as used in the standard is limited to those persons, companies, or other organizations with whom the Responsible Entity, or its affiliates, contract with to supply BES Cyber Systems and related services.”

The first question you might have is “What’s the difference between a definition that’s found in the Rationale section of a standard and one that’s found in the NERC Glossary? The difference is that a Glossary definition is voted and approved by the NERC ballot body, approved by the NERC BoT, and finally by FERC; the Rationale is none of these.

The CIP 13 Standards Drafting Team originally discussed developing a definition of Vendor that they would put through the whole approval process (along with CIP 13 itself). However, they were then worried this might delay the whole project (and they were mindful of FERC’s requirement that the new standard be developed, approved and delivered to them for approval within about a year after they issued Order 829 in July 2016 – which was way too short), so they decided to handle it differently.

They at first inserted the definition into a blue box in the second draft of the standard – there were a couple other items also inserted in blue boxes, as I recall. They supposedly tried to make clear that the blue boxes weren’t part of the standard, but a lot of people didn’t get the message, and thought that they were voting for the definition as part of the standard itself – since after all the blue boxes were included in the Requirements themselves. After an uproar about this, from then on the definition was relegated to the Rationale section, which is very explicitly stated not to be part of the standard – although FERC seemed to have missed that memo when they “endorsed” this definition in Order 850, which approved CIP-013-1 in October 2018.

So like a lot of things in the NERC CIP world, this definition lives a precarious existence between being kinda sorta an official one and something that doesn’t carry any more legal heft than my definition – you might call this definition one of the Living Dead of NERC CIP (there’s a whole host of such zombies walking around).

I often quote Lew Folkerth of the RF Region, who writes the great articles on CIP and BES cybersecurity that appear under the Lighthouse brand in the RF newsletter (I don’t know whether or not Lew has an online store with Lighthouse-branded merchandise. If so, I’ll buy a sweatshirt). In his article from the May/June 2019 newsletter (which you can get by going here, clicking on Standards and Compliance and then Outreach, and finally downloading the PDF labeled 00b. This contains all five of the articles Lew has written about CIP-013 and supply chain, so you get five for the price of one! Such a deal), Lew said (on page 15 of the PDF) “While the term “vendor” is defined in the Rationale section of the Standard, remember that this section is considered to be guidance and is not enforceable.”

In Lew’s July-August 2019 article (next in the PDF), he included this paragraph: “The term “contract” also appears in the definition of “vendor” in the Rationale section of the Standard, but that definition does not appear in the enforceable elements of the Standard. The definition may be useful as guidance, but be cautious about relying on the exact wording. For example, the use of “contract” in the definition appears to restrict the application of CIP-013-1 to only those parties with which the Responsible Entity has a formal contract. This restriction is not supported by the enforceable elements of the Standard, which means you cannot rely on that aspect of the definition.”

I completely agree with this statement. The word “contract” seems to limit vendors to organizations with which yours has a written contract. This ignores several cases where there is no contract.

The most important example is Cisco and Microsoft, and other behemoths like them. Both of these organizations are very important for supply chain cyber risk management of the BES, but neither one of them directly sells you anything. Raise your hand if your organization buys directly from Cisco without any intermediary organization. How about Microsoft? HP? Dell? I didn’t think I’d see any hands. I can (almost) guarantee you don’t have a contract directly with either one of them.

All of these companies – which I call Suppliers – develop software products or manufacture hardware products (or both), but they use dealers (which I call Product Vendors) to take care of delivering the products to you, collecting payment, etc. (many of the Product Vendors also provide services like support, installation and maintenance, but I call them Service Vendors when they do that. Of course, almost all Suppliers are also Service Vendors as well, for example when they provide maintenance contracts). Your do have a contract with the Product Vendor, but that’s not going to have any terms addressing the Risks that are due to the Supplier – such as a secure development environment, software integrity and authenticity, etc. The Product Vendor can’t sign a contract that binds Cisco or Microsoft to do something.

Another example is an emergency purchase of say a Cisco firewall from Best Buy. Your company doesn’t have a contract with BB, other than the fine print on the back of the receipt. But good luck trying to negotiate those terms, especially when you need the firewall immediately to help restore power. And again, the main sources of supply chain risk are the Suppliers, not the Product Vendors – whose primary risk is that they may ship the product to you insecurely.

Another example is another utility that your utility acquires products or services from. There might be some charge to your utility, but it also might be as part of a broader give and take between the two organizations. The other utility needs to be treated as a Vendor (in my nomenclature), even though no money changes hands when this happens, and you don’t have a contract in place with them.

I pointed this out during the call with the auditors, and another auditor answered that the Region wasn’t going to require entities to follow the definition of vendor found in the Rationale section. So this leaves you in the same position as you’re left by:

1. The lack of a definition of Procurement in CIP-013;

2. The lack of a definition of “programmable” in the Cyber Asset definition (the most fundamental definition in NERC CIP, BTW);

3. The lack of a definition of “affect the BES” in the definition of BES Cyber Asset; and

4. Other logical gaps in the CIP standards and definitions.

In all of these cases, and certainly with the term vendor, you need to document in your program (in this case, in the CIP 13 R1 plan itself) how you are interpreting the word. More importantly, you need to follow that definition consistently, as long as you’re following that version of your plan (you can change your CIP 13 plan at any time, though, as long as you follow a consistent process for doing that - which itself needs to be in the plan).

So if three years from now you get audited and the auditor disagrees with the definition of vendor that you used, you just need to say “There is no official definition, so we used one that, based on our reading about CIP-013 and supply chain security in general, seemed to make the most sense.” That’s all you need to say.

Sunday, July 19, 2020

The DoE RFI, part II

If you're looking for my pandemic posts, go here.

This is the second part of my post describing my response to the DoE RFI, which I will send in to DoE soon. The first part covered my response to question A-3 and also reproduced my extended quotation in an article in E&E News on the subject of the RFI. This part will discuss my responses to three of the other questions (I decided I didn’t have anything to say about the remaining four questions).

Question A-4 actually consists of about nine separate questions:

What information is available concerning the following: BPS electric equipment cyber vulnerability testing standards, analyses of vulnerabilities, and information on compromises of BPS electric equipment over the last five years, including results of independent BPS electric equipment testing and penetration testing of enterprise systems for vulnerabilities (including methodology for discovery and remediation)?

a. What process does the energy sector have to share information with utilities regarding vulnerabilities and vice versa? Are contingency plans in place?

How is the effectiveness of vulnerability testing and mitigation efforts monitored, tracked, and audited?

b. Is a record of an analysis of component vulnerabilities and any compromises of components and systems maintained for a specific period of time (e.g., five years)? If yes, are the results of independent component testing and penetration testing of enterprise systems for vulnerabilities (including timeline for discovery and remediation) also maintained?

c. How are the results of independent component testing and penetration testing of enterprise systems for vulnerabilities (including timeline for discovery and remediation) maintained?

d. How are vulnerabilities identified by external entities addressed? How is the distribution of information regarding patching security vulnerabilities in the supply chain facilitated?

e. What insecure by design/vulnerable communication protocols exist today that should be retired or cannot be disabled or mitigated from BPS electric equipment (examples of protocols include Distributed Network Protocol 3 [DNP3], File Transfer Protocol [FTP], Telnet, or Modbus)?

My basic answer to these questions is simple: Other than item a) (where the E-ISAC does a great job of letting the industry know about vulnerabilities and mitigations for those), there is very little public information available about any of these items. If DoE thinks all of these things are important to test, they will need to take the lead on getting it done. But the first thing DoE needs to do is focus their efforts on real, vs. imagined threats:

1. Any supposed threat to a piece of equipment that isn’t controlled by a microprocessor (or other logic circuits) doesn’t need to be tested; devices with microprocessors meet the NERC definition of Cyber Asset. In this post, Kevin Perry and I pointed out something that should have been obvious to whoever wrote the EO: A cyberattack can only affect a Cyber Asset. Only about five of the 25-odd devices listed as targets of the EO are Cyber Assets.

2. But as Kevin and I went on to point out in that post, what’s really important is that the device is performing a function that can impact the Bulk Electric System. These devices are defined by NERC as BES Cyber Assets. Kevin and I couldn’t identify any BES Cyber Assets that are sold by Chinese companies (and certainly none sold by companies headquartered in any of the five other countries on the list of “foreign adversaries” in the RFI).

3. However, it’s beyond doubt that many of the components – especially chips – found in just about any BES Cyber Asset have some sort of Chinese connection. It’s unlikely that any of these could be used to cause damage to the BES, or even to steal important information and relay it back to China (given that all BES Control Centers and substations are required by the NERC CIP standards to prevent any outgoing communications traffic that doesn’t fulfill a known purpose). But if anything listed in the EO is worth investigating, the threat of compromised components is.

However, no electric utility of any size has the resources available to do the testing that’s needed to be assured that components are benign, and no government agency has stepped up to take the lead in doing this for the industry (I’m sure DoD does lots of this testing now, but I don’t know if they’re specifically testing power industry device components). It would be great if DoE could step up and do that.

Question A-5 reads “What governance of sub-tier vendors do energy sector asset owners and/or vendors have in place? Is contract language for Supply Chain Security included in procurement contracts? Are metrics for supply chain security, along with cost, schedule, and performance maintained? What specific guidance should be developed for Integrator/Installer/Maintenance Service provider activities?”

I summarize this as “What steps does the power industry take to assure itself that “third party” (also sometimes called “fourth party”) suppliers of software and hardware components of BES Cyber Systems, as well as subcontractors of entities that provide services for BCS, follow good cybersecurity practices, especially when they relate to provision of these products and services?”

To be honest, I don’t know exactly what the industry in general is doing currently about the issue of third party security, but it’s definitely an important one. What I can say is that this should be a big focus of NERC entities’ supply chain cybersecurity risk management plans required by CIP-013-1. I can also say that the only viable approach is for the NERC entity to make sure their suppliers of BCS hardware and software have good programs for managing their own suppliers’ security. The utility shouldn’t even try to reach out to third-party suppliers themselves unless a) the supplier has made it clear they can’t be bothered with making sure their own suppliers are secure, and b) the supplier is so strategic that they can’t be replaced.

There’s a great example of good practices in third party supply chain cybersecurity risk management to be found starting on page 4 of this NIST document about Schweitzer Engineering Labs, and also in this post of mine, mostly written by SEL CEO Dave Whitehead last year. My favorite two practices are on page 7 of the NIST document:

1. For hardware, SEL tries to manufacture as much as it can in house and maintains strict controls on everything else.

2. For software, SEL makes sure they own every line of code in their software. That is, if they’re including code from a third party, they not only license the code for use, they buy the code outright. This allows them to troubleshoot later problems and fix vulnerabilities themselves, rather than have to wait for the third party to address the problem.

Neither the NIST document nor my post discusses SEL’s approach to risks from open source software that they incorporate into their products, but if you’d like to get some flavor of that, read (or hopefully re-read) the Guideline document from the NERC Supply Chain Working Group on this topic (which can be found here). Development of that paper was led by George Masters of SEL (and very ably led, I might add. I can’t say I contributed a lot to the paper myself, but I certainly learned a lot from the meetings of the group that developed it!).

Question A-6 reads: “Can energy sector asset owners and/or vendors document the level of engagement in information sharing and testing programs that identify threats and vulnerabilities and incorporation of indicators of compromise (e.g., Information Sharing and Analysis Center, Information Sharing and Analysis Organization)? Does the energy sector participate in a community for sharing supply chain risks? Does the energy sector encourage security related information exchange with external entities, including the Federal government?”

The answers to the first and third questions seem obvious to me: They’re both yes. I’ll leave these to others to answer. However, I do want to answer the second question: There’s currently no vehicle for the energy sector to participate in a community to share supply chain risks. And there needs to be. I’ve been saying since soon after CIP 13 was approved that its biggest problem is that it doesn’t provide a listing of risks (or really categories of risk, not the risks themselves) that need to be considered in the R1.1 supply chain cybersecurity risk management plan.

If there were such a list of risk categories included in the requirement (and this is essentially what CIP-010-2 R4 Attachment 1 does. I consider this to be a risk-based requirement, although R4 doesn’t explicitly state that its goal is risk management), then NERC entities would at least have some guidelines for what they need to consider in R1.1 – and the auditors would have something to hang their hat on in audits. The entity would need to show they at least considered each area of risk (e.g. risk due to third party hardware and software components, as discussed above) in developing their plan.

There was no way the CIP-013 Standards Drafting Team could have included this list in the standard, since they had a very unrealistic one-year deadline from FERC to get the standard developed, completely approved by NERC, and on FERC’s desk for them to approve (at which point FERC waited 13 months to approve it, although for a part of that time they had no quorum and couldn’t approve anything, and they lost all of their members but one). And NERC won’t even try to develop a list like this, since that would probably get slapped down by their lawyers as a backdoor rewriting of the standard (which I don't agree with, of course).

So there’s no way a list of risks (more specifically, categories of risk) will be developed by NERC in any official capacity, and I see no inclination on the part of other organizations to develop such a list. On the other hand, I think there needs to be some written discussion of categories of supply chain cyber risks to the BES, as well as at least examples of some of those risks. For this reason, Steve Briggs and I have decided to address this need in our upcoming book on CIP-013 compliance, which we expect to have published later this year.

Friday, July 17, 2020

A serious Microsoft vulnerability

If you're looking for my pandemic posts, go here.

Kevin Perry wants me to make sure everyone (or everyone who reads this blog, anyway) knows about the important Microsoft DNS vulnerability that was announced this week. The problem is that it’s “wormable” (new word of the day!), and it might lead to attacks like Wannacry and NotPetya – perhaps as early as next week. In case you haven’t heard of this, here’s your notice!

And here is CISA’s Emergency Directive.

Wednesday, July 15, 2020

The DoE RFI, Part I

DoE issued a Request for Information last week. It seems to function like an informal version of the Notice of Proposed Rulemaking that regulatory bodies like FERC issue: DoE is considering issuing rules to implement the May 1 Executive Order and wants comments to help them develop those rules. Here are my comments (I won’t submit every word below, though). Given how the new rule could easily turn out to require huge expenditures by industry with little or no benefit to anybody I can identify, it would be a good idea for as many industry players as possible to comment on it.

I liked the RFI. The DoE people who wrote it are obviously quite concerned about the possible costs of implementing the EO the way it was written. Moreover, it seems they’re not blindly going along with whoever wrote the EO (and I’m sure it wasn’t DoE! The White House drove the EO, but I imagine they had help from other parts of the government, and perhaps certain consultants).

That is, DoE isn’t assuming at the outset – as the EO directly implies - that the big supply chain security problem facing the US grid is something that is being implanted in products made in China, which on some dark day in the future will be used to bring the US to its knees with a massive power failure. They seem genuinely interested - although I may be reading too much into this - in actually learning what the real problems are before they burden the industry with a hugely expensive non-solution to a non-problem.

There are definitely real supply chain real security risks to the US grid, which is why CIP-013-1 will come into effect on October 1. But I would put this Chinese supply chain attack idea at about number 1,000 on the list of supply chain security risks we need to be concerned about. It would be a huge shame if a large portion of the already limited utility resources available to address supply chain risks had to be diverted for this purpose. If you would like to know why I say this, here are two of the many posts I’ve written about this issue: one and two.

Christian Vasquez of Energywire wrote a good article on the RFI last Thursday. In it, he quoted me as follows:

DOE is also asking whether the energy sector participates in a "community for sharing supply chain risks." Grid security consultant Tom Alrich said the answer is "an emphatic no," but a centralized agency like DOE could help establish such a data-sharing hub rather than requiring utilities to identify risks on their own.

Alrich also said the list of equipment in DOE's notice "is so heavily weighted with almost completely 'dumb' boxes like transformers, when the real risks rest with computers and other programmable electronic devices like relays and programmable logic controllers, and just as importantly with their components." Alrich said that transformers aren't typically attached to a computer network and "act 100% according to the laws of physics," meaning a cyberattack doesn't pose as much of a threat as a physical attack.

The real risk, Alrich said, is in hardware, software, and firmware (components). It's difficult for any utility to complete a thorough investigation into the components that it uses on the grid, Alrich said, as few power providers have the resources to do so. "This is where DOE could add a lot of value — doing these investigations itself and sharing the results with the industry," Alrich said.

I appreciate that Christian didn’t just take one or two snippets of what I said in an email, but actually tried to get the whole thoughts. This week, I’ve gone back through the RFI. I’d like to give you my full thoughts on it now (although I have to break this post into two parts).

Are changes needed to the standards?

Question A-3 starts by asking “Are non-standard incentives or changes to established standard development organizations’ SCRM standards (including NIST 800 series, ISA/IEC 62443, NERC CIP, and other Cyber Risk Maturity Model evaluations/practices) necessary to build capacity to protect source code, establish a secure software and firmware development lifecycle, and maintain software integrity?”

My answer to this question is almost exactly what I intend to say when I finally do my post in response to FERC’s recent Notice of Inquiry on the CIP standards (this will hopefully be soon): All real cyber security threats (supply chain or otherwise) should be addressed in the CIP standards. However, they all need to be addressed in a non-prescriptive, risk-based manner like in CIP-013, not in the prescriptive zero-tolerance model of most of the other CIP standards and requirements.

In the ideal CIP compliance regime, the NERC entity would be free to consider all of the BES cybersecurity threats they face and allocate their available mitigation resources toward those that pose the highest degree of risk (as they can in the CIP-013 compliance process). But this in turn requires:

Rewriting all of the prescriptive requirements like CIP-007 R2 and CIP-010 R1. Complying with just these two requirements diverts a huge amount of resources from other cybersecurity threats like ransomware, APTs and phishing, that aren’t addressed at all in CIP now. All threats need to be treated equally in my new CIP regime.
Having an independent body that identifies important threats and publishes a list regularly (along with recommendations for mitigation measures). Each NERC entity with High or Medium impact assets would need to consider each of these threats as they decide how they’ll allocate their mitigation resources (say at the beginning of each year). They won’t be able to mitigate every threat, but they will need to show that any threats they didn't mitigate posed a lower level of risk – in their estimate – than the ones they mitigated.
Drastically revising the NERC auditing regime, so that NERC CIP auditors are required to take a cooperative approach with the entities they’re auditing, not a basically adversarial one. In fact, the auditors would help the entities decide which threats they will mitigate and help them identify the best ways to mitigate them. Of course, this makes them much more consultants than auditors. But any other approach makes less and less sense all the time (for example, having these changes in place is required if NERC entities will ever be allowed to put BES Cyber Systems in the cloud). And I’ll bet there’s hardly a single CIP auditor that wouldn’t be happy as a clam if they could cooperate with the entities, rather than have to respond to requests for compliance advice by answering those questions in a low voice and never writing down any advice they provide, since officially they’re not allowed to provide any sort of compliance advice at all.
Exorcising NERC of the idea of auditor independence, as far as the CIP standards go (the Operations and Planning standards are a different story altogether. Nothing I’m saying here necessarily applies to them). This is the reason why NERC is never allowed to provide true guidance on any CIP standard, even though there’s still lots of uncertainty about basic concepts like “programmable”, as well as about exactly how the prescriptive CIP requirements are to be applied. It’s also why NERC staff have been shot down every time they’ve tried to provide real guidance on CIP requirements, as in the late, lamented Guidance and Technical Basis additions to the CIP standards - which have now officially become Unpersons.

So let’s get back to question A-3. It continues on to ask how “benchmarks” are documented and tracked in several specific areas like software bill of materials (a mythical creature that on paper sounds like a great idea but in reality is very elusive – just like Bigfoot has so far eluded us). But just to answer the one-sentence question quoted above, CIP-013-1 already addresses protecting source code, establishing a secure software and firmware development lifecycle, and maintaining software integrity.

It does this in R1.1, where the NERC entity is required to survey the landscape of supply chain security risks, “identify” those that are most important, and “assess” each of those for the degree of risk they pose – so in theory they’ll consider the three risks just mentioned, along with all others. But in practice, my biggest problem with CIP-013 is it lacks a list of risks (actually “risk areas” like secure software development lifecycle) that the entity needs to consider – and the entity could be in violation if they can’t show they at least considered each one. Having this would give the entities some very helpful guidance, and it would also give the auditors something they can hang their hat on in audits.

In part II (coming soon to a blog near you. BTW, does anyone remember something called “movie theaters”? They all disappeared in about March, and I fear we’ll never see them again), I’ll give my answers to three more of the questions in the RFI. Try to contain your excitement.

Sunday, July 12, 2020

Will there (n)ever be any more face-to-face NERC meetings?

If you're looking for my pandemic posts, go here.

About a month ago, WECC asked their members whether they would be interested in having their September compliance meeting be conducted onsite or entirely online. I was surprised then that they would even ask the question, but I’m sure they wouldn’t even consider asking it today – given the crisis situations in two WECC states (Arizona and California), as well as rapidly growing cases in other states like Montana, Idaho, Oregon and Utah (where WECC is headquartered, of course). I’m not stretching at all when I say there will be no face-to-face meetings scheduled by NERC or any of the NERC Regions until next year.

And the real question isn’t even when NERC person-to-person meetings will resume next year; it’s whether they’ll resume at all next year – or for that matter, any year after that, until a safe, effective, and affordable vaccine for Covid-19 is developed and deployed to enough people (roughly 70% of the world population) that herd immunity will protect everybody. It will certainly be at the very least 3-4 years before this happens, even assuming the needed vaccine is developed next year.

A few “fun” facts:

It took twenty years to eradicate smallpox, and that’s the only disease that has been eradicated.
40 years after AIDS appeared, there’s still no HIV vaccine.
The SARS outbreak (caused by another coronavirus) happened 18 years ago, and there’s no vaccine for SARS yet, either. In fact, just about the fastest a vaccine for any disease was developed (let alone deployed to the general population) was four years, and that was for mumps.
When the WHO announced last week that the novel coronavirus could be spread through aerosols, this effectively showed that there's no such thing as a "safe" in-person meeting (unless perhaps everyone sits in their own glass booth with their own air filtration system. But that doesn't fit most people's definition of an in-person meeting!). If the air is constantly being replaced with outside air, or if it's filtered and recirculated, these measures will certainly help, but they're not a guarantee of safety (of course face masks and social distancing aren't a guarantee, either).

Does this mean I don’t think there will be meetings of any sort in the US for years? No. But here’s why I think NERC (and Regional) meetings will be just about the last to come back:

They’re really not “essential”. What’s essential in the power industry has to be discussed in real time, since a delay in responding by even seconds can cause serious problems. For that reason, there are conference calls all the time (of course, NERC doesn’t have real-time responsibility for the grid. That’s the job of the ISOs, RTOs and BAs). The onsite meetings are about issues like compliance, which can usually be handled almost as well using virtual means (as MRO has been proving with their recent excellent webinars on cybersecurity and CIP compliance – which are mostly open to anybody, by the way. MRO announced a month ago that their usual later-fall compliance workshop would be held virtually. What really surprises me is that other Regions haven’t all announced the same thing by now. They either will or they won't have them at all).
By definition, the power industry is dispersed across North America. Any NERC meeting is guaranteed to have people from most states and provinces of the US and Canada. When you have a situation, as we do now, where some states are having severe outbreaks and other states are recovering from their outbreaks or never had a bad outbreak in the first place, bringing all these people together in one room almost guarantees spread from impacted states to less-impacted ones. In fact, some states like Illinois and New York are now requiring people coming from other states – even returning residents – to quarantine for 14 days. That alone probably kills almost all national meetings for any group until there’s a lot more uniformity among the states (and not uniformity in the fact that they’re all in terrible shape!).
There are a lot of people over 60 who attend NERC meetings (yours truly being one of them). Given the dramatically higher likelihood of dying from Covid-19 in that demographic, the disease will need to be almost eradicated in the US before they’ll feel safe attending. Looking in the papers this morning, there isn’t much evidence that Covid-19 is anywhere near being eradicated in the US!
Some utility employees who are needed for real-time operations, especially in Control Centers, are effectively banned from travel. A friend of mine, who is a senior person in one of the most important Control Centers in the US, told me this week that he’s sure he’ll never be allowed to travel anywhere until the pandemic subsides. The risk to the organization simply isn’t worth it.

Of course, I’ve attended a lot of NERC and Regional meetings, and I’ll be the first to say that I got a lot out of all the personal interaction, both after hours and at lunch and breaktime. This is a loss, pure and simple. One of millions of losses caused by the fact that the US has for the most part let the novel coronavirus run rampant, and we’re still very far from containing it, let alone defeating it.

I think NERC and the Regions should all plan on making all meetings virtual through the end of 2021. Maybe in the second half of next year, things will start to improve and there might possibly be some combination face-to-face and virtual meetings. But that won’t happen if things continue on their present course. If they do, we’ll be asking next year whether there will ever be face-to-face NERC meetings again, not when they’ll happen.

But here's an optimistic note: There are a lot of advantages to virtual meetings, the biggest being that no travel is required - so there can be much bigger "crowds". If NERC and the Regions stopped thinking there will be onsite meetings next year (including GridSecCon) and instead put some creativity into planning virtual ones, we'd all be a lot better off.

Thursday, July 9, 2020

(less than) Three months!

In case you forgot (like I did), July 1 was the day compliance with CIP-013-1 was due. Everybody did that, right? Of course you didn’t, because October 1 is now the compliance date. Will the date get pushed back again, since the pandemic isn’t over yet? I’d say it’s highly unlikely. There was a lot of sniping at NERC and FERC in some circles (the Professional CIP Snipers Association, I believe they’re called. You know who I mean…) for even pushing it back once; I doubt anyone at NERC or FERC is eager to see that sniping renewed.

This is especially true since the Executive Order is now out. That’s alleged to show there’s a serious foreign campaign against the US power grid, although the details of that remain to be filled in. I’ll agree there are serious supply chain risks to the electric power grid, but they have very little to do with foreign attackers and everything to do with poor cybersecurity practices at suppliers of OT systems, as well as in some cases lack of appropriate good cyber practices at electric utilities. We’d be much better off paying attention to the threat in our back yard, not just the hypothesized one in China.

I’m sure every NERC Entity who has to comply has made some progress on their CIP 13 program, so hopefully it’s not like you have to start from scratch today. And even if you do have to start from scratch (you don’t have to raise your hand if that’s the case), you certainly have time to put together a good program by October 1. I’ve been working on CIP-013 compliance for a year and a half with a number of electric utilities, and I’m now writing a book with Steve Briggs on it (if you ever want to find out how much you still need to learn about a subject you thought you were an expert in, I recommend writing a book about it!), so I think I can now give you a pretty complete picture of what you have to do, from start to finish.

You need to identify all of the Products in scope for CIP-013-1 at your organization. Of course, BES Cyber Systems are what’s in scope, but in many if not most cases NERC entities buy hardware and software components of BCS, rather than BCS themselves (e.g. they’ll buy a bunch of different servers, switches, etc. for their EMS, as well as the different software packages that run on them. You have to treat all of these components as different Products, even if they’re from one Supplier). For each component, you should identify Suppliers and Vendors, where the Supplier is the entity that develops or manufactures the Product and the Vendor is the entity that sells it to you. Of course, in many cases they are the same organization. Note that you can group both of these together under the word Vendor in your compliance documentation if you want (I don’t think it’s necessary), but the point is that many risks that apply to Suppliers don’t apply to Vendors at all (e.g. maintaining a secure development environment), and a few Risks that apply to Vendors don’t apply to Suppliers at all. I believe it’s much easier if you treat them separately, even if you end up classifying them as “Vendor type 1” or “Vendor type 2” in the actual documentation.
Similarly you need to identify all of the Services in scope; these are Services performed on BCS, either onsite or remotely. The big risk with Services is over-identifying them. The Service must be able to directly affect a BCS, like maintenance or troubleshooting. Even a cleaning contractor who has unescorted physical access to medium and high BCS is performing a Service that should be in scope for CIP-013 (the contractor should be in scope for CIP-004 as well). But consulting advice (including on compliance), engineering services, painting services, etc. don’t directly impact BCS, so they shouldn’t be on your list. Of course, you can always put them on the list, but then you’ll be smacking your head if you end up receiving a PNC for something you did or didn’t do with respect to a contractor that didn’t have to be listed in the first place. Note that there are only Vendors of Services, not Suppliers of Services.
For compliance with R1.1, you need to “identify” Risks that apply to Procurement of Products, Procurement of Services, Installation (and Use) of Products, Use of Services and Transitions between Vendors. Four of these five items are required in the language of R1.1, and the need to address the fifth - Use of Services - is implied by the six items in R1.2, since five of those apply to Use of Services). Of course, every Risk you identify should be a Risk to the BES, not for example a purely IT Risk, such as one having to do with encrypted data storage or data privacy (of course, IT risks are important when your organization is buying IT systems. You should consider having a separate supply chain cybersecurity risk management program for your IT systems, in which you could address this type of Risk. But trying to use the same list of Risks – and the same questionnaire questions, as discussed below – for both types of systems puts a lot more burden on you, and even more importantly significantly increases your CIP-013 compliance risk footprint, with no corresponding benefit in terms of supply chain risk reduction).
For each Risk in R1.1, you need to “assess” it to determine whether it has anything more than a low likelihood of being present in a Supplier’s or Vendor’s environment, or in your (the NERC entity’s) environment. Since I consider low likelihood to be the same thing as saying a Risk has already been mitigated, this means there’s no point in addressing low likelihood risks in your R1 plan (for example, there’s a Risk that a Supplier might not have a firewall, even though they’re connected to the internet. Since it’s just about impossible to believe that any organization that’s connected to the internet today doesn’t have some sort of firewall, I think this can be safely dismissed as a Risk that’s unlikely ever to have a Likelihood other than low).
How many Risks do you need to identify in R1.1? I and my clients have identified a little over 100 Risks, of which about 85 apply to Suppliers and Vendors and the rest apply to the NERC entity itself. These are all Risks that have some Likelihood of being present in a Supplier’s or Vendor’s environment, or in the NERC entity’s environment. Do you have to list this many Risks in R1.1? Certainly not. But what I discovered with my clients is that all of the Risks that apply to the NERC entity, and more than 95% of the Risks that apply to a Supplier or Vendor, require just a policy or procedure to mitigate (either the Supplier’s/Vendor’s policy or procedure/policy, or your organization’s) – not installing an expensive new system, hiring an army of consultants, etc. And given that these are all real Risks to the BES (i.e. they could possibly have a likelihood greater than low), you should really try to address as many as you can, since the whole idea of R1.1 is to find the most important Risks and mitigate them.
The six items in R1.2 are mitigations of Risks; these are there because FERC ordered they be included in the NERC entity’s plan, in Order 829 in 2016. It’s not hard to state the Risks behind these - although there are actually eight Risks, since R1.2.5 and R1.2.6 address two Risks each. All of these have to be included in your list of Risks – but since they're all important, you probably would have identified them anyway in R1.1.
You need to identify Mitigations for all the Risks on your list (and often, one Mitigation will mitigate multiple Risks, just like one Risk might have multiple Mitigations). These Mitigations are almost entirely the policies or procedures that you or your Suppliers or Vendors will need to implement, to mitigate the Risks you have identified in R1.1 and R1.2.
You need to determine how you will assess Vendors and Suppliers for the Risks that apply to them. Essentially, there are two ways to do this. The preferred method is a questionnaire. What questions should you include in your questionnaire? That’s pretty simple: You should include at least one question for each Supplier/Vendor Risk in your list, and you shouldn’t include any questions that are based on Risks that aren’t on your list, as I explained in this recent post. The problem is that, if you throw in a bunch of questions that address Risks you haven’t identified as significant enough to be on your list – which of course often happens if you use another organization’s questionnaire - you’re implicitly adding those Risks to your list and you’ve now increased your compliance risk footprint, with no concomitant benefit to your supply chain risk mitigation program. In other words, if you use someone else's questionnaire that has 200 questions, that means you've identified 200 Risks to mitigate, with all that entails.
If the Supplier or Vendor won’t answer your questionnaire, you need to “assess” them using other means, such as reviewing any certifications they have (but you need to see a detailed audit report, since just knowing they have a particular certification provides no information on how they stand with respect to each of the Supplier/Vendor Risks on your list). Or reviewing their web site (where they may have posted documents describing their cybersecurity program in specific areas like secure development lifecycle and security vulnerability policy, as found on Cisco’s website). Or buying a third-party assessment like those offered by Fortress, as long as you don't thereby expand the list of Risks you're committed to mitigating. Or – if there’s no better solution available – relying on news articles, opinions of your peers, etc. to form an overall judgment on the firm’s security, even if you can’t develop Likelihood Scores for particular Risks. If you take the last approach, you still should consider the Supplier or Vendor to pose moderate or high likelihood for any Risks for which you can’t be sure where they stand. In other words, just because they have good overall security doesn't mean they have in place say the remote access protections that you want to see.
You need to put all of the above into a supply chain cyber risk management Plan (required by R1, of course) that also shows how you will mitigate supply chain cybersecurity Risks through RFPs, contract language, emergency procurements, open source software controls, “fourth-party” software Risks, the R3 review of the plan, etc.
The crucial element of your R1 Plan is the Procurement Risk Assessment (which I abbreviate PRA, even though I realize that has another meaning in CIP compliance). This is both a) the point where all the other elements of your Plan come together, and b) the basis of the CIP-013-1 compliance evidence required by the NERC Evidence Request Tool. In your PRA, you need to show that, in each procurement, you fulfilled R1.1 and R1.2, as requested by the ERT.
Once you have your Plan worked out, you need to develop procedures and policies to implement the Plan. As I’ve previously pointed out, CIP-013 provides great freedom in how you draw up your Plan in R1. But when you get to R2 and have to implement it, the Plan becomes a straitjacket. You need to implement it as written, just as you do with prescriptive requirements in the other CIP standards (the big difference in CIP-013 is that you don’t need evidence for every instance of compliance, just evidence that you’ve developed a good plan and implemented it).
In case you find the above item pretty scary, you need to keep in mind that, if you decide after the compliance date that you need to make some change to your plan, you can do that at any time. However, you also need to make sure that you change your procedures and policies to reflect the changes in your plan.

Remember, we (OK, I) at Tom Alrich LLC are prepared to help you with any of the above items, from helping develop your plan from scratch, to reviewing what you have now and offering suggestions for improvements, to helping you draw up the procedures you need to implement your plan. Drop me an email if you’d like to discuss this!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.