Thursday, January 17, 2019

Lew on CIP 13, part 2 of (integer < 10)



Note: I expect to have the second post in my new series on the Russian cyber attacks up early next week. 

This post is the second of a set of posts on an excellent article by Lew Folkerth of RF (the NERC Region formerly known as ReliabilityFirst) on CIP-013, the new NERC supply chain security standard; the first post is here. That post dealt primarily with how Lew characterizes the standard and started to discuss what he says about how to comply; this one continues the discussion about how to comply with the standard. The next post will continue the compliance discussion and also discuss how CIP-013 will be audited, although it may not be the last in this series (sorry to disappoint you!).

I also want to point out that what I am saying in this series of posts goes beyond what Lew said in his article, for two reasons:

  1. Lew doesn’t have much space for his articles, as I do for my posts. So where he has to use ten words, I can write five paragraphs. And I have no problem with doing that, as any long-time reader will attest.
  2. While I firmly believe everything I say in this series of posts is directly implied by what he said in his article, it’s natural that I would be able to discuss these topics in more detail, because I’ve had to figure out a lot of the details already – since I’m currently working with clients on preparing for CIP-013 compliance. Of course, what I write in these posts is by necessity very high level; there are details and there are DETAILS. These posts provide the former (see the italicized wording at the end of this post to find out how to learn about the latter).

What risks are in scope?
As I have pointed out in several posts in the past, and also pointed out in part 1 of the first post in this series (in Section A under the heading “Lew’s CIP-013 compliance methodology, unpacked for the first time!”), CIP-013 R1.1 requires the entity to assess all supply chain risks to BES Cyber Systems, but it doesn’t give you any sort of list (even a high-level one) of those risks. So R1.1 assumes that each entity will be quite well-read in the literature on supply chain security risks and will always be diligently searching for new risks; then they’ll put together a list of all of these risks and assess each one for inclusion (or not) in their plan.

This might be a good idea if every NERC entity with Medium or High impact BCS had security staff members who could devote a good part of every day to learning about supply chain security risks, so that they could always produce a list of the most important risks whenever required. While this might be true for some of the larger organizations, I know it’s not true for smaller ones. What are those people to do?

I’ve repeatedly expressed the hope that an industry organization like NATF or the NERC CIPC would put together this list of supply chain risks, although I’ve seen no sign of that happening yet. Another idea would be if the trade associations, including APPA, EEI, NRECA and EPSA, each put together a comprehensive list for their own members. While APPA and NRECA developed a good general discussion of supply chain security for the members of both organizations, it doesn’t contain such a list; I hope they will decide to do that in the future as well.

In the meantime, NERC entities subject to CIP-013 need to figure out on their own what their significant supply chain security risks are. Where can you go for ideas? Well, there are lots of documents and lots of ideas – and that’s the problem; there are far too many. There’s NIST 800-161 and parts of NIST 800-53, for starters. There’s the NERC/EPRI “Supply Chain Risk Assessment” document, which was issued in preliminary form in September and will be finalized in February; there’s the excellent (although too short!) document that Utilities Technology Council (UTC) put out in 2015 called “Cyber Supply Chain Risk Management for Utilities”; and there’s the APPA/NRECA paper I just mentioned. There are others as well. None of these, except for 800-161, can be considered a definitive list, though. And 800-161 is comprehensive to a fault; if any utility were to seriously try to address every risk found in that document, they would probably have to stop distributing electric power and assign the entire staff to implementing 800-161 compliance!

One drawback of all of these documents, from a CIP-013 compliance perspective, is that they don’t identify risks directly. Instead, they all describe various mitigations you can use to address those risks. This means that you need to reword these mitigations to articulate the risks behind them. To take the UTC document as an example, one of the mitigations listed is “Establish how you will want to monitor supplier adherence to requirements”. In other words, while it’s all well and good to require vendors (through contract language or other forms of commitment like a letter) to take certain steps, you need to have in place a program to regularly monitor that they’re taking those steps.

We need to ask “What is the risk for which this is a mitigation?” The answer would be something like “The risk that a vendor will not adhere to its commitments”. This is one of the risks you may want to add to your list of risks that need to be considered in your CIP-013 supply chain cyber security risk management plan. You can get a lot more by going through the documents I just listed.

So – in the absence of a list being included in Requirement CIP-013 R1.1 itself, and in the absence of any comprehensive, industry-tailored list put out by an industry group - this is one way to list the risks you need to assess in your CIP-013 supply chain cyber security risk management plan. The main point of this effort is that you need to develop a list that comes as close to covering (at least at a high level) all of the main areas of supply chain cyber risk as possible.

But I know there’s a question hidden in every NERC CIP compliance person’s heart when I bring this point up: If I develop a comprehensive list of risks, am I going to be required by the auditor to address every one of them? In other words, if my list includes Risk X, but I decide this risk isn’t as important as the others so I won’t invest scarce funds in mitigating it, am I going to receive an NPV for not mitigating it?

And here’s where Uncle Lew comes to the rescue. He points out “You are not expected to address all areas of supply chain cyber security. You have the freedom, and the responsibility, to address those areas that pose the greatest risk to your organization and to your high and medium impact BES Cyber Systems.” There are two ways you can do this.

The first way is that you don’t even list risks in the first place that you believe are very small in your environment – e.g. the risk that a shipment of BCS hardware will be intercepted and then compromised during a hurricane emergency is very low for a utility in Wyoming, while it might be at least worth considering for a utility in South Carolina. The former utility would be justified in leaving it off its list altogether, and doesn’t need to document why it did that. Any risk that has almost zero probability doesn’t need to be considered at all – there are certainly a lot more that have much greater than zero probability!

The second way in which you can – quite legally – prune your list of risks to a manageable level is through the risk assessment process itself. R1.1 requires that you “assess” each risk. What does that mean? It means that you assign it a risk level. In my book, this means you first determine a) the likelihood that the risk will be realized, and b) its impact if it is realized. Then you combine those two measures into what I call a risk score.

Once you’ve assessed all your risks, you rank them by risk score. And guess what? You now need to mitigate the highest risks on the list. You can also mitigate some risks below these (perhaps mitigate them to a lesser degree), but in any case there will be some level on your risk list below which you won’t even bother to mitigate the risks at all, since they have lower risk scores than all of the risks above them (although you will still need to document why you didn’t mitigate those risks, by briefly explaining why the risk score is so low for each of them).

Will you get into trouble for not mitigating the risks at the bottom? No. As Lew said, you need to “address those areas that pose the greatest risk to your organization and to your high and medium impact BES Cyber Systems.” The direct implication of these words is that you don’t need to address the risk areas that pose the least risk.

Why are you justified in not mitigating all of the risks listed in your supply chain cyber security risk management plan? Because no organization on this planet (or any other planet I know of) has an unlimited budget for cyber security. Everyone has limited funds, and the important thing is that you need to allocate them using a process that will mitigate the most risk possible. That process is the one I just described (at a very high level, of course).

You may notice that this is very different from the process to mitigate risk implicit in all of the other NERC standards, as well as the majority of requirements for the CIP standards. That process – a prescriptive one – tells you exactly what needs to be done to mitigate a particular risk, period. You either do that or you get your head cut off.

For example, in CIP-007 R2, you need to, every 35 days, contact the vendor (or other patch source) of every piece of software or firmware installed on every Cyber Asset within your Electronic Security Perimeter(s), to determine a) whether there is a new patch available for that software, and b) whether it is applicable to your systems. Then, 35 days later, you need to either install the patch or develop a mitigation plan for the vulnerability(ies) addressed by the patch. It doesn’t matter if a particular system isn’t routably connected to any others, or if the vendor of a particular software package has never issued a security patch in 20 years. For example, you still need to do this every month; you can’t have two schedules, say monthly for the most critical systems and those routably connected to them and quarterly for all other systems. Needless to say, if CIP-007 R2 were a risk-based requirement like CIP-013 R1.1 (or CIP-010 R4 or CIP-003 R2, for that matter), you would have lots of options for mitigation, not just one.

As an aside, I do want to point out here that in CIP you never have complete freedom to choose how you will mitigate a particular risk, even when the requirement permits consideration of risk, for two reasons:

1.       The mitigation always has to be effective, as Lew pointed out a couple years ago; and
2.       If you’re using a mitigation different from the one normally used – e.g. you’re not using patch management to mitigate the threat of unpatched software vulnerabilities, or you’re not using antivirus or whitelisting software to mitigate the threat of malware – you can rightfully be asked to justify why you took an alternative approach.

A final question you might ask about identifying risks for R1.1 is “Where do I draw the line? You said that I can draw a line through the ranked set of risks, so that all risks below that line don’t need to be mitigated at all. Of course, I would have to draw the line when I had already committed all of the funds I have budgeted for CIP-013 compliance (although I will obviously be willing to spend as much as I have budgeted for that purpose).

“But let’s suppose I don’t have a lot of funds available, and I have to draw the line after three items. This means that my plan will only require me to mitigate those three risks (even though I would definitely mitigate more if I had the funds). And let’s suppose further that the auditor believes that I left some significant risks unmitigated by drawing the line where I did. Can he or she give me an NPV for this? And will my mitigation plan for this violation require that I go back and get more funds to address these risks?”

It’s interesting that you bring this up, since I have considered this question a good deal myself. I think the answer is that it all gets down to reasonableness. If you can demonstrate to the auditor that your organization really can’t afford to spend more on supply chain cyber security risk mitigation (e.g. there was a natural disaster that was very expensive for the utility, for which there’s a serious question whether you will be able to get rate relief), they will hopefully be understanding.

Of course, if we were talking about CIP-007 R2 here and you used the argument that you didn’t fully comply with that requirement because your organization couldn’t afford it, I don’t know of any way that the auditor could be lenient, whether or not he or she wished to. This goes back to the fact that CIP-013 R1 is a risk-based requirement, while CIP-007 R2 is prescriptive. Reasonableness isn’t something an auditor is allowed to consider when auditing a prescriptive requirement (unless we’re talking about a reasonable interpretation of a particular term in the requirement, or something technical like that), while it’s inherent in the idea of a risk-based requirement. I’ll discuss this further in the next post (or maybe the fourth post, if there end up being that many) in this series.


Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

My offer to NERC entities of a free webinar workshop on CIP-013, described in this post, is still open! Let me know (at the email address below) if you would like to discuss that, so we can arrange a time to talk.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or challenges like what is discussed in this post – especially on compliance with CIP-013; we also work with security product or service vendors that need help articulating their message to the power industry. To discuss this, you can email me at the same address.

Monday, January 14, 2019

The Russian attacks: A new WSJ article puts them in a whole new light


This week, I intended to write the second part of my last post on Lew Folkerth’s great article on CIP-013. However, I believe this topic has more urgency. I will write a second post on this topic, then get back to Lew's article (I hope) next week.


Last Friday morning, I opened my subscription copy of the Wall Street Journal to see a front-page article entitled “Russian Hack Exposes Weakness in U.S. Power Grid”.[i] Then I read the article, very carefully. What was my first reaction? It was “Well, there goes my weekend.” I realized that this article is very important, for two reasons. First, it points the way to an important cyber attack vector that the industry, and especially the NERC CIP standards, hasn’t paid too much attention to. And yet it turns out that this was the primary vector the Russians are using, not the one I thought they were, based on the first WSJ article and DHS briefings last July. That is the subject of this post and the one to follow.

The second reason why this article is important is that it makes me (and I’m sure it will others as well) far less certain that the DHS briefings in July constituted a gross exaggeration of the success that the Russians had. Those briefings implied that the Russians had penetrated a number of utility control centers, where they would have had the opportunity to plant malware that they might call into action at a later date. I expressed great skepticism about this conclusion, and two days later DHS put out a completely different story, in which they said that only one “insignificant” generating plant (presumably gas-fired, going by a diagram that was shown) had actually been penetrated (i.e. at the control system level). Yet this was followed up a week later by a different story: that in fact just two wind turbines had been impacted, not a whole plant.

In a post in early September (which was preceded by others, and followed by one more), after describing the timeline that produced these three mutually contradictory explanations from DHS, I stated that I continued to believe that statements made at the initial briefings were wildly exaggerated – if not actually factually wrong, since the wording seemed to be very carefully chosen. I also emphasized that I really wished DHS would come out with a straight story on what really happened. However, last Friday’s article makes me question that conclusion, so that I now think it’s possible that the initial briefings were correct, and the Russians did penetrate a number of utility control centers. My third (and probably fourth) posts will discuss how Friday’s WSJ article caused me to rethink my conclusion, and will go on to address some of the huge implications, if it’s actually true that utility control centers were penetrated. These implications aren’t so much cyber implications as political ones.

Before I get on with the discussion of the cyber implications of Friday’s story, I want to point out that this is a great reporting job, by Rebecca Smith and Rob Barry. Ms. Smith is a veteran WSJ writer on the electric power industry and cyber security, and is the author of the article last July that caused a firestorm in the US and elsewhere, with its implications that the Russians had used the supply chain to penetrate a number of U.S. utilities and plant malware in their control centers.  The big difference between Friday’s article and the one in July is that the latter was primarily based on the first DHS briefing. Ms. Smith published it the day after the briefing, and there was certainly no time to follow up with other industry sources, try to verify some of the statements made by DHS, etc.

By contrast, Friday’s article is based on a lot of really dogged reporting (which has probably been going on since soon after the briefings), tracing in great detail, with lots of quotations from victims, how the Russian attacks actually proceeded through a number of small vendors to actual utilities (the article names five utilities that were attacked). In the article, Ms. Smith provides evidence that convinces me that my original scenario for how the attacks unfolded is incorrect.

The July briefings and WSJ article didn’t directly provide a scenario for the attacks, but I made a few assumptions in developing my own implicit scenario. I never wrote it down, but it was behind all of the articles I wrote on the Russian attacks last year. This scenario was:

  1. The attackers were aiming for the Big Prize of cyberattacks on the US power grid: causing a cascading outage in the Bulk Electric System (this is obviously the way to cause the greatest total damage to the US economy). This means they would necessarily attack only transmission-level assets (i.e. BES assets), not distribution-only ones. You can’t cause a cascading outage by just attacking the latter.
  2. Because of this, the best way to proceed is to try to obtain direct access to the control systems that control or power the transmission grid – i.e. control systems located at control centers, generating plants over 75 megawatts (including larger wind farms), and substations connected to the grid at greater than 100 kilovolts. In NERC CIP terms, these are High, Medium and Low-impact BES Cyber Systems, located at High-Medium and Low-impact assets (control centers, substations and generating plants).
  3. Getting access to these systems is a formidable challenge. High- and Medium-impact assets (i.e. the more important control centers and substations, along with a small number of large or otherwise strategic generating plants) are almost all protected by two strong defenses (both required by NERC CIP).
  4. The first of these defenses is well-managed firewalls, which make it very hard to make a direct frontal attack on the network in the asset. Largely due to NERC CIP compliance, these firewalls will have very few, if any, open and unprotected ports that a hacker could exploit.
  5. The second defense at these assets is a well-protected system for Interactive Remote Access (IRA), including an Intermediate Server and two-factor authentication. This means that an attacker attempting remote access out of the blue will probably never get through the IRA system, unless they have found a way to break two-factor authentication – and I know of no verified cases to date in which an attacker has done that.
  6. Low impact assets don’t necessarily have these two strong protections (some do), so they are easier to penetrate. On the other hand, they’re classified as Low impact because if compromised their loss will cause a much less severe impact on the grid than the loss of a Medium or High-impact asset. So the poor Russians won’t even come close to causing a cascading outage if they bring down a single Low-impact asset (they could perhaps do it if they attacked a lot of Low-impact assets simultaneously, but that is hard to do).
  7. This means that no Transmission-level assets (BES assets) would be fruitful targets for Russian hackers. I assumed the attackers had tried to compromise these assets, not knowing how hard it would be to accomplish this goal. And I was for the same reason very skeptical of the initial DHS briefings and the WSJ article last July, which strongly implied (if they didn’t state it outright) that some Transmission-level assets (probably utility control centers) had been penetrated.
  8. When DHS came out with their new story (and a week later, a second story) that said only a very small generating plant had been compromised (far below the 75 MW threshold for being a part of the Bulk Electric System), I took this as confirmation that I was right, and the Russians had essentially wasted a lot of time and money trying to break into something that was pretty much impenetrable.
However, the Friday WSJ article implicitly describes a very different scenario for the attacks:

  1. The biggest difference between the new scenario and the one I was assuming is that the attackers weren’t obsessed with a cascading BES outage as their be-all and end-all. They were looking to cause whatever damage they could (or more specifically to position themselves to do so in the future if called upon), and they were fine with attacking the distribution system. In particular, they were looking at cutting off power distribution to military installations, which of course is a very understandable strategic purpose (and I assume the US is doing the same sort of reconnaissance and probing in the Russian grid).
  2. This means that the attackers weren’t going to be stymied by the fact that they couldn’t penetrate any Medium- or High-impact assets. A single military base could in most cases easily be attacked by disrupting a single Low-impact generating plant or substation, or even a distribution-level plant or substation. Because of this, the Russians’ universe of possible targets was much larger than I was assuming last summer – so I was wrong last week in pointing out to the large spike in Russian readers of my post (among whom I assumed were at least some of the people involved in attacking the US grid) that their attacks so far had been a “dismal failure”. Instead, they might well believe them to be at least moderately successful, and Friday’s WSJ article provides some documentation for why they would be justified in this belief (of course, I’m not trying to lift the spirits of the Russian attackers by saying that! In any case, my spike of Russian readers quickly dissipated after that story, and now Russia is number four in my readership list, after the US (once again firmly in first place), Canada and the Ukraine (where I seem to have a steady readership, unlike the fickle Russians).
  3. Another big difference between my original scenario and the one from Friday’s article is that I was assuming that the Russians would want to attack US power entities through vendors of control systems, by compromising the remote-access channels they already had set up with their customers. But the vendors discussed in the Friday article are quite different. They are all fairly small firms, including two excavating companies, an office-renovation firm, individual engineers (attacked through a watering-hole attack on a publisher of magazines read by power engineers), and others. So I was entirely wrong in my idea of the vendor entities that served as the intermediaries for the Russian attacks.
  4. There’s no way that an attack on any of these vendor targets could ever get the Russians into the utility assets they needed to compromise in order to cause a cascading BES outage. But what could it do? It could get them into the IT networks of utilities. After all, every vendor interacts probably every day with utility staff using workstations attached to the IT network.
  5. And the Russians didn’t have to compromise a remote access system to get to these workstations. All they had to do was to follow the same path used in the Ukraine attacks, as well as just about every other successful cyberattack worldwide in recent years: use phishing emails (or watering-hole attacks) to load malware onto workstations on the IT network. And once they were on one or a few workstations, it was much easier to compromise almost any other workstation on the IT network, since most IT network assets are much better protected from external attacks than they are from internal ones. The WSJ article provides great detail on how some of these phishing attacks proceeded.
Of course, the goal of the attacks wasn’t to compromise the IT network, but somehow to reach the control systems (i.e. the “OT” network, meaning operational technology), where they could drop malware that will allow them to come back later to turn that into actual destruction. And here we need to ask “Did the attackers reach any control systems?”  The article answers this question in the affirmative – and the systems weren’t in just two wind turbines or one small natural gas-fired power plant, as DHS stated this summer. Here are four paragraphs from the last part of the article:

Federal officials say the attackers looked for ways to bridge the divide between the utilities’ corporate networks, which are connected to the internet, and their critical-control networks, which are walled off from the web for security purposes.

The bridges sometimes come in the form of “jump boxes,” computers that give technicians a way to move between the two systems. If not well defended, these junctions could allow operatives to tunnel under the moat and pop up inside the castle walls.

In briefings to utilities last summer, Jonathan Homer, industrial-control systems cybersecurity chief for Homeland Security, said the Russians had penetrated the control-system area of utilities through poorly protected jump boxes. The attackers had “legitimate access, the same as a technician,” he said in one briefing, and were positioned to take actions that could have temporarily knocked out power.
           ……..
Vikram Thakur, technical director of security response for Symantec Corp., a California-based cybersecurity firm, says his company knows firsthand that at least 60 utilities were targeted, including some outside the U.S., and about two dozen were breached. He says hackers penetrated far enough to reach the industrial-control systems at eight or more utilities. He declined to name them.

To make a long story short, it seems the Russian attackers had a much broader goal than just causing a cascading BES outage, which made it perfectly acceptable for them to attack Low impact Transmission-level assets, as well as distribution-level assets not part of the Bulk Electric System at all – since both of these types of assets are much less well-defended than BES assets. Because of this broader goal, they weren’t confined to attacking utilities by commandeering vendor access to their remote access systems; they were perfectly happy using the tried-and-true phishing route to get into the IT networks of utilities. And from there, they were able to penetrate the control system networks of at least eight utilities, where they might have been able to deposit malware.

My second post in this series will discuss the implications of this finding for cyber regulation of the electric power industry, including the NERC CIP standards.


Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or challenges like what is discussed in this post – especially on compliance with CIP-013; we also work with security product or service vendors that need help articulating their message to the power industry. To discuss this, you can email me at the same address.


[i] The WSJ web site is behind a paywall, so you can’t read the article there. I requested that the site provide a free link to this article, since I think it is of very high importance to the North American power industry. In the meantime, I found this online reproduction of the article.

I think all of you should seriously consider subscribing to the Journal, either in print or online. It has the best coverage of cyber issues of any major American newspaper. It also has the best coverage of economic issues, which I’m also very interested in. I don’t agree with the majority of the editorials or op-eds, but even then they’re all very well-written and informed, so you can’t just dismiss them unread like you can in some other publications.

Wednesday, January 9, 2019

Lew Folkerth on CIP-013: Part 1



Two weeks ago, when I downloaded the most recent RF newsletter and went to Lew Folkerth’s column (called The Lighthouse), my heart started to beat faster when I saw that his topic this time is supply chain security – and really CIP-013. I’ve been quite disappointed with literally everything else I’ve read or heard so far from NERC about CIP-013, since none of it has addressed the fundamental difference between that standard and literally all other NERC standards (including the CIP ones): namely that CIP-013 is a standard for risk management first and supply chain security second (more specifically, the risks being managed by the standard are supply chain security risks). I was very pleased to see that Lew not only gets the point about CIP-013, but has a deep understanding that allows him to communicate what he knows about the standard to the rest of the NERC community.

I was quite heartened when I read Lew’s first sentence about the standard: “CIP-013-1 is the first CIP Standard that requires you to manage risk.” Yes! And it got better after that, since he not only described the standard very well, but he laid out a good (although too concise) roadmap for complying with it, and made some very good points about how it will be audited (Lew was a longtime CIP auditor until he moved into an Entity Development role at RF, where he still focuses on CIP). On the other hand, I do disagree with one thing he says, which I’ll discuss below. I’m dividing this post into two parts. This first part discusses what Lew says about CIP-013 itself and how to comply with it. The second part will discuss how CIP-013 will be audited, including a subsequent email discussion I had with Lew on that topic.

Lew makes three main points about CIP-013:

I. Plan-based
His first point is “CIP-013-1 is a plan-based Standard…You are required to develop (R1), implement (R2), and maintain (R3) a plan to manage supply chain cyber security risk. You should already be familiar with the needs of plan-based Standards, as many of the existing CIP Standards are also plan-based.” I don’t agree that any existing CIP standards (other than CIP-013 itself) are plan-based[i], although several requirements are. Specifically, CIP-003 R2, CIP-010 R4 and CIP-011 R1 all require that the entity have a plan or program to achieve a particular objective. I would also argue that CIP-007 R3 is plan-based, even though it doesn’t actually call for a plan. This is because I don’t see that there’s any way to comply with the requirement without having some sort of plan, although perhaps a fairly sketchy one.[ii]

What are the objectives of these other plan-based requirements? They are all stated differently, but in my opinion they are all about risk management, even though CIP-013 R1 is the first requirement to state that outright. And if you think about it (or even if you don’t), when you’re dealing with cyber security, risk management is simply the only way to go. Let me explain by contrasting the CIP standards with the NERC Operations and Planning (O&P) standards, which deal with technical matters required to keep the grid running reliably.

In the O&P standards, the whole objective is to substantially mitigate particular risks - specifically, risks that can lead to a cascading outage of the Bulk Electric System – not manage them. These standards are prescriptive by necessity: For example, if a utility doesn’t properly trim trees under its transmission lines, there is a real risk of a widespread, cascading outage (which is what happened in 2003 with the Northeast blackout, although there were other causes as well). Given the serious impact of a cascading outage, there needs to be a prescriptive standard telling transmission utilities exactly what needs to be done to prevent this from happening, and they need to follow it.

The prescriptive requirements in CIP take this same approach: They are designed with the belief that, if you take certain steps like those required for CIP-007 R2, you will substantially mitigate a certain area of risk - which, in the case of that requirement, is the risk posed by unpatched software vulnerabilities. You need to follow those steps (which include rigid timelines in CIP 7 R2), and if you don’t, there will almost inevitably be severe consequences.

But are the severe consequences really inevitable in this case? If one utility doesn’t patch five systems in their Control Center for two months, will there inevitably be some sort of BES outage (let alone a cascading one)? Certainly not. How about if all the utilities in one region of the country don’t patch any of their Control Center servers for one year? Will there inevitably be an outage then? Again, the answer is no, but obviously the probability of a BES outage – and even a cascading one – is much higher in this case than in the first one.

The risk in the second case is obviously much greater than in the first. Therefore, the utilities as a group should put much more effort (which equals money, of course) into mitigating the second risk than the first. But a prescriptive requirement like CIP-007 R2 doesn’t allow for consideration of risk at all – every NERC entity with Medium or High impact BES Cyber Systems needs to follow exactly the same patch management process for every system, whether it’s the EMS that controls power in a major metropolitan area, or a relay on a relatively insignificant 135kV transmission line. The extra funds required to comply with a requirement like this (and CIP 7 R2 is without doubt the most resource-intensive of all the CIP requirements, although I hear that CIP-010 R1 gives it a pretty good run for its money) have to come from somewhere, and since every entity I know has a fairly fixed budget for cyber security and CIP compliance, it will have to come from mitigation of other cyber risks – such as phishing or ransomware, which aren’t addressed at all in CIP now.

A requirement like CIP-013 R1 is different. It requires the entity to develop a plan to mitigate risks in one area – in this case, supply chain security. It is up to the entity to decide how they’ll allocate their funds among different supply chain risks. And the best way to do that is to put the most resources into mitigating the biggest risks and the least resources – or none at all – into mitigating the smallest risks. That’s why I have always believed that the first step in CIP-013 compliance - and Lew confirms this in his article - is to identify the important supply chain risks to BES Cyber Systems that your organization faces, then rank them by their degree of risk to the BES. The amount of resources you allocate toward mitigating each risk should be directly proportional to its degree (and the lower risks won’t receive any mitigation resources). This way, your limited funds can achieve the greatest results, because they will mitigate the greatest possible amount of overall risk.

This is why CIP-013 doesn’t say the entity must take certain steps to mitigate risk X and other steps to mitigate risk Y, ignoring all of the other risks. Instead, CIP-013 does exactly what I think all cyber security standards should do: require the entity to follow the same process to mitigate cyber risks that they would follow if they weren’t subject to mandatory cyber security requirements. But entities are mandated to follow this process. It’s not a “framework” that they can follow or not, with no serious consequences if they don’t.

The point about mandatory is key: We all know that utility management makes much more money available to mitigate cyber risks because NERC CIP is in place and violations carry hefty consequences (and the non-monetary consequences are at least as bad as the actual penalties), than they would if CIP weren’t in the picture. The problem is to rewrite the CIP standards so that they don’t distort the process of risk identification, prioritization and mitigation that an entity would follow in their absence – yet still keep the money spigot open because they’re mandatory. CIP-013 comes close to achieving this goal in the domain of supply chain security risk management. We need similar standards for all the other cyber security domains.

Fortunately, the CIP standards are gradually moving toward eliminating prescriptive requirements and implementing plan-based (i.e. risk-based) ones. In fact, the two major requirements drafted (or revised) and approved since CIP v5 (CIP-003 R2 and CIP-010 R4) are both plan-based, and the three new standards drafted since v5 (CIP-014, CIP-013 and CIP-012) are also all plan-based; moreover, it’s almost impossible to imagine a new prescriptive CIP requirement being drafted. But prescriptive requirements like CIP-007 R2, CIP-010 R1 and CIP-007 R1 remain in place, where they continue to require much more than their “fair share” of mitigation resources.

II. Objective-based
Lew’s second point is “CIP-013-1 is an objective-based Standard.” I agree with this, too, but I think it’s redundant. If a requirement is plan-based, it’s ipso facto objective-based, since the purpose of any plan is to achieve its objective. I pointed this out to Lew in an email, and he replied that the redundancy was intended. He continues “I’m trying to lay a strong foundation for future discussion. A plan without an objective isn’t worth the media it’s recorded on. But some entities have had difficulty grasping this idea and need to have it reinforced.” Consider it reinforced!  

Lew goes on to identify the objectives of CIP-013, and this is where I disagree with him (although FERC deserves partial blame for this. See below). To identify the objectives, he goes to the second paragraph of FERC Order 850, which approved CIP-013 in October. Here, FERC states that the four objectives of CIP-013 are

  1. Software integrity and authenticity;
  2. Vendor remote access protections;
  3. Information system planning; and
  4. Vendor risk management and procurement controls.
And where did FERC determine that these are the objectives of CIP-013? If you pore through the standard, I can promise you’ll never find these stated together anywhere, although you’ll find them individually in different places – along with other objectives that don’t seem to have made it into the Final Four, for some reason.

But FERC didn’t make these objectives up; they came from an authoritative source – FERC itself! Specifically, they came from FERC’s Order 829 of June 2016, which FERC issued when they ordered NERC to develop a supply chain security standard in the first place. So it seems FERC, when looking for the purpose of CIP-013, decided that the people who drafted the standard weren’t to be trusted to understand its real purpose, and the best source of information on this topic is…FERC (although, since only one of the five Commissioners who approved Order 829 is still on the Commission, it’s very hard to say that FERC 2018 is the same as FERC 2016).

This would all just be an amusing piece of trivia, if it weren’t for two things. First, FERC’s four objectives are very specific, and are far from being the only objectives found in CIP-013. For example, the first two objectives are found in R1.2, but there are four more items in R1.2 that FERC didn’t include in their list, for some reason. I see no reason why all six of the items in R1.2 shouldn’t be included in a list of objectives of CIP-013, although even that would hardly constitute a complete inventory of CIP-013’s objectives.

Since FERC didn’t do a good job of it, how can we summarize the objectives of CIP-013? It’s not hard at all. We just need to go to the statement of purpose in Section 3 at the beginning of the standard: “To mitigate cyber security risks to the reliable operation of the Bulk Electric System (BES) by implementing security controls for supply chain risk management of BES Cyber Systems.” In my opinion, this is a close-to-perfect summary of what CIP-013 is intended to do.

But Lew isn’t just quoting FERC for people’s edification; he’s saying that the objectives FERC lists should be the objectives that your supply chain cyber security risk management plan aims to achieve. Specifically, he says “Your actions in developing and implementing your plan should be directed toward achieving these four objectives. You should be prepared to demonstrate to an audit team that you meet each of these objectives. These objectives are not explicitly referenced in the Standard language. However, as outlined in the FERC Order, the achievement of these objectives is the reason the Standard was written.”

You’ll notice that Lew states that a NERC entity will need to demonstrate to the auditors that their plan achieves FERC’s four objectives. Now, even though Lew isn’t an auditor any more, I know that his words are taken very seriously by the auditors in all of the NERC Regions. This means most, if not all, auditors will pay attention to this sentence, and therefore you can expect many or even most auditors to ask you to show them that your plan meets these four objectives.

Since I obviously don’t think that FERC’s four objectives are a completely accurate summary of the purpose of CIP-013, am I now saying that Lew has provided misleading advice to NERC entities, so that they’ll end up addressing meaningless or even harmful objectives in their plans? No, there’s no harm in telling NERC entities that their auditors will want to determine if their CIP-013 plan meets each of FERC’s four objectives, since as I’ve said those objectives are all found somewhere in the standard anyway. The harm is that the real objective of CIP-013 is what’s found in the Purpose statement in Section 3; that statement encompasses FERC’s four objectives, and a lot more. This needs to be brought to people’s attention, since neither FERC nor NERC have done so yet.

Why doesn’t Lew instead say that auditors should make sure the entity’s CIP-013 plan meets the stated objective (purpose) of the standard? This could still be followed by FERC’s four things – in order to provide more detail. I think that would work, as long as it’s made clear that FERC’s four things are in no way a summary of everything that needs to be addressed in the plan. The Purpose statement provides that summary. But is that enough detail to make the requirement auditable? That’s a question I’ll discuss below and in Part 2 of this post.

III. Lew’s (implicit) methodology for CIP-013 compliance
Lew’s third point is “CIP-013-1 is a risk-based Standard”. He explains what that means, and in the process specifies (very concisely) a complete compliance methodology, when he writes:

You are not expected to address all areas of supply chain cyber security. You have the freedom, and the responsibility, to address those areas that pose the greatest risk to your organization and to your high and medium impact BES Cyber Systems.

You will need to be able to show an audit team that you have identified possible supply chain risks to your high and medium impact BES Cyber Systems, assessed those risks, and put processes and controls in place to address those risks that pose the highest risk to the BES.

This passage actually describes the whole process of developing your supply chain cyber security risk management plan to comply with CIP-013 R1.1, although it is very densely packed in the passage. Since I’m a Certified Lew Unpacker (CLU), I will now unpack[iii] it for you (although my unpacked version is still very high-level):

Lew’s CIP-013 compliance methodology, unpacked for the first time!
A. The first step in developing your plan (in Lew’s implicit methodology) is that you need to consider “all areas” of supply chain cyber security. I interpret that to mean you should in principle consider every supply chain cyber threat as you develop your plan. Of course, it would be impossible to do this – there are probably an almost infinite number of threats, especially if you want to get down to a lot of detail on threat actors, means used, etc. Could you simplify that by just listing the most important high-level supply chain cyber threats likely to impact the electric power industry? Sure you could, but do you have that list?

And here’s the rub: It would be great if there were a list like that, and it probably wouldn’t be too hard for a group of industry experts to get together and compile it (I’m thinking it probably wouldn’t have many more than ten items). Even better: The CIP-013 SDT was a group of industry experts. Why didn’t they put together a list like that and include it in CIP-013 R1? As it is, there is no list of threats (or “risks”, the word the requirement uses. I have my reasons for preferring to use “threats” – which I’ll describe in a moment) in the requirement, and every NERC entity is on its own to decide what are the most important supply chain cyber security threats it faces. This inevitably means they’ll all start with different lists, some big and some small.

There’s a good reason why the SDT didn’t include a list in the requirement (and, even though I attended a few of the SDT meetings, I’ll admit this omission never even occurred to me): FERC only gave them one year to a) draft the standard, b) get it approved by the NERC ballot body (it took four ballots to do that, each with a comment period), c) have the NERC Board approve it, and d) submit it to FERC[iv] for their approval (and FERC’s approval took 13 months, longer than they gave NERC to develop and approve the standard in the first place). If the SDT had taken the time to have a debate over what should be on the list of risks, or even whether there should be a list at all in the standard, they would never have made their deadline.

This is a shame, though. To understand why, consider one plan-based requirement that does include a list of the risks that need to be considered: CIP-010 R4. Looking at this requirement can give you a good idea of the benefits of having the list of risks in the requirement.

CIP-010-2 R4 requires the entity to implement (and, implicitly, to develop in the first place) “one or more documented plan(s) for Transient Cyber Assets and Removable Media that include the sections in Attachment 1”.  When you go to Attachment 1, you find that it starts with the words “Responsible Entities shall include each of the sections provided below in their plan(s) for Transient Cyber Assets and Removable Media as required under Requirement R4.” Each of the sections describes an area of risk to include in the plan. These are stated as mitigations that need to be considered, but you can work back from each mitigation to identify the risk it mitigates very easily.

For example, Section 1.3 reads “Use one or a combination of the following methods to achieve the objective of mitigating the risk of vulnerabilities posed by unpatched software on the Transient Cyber Asset: Security patching, including manual or managed updates;  Live operating system and software executable only from read-only media; System hardening; or Other method(s) to mitigate software vulnerabilities.” (my emphasis)

The risk targeted by all of these mitigations can be loosely described as the risk of malware spreading in your ESP due to unpatched software on a TCA. What are you supposed to do about it? Note the words I’ve italicized. They don’t say you need to “consider” doing something, they say you need to do it. And since Attachment 1 is part of CIP-010 R4, this means you need to insert the word “must” before this whole passage (in fact, that word needs to be inserted at the beginning of each of the other sections in Attachment 1 as well). You must achieve the objective of mitigating this particular type of risk.

But if I’m saying CIP-010 R4 is a plan-based (as well as objective-based and risk-based) requirement, how is that compatible with the (implicit) use of the word “must”, at the beginning of this section as well as all the other sections? Does this turn R4 into a prescriptive requirement?

I’m glad you asked that question. Even though you have to address the risk in your plan, you have complete flexibility in how you mitigate that risk. The requirement still isn’t prescriptive, because it doesn’t prescribe any particular actions. The same approach applies to each of the other sections of Attachment 1: The risk underlying each one needs to be addressed in the plan, while the entity can mitigate the risk in whatever way it thinks is best (although it must be an effective mitigation. Lew has previously addressed what that means).

Obviously, somebody who is complying with CIP-010 R4 will know exactly what risks to include in their plan for Transient Cyber Assets and Removable Media, due to Attachment 1 (and remember, since Attachment 1 is called out by R4 itself, it is actually part of the requirement – not just guidance). They can add some risks if they want (my guess is very few will do that), but at least they have a comprehensive list to start with.

And more importantly, the auditors have something to hang their hat on when they come by to audit. They can ask the entity to show them that they’ve addressed each of the items listed in Attachment 1, then they can judge them by how well they’ve addressed each one (i.e. how effective the mitigations described in their plan are likely to be, and how effective they’ve actually turned out to be – since most audits will happen after implementation of the plan, when there’s a year or two of data to consider). This is the main reason why I now realize it’s much better for a plan-based requirement to have a list of risks to address in the requirement itself, although that’s not currently in the cards for CIP-013 (it would be nice if NERC added this to the new SAR that will have to be developed to address the two or three changes in CIP-013 that FERC mandated in Order 850, but for some reason they don’t take orders from me at NERC).

You can see the difference this makes – i.e. the difference it makes to have the list of risks that must be addressed in the plan in the requirement itself – by comparing the RSAW[v] sections for CIP-010 R4 and CIP-013 R1. The former reproduces all of the detail in Attachment 1 – making the RSAW a great guide both for auditors and for the entity itself, as it prepares its supply chain cyber security risk management plan for CIP-013.  The R1 Compliance Assessment Approach section goes on for more than a page.

And how about the CIP-013 R1 RSAW? Here’s the entirety of what it says for the R1 Compliance Approach: “Verify the Responsible Entity has developed one or more documented supply chain cyber security risk management plans that collectively address the controls specified in Part 1.1 and Part 1.2.” In other words, make sure the entity has complied with the requirement, period. Not too helpful if you’re drawing up your plan, but what more can be said? The RSAW can only point to what is required by the wording of R1.1, and since there is no Attachment 1 to give the entity a comprehensive idea of what they need to address in their plan, all the RSAW can do is point the reader to the wording of the requirement, which only says “The plan shall include one or more process(es) used in planning for the procurement of BES Cyber Systems to identify and assess cyber security risk(s) to the Bulk Electric System from vendor products or services resulting from: (i) procuring and installing vendor equipment and software; and (ii) transitions from one vendor(s) to another vendor(s).”

Not a lot to go on here, although I guess the RSAW could have just turned this into a question: “Does the plan include one or more processes used in planning for the procurement of BES Cyber Systems to identify and assess cyber security risk(s) to the Bulk Electric System from vendor products or services resulting from: (i) procuring and installing vendor equipment and software; and (ii) transitions from one vendor(s) to another vendor(s)?” That would have at least pointed to the need to address these two items - although I break them into three:

  1. Procuring vendor equipment and software;
  2. Installing vendor equipment and software; and
  3. Transitions between vendors.
I think it would be good if the RSAW specifically listed these items (whether it’s two or three doesn’t matter to me) as being required in the plan, since they’re definitely in the requirement.

Even though it’s too late to have a list of risks to address in CIP-013-1 R1 itself, it’s not too late for some group or groups – like the CIPC or NATF, or perhaps the trade associations, for each of their memberships – to develop a comprehensive high-level supply chain security risk list for NERC entities complying with CIP-013 (as well as any utilities or IPPs who don’t have to comply, but still want to).

While the auditors couldn’t give a Potential Non-Compliance finding to an entity that didn’t include all of the risks on the list in their plan, they would be able to point out an Area of Concern – and frankly, that’s probably better anyway. I don’t think there should be a lot of violations identified for CIP-013. Given that R1.1 lacks any specific criteria for what should be in the plan, I see no basis for an auditor to assess any violation of R1.1, unless the entity submits a plan that doesn’t make an attempt to seriously identify and mitigate supply chain security risks at all. More on auditing in Part 2 of this post, coming soon to a computer or smartphone near you!

B. The second step in developing your supply chain cyber security risk management plan, in compliance with CIP-013 R1.1, is to decide the degree of risk posed by each threat (and this is why I preferred to talk about threats at the beginning of this methodology, even though the standard just talks about risks. It’s very awkward to talk about assigning a degree of risk to a risk. A risk is inherently a numerical concept; a threat is just a statement like “A software vendor’s development environment will be compromised and malware will be embedded in the software”. You can legitimately ask “What is the risk posed by this threat, and how does it compare to - say - the threat that malware will be inserted into a software patch and delivered to our systems?” It’s much more difficult – although not impossible – to ask “What is the degree of risk posed by this risk, and which of these two risks is riskier?” It begins to sound like the old Abbot and Costello routine, Who’s on first?.

How do you quantify the degree of risk posed by a particular threat? You need to consider a) the potential impact on the BES[vi] (remember, all risks in CIP-013, like all NERC standards, are risks to the BES, not to your organization) if the threat is realized in your environment, as well as b) the likelihood that will happen. You need to combine these two measures in some way, to come up with a risk score. Assuming that you’re assigning a high/medium/low value to both impact and likelihood (rather than trying to pretend you know enough to say the likelihood is 38% vs. 30%, or the potential impact on the BES is 500MW vs. 250MW, which you don’t), I recommend adding them. So if you assign values of 1, 2 and 3 to low, medium and high, and the likelihood is low but impact is high (or vice versa), this means the risk score for this threat is 4 out of a possible 6 (with 2 being the lowest possible score).

C. The third step is to rank all of the threats by their risk scores. Once you have your ranked threat list, you instantly know which are the most serious supply chain cyber security threats you face: they’re the ones at the top of the list.

D. The fourth step is to develop a risk mitigation plan for each of the top threats. As mentioned earlier, there’s no question that you won’t be able to completely mitigate any cyber threat. The most you should aim for is to bring the level of risk for each threat on your “most serious” list down to a common lower level (say, you’ll aim to bring all threats with a risk score of 5 or 6 down to a risk score level of 3 or 4), at which point other unmitigated threats will then pose higher levels of risk; if you still have resources available to you, you should consider mitigating those “second-tier” threats as well. But whatever your available budget, you should invest it in mitigating the highest risks – that way, you’re getting the most bang for each hard-earned buck.

In Part 2 of this post, I’ll discuss what Lew says (or implies) regarding how CIP-013 will be audited, as well as relate an email discussion he and I had on this question.

While you’re anxiously awaiting Part 2, you might re-read (or read for the first time) this post describing my free CIP-013 workshop offer, which is still on the table. While I have had a number of takers for this so far, I regret to say that an even larger percentage of the 500 or so NERC entities with High and/or Medium impact BES Cyber Systems have yet to contact me about the offer – as unbelievable as this may seem (sniff). If you’re one of those entities, I promise I’ll have no hard feelings if you drop me an email at tom@tomalrich.com saying you’d like to set up a time to discuss this on the phone. All is forgiven!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

[i] You could certainly argue that CIP-014 is a plan-based standard, although I think it falls short in a few ways. So let’s leave it out now, and say we’re just talking about cyber security standards.

[ii] On the other hand, I don’t call CIP-008 and CIP-009 plan-based, even though they both explicitly call for plans. They are definitely objectives-based, with the objective being an incident response plan or backup/recovery plan(s) respectively.  But in my view of a plan-based requirement, the objective is always managing a certain area of risk. In CIP-010-2 R4, it’s risk from use of Transient Cyber Assets and Removable Media. In CIP-003-7 R2, it’s risk of cyber or physical compromise of BCS at Low impact assets. In CIP-011 R1 it’s risk of compromise of BES Cyber System Information. And in CIP-007 R3 it’s risk of infection by malware. But CIP 8 and 9 aren’t directly risk-based, not that they’re bad standards of course. They both call for development and testing of a plan, but risk only enters tangentially into the plan (if at all), since say a CSIRP for an entity in a very risky environment should probably be more rigorous than one for an entity in a “normal” environment.

[iii] I readily admit that what I write in the rest of this post isn’t all just an expansion of what Lew says in these two short paragraphs! However, each of the compliance steps that I discuss below is implicit in those two paragraphs. If you don’t believe me, I can prove it to you using advanced mathematics.

[iv] I’ll admit this is just my speculation. It’s not so much that the SDT wanted to draw up the list and didn’t have time to do it, as that they never had the leisure to even consider more philosophical questions like this; they were racing against the clock during their whole existence.

[v] RSAW stands for Reliability Standard Audit Worksheet. It’s the guide that NERC auditors use for audits. And for that reason, it’s also the guide that NERC entities use as they set up their compliance program for a particular standard.

[vi] Lew made the following comment at this point in the post: “This is a concept that needs more attention. I think entities should give consideration to those threats that could result in damage to difficult-to-repair equipment, such as large transformers, generators, turbines, boiler feed pumps, etc. If you can take over the control system for such equipment and run it to destruction, that is a risk with higher consequence that merely opening a breaker and causing a short outage. And I think this is the type of compromise that a nation-state would be interested in.” A word to the wise!

Wednesday, January 2, 2019

Solving the vendor problem (both of them!)



Introduction
I have been thinking about vendors a lot this fall, for three main reasons. First, the Russian attacks that were played up by DHS this summer (and I owe a new post on that subject – please don’t cancel your subscription. It will come soon) came entirely through vendors. It doesn’t seem like the Russians succeeded in their goal of implanting malware in the systems that control the grid, but they definitely were able to penetrate a lot of vendors. Second, I’ve been doing work for a vendor who is considering registering with NERC as a Generator Operator. And third, CIP-013 has a lot to do with vendors, in case you hadn’t noticed that.

It seems to me there are two big problems related to power industry vendors (at least those that sell control systems. However those are a pretty significant fraction of total product vendors to the industry, since it seems there are very few products – well, maybe insulation and wire – that don’t come with their own control systems, or at least the capability of being controlled). One is the industry’s problem, the other is the vendors’ problem. I will deal with these separately.

The industry’s problem
The industry’s problem is due to the current big concern with supply chain security, and with CIP-013 compliance in particular. The problem can be simply stated: On the one hand, having good supply chain security requires getting vendors to do a number of things, six of which are described in CIP-013 R1.2 (although these are far from being the only concerns that the industry has with vendors). On the other hand, at least for larger vendors, individual utilities don’t have much leverage to get them to do something they don’t want to do.

Moreover, the industry can’t gang up on a recalcitrant vendor and say “Mr. Vendor, either you take the actions on this list or you can say goodbye to any further power industry business.” This would be a clear violation of antitrust laws. Anyone who has attended even just a few NERC meetings probably can cite the antitrust disclaimer by heart. It specifically forbids any discussion of vendors at the meeting.

So what does the industry do about this? Unfortunately, some people – including, it seems, many at NERC – seem to think that the ultimate solution to this problem is getting vendors to agree to certain contract language. Their thinking seems to be that, as long as a few words are on some parchment somewhere, the “vendor problem” is more or less solved. In fact, I heard a senior NERC official summarize CIP-013 for a group of security professionals as “a standard for incorporating cyber security language int0 vendor contracts” – or words to that effect. This person was clearly saying that contract language was the whole purpose of CIP-013, although he backtracked when I questioned him on it afterwards.

I wrote a post on contract language early this year, and followed it up with a post summarizing an email exchange with an auditor, which was prompted by the original post. The point of both posts was that, while you do need to be able to demonstrate that you tried to get a vendor to do certain things to comply with CIP-013 R1, and while there needs to be some written documentation of that agreement, this in no way requires contract language itself. In fact, I think of contract language as the last resort, not the first one, for these reasons:

  1. It is by far the most expensive way to get a vendor to commit to doing something, since it will require lots of lawyers’ time on both sides – and this is even assuming the vendor is willing to negotiate in the first place. If they won’t negotiate this, you have to start examining other vendors and determining what kind of contract language they’ll accept, then threaten to leave the original vendor if they don’t give in. None of that is easy or fun to do.
  2. It is even more “expensive” in the political sense, since hammering your vendor with lawyers is a terrible way to achieve what should be your primary aim: having a close relationship, based on mutual trust, in which both parties work together to achieve the common goal of secure products and services.
  3. You’re still going to have to provide some documentation that the vendor is actually doing what they promised to do in the contract. This is because, contrary to the NERC official I paraphrased earlier, CIP-013 is a standard for management and mitigation of supply chain security risks, not contract language – which is just one of many possible mitigation measures. Just getting a vendor to put language in a contract doesn’t in itself mitigate one iota of risk. You need to take steps to mitigate each risk, and if your first steps are contract language and they fail (either because the vendor wouldn’t agree to the language you wanted, or because they agreed to it but didn’t live up to the agreement), you still need to do something to mitigate the risk, either with or without the cooperation of your vendor.
  4. Think of this: What if you get your vendor to agree to your contract language, but they don’t live up to the agreement? You then have your lawyers send them multiple letters, and they ignore all of them. Furthermore – and this is very commonly the case – suppose that switching to another vendor would be difficult, if not impossible. What’s your leverage at this point? I guess you can sue the vendor – and with a little luck, in three years you may get some money from them, or at least a promise to actually live up to the language they agreed to. How has all of this in any way made the Bulk Electric System more secure? All of this time, you’ve presumably been without protection against the risks that the vendor refuses to mitigate.
  5. Perhaps most important, even if you get the vendor to agree to your language, you very well may not be allowed by your lawyers to show the auditors the main piece of evidence: the contract. Because of the special provisions in the requirement, the auditors won’t be able to demand you show them the contract, but – as an auditor pointed out to me in the second of the two posts linked above – they will be able to require that you show them some written evidence that the vendor agreed with the language. That might be a separate letter, or even an email. But in that case, what’s the point of the contract? The vendor is agreeing in a letter to do what you’re asking. Why not just go for the letter in the first place, and save thousands of dollars and maybe a couple months of delay by not worrying about the contract in the first place?
So contract language isn’t the magic bullet for getting vendors to take security seriously. Then what is? How about certification? There are various certifications that the vendor can achieve (e.g. UL 2900), which will presumably demonstrate that their products are secure. Why not just require they get this certification? Well, for one thing, getting these certifications is very expensive. Unless a lot of other NERC entities are demanding the same thing, you’re unlikely to be able to force a vendor to do this, unless you’re negotiating a huge contract with them.

Well, how about getting a bunch of friends at other utilities together and all demanding the same certification of the vendor? Or maybe you can “name and shame” the vendor in industry forums or online? You know the answer to that: You (and your friends) will be violating antitrust laws. The fact is that there isn’t currently any way that one or more NERC entities can force a vendor to comply with a certain set of cyber security requirements, if the vendor really doesn’t want to do it and is prepared to sacrifice some industry market share because of that decision.

The vendors’ problem
Let’s turn now to the other problem: the one the vendors have. That problem can be stated very simply: Even before CIP-013 made its appearance, but certainly after that, vendors have been trying to figure out a way to head off demands for contract language at the pass. They know how expensive it is to negotiate contract language with just one customer. What do you do if every one of your customers starts coming at you with their own customized set of contract language (in some cases, simply downloaded from the internet, and having little or nothing to do with threats to control systems)? I know of one vendor that complained of exactly this situation in early 2018, and I’m sure they’re not alone. How can a vendor assure their power industry customers as a whole that they have good cyber security?

Now let’s go back to the second reason why I’ve been thinking about vendors lately: because I’ve been working for one. This vendor – who sells a product extensively used in power generation, and only in power generation – contacted me over the summer and said they wanted to register with NERC, so that they could be subject to compliance with the NERC CIP standards.

I naturally first thought they should see a psychiatrist, not me, but as I worked with them I began to see why they want to do this: They ultimately want to sell additional services to their customers that would require their having a secure environment to deliver those services. What better way to do that than to point out to your customers “Hey, we comply with NERC CIP, just as you do”? Complying with CIP is expensive, but negotiating a bunch of contracts with every customer is probably even more expensive, and much less productive.

Does this solve the vendors’ problem? I think it does. While NERC entities might have differences on what would be an ideal standard to have vendors comply with, I think they will (almost) all agree that a vendor that complies with CIP will have good security, for the simple reason that the NERC entities themselves are CIP compliant and (hopefully) have good security – why shouldn’t that be the case with a vendor as well? In fact, NERC entities could simply put wording in their RFPs and contracts saying that the vendor must be registered with NERC and compliant with the CIP standards. Since they’re not ganging up with other NERC entities in doing so (and I’m sure some NERC entities wouldn’t want to have this language in their contracts – that’s fine), I don’t think there’s a question of running afoul of antitrust law.

But I suppose some lawyer might find a reason why even this violates antitrust laws. In that case, nobody needs to mention CIP compliance in their RFPs or contracts. I contend that many vendors will still feel just as compelled to become CIP compliant as they would if forced to by a contract – especially if a lot of their customers make clear to them that it would be really nice if they became CIP compliant.

Is this a solution to both problems?
And how about the industry’s problem discussed at the beginning of this post, which – to refresh your memory – is that they don’t have a way of enforcing standard cyber security practices on vendors? Will having a way for vendors to register with NERC and comply with CIP solve this problem? I believe it will, with one caveat: This can’t be a forced process. For one thing, NERC and FERC don’t have any jurisdiction over vendors, so they certainly can’t force them to register – as they can with power market participants.[i]

But this is an advantage, not a drawback, of this idea. If vendor registration is voluntary, there can be no question of antitrust violations, as there would inevitably be with any sort of forced registration. If a vendor doesn’t see the need to register and comply with CIP, fine. The NERC entities will be able to decide individually in what cases they will continue to be a customer of a non-CIP-compliant vendor. Some entities might want all their OT hardware and software vendors to be registered and CIP compliant; others might want just their most strategic vendors to be registered; while others (perhaps smaller entities with only Low impact assets, who don’t have to comply with CIP-013) would continue business as usual with their current vendors, compliant or not.

Meanwhile, some vendors wouldn’t want to lose a single power industry customer, and will be the first to line up at NERC’s door, demanding to register and be made to comply with CIP. Others (probably larger ones that sell to lots of industries, not primarily the power industry) might assign someone to start studying this issue and figuring out what it would cost to register and become CIP compliant. If they decide the costs are less than the benefits (retained power industry customers), they’ll probably register; if they don’t, they probably won’t. In the end, you’ll have a mix of CIP-compliant and non-CIP-compliant vendors, and a mix of NERC entities that want their vendors to be compliant and others that couldn’t care less about vendor CIP compliance. All voluntary – no coercion. Life is good.

And there’s a special benefit for NERC entities that are subject to CIP-013 (i.e. ones that own or operate High and/or Medium impact assets): I think that they will be well advised to include registration and CIP compliance in all of their RFPs, as well as in future contract language (again, subject to antitrust vetting). With one fell swoop, they will be gaining a level of assurance about their vendors’ security, and – most importantly – their CIP-013 auditors are very likely to share that level of assurance. After all, CIP-013 auditors are likely to already audit the other CIP standards. While they may not like them 100%, they will certainly have a strong level of comfort that, if a vendor has passed a CIP audit, they have pretty good security.

Of course, this can’t be codified in the CIP-013 requirements themselves – i.e. the next version of CIP-013 can’t include a requirement that the entity’s vendors be CIP compliant; this would again be a pretty clear antitrust violation. But it won’t have to be codified: Since there will likely need to be “vendor” versions of the CIP standards (see below), it will be easy for auditors and NERC entities to weigh in on the standards drafting process, to make sure that “Vendor CIP” includes the types of requirements that the industry thinks are necessary for vendors. As a result, Vendor CIP will ipso facto be as close as possible to an “industry standard” for vendors, but the fact that compliance will be audited makes this a standard that will definitely be respected.

A couple problems, and solutions
However, all of this isn’t a straightforward proposition, for two reasons. The bigger reason is that, even though vendors typically won’t fulfill the tasks of any of the NERC Functional Model designations, they still need to register as something. My client chose GOP as the closest designation to what they actually do (and that’s not very close). But a GOP doesn’t just have to comply with the CIP standards – they have to comply with a number of Operations and Planning standards, very few (if any) of which have anything to do with what my client actually does.

Rather than have to pretend to comply with all of these other standards, my client will most likely try to get their compliance scope reduced by their Regional Entity for most of the Operations and Planning (i.e. “693”) standards, although there’s no assurance they’ll be allowed to do this. So it’s possible they will have a big compliance documentation burden, even before CIP comes into the picture.

What would be much better would be if there were a NERC Functional Model designation for vendors – let’s call it VEN. The definition of VEN would be something like “an entity that has the capability to monitor or control BES Cyber Systems in real time, through remote electronic access or on-site physical access, but does not otherwise meet any Functional Model designation”. A VEN would probably only have to comply with CIP, although I’m not ruling out some other standard that might apply to them. But a VEN certainly wouldn’t have to comply with the full range of O&P standards.[ii]

Of course, there’s another vendor infrastructure that needs to be secured besides the infrastructure (and people) used for remote access – and this is far more challenging. I’m speaking of the systems and processes used in designing, manufacturing and integrating the products the vendor sells, and even more importantly those used by the vendors’ suppliers (and those suppliers’ suppliers). Were a line of motherboards to be compromised in China so that they phoned home information about power flows, etc. (as was alleged to have happened by a long article in Bloomberg Businessweek. The credibility of this article has been called into question, but in principle this sounds like something that could happen, this would pose a very serious threat, and one that is very hard to mitigate.

So maybe there should be two designations. VE1 is the one I described above. VE2 could be something like “an entity that designs, manufactures or integrates hardware or software products used in BES Cyber Systems”. Vendors might sign up for one or both of these designations.

However, as just discussed above, I think there should really be a vendor version of CIP – although it’s certainly not out of the question that a vendor would be able to comply with the “regular” CIP standards (and my client intends to do exactly that). The problem isn’t so much that the vendors can’t comply with the current CIP standards, as that the BES Cyber System definition (including the BES Cyber Asset and Cyber Asset definitions) doesn’t apply to all of the systems that should be in scope for the vendors.

While it would be interesting to sit down and try to draw up a detailed Vendor CIP, for now I’ll just list what I think should be some of its main features (and challenges):

  • There is a real applicability problem, since a vendor by definition doesn’t own BES Cyber Systems. Even if they regularly operate BCS, they’re the customer’s systems, and the customer itself needs to comply with CIP for them. There might even need to be a new type of Cyber System called a “Vendor Control System”, which could be defined as something like “A system used to monitor and/or control BES Cyber Systems, whether remotely (which would mean from the vendor’s own remote control facility) or onsite (which would usually mean a vendor laptop used by a technician to control the vendor’s systems while onsite at one of the customer’s BES assets)”.
  • There could also be “Vendor Development Systems” and “Vendor Manufacturing Systems”, which would only be found at vendors of the second type described above. The requirements for these systems would obviously be very different from those for systems used in remote control or monitoring of BCS (for one thing, Secure Development Lifecycle would be a big component of the VE2 CIP standard(s), while the VE1 standard(s) would be very focused on securing any system that would ever be used for interactive or machine-to-machine access to customer BCS).
  • Now that I think of it, developing requirements for the VE2 vendors will be much more involved than for the VE1 vendors. Since the Russian attacks show that the latter are at the moment a much bigger threat to the BES, it might be best to first concentrate on developing CIP standards just for them, unless both can be done at once.
  • Along with controls for Interactive Remote Access, vendors would need controls for “Interactive Remote Monitoring or Control” (IRMC) – i.e. what they do for their customers’ systems. There would also need to be personnel controls for these vendors (especially for employees that either come onsite to customers or access customer systems remotely) and of course regular computer controls like patch management, ports and services control, etc. (since both the systems used for IRMC and those used for machine-to-machine access to BCS clearly pose a high risk to the BES).

As I said, I could go on and on about this subject, but this post is already long enough. Please let me know what you think of this idea.


Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. Please keep in mind that if you’re a NERC entity, Tom Alrich LLC can help you with NERC CIP issues or challenges like what is discussed in this post – especially on compliance with CIP-013; we also work with security product or service vendors that need help articulating their message to the power industry. To discuss this, you can email me at the same address.

[i] Of course, there are a small number of vendors who are power market participants. These include vendors that provide outsourced services to NERC entities and other power market participants – services that those entities would otherwise have to provide themselves. For example, an organization that runs power plants owned by NERC entities – probably from the vendor’s own Control Center – is a true GOP, and I’m sure they will be compelled to register as such, if they don’t do so voluntarily. Of course, these aren’t the vendors this post is about. It’s about vendors of hardware and software products, as well as vendors who provide services on a contractual basis and don’t actually take on responsibility for the services of one of the NERC Functional Model designations.

[ii] I wish to point out here that great minds think alike, and it seems that Tobias Whitney of EPRI (and previously of NERC, as I’m sure many of you know) also came up with the idea of having vendors register with NERC and comply with CIP – he brought it up in a discussion of cloud vendors at the NERC CIPC meeting in Atlanta a few weeks ago. I discussed this with him after the meeting, and he hadn’t thought about the idea that there would need to be a new Functional Model registration for vendors. However, I don’t think many vendors will register if they may be on the hook for at least some of the 693 standards as well, even though they really don’t perform functions that need to be regulated by NERC, as cyber security does.