Tom Alrich's Blog: Has CIP-013 worked?

When FERC mandated in July 2016 that NERC develop a supply chain cybersecurity risk management standard, just about everyone (including me) focused on the fact that it was probably the first supply chain cybersecurity standard outside of the military – in the US and possibly anywhere. This was true, but NERC CIP-013 (produced in response to FERC’s order) was also one of the first cybersecurity risk management standards: i.e., a standard whose explicit goal was establishment of a cyber risk management program. I think that will be CIP-013’s most important legacy. This is why I say that:

A brief, admittedly biased history of the CIP standards

No NERC standards before CIP version 1 had even mentioned anything about risk. And why should they have? They were all about physical actions: trimming trees under Transmission lines, balancing load and supply in real time, etc. The results of these actions could be objectively measured and needed to be kept within certain operating limits. Risk management had nothing to do with those standards. They were based ultimately on the laws of physics, which in themselves don’t understand risk.

However, the team that drafted CIP version 1 knew that cybersecurity was different. They knew there’s just about zero historical data that will let you make predictions like, “If we decrease our patching interval from 45 to 30 days, our chances of suffering a cyberattack – that could cause an adverse grid event – will go down by 15%.”

Instead, the only thing you can say with certainty is, “Patching more frequently will in general lower the risk that we will be compromised through an unpatched software vulnerability.” Does that mean a utility should patch every hour, even if it means shutting down the rest of its operations and putting its entire staff to work on patch management? No. Like every organization, electric utilities have limited resources and have many other risks (both cyber and otherwise) that they need to manage. They need to balance their various risk management activities against each other, so they can reduce as much overall risk as possible, given their available resources. Oh, and they have to keep the lights on at the same time.

In other words, cybersecurity is at heart about risk management. An important principle of risk management is accepting risks that are too expensive to mitigate, when weighed against the benefits that would be realized by mitigating them. Sure, a meteor strike on your headquarters would have a devastating effect on an organization (to say nothing of their people). Does that mean that every organization in the world should spend every extra dime (shekel, euro, rupee, etc.) they have on protecting their headquarters against a meteor strike?

No, because risk is a combination of likelihood and impact. The impact of a meteor strike would be huge, but the likelihood is so small that the risk itself is close to zero. No organization that I know of spends anything to fortify their headquarters (or anything else) against meteor strikes. By the same token, there’s no amount of spending on cybersecurity that would allow any organization – electric utility, dry cleaners, pizza parlor, the US military, etc. – to say it was perfectly cybersecure. There ain’t no such thing. There will always be some residual risk that the organization needs to accept.

For this reason, the CIP v1 drafting team included in many of the CIP v1 requirements the words “…or acceptance of risk…” In other words, if the utility believes that the cost of fully complying with the words of the requirement would outweigh whatever benefits (in risk reduction) would be achieved by compliance – and can document their reasons - they would have the option of not fully complying with those words. And they wouldn’t be held in violation of the requirement.

But FERC wouldn’t have any of this. When they approved CIP v1 – after 17 months of consideration – in early 2008, they said, somewhere in their 800 (or so)-page Order 706, that they wanted NERC to start work on a new version that would, among many other things, eliminate the wording about acceptance of risk. And in the meantime, NERC needed to audit as if those forbidden words weren’t there (by the way, if you want to see a very impressionistic history of NERC CIP that I wrote in 2018, go here).

But this caused a problem: The team that drafted CIP v1, believing that the language about acceptance of risk would remain in the requirements they were drafting, deliberately made the requirements quite prescriptive. After all, if a NERC entity found they couldn’t follow the prescriptive wording of the requirement, they could always just accept the risk. What could possibly go wrong?

As it turns out, a lot. By eliminating “acceptance of risk” but not changing the overly prescriptive nature of the CIP v1 requirements, FERC left NERC with the worst of both worlds: hard-edged prescriptive requirements with no means of “softening” them through risk considerations.

Over the first four or five years of CIP enforcement – and as CIP v1 was replaced by v2 and then v3 - there were loud and growing complaints from NERC entities about the amount of time and money they were being required to spend to comply with rigidly prescriptive requirements. Yea, great was the weeping and wailing and gnashing of teeth over NERC’s “zero-tolerance” (i.e. non-risk-based) enforcement of the CIP requirements.

It wasn’t until CIP version 5 came into effect in 2017 that there were any CIP requirements (let alone entire standards) that at least implicitly allowed NERC entities to take risk into account: these pioneering requirements, both part of CIP v5 (which was a complete rewrite of all the CIP standards), were CIP-011-5 R1 (Information Protection) and CIP-007-5 R3 (Anti-Malware)[i].

But the watershed event in moving to a risk management approach was FERC Order 829, which ordered NERC to develop a supply chain security standard in July 2016. Even though NERC had never – since their unfortunate experience with “acceptance of risk” - officially referred to “risk” in any standard, in Order 829, FERC specifically called for a “supply chain (cyber) risk management” (my emphasis) standard. And they specifically warned against “one size fits all” – i.e. prescriptive – requirements.

Why this change of heart? For one thing, FERC certainly had heard the complaints about the then-current standards (when they issued Order 829 in 2016, the industry was still complying with CIP version 3, although they were preparing for CIP v5, which came into effect on July 1, 2017), and knew that the last thing NERC entities needed was another highly prescriptive CIP standard.

But I believe (and believed in 2016) that the main reason why FERC wanted the new standard to be risk-based was because there was simply no other way to do it. Remember, even though the burden of compliance with CIP-013 falls on electric utilities (and other bulk electric system entities like federal power marketing agencies and independent power producers), the standard is really aimed at the suppliers of the hardware, software and services that those utilities rely on to operate the BES.

Let’s say a utility decided to hold all of those suppliers to a very high standard of cybersecurity, e.g. ISO 27001. For some suppliers, such as the supplier of the energy management system (EMS) that literally runs the utility’s own corner of the grid, it makes sense to require this. However, for other suppliers, such as perhaps the vendor of maintenance services in a power plant, this would be overkill.

If the utility tried to force the maintenance services vendor to comply with ISO 27001, they would quickly find themselves looking for a new vendor. A vendor isn’t going to spend many times the profit they may make from a customer in order not to lose that customer; they’ll simply try to find another customer (perhaps in another industry) that isn’t so demanding. And if the vendor can’t find another customer to replace the utility, they would still be much better off financially than if they’d agreed to pay for an ISO 27001 audit just to keep one customer.

And this is the problem with supply chain risk management: The organization has to convince their vendors to incur costs in order to keep them as a customer. They’ll never do that if they require them all to comply with the same high standards, rather than tailor their requirements to the degree of risk posed by each vendor. A supply chain risk management standard like CIP-013 has to be risk-based, if it is to have any chance of succeeding.

The NERC standards drafting team (SDT) took Order 829 to heart when they developed CIP-013. In fact, in my opinion, the SDT went a little too far. CIP-013 consists of a grand total of five sentences (although with a number of clauses), divided into three requirements:

1. The first part of requirement R1 (R1.1) tells the utility to develop a “supply chain cyber security risk management plan(s)”. The plan needs to “..identify and assess cyber security risk(s) to the Bulk Electric System from vendor products or services resulting from: (i) procuring and installing vendor equipment and software; and (ii) transitions from one vendor(s) to another vendor(s)..”

2. The second part of R1 (R1.2) identifies six risks that need to be included in the plan, such as “Disclosure by vendors of known vulnerabilities related to the products or services provided to the Responsible Entity”. While all six of these risks are important ones, they weren’t developed as some sort of comprehensive catalog. Rather, the drafting team (and I attended a number of their meetings) simply gathered in one place six statements about particular risks that FERC made at various random points in Order 829. These six risks are by no means the only six that need to be included in the utility’s supply chain cyber risk management plan. But these are the only risks that FERC specifically required to be included in the plan; the utility is free to decide for themselves what other risks should be included.

3. R2 requires the utility to “..implement its supply chain cyber security risk management plan(s) specified in Requirement R1.” That’s literally all it says.

4. R3 requires the utility to review its plan every 15 months.

I just summarized the entirety of CIP-013. The utility has lots of freedom to develop their supply chain cyber risk management in R1, but in R2 they have no freedom at all: they have to implement the plan as written. However, this isn’t as bad as it sounds: If the utility decides they made a mistake in their original plan – or if they realize that changed circumstances require a change in the plan - they’re free to change it at any time; they just have to document the changes they made and why they made them.

When NERC entities were starting to think about CIP-013 compliance, I know some of them made a very understandable mistake (especially for NERC entities): They focused on the six items in R1.2 and considered these to be the total of what’s required by CIP-013. After all, those six requirement parts were the most like the existing CIP requirements; why shouldn’t they be the only real “requirements” in CIP-013?

However, taking that attitude was based on completely ignoring the wording at the beginning of CIP-013 R1: “Each Responsible Entity shall develop one or more documented supply chain cyber security risk management plan(s)… The plan(s) shall include:..” This is followed by the text of R1.1 and R1.2, meaning that what’s described in both R1.1 and R1.2 needs to be included in the plan. So, without a doubt, the plan needs to include the six R1.2 items. But it also needs to include R1.1.

What threw these utilities off was probably the fact that R1.1 doesn’t prescribe anything in particular: It just requires the entity to develop a plan to “identify and assess” supply chain cybersecurity risks. In other words, it’s up to the entity themselves to determine what are the risks they face. Of course, that idea was a 180-degree change from all previous NERC standards. If you were complying with any of the CIP version 3 standards, or the FAC, BAL, etc. standards, and you told an auditor that you should be allowed to decide what goes in your compliance plan, not NERC, how far do you think that would get you?...You’re right – not very far at all.

For this reason, it’s understandable that NERC entities wouldn’t know what to do with a requirement that says it’s up to them to decide what goes into their supply chain cyber risk management plan. What’s to prevent an entity from simply including the six R1.2 items in their plan? In other words, how could they be found non-compliant if that’s all they put in their plan?

They can be found non-compliant because they’d have to convince the auditor, as part of their R1.1 compliance evidence, that they searched high and low to find any other supply chain cyber risks that applied to them, beyond the six risks in R1.2 – and they just couldn’t find any others at all. If I were the auditor and that evidence were prevented to me, I’d just ask some questions like:

1. What about SolarWinds-type attacks? Don’t you think you should be concerned about those? More importantly, what have you done that makes you immune to such attacks?

2. What about Log4j-based attacks? Have you determined that you don’t have any log4j at all in your environment (even in a component of a component of a component)?

3. How about the risks identified in the NATF Criteria? They include the six R1.2 items, but they go well beyond those (there are 60 Criteria now), including:

a. The risk that a software or device supplier won’t follow a secure development lifecycle (criterion #47);

b. The risk that a supplier might install a backdoor while developing a product and not remove it before they ship it to you, leading to your being compromised (criterion #15);

c. The risk that a product supplier might not conduct 7-year background checks on its employees (criterion #4); etc.

These are all supply chain cyber risks. The entity will have to convince the auditor that they at least considered all of these risks when they were developing their plan. And if they didn’t include them in the plan, they’ll have to provide some justification for not including at least each of the 60 NATF Criteria. From what I’ve heard about how CIP-013 is being audited, and from the presentations I’ve seen by regional auditors on the subject, I really don’t think an entity that only included the R1.2 risks in their plan wouldn’t find the auditor to be fairly skeptical.

In retrospect, it would have been good if the drafting team had included a list of say ten areas of supply chain security risk that the entity needs to consider in their plan; if they decide one or more of those areas don’t apply to them, they need to document that fact. These areas might include:

1. Software supply chain risks, including the risk that a malicious party might have implanted a backdoor in the software while it was being developed, as in SolarWinds; or the risk that a serious vulnerability would be identified in a widely used open source component like Log4j, making it difficult even to find all the vulnerable instances on your network.

2. Inadequate protection of the supplier’s remote access system (i.e. no MFA). DHS said in 2018 that at least 200 vendors to the power industry had been penetrated by the Russians through their remote access systems, in attempts to penetrate US electric utilities through them.

3. Inadequate anti-phishing training and other anti-phishing measures, making the vendor a possible vector for attacks aimed at the utility. In early 2019, the Wall Street Journal published a great article on how the Russians were penetrating vendors to the power industry through phishing attacks, then utilizing that access to penetrate electric utilities; the article listed four utilities that had been penetrated this way.

Had the drafting team included this list in R1 (and I certainly never even thought to suggest it to them), this would have made it clear to utilities that the purpose of R1 was to allow them to decide the risks that were most important to them, given their configuration of assets and vendors. Being able to allocate your resources toward the risks that are most important is a key element of supply chain cyber risk management.

And there would have been another benefit: Including this list in the requirement itself would have made it auditable. That is, the auditors would have been able to determine whether or not the utility gave serious consideration to each of these areas of risk, based on the documentation they were shown. If the utility hadn’t even considered each of these areas, they would have been eligible for a PNC (potential non-compliance) finding in an audit.

However, as far as I know, a substantial majority of NERC entities (who have medium or high impact BES Cyber Systems, since they are the only ones in scope for CIP-013 so far) did a good job of identifying and assessing supply chain cyber risks in R1, in spite of there not being a list of risks in the requirement itself. This is because organizations – especially the North American Transmission Forum (NATF) – stepped up to provide the guidance that the drafting team didn’t want to include in the requirement itself.

To get back to the question in the title, has CIP-013 worked? Yes, it has. It’s worked both as a supply chain cybersecurity standard, and as a cybersecurity risk management standard. It could have been better, but it could also have been a lot worse.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

[i] They were joined by CIP-010-2 (now -3) and CIP-003-7 (now -8), which are also risk-based requirements – although neither of them mentions risk, either.

Tom Alrich's Blog

Sunday, April 17, 2022

Has CIP-013 worked?

No comments:

Post a Comment

Get new posts by email: