Tuesday, May 16, 2017

What Systems Should be in Scope for CIP?

In my post yesterday, my second on Wannacry, I was addressing the emergency patches made available on Friday for Windows XP, Vista and Server 2003 – all out-of-support operating systems. While I had already published a “Public Service Announcement” on the need to apply that patch on all systems at High and Medium impact assets in scope for CIP, an auditor had emailed me to point out that, for security not compliance reasons, the patch should be applied to all devices on the OT network that run one of the three old OS; this includes devices found in Low impact assets, Distribution substations and generating plants that are below the threshold for inclusion in the BES. The auditor’s reasoning for suggesting this was good: Just because these devices aren’t directly networked with Cyber Assets in scope for CIP (if they were directly networked, they’d at least be PCAs), if they become infected with Wannacry they will still pose a substantial risk to the BES.

Of course, many NERC entities will argue that they already have great defenses protecting their networks subject to CIP – those in High and Medium impact Control Centers, and in Medium impact substations and generating stations – from their networks that aren’t subject to CIP. And I’m sure this is almost always the case, to a large degree due to CIP, which does require thorough separation of ESPs from other networks. But this didn’t deter the auditor from still advocating (coming as close to “requiring” as he could) that the discontinued OS’s on non-ESP networks also should be patched.

And the reason for this is simple: There is no such thing as a 100% effective security measure. For a threat as serious as the one posed by Wannacry, however small the chance that it could spread from say a distribution substation to a High impact Control Center, almost any security measure would be justified to prevent that from happening.

But if this is the case, why aren’t these other systems subject to CIP? If there’s even a small chance that they could be the vector for an attack like Wannacry that could lead to a serious event on the Bulk Electric System, shouldn’t there be at least some protections (e.g. patching, in the event of a serious threat like Wannacry) that would apply to them?

Or to use another attack as an example, the Ukraine attacks in December 2015 didn’t originate on the OT network; they started with phishing emails opened by people who had no connection at all to actual operations. Yet by opening these emails, these people inadvertently made it possible for the attackers to have free rein of the IT network and search diligently for a way to get into the OT network – which they inevitably found.

As I’ve said before, I do think IT assets need to be included in CIP in some way. I also believe that non-CIP OT assets (such as the ones discussed above with reference to patching) should also be included. More generally, I think that every cyber asset either owned or controlled by the NERC entity should be included in scope for CIP. But there are a few caveats to that:

  1. I certainly don’t want these new assets to be treated as BES Cyber Systems or Protected Cyber Assets. This would impose a huge burden on NERC entities, for a much-less-than-proportional benefit.
  2. The only way the new assets should be included is if CIP – and the enforcement regime that goes with it – is totally rewritten, along the lines of the six principles I discussed in this post.
  3. My fifth principle is “Mitigations need to apply to all assets and cyber assets in the entity’s control, although the degree of mitigation required will depend on the risk that misuse or loss of the particular asset or cyber asset poses to the process being protected.” In practice, I think there need to be at least two categories[i] of cyber assets in scope: direct and indirect impact. Direct impact cyber assets are those whose loss or misuse would immediately impact the BES; these are essentially BCS, but I would of course change the definition to fix some of the current problems. Indirect impact cyber assets are those that can never themselves directly impact the BES but can facilitate an attacker, as happened in the Ukraine (and as would have happened had any utilities been compromised by WannaCry – since their OT networks aren’t connected to the Internet, the initial infection would have been on the IT network). Essentially, all systems on the IT network, as well as systems at Low impact BES assets and at Distribution assets, would fall into this category.[ii]

As I said in my Wannacry post from Friday, I’m now leaning more to the idea of having a separate agency - within probably DHS - regulate cyber security of critical infrastructure. This includes the power industry, oil and gas pipelines, water systems, chemical plants, etc. I’m not doing this to punish NERC, but because I believe there will be a lot of advantages to having one regulator overseeing all of these industries, as opposed to separate regulators for each one. For one thing, there would be a lot of synergies, since the similarities among critical infrastructure in these industries are much greater than the differences between them (for example, if you look at my six principles, you’ll see they don’t refer to power at all). For another, I think the power industry, which has had by far the most experience with cyber regulation, would be in a good position to share their lessons learned with the others.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

[i][i] Note these categories don’t have anything to do with the High, Medium and Low impact categories in the current CIP v5/6/7 (and soon 8!). As I pointed out it seems like 50 times a few years ago when I was digging into the swamp known as CIP-002-5.1 R1 and Attachment 1, those are really not categories of BES Cyber Systems (even though they are identified as such in the requirement); they’re categories of BES assets (substations, etc). I think I first pointed this out when FERC issued their NOPR saying they’d approve CIP v5 in April 2013 (see footnote vii as well as my response to the first comment at the end of the post).

[ii] I’m not ruling out the possibility that there might need to be other categories, or sub-categories of these two.

Monday, May 15, 2017

Follow-Up on Wannacry

I received a few interesting comments on my Saturday post on the Wannacry worm, which I would like to share with you.

First, an auditor wrote me regarding the Public Service Announcement from Lew Folkerth of RF, that I included in this post. That announcement pointed out that there is now a patch for Windows XP, Vista and Server 2003. If you have Medium or High impact BES Cyber Systems and have BCS or PCAs running one of those OS’s, you are now on notice that a security patch is available for them (for the first time since support was discontinued, I believe). You’re required to install that patch per the schedule in CIP-007 R2. But you should really install it ASAP, not wait 35 days. This isn’t required by CIP, but it should be  required by common sense, and reading the news reports.

However, the auditor also wants me to point out that NERC entities that have any systems with one of the three discontinued OS’s running on their OT networks – say, systems in Distribution substations or perhaps Low impact generating stations – should also quickly patch them. For one thing, you shouldn’t want Distribution outages any more than you want Transmission ones (even though the latter are the only kind that might involve CIP violations). But for another, even if your only concern were Transmission assets and you in theory have these wonderfully isolated from the Distribution and corporate networks, if for some reason you’re wrong and there is a connection you didn’t know about between your Distribution and Transmission networks, a Wannacry infection on the former could lead to real disaster (both for your utility and your NERC peers) on the latter.

This observation does point out to me an implication for the Big Picture in NERC CIP. And since I’m a Big Picture sort of guy, I’d like to elaborate on that. However, I’ll spare you this elaboration until my next post.

Second, a security manager for the High impact Control Centers of a large utility pointed out an interesting caveat regarding the section of the post entitled “Not a Public Service Announcement, but still Interesting”. This referred to the fact that there is a “kill switch” embedded in the code for the worm, which requires it to look for a certain domain name on the web; if it finds something at that domain, it de-activates itself. An unknown cyber security researcher found this domain name wasn’t registered, registered it, and linked it to some existing system. That act killed the worm and probably prevented a lot of further damage, especially in the US.

The security manager pointed out that at his Control Centers he has disabled DNS recursion and forwarding (I would imagine he’s not alone, when it comes to Control Centers with High or Medium BCS). Of course, this is normally a very good thing, since if a machine at the Control Center becomes infected with almost any other malware and starts trying to phone home to a command and control server, it won’t be able to get through.

However, this does mean that, assuming he doesn't take any other precautions, if any of his machines within the Control Center do get infected with Wannacry, this security measure would in theory end up enabling the worm to run. The worm would try to locate the domain name in question, but of course it would receive a message saying it can’t be found. But that means the kill switch would be ineffective, and the worm would proceed on its merry way to try to infect all of the machines in the Control Center. Of course, that won’t happen since the machines are fully patched against Wannacry, and he has beefed up his antivirus DAT files to be sure to catch Wannacry if somehow it does get into the ESP. But it does show how you have to think these things through.

Finally, this is a comment I received from myself, regarding the final section of the post titled “Also not a Public Service Announcement, but also still Interesting”. This regarded the nation-state whose security services are suspected of being behind the Shadow Brokers, the group that stole hacking tools from the NSA and dumped them online. That same nation-state was (by far) the biggest victim of Wannacry (at least as of Saturday). I want to point out that the Good Book, which isn’t normally my number one source of information on cyber security issues, has this one nailed: “…whatsoever a man soweth, that shall he also reap.”

However, that quotation also needs to be applied to another large country that was severely impacted but with a one day delay – the biggest impact in that country was today, Monday. The country has a lot of pirated Windows software that of course isn’t receiving regular patches. As a result of that lack of patching, systems across that country booted up today and found their files had been encrypted.

But before I get on a high horse and start being smug about other countries bringing their troubles on themselves, I do want to point out that the Original Sin in all of this is the fact that a serious software vulnerability was discovered by a government agency in the US, but not reported to the vendor. If it had been reported, it could have been patched before the bad guys also discovered it. Instead, the aency used the vulnerability as the basis for a potent cyber weapon. Sounds like a great idea at first glance, but that assumes knowledge of the vulnerability will never leave your control. Unfortunately, that’s exactly what happened here.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

Saturday, May 13, 2017

Makes you WannaCry (and a Public Service Announcement)

Yesterday’s events were a real eye-opener to me. And I think they should be an eye-opener for anybody involved in critical infrastructure security. Here are some initial thoughts:

  1. This was an infrastructure event, not just a bunch of individual computers that fell prey to ransomware. Sure, reports say up to a billion dollars may need to be paid in ransom, but that isn’t what’s significant, IMHO. What is significant is the fact that at least one critical infrastructure, that of the National Health Service in the UK, was severely impacted[i]. If nobody lost their life because of this, it will be a miracle. But there were certainly a lot of people whose health will suffer in various ways due to their lack of access to care yesterday.
  2. As far as I know, all ransomware until yesterday has infected only individual machines (some were servers, of course, which impacts many users). And in all cases, what was affected was data. It was of course painful to pay the ransom, but that restored the data (in most cases), and there were few if any further direct effects. Even a successful ransomware attack on a US electric utility last year didn’t have any impact on operations.
  3. Compare this to yesterday’s events in the UK, in which surgeries and regular doctors’ appointments had to be cancelled, people were turned away from the ER, patient records and test results couldn’t be accessed, etc. Even though it wasn’t intended as such, this turned out to be an attack on the UK health care infrastructure. This is all due to the fact that WannaCry (and there have been some variants appearing as I write this) is a worm[ii] and a very fast-spreading one at that[iii].
  4. Now suppose that other critical infrastructure in the UK, such as the power grid, water systems, traffic systems, etc. had also been successfully attacked by WannaCry. If a lot of people had been sickened by impure water, or had traffic accidents when the stoplights in London suddenly went out, where would they have gone for treatment? And with the lights out and the Underground shut down, how would they have gotten there anyway?

So I’d say there are at least two major lessons from this, for the critical infrastructure community. First, an infrastructure attack doesn’t have to be deliberately caused – it can be a side effect of an attack with another purpose. Specifically, a worm-based ransomware attack can have a huge CI impact, even though it was never intended to do this.

Second, the need for coordination among critical infrastructures – both locally and nationally – is greater than ever. In fact, I’m beginning to think that it’s now becoming an unaffordable anachronism to have separate cyber regulatory structures for the Bulk Electric System, electric power distribution, natural gas pipelines, natural gas distribution, water treatment, health services[iv], etc. Maybe there should be a single organization – perhaps under DHS – that regulates cyber security of all critical infrastructures.

Public Service Announcement
Lew Folkerth of RF emailed me this afternoon to ask me to point out that there is now a security patch for Windows XP, Vista and Server 2003 (Microsoft released the patch yesterday). As Lew points out (and this applies to all NERC regions, not just RF), “This means there IS a patch source for those systems, and entities need to identify the source, assess the patch for applicability, and install the patch (or create/update a mitigation plan[v]).” Of course, this only applies to High or Medium impact systems running this software.

Not a Public Service Announcement, but still Interesting
You’ll notice the Binary Defense link I just provided thanks “MalwareTechBlog” for initiating the kill switch that shut the worm off. It points out that this move undoubtedly saved lives. I think the idea is that by shutting the malware off early (US time) on Friday morning, this move greatly inhibited its spreading here, since most workers weren’t in their offices yet and able to open the phishing emails that spread the worm.

But it turns out that the unnamed person behind MalwareTechBlog didn’t actually know he was killing it – you can read the story here. Of course, he still deserves lots of accolades (if he were willing to come forward) and perhaps a Presidential Medal of Freedom. But it just proves an adage I’ve repeated since I was a boy (20 years ago): “Rational planning is good, but in the end there’s no substitute for dumb luck.”

Also not a Public Service Announcement, but also still Interesting
The exploit that made WannaCry so effective was one that had been stolen from the NSA and dumped online by the Shadow Brokers group; this group has been linked to a certain country’s intelligence services. And guess which country – as of today, anyway – is listed as the number one victim of WannaCry? Hmmm…

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

[i] Other infrastructure events included factories that had to be shut down, and multiple government bodies in Russia that had to curtail operations.

[ii] More specifically, it is delivered on a “worm delivery system” built on the EternalBlue exploit.

[iii] Although the all-time champ for speed of spreading has to be 2003’s SQL Slammer, which infected its 75,000 victims worldwide within ten minutes. In fact, I read somewhere that this figure was something like 85% of the potential victims (MS SQL systems that hadn’t received a recent patch) worldwide. Talk about efficiency!

[iv] When I speak about health services, I’m not talking about patient data privacy. Cyber regulations like HIPAA in the US are already addressing that. What they aren’t addressing now is the infrastructure required to keep the health system running smoothly. Of course, individual hospitals, doctors’ offices, ambulance services, etc. have a lot of incentive to protect the systems required for their individual operations. But I don’t believe there’s any organization – like NERC for electric power – that is specifically charged with regulating cyber security for the purpose of maintaining reliability of the health care system.

[v] And if you’re not sure what should be in a mitigation plan, see my previous post.

Friday, May 12, 2017

What is a Patch Mitigation Plan?

Recently, a NERC entity emailed me with a question about CIP-007 R2, patch management. Specifically, the question was whether the mitigation plan needs to do more than simply explain why the patch can’t be installed at the time, and state that it will be installed by a specific future date; it seems their auditor had informed them that wasn’t enough.

I knew the answer to this, but I reached out to an auditor for his opinion and I was glad I did – he had some very helpful suggestions.  Here is his response in full:

“The requirement is to create (or update) a mitigation plan if the patch cannot be implemented within 35 days of it being determined to be applicable.  The Registered Entity is expected to document when and how the vulnerability will be addressed, and the expectation as expressed in the Measures is to specifically document the actions to be taken by the Responsible Entity to mitigate the vulnerabilities addressed by the security patch and a time frame for the completion of these mitigations.  Simply stating the patch will be installed sometime in the future is not an action that mitigates the vulnerability in the interim.

“The Registered Entity needs to understand what the vulnerability is and how it can be exploited in order to document what mitigating controls are in place to reduce the risk of exploit until the patch can be installed.  Often, but not always, the proper implementation of the CIP Requirements will mitigate the risk.  For example, if the vulnerability can be exploited across the network, tight firewall rules will likely be a mitigation as long as there is no requirement for broad access to the Cyber Asset that counteracts the control.  The Registered Entity might also update its anti-malware signature files more frequently and/or increase monitoring of the impacted Cyber Asset.

“But, if the exploit requires physical access to the Cyber Asset, asserting the device is behind a firewall is meaningless.  Rather, the mitigations would include physical access restrictions, possibly current or enhanced restrictions on the use of removable media; in other words mitigation steps that counter the exploit mechanism.  And, while not stated as an explicit requirement, the Registered Entity really needs to monitor the vulnerability until the patch is installed in case the exploit risk changes, possibly requiring additional protections.  That would be a good cyber security (best) practice.”

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

Thursday, May 11, 2017

Webinar: Third-Party Cyber Risk Management

A very big concern today – in almost all industries - is third-party cyber risk. Of course, this often manifests itself in the form of vendor risk, which is why NERC is now finishing development on CIP-013 and related changes in two other CIP standards. Vendor cyber security can pose a risk both to the Bulk Electric System (which is of course why we will have CIP-013) and to your organization itself (a great example of that is the Target breach, which started because one of their suppliers had unneeded access to the actual production network).

On Tuesday, May 23 from 12:30 – 1:30 EDT, Deloitte and the law firm Morgan Lewis will present a webinar on Third Party Risk Management. This webinar will address:
•            The third-party risk landscape
•            How third parties exacerbate an organization’s cyber risk
•            The growing regulatory and legal importance of managing third-party cyber risk
•            The complexity and impacts of responding to a third-party cyber risk incident
•            Solutions for managing third-party cyber risk

To register, please go here.

I have said before that Deloitte’s Cyber Risk Services group is one of the largest, if not the largest cyber security consulting organization in the world, with over 3,000 US-based cyber consultants. However, we are part of a much larger organization, Deloitte Advisory, which advises organizations on dealing with many kinds of risk, including Financial, Regulatory, Legal, and Third-Party.

This webinar is a joint effort of the Third-Party and Cyber Risk groups. I hope you will find it gives you a perspective on the larger problem that CIP-013 is trying to address. Feel free to forward this post to anyone in your legal, risk management, supply chain or other departments who you think would be interested in attending.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

Wednesday, May 10, 2017

The News from RF, Part III: What Happens on 9/1/18?

My last post was the first of at least three or four dealing with interesting things I learned from the presentation by Felek Abbas of NERC and Lew Folkerth of RF at RF’s CIP compliance workshop last month in Baltimore (the download includes all of the presentations at the workshop. Felek and Lew’s presentation starts on slide 19).  

Felek’s part of the presentation started with a discussion of compliance dates for the Low impact requirements in version 6. Since those should be well known to you already, I won’t discuss them now. What I found interesting was slide 29, which says in part:

  • CIP-003-7 was filed with FERC on March 3, 2017
  • However, CIP-003-7 is very unlikely to come into effect before September 1, 2018. You will need to comply with the CIP-003-6 version of these requirements beginning September 1, 2018, until the effective date of CIP-003-7.

This was interesting because, in the whole LERC discussion last year, it had never even occurred to me that CIP-003-7 wouldn’t be in place by 9/1/18; I never thought there would be a serious possibility that entities would have to comply with version 6 of this standard, and then version 7 (as I’ll discuss below, having to do both doesn’t change what you have to do to comply, but it certainly does change the language you need to use to document compliance). However, this was probably because I hadn’t bothered to read the implementation plan that got passed with CIP-003-7(i) this year.

When I read the plan after hearing Felek’s discussion, I realized that the words “very unlikely” on the slide should have been replaced with “mathematically impossible”. This is because the plan says “…Reliability Standard CIP-003- 7(i) shall become effective on the first day of the first calendar quarter that is eighteen (18) calendar months after the effective date of the applicable governmental authority’s order approving the standard…”

So let’s do the math. CIP-003-7(i) was filed with FERC on March 3. If FERC had approved it that day (which I doubt has ever happened for a NERC standard, or almost anything else), the effective date would have been October 1, not September 1, 2018. Of course, I doubt this would have been a big deal, since the Regional Entities wouldn’t have issued any PNCs (the successor of PVs) for turning in V7 documentation during the month of September, 2018. And even if FERC had taken just 3-6 months, I think the Regions would still follow the same approach.

However, unless you’ve been living in a cave in the Himalayas for the past year, you have probably heard that there is a new administration in Washington and they have been very slow to make high-level appointments to almost any Federal agency. In FERC’s case, this situation was made worse by the fact that a key resignation in January left the Commission with only two Commissioners (out of a normal five), meaning they don’t have a quorum to conduct business.

And since one of the remaining commissioners has announced her intention to resign, two more new Commissioners need to be appointed and confirmed by the Senate, then get comfortable in their new jobs, before there is any chance at all that CIP-003-7(i) will be approved. So it’s almost certain now that it will be summer 2019 at the earliest before the new standard comes into effect, and that there will be a period of at least nine months during which entities will have to comply with CIP-003-6 (and when I’m talking about CIP-003 v6 or v7, I’m specifically talking about Section 3.1 of Attachment 1 of CIP-003. The single sentence in this section is the only substantial change between the two versions).

However, as I implied above, this shouldn’t require entities to implement procedures or technologies to comply with CIP-003-6, then rip them out when CIP-003-7 finally comes into play. As I discussed in this post last November, almost everything – with one exception that I’ll discuss below – that you could do to comply with Section 3.1 of Attachment 1 of CIP-003-6 will still work under CIP-003-7.

The one exception to this statement is if you had had your heart set on the fact that there is a routable-to-serial protocol conversion within say a substation, as reason why you don’t have to take any more protections on the routable connection to the substation. I believe this was the one case that FERC had in mind when they ordered NERC to eliminate the word “direct” from the LERC definition, when they approved CIP v6 in Order 822. So don’t plan on being found compliant if you do this.

Given that practically everything else you can do to comply with CIP-003-6 will work for CIP-003-7 as well, this means that the only difference will be your documentation. But you definitely can’t use the same words to describe the same setup in both versions. For one thing, you need to lose the word LERC. It sleeps with the fishes. If you’re looking for clues on how to do the v7 documentation, you might look at the post from last November, referenced two paragraphs above.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

Friday, May 5, 2017

The News from RF, Part II: As Usual, Lew Hits the Nail on the Head

I have already said that Reliability First’s CIP workshop two weeks ago was the best regional CIP meeting I have attended. Probably the highlight of the meeting for me was the joint presentation by Lew Folkerth of RF and Felek Abbas of NERC (you can find their slides here, in the single file that includes all of the day’s presentations. Their slides begin at slide 19). I’ll have at least a few more posts on points that were made by Lew or Felek.

Lew addressed a number of interesting topics, including the RSAW for the new CIP-003-7 standard; of course, the standard itself is awaiting FERC approval. One of his points was a real lightbulb moment for me, which I’d like to share here. On slide 68 in the second section, Lew listed what the new RSAW says regarding auditing Attachment 1, Section 5 of CIP-003-7 (this is the new requirement that addresses Transient Cyber Assets used at Low impact assets): “For Transient Cyber Assets managed by the Responsible Entity in an ongoing manner, verify that the Transient Cyber Assets have an effective means of mitigating the risk of the introduction of malicious code onto the Transient Cyber Asset.”

Lew emphasized the word “effective”, then pointed out that he thought this is really the key to auditing non-prescriptive, results-based requirements (although I prefer the term “objectives-based[i]”), such as this one. That is, since this type of requirement only specifies an objective that needs to be met, not the method to achieve it, there has to be some criterion that the auditor uses to determine what is an acceptable method and what is not.

For example, in CIP-007 R3 (another objectives-based requirement), the entity is required to achieve the objective of mitigating the threat posed by malware to BCS. Suppose an SME at an entity told the auditor that, based on the advice of his brother-in-law, his method of mitigating the malware threat to one or more BCS is to say a certain chant every morning at 7 AM. I think the auditor would be justified in finding the entity in violation - not just issue an Area of Concern, as might be the case if the entity had chosen IDS signatures over anti-virus or application whitelisting methodologies. In the latter case, the auditor might issue an Area of Concern and ask the entity to either justify this decision or implement a different solution. IDS signatures are a plausible methodology for effectively mitigating the malware threat, whereas chants are not (and please don’t send me emails arguing why chants are probably likely to be as effective as IDS signatures! I pride myself on having a fairly open mind, but I do have my limits).

Lew wrote an article about auditing non-prescriptive CIP requirements for the January/February RF newsletter, and I wrote about that article in my own post. I just checked to see how the use of “effective” as a criterion fits into what he said in that article. He lists four components of a good evidence package for the requirement he wrote about in that article, CIP-010-2 R4 (of course, another non-prescriptive requirement). The third component is that the plan must show “how methods documented in the plan achieve the objectives” (the “plan” Lew refers to is the one required by R4. You could say that the plan is the same thing as the objective of this requirement).

Of course, the word “effective” isn’t in here, but I would argue that “methods that achieve the objective” is the same thing as saying “effective methods for achieving the objective”. So I call this a match (not that I would hold it against Lew if his thinking had evolved since he wrote the article. My thinking is always evolving - to put it kindly - and my unofficial motto is “Often wrong, but never in doubt!”).

To sum up this post, I think that the word “effective” (or an equivalent word or phrase) should be understood (and if possible, explicitly stated) in every non-prescriptive, objective-based requirement. This will effectively (I couldn’t help that one. Sorry) indicate that the entity must not just utilize one or more methods to achieve the objective, but that the chosen method must be effective. Of course, none of the current non-prescriptive CIP requirements (such as CIP-010 R4 and CIP-007 R3) currently use this word, but I imagine the RSAWs effectively (OK, I did it again!) remediate that omission. In any case, you should always understand that this word is at least implicitly in place.

As a postscript, I want to point out that one questioner at the RF CIP workshop implied to Lew that the use of the word “effective” would increase use of “auditor discretion”, and thus was a bad thing. I can’t remember Lew’s answer, but I know my answer – if I were in Lew’s place - would be: “The fact that this requirement is non-prescriptive means auditor discretion will definitely be required, whether or not the word ‘effective’ (or its equivalent) is present in the requirement – and the decision to make the requirement non-prescriptive was made by the Standards Drafting Team, not me. However, as I discussed in this post, auditor discretion is already required to audit most of the current CIP v5 and v6 requirements – both prescriptive and non-prescriptive - due to the presence of many ambiguities and missing definitions. The auditor is expected (perhaps with assistance from the Regional Entity) to use whatever training they have in legal logic to audit in spite of these flaws.

“The difference with non-prescriptive requirements is that the auditor is required to use discretion regarding matters of cyber security, including making judgments about whether the entity has used an effective methodology for addressing a particular requirement. Since the auditors are chosen for their posts in part because of cyber security expertise, not legal training, I think it is much preferable to have them exercising judgment in cyber issues, rather than legal/logical ones. But in any case, non-prescriptive requirements are clearly here to stay. No CIP drafting team has drafted prescriptive requirements since v5; I predict that no more will be drafted, regardless of what happens with the current prescriptive requirements[ii] and the current compliance regime.”

Note: When I showed a draft of this post to Lew, he commented that he wasn’t sure what his answer was, but he should have referred to GAGAS, which requires use of professional judgment when performing audits. So his answer would be something like “Use of professional judgment isn’t the exception in auditing, but the rule. Even the “Bible” of our profession requires that we exercise professional judgment, since no requirement ever perfectly addresses every possible case you may throw at it.” Amen.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

[i] After reading a draft of this post, Lew commented that he prefers this term as well.

[ii] I used to view the current CIP v5 and v6 requirements as being almost entirely prescriptive, except for a few notable exceptions like CIP-007 R3 and CIP-010 R4. I now think that the majority of the current requirements and requirement parts are non-prescriptive, perhaps the great majority. I hope to sit down in the not-too-distant future and determine whether each requirement and/or requirement part is prescriptive or not. However, in my opinion there are a few very prescriptive requirements – including CIP-007 R2 and CIP-010 R1 – that require NERC entities to devote inordinate amounts of resources to them, way out of proportion to whatever benefits they provide. 

Monday, May 1, 2017

A Clarification to my Last Post

In my last post, I concluded with this paragraph: “And this, folks, is the canary in the coal mine, which has just keeled over dead: It’s clear (to me, anyway) that no new compliance area (like supply chain or virtualization) can be incorporated into CIP unless this is done with non-prescriptive requirements. Indeed, no new prescriptive CIP requirements have been proposed since 2012, when CIP v5 was approved by the NERC membership. Yet it is also now clear to me that no new non-prescriptive standards (or requirements) will ever be freely accepted by the NERC membership until the CIP compliance regime itself has changed. I will be watching to see what happens.”

To summarize the argument in this paragraph:

1.       New “compliance areas” won’t be incorporated into CIP unless the requirements implementing them are non-prescriptive.[i] The two examples of new compliance areas that I used are supply chain security and virtualization.
2.       But (as discussed in the last post) no new non-prescriptive requirements will be “freely accepted” by the NERC membership until the overall NERC CIP compliance regime has been changed. Of course, I used the phrase “freely accepted”, because I was arguing in that post that, even though CIP-013 was very likely to be approved by NERC, it was going to be approved either without a passing vote by the NERC ballot body (i.e. through use of the Section 321 process) or as a result of a substantial effort by an important industry trade group to “persuade” their members to vote yes on the next ballot, despite what are likely to be substantial misgivings.
3.       Therefore (by implication, although I didn’t explicitly state it), only a compliance regime change will allow CIP to be expanded to address these and other new compliance areas.

While I still agree with my conclusion in point 3, I do have two misgivings about point 2:

First, it is still possible for new non-prescriptive requirements to be freely accepted by the NERC ballot body. The changes in CIP-003 (the “LERC” requirement and the new requirement for Transient Cyber Assets used at Low assets) that were recently approved by both the NERC ballot body and the NERC Board were freely approved, although I think the fact that a FERC deadline was looming for the LERC requirement helped get that one passed. I was frankly surprised, given the negative comments by many entities on the second ballot, that this requirement nevertheless received the required super-majority.  

However, I now don’t want to go so far as to say that a new area like supply chain can never be implemented through exclusively non-prescriptive requirements – i.e. that the non-prescriptive requirements required to incorporate a new compliance area (such as supply chain security) into CIP will never be freely accepted by the NERC membership. I believe CIP-013 would eventually have been balloted, modified and re-balloted enough times that it would have passed without any external intervention; it is just that FERC’s very aggressive one-year deadline for developing the standard precluded this from happening.

But virtualization is an example of an area that can only be addressed through modifications of a number of existing prescriptive requirements, as well as new requirements and definitions. I see no end to the balloting that will be required to effect all of the changes that might be required to implement virtualization, to the extent that I am close to certain the current CIP Modifications SDT will never complete the mandate in their SAR to incorporate virtualization into CIP.

Second, it may seem that I am implying in the second point above that only if a new standard or requirement is “freely accepted” will it be “legitimate”. I agree that, given the standards approval process described in the NERC Rules of Procedure, the only “normal” process for approving a new standard is one in which it receives the required super-majority of votes; this implies that standards not approved by a super-majority in a ballot are somehow less legitimate than others. However, while I am a great fan of democracy in general, I’m not saying that this is the only way to develop mandatory cyber security requirements. In my opinion, it would be quite legitimate if for example a group composed of stakeholders (mainly NERC entities) would both draft and approve new CIP standards; these would then be submitted to FERC for final approval. Of course, this would require a change to the NERC Rules of Procedure.

What I was actually trying to say by using the phrase “freely accepted” was that, if there isn’t broad consensus among NERC members that a particular standard is a legitimate one worthy of their attention, the entire compliance process – which requires a huge amount of cooperation by the entities themselves – will be thrown into chaos. It will be very hard to convince the NERC membership that they need to comply with a standard if a majority of members believe complying with this standard will simply be a waste of time and money with no benefit to the reliability or stability of the BES. As long as NERC is charged with developing new or revised standards for cyber security, any standard that is approved in such a case will almost inevitably end up being changed or withdrawn before it ever reaches FERC for approval. There is going to have to be substantial input by the entities into the final product, whether or not it is expressed in an actual ballot.

However, implementing a new area of compliance like virtualization (which is, of course, on the docket for the current CIP Modifications drafting team) will require a lot of changes to existing prescriptive requirements, as well as some new requirements, which presumably could be non-prescriptive. To push beyond the points I made in this post, I believe just getting the needed non-prescriptive requirements approved will take literally years (I’m guessing there might be at least five or six new non-prescriptive requirements required to incorporate virtualization into CIP, each of which will take at least six months of the SDT’s full attention[ii]).

But modifying the many existing prescriptive requirements that will need to be changed to accommodate virtualization will take far longer, both because there will be more of them but also because there are likely to be epic battles over the wording of the changes; in prescriptive requirements, the wording is everything. This is the main reason why I believe the drafting team will never complete their virtualization mandate.

The same reasoning would apply to any future effort to allow entities to put BCS in the cloud (as opposed to putting BCS Information in the cloud, which is already permitted by the existing CIP requirements), although such an effort is not even being contemplated at this point. Besides probably some new requirements, this effort would require substantial changes to almost all of the existing CIP requirements, prescriptive and non-prescriptive. I think trying to make these changes in the existing set of CIP requirements would require something like a generation of effort, so SDT members would need to be recruited fresh out of college with the somewhat slim hope that they will finally accomplish their goal before they retire. Not exactly a formula for success, IMHO.

To summarize, in this post I tried to clarify the conclusion of my last post (not sure I succeeded, though). Instead of implying (as I did in that post) that there can never be any expansion of CIP to address new areas of compliance, I want to say that there can be expansion that doesn’t require changes to existing CIP requirements (as is the case for CIP-013. Even though the new draft will include changes in several existing CIP requirements as well as a revised CIP-013, the wording of the existing requirements themselves wasn’t modified, just added to). But for areas like virtualization and BCS in the cloud that do require changes to existing requirements, I believe that expanding CIP to cover these areas will never be possible until all of the CIP standards are rewritten from scratch, and incorporated into a new compliance regime similar to the one I outlined in my last post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

[i] In fact, as I will point out in my next post, no new prescriptive CIP requirements at all have been drafted since CIP v5. This is not by chance: Both NERC and FERC now realize that non-prescriptive requirements are the only way to go when it comes to cyber security.

[ii][ii] I’m basing this estimate on the fact that the LERC requirement – which was non-prescriptive – took more than six full months of the SDT’s time, using my informal observations. And this requirement was a modification of an existing prescriptive requirement. The SDT made it non-prescriptive and thus allowed entities more options for complying with it, without removing any options they previously had, yet it still was greeted with a lot of concerns and even outright hostility. So my guess is that getting a brand new non-prescriptive requirement approved would take significantly more of the SDT’s time than six months – which is why I think my estimate of six months for each new requirement is quite conservative.

Thursday, April 27, 2017

Why the Canary Died

In early January, I wrote a post comparing the two CIP Standards Drafting Teams currently operating: the CIP Modifications SDT and the Supply Chain Security SDT. In that post, I made the argument that one team (the Supply Chain team) is likely to achieve their goals fairly easily and on time, while the other team isn’t likely ever to fully achieve their goals.

It is now time to update that observation; I have good news and bad news. The good news is I no longer see such a disparity between the two teams. The bad news is I am now quite sure that both of these teams won’t fully achieve their goals.

In fact, for the Supply Chain team, it is just about certain they won’t achieve their goal, which was to draft and get approved – by the NERC ballot body, the NERC Board of Trustees, and finally by FERC – probably the first non-defense mandatory supply chain security standard in the history of the universe.[i]

As I explained in a recent post, it is now very possible that CIP-013, the draft Supply Chain Security standard (along with related changes in three other CIP standards), will never be approved by a NERC ballot, and that NERC will for the first time have to invoke the Section 321 process in order to have a CIP-013 standard ready for FERC’s approval in September[ii] (of course, it is almost 100% certain that FERC won’t have a quorum in September, since they don’t have one now and no new Commissioners have even been nominated yet. This means the standard will probably just sit on their desk for many months, if not for a year or more, once it does get delivered to FERC. But that’s another story). How did this situation come to pass?

For the Supply Chain SDT, the post just linked will give you a good summary of why they likely won’t achieve their goal. So the question is: What has changed since January, when I was very optimistic that CIP-013 would quickly cruise to approval (at least on the second ballot. Approval on the first ballot rarely happens in NERC, and has never happened for any of the CIP standards)?

I admit that when I wrote the January post – which was before the first draft of CIP-013 was even posted for comment, let alone balloted – I assumed that everyone would love a non-prescriptive standard. After all, non-prescriptive, objective-based requirements allow the entity to determine for itself what is the best means of achieving the stated objective. Why wouldn’t NERC entities want that?

The two examples I use most often to illustrate the difference between prescriptive and objective-based requirements are CIP-007 R2 and CIP-007 R3. R2 (patch management) is very prescriptive; I know that many entities are spending large amounts of resources in trying to comply with it (and even then they aren’t able to be 100% compliant, as this post shows). On the other hand, R3 (malware protection) is very non-prescriptive, simply saying that the entity should “Deploy method(s) to deter, detect, or prevent malicious code” (R3.1) and “Mitigate the threat of detected malicious code” (R3.2).[iii]

There has been a large amount of grumbling about R2 – and as I’ve said in another post, one entity told me that half of all the compliance documentation they’re producing in their control centers is tied to that one requirement – vs. almost no grumbling (that I’ve heard of) about R3. This is especially true given that R3 replaced the anti-malware requirement in CIP v3, which required entities to take a Technical Feasibility Exception for every switch, router, PLC etc. that wouldn’t take antivirus software. Given that NERC entities seem to prefer non-prescriptive requirements once they’ve started using them, I just assumed people would be quite happy with the first draft of CIP-013, where all of the requirements were much more like CIP-007 R3 than CIP-007 R2.

However, the first draft of CIP-013 when down to ignominious defeat, garnering a whopping 10% positive vote. While the comments revealed many reasons for the overwhelmingly negative vote, one important theme was that many CIP compliance professionals at NERC entities don’t trust the auditors to be able to audit non-prescriptive requirements; they want to have the “safety” of requirements for which there can be no question how they can be audited – i.e. prescriptive requirements. This is usually expressed with a statement to the effect that the auditors shouldn’t be allowed to exercise “discretion”.

I wrote about this phenomenon in another recent post, where I pointed out that CIP auditors are already required to use a lot of discretion in their CIP audits, especially since v5 became enforceable. In effect, I said (although I didn’t explicitly make this point) that it would require absolutely perfectly-written requirements (and definitions) for the auditors not to need to use any discretion at all. Furthermore, I said the auditors currently have to use discretion mostly about things they’re not trained on, such as how to interpret the meaning of ambiguous or missing terms in quasi-legal definitions. Wouldn’t it be better to remove the need for them to make these kinds of judgments (by making the requirements non-prescriptive) and require them to instead exercise judgment in an area they do understand, namely cybersecurity – by only having to audit non-prescriptive CIP requirements?

To be honest, that post essentially said to the people who are worried about auditor discretion “You’re not being rational. Get over it.” I now realize that it would have been much more productive to assume these people are rational (which they of course are, especially considering experiences some entities had with auditors under CIP v3) and ask what could be a rational basis for their fears.

After much thought, I’ve decided the answer to this question is that, even if all of the CIP requirements were made non-prescriptive (objectives-based) tomorrow, the overall compliance regime for CIP – and for the other NERC standards – remains very prescriptive. So even though all the requirements in a new standard like CIP-013 are non-prescriptive, the fact that the overall compliance regime remains the same old prescriptive one will almost inevitably lead to fears – grounded or not - that the auditors will have a tendency to fall back to their old ways and audit non-prescriptive standards as if they were prescriptive ones. There needs to be a different compliance regime.

Before I describe the new compliance regime I’m proposing, let me describe how I view the current NERC compliance regime:

  1. NERC standards are designed to protect the Bulk Electric System from cascading outages caused by errors of omission or commission on the part of electric utilities and other participants in the BES.
  2. The requirements in those standards dictate certain actions which are either required or prohibited.
  3. If an entity is found to have violated one of those requirements – that is, they have either not performed a required act or performed a forbidden act – they are subject to fines and need to implement controls designed to prevent recurrence of the violation.[iv]

I want to say now that I am not criticizing this regime, which has been in place since NERC was founded in 1967.[v] In fact, I’ll certainly stipulate that it’s the only compliance regime that could successfully accomplish NERC’s original purpose. The problem is that cyber security is very different from the reliability issues addressed by the original NERC standards, as well as what are called the NERC 693 standards today. No matter how many times NERC officials may swear on a stack of Bibles that the CIP standards are reliability standards just like all of the other ones, that doesn’t change the fact: cyber security is very different from security from cascading BES outages. The compliance regime for CIP can and should be different from the regime for the other NERC standards.[vi]

What should this new regime look like? In my last post, I listed six principles that any regime of mandatory cyber standards for critical infrastructure should follow. Since that post was nominally about regulation of natural gas pipelines, I will repeat them here:

  1. The process being protected needs to be clearly defined (the Bulk Electric System, the interstate gas transmission system, a safe water supply for City X, etc).
  2. The compliance regime must be threat-based, meaning there needs to be a list of threats that each entity should address (or else demonstrate why a particular threat doesn’t apply to it).
  3. The list of threats needs to be regularly updated, as new threats emerge and perhaps some old ones become less important.
  4. While no particular steps will be prescribed to mitigate any threat, the entity will need to be able to show that what they have done to mitigate each threat has been effective[vii].
  5. Mitigations need to apply to all assets and cyber assets in the entity’s control, although the degree of mitigation required will depend on the risk that misuse or loss of the particular asset or cyber asset poses to the process being protected.
  6. It should be up to the entity to determine how it will prioritize its expenditures (expenditures of money and of time) on these threats, although it will need to document how it determined its prioritization.

I hope you can see why the fact that the three principles of the current NERC regime still apply to enforcement of all CIP standards, including CIP-013, means that NERC entities are rightly going to be suspicious of any standard, no matter how non-prescriptive its wording, that will be enforced under that regime (and if you can’t, well, it’s late at night now and I’m tired from writing this post – almost as tired as you are from reading it! I’m sure I’ll come back to this topic in future posts, and at least in an upcoming book, which – although I have yet to put a single word on paper for it – has already been at least half written in posts like this one).

And this, folks, is the canary in the coal mine, which has just keeled over dead: It’s clear (to me, anyway) that no new compliance area (like supply chain or virtualization) can be incorporated into CIP unless this is done with non-prescriptive requirements. Yet it is also now clear to me that no new non-prescriptive standards (or requirements) will ever be freely accepted by the NERC membership until the CIP compliance regime itself has changed. I will be watching to see what happens.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.

[i] Well, at least of the solar system. With so many possibly habitable planets outside of the solar system, it’s almost inevitable that on at least one of them such a standard has already been developed, although I will guess that on at least one other planet the effort to develop such a standard has collapsed in confusion and recrimination. And we’re not even talking about the uncountable number of alternative universes, where it’s almost inevitable there is not only at least one planet with a continent called North America and an organization called the North American Electric Reliability Corporation, but further that this organization has a CEO named Gerry Cauley. Such is the power of having an infinite number of instances, with no possible means of verification for your assertions.

[ii] As the post just referenced points out, there is the possibility that a large push by a major industry trade organization will put the next (and last) ballot over the top, despite the fact that so many of the yes votes will be by entities that strongly opposed the first draft and still have substantial misgivings about the second. Since this would technically count as a ballot victory for CIP-013, some might say that this constituted the SDT’s fulfillment of their mission. But I think the mission of the SDT is to develop a standard that will win the outright approval of 68% of the ballot body, not a grudging approval based on the idea of “Well, it’s either this or have the Standards Committee write the standard, and go against the strong wishes of our trade organization and our CEO”.

[iii] There is also a third requirement part (R3.3) that states the entity should “For those methods identified in Part 3.1 that use signatures or patterns, have a process for the update of the signatures or patterns.” Even this part is non-prescriptive, of course, since it also simply states an objective and requires the entity to have a process to address the objective.

[iv] You may wonder where Risk-Based CMEP (formerly known as the Reliability Assurance initiative) fits in with the three principles of NERC compliance I’ve just listed. I haven’t thought about this too closely yet, but I will point out that the risks addressed by RAI are compliance risks – i.e. the risk that an entity will violate a NERC requirement. You can think of RAI as kind of an overlay to the three principles of the NERC compliance regime; it doesn’t modify them, but it’s designed to lower risk of violations, and perhaps of cascading outages caused by a NERC entity’s doing or not doing something. It doesn't at all change the prescriptive nature of the current NERC CIP compliance regime.

[v] Of course, NERC standards weren’t mandatory until 2007, but I believe the principles of compliance were the same before and after that date.

[vi] The question may now occur to you: Is Tom advocating that the CIP standards be removed from NERC’s purview? I’m not necessarily saying that, although I reserve the right to change my mind on that point in the future. It seems to me that NERC has already put in place much of what is needed to have a successful program for regulating the cyber security of the BES, not the least of which is a large team of auditors – some of whom are very competent – that understand both the cyber and electrical issues involved with auditing compliance with mandatory cyber regulation of the BES. What isn’t in place now is a compliance regime that would be very different from the current CMEP, which is described by the six principles listed in this post. I believe that technically speaking this regime could be built within the NERC organizational framework – you might have to have two CMEPs, for example. However, I’m not sure that politically this will be possible within the NERC organization itself; there are few organizations that are flexible enough to undergo such a wrenching change. I certainly hope NERC proves to be the exception to this rule, since I believe there is no alternative to implementing the changes to the CIP compliance regime that I’m proposing. But if NERC can’t do it, some other organization will have to.

[vii] My eyes were opened at the RF CIP workshop this past week, when Lew Folkerth pointed out that the key to being able to audit non-prescriptive requirements is for the entity to have to demonstrate that the measures they took were effective. I will do a post on this soon.