Tom Alrich's Blog: July 2015

Tuesday, July 28, 2015

FERC’s New NOPR, Part II

This is the second half of my post on FERC’s new NOPR, which was released on July 16, 2015. You can find the first half here. I recommend you read the first post before you read this. Besides repeating the not-so-surprising news that FERC intends to approve the v6 standards, the first post discussed FERC’s request for comments on their proposal (not yet a directive) that CIP be revised to provide protection for communications between all control centers in the Bulk Electric System. This post discusses the remaining issues that FERC is considering for new or revised standards. FERC is asking for comments on all of these topics.

Protections for Remote Access

FERC says they wish to have comments on “the value achieved if the CIP standards were to require the incorporation of additional network segmentation controls, connection monitoring, and session termination controls behind responsible entity intermediate systems.” They base this on statements made by two of the speakers at the April Technical Conference that suggested the need for further protections.[i] I didn’t realize that Intermediate Systems and Interactive Remote Access were so poorly protected with the current wording, but I’ll defer to others who know more than I do about this.

A New Supply Chain Security Standard?

FERC surprised me – and I suspect a lot of others in the industry – with their seemingly out-of-the-blue suggestion (Section E, starting on page 37) that there should be a standard for security of the supply chain. I don’t believe this means having armed guards accompanying trucks delivering transformers, but rather – among other things - protection against the introduction of malicious code at the “factory” level; that is, introducing malware into control system software and firmware.

I’ve certainly known for a while that some people were quite concerned about the possibility that – say – all switches sold by vendor X would have embedded malware that could be “woken up” with the touch of a button by some evil genius in a secure mountain fortress overseas, and that this would lead to a nationwide power outage. But I certainly hadn’t heard a lot of buzz from Congress about this danger, and since the FERC Commissioners and staff always seem to be paying close attention to what is being said in Congress, it was surprising – indeed, refreshing – to see this concern come out of the blue. In fact, FERC freely admits that Order 791 never mentioned supply chain risks at all, so this issue is completely different from all the others discussed in the NOPR, which otherwise all deal with concerns raised in 791.

What surprises me even more is that it’s not at all clear how FERC and NERC – which of course have no direct authority over hardware and software vendors – can write and implement a standard that will result in big improvements in this area. It will obviously have to be done indirectly, by say requiring that new vendor contracts contain certain provisions. And that makes me wonder about any antitrust implications of this effort.

In any case, I do agree with FERC that this is an important concern, and if it can be sufficiently addressed through a standard that applies to a group of customers, not the vendors themselves, then it’s certainly worth examining in more depth. Of course, this concern definitely needed to be addressed in a NOPR, not an Order. However, I believe a separate NOPR would have been better, given that this issue is so divorced from the others discussed in this document. I really doubt FERC will be able to decide what it will do on this issue for quite a while; I certainly hope they don’t hold up addressing all of the other issues in the NOPR until this one can be addressed.[ii]

LERC

This delightful acronym (EnergySec describes it as “Seuss-ian” in a recent newsletter), of course, stands for Low impact External Routable Connectivity. This concept is used (in CIP-003-6 Attachment 1) as a qualifier for how much certain Low impact assets have to be protected, analogously to how External Routable Connectivity serves as a qualifier for some Medium impact BES Cyber Systems. I believe the v6 SDT invented a different phrase because they were worried that, by using ERC in qualifying the Low assets, they might end up having the auditors treat individual cyber assets at a Low asset differently depending on whether or not they had ERC – and that would then lead to being audited at the cyber asset level, which supposedly is strictly verboten for Low assets. But guess what, SDT? It seems the new term may not have prevented that from happening after all….However, I’m getting ahead of myself.

FERC discusses LERC in Sections 68-70 on pages 43 and 44. They have an issue with the first part of the definition: “Direct user-initiated interactive access or a direct device-to-device connection to a low impact BES Cyber System(s) from a Cyber Asset outside the asset containing those low impact BES Cyber System(s) via a bidirectional routable protocol connection.”

FERC’s issue is not really with the definition itself, but with the fact that they think the word “Direct” isn’t being properly interpreted by NERC. They point in particular to Reference Model 6 on page 36 of CIP-003-6, which purports to show a situation where there is no LERC. In the diagram, an outside system connects routably to a “Cyber Asset” (not a BCS) that is itself non-routably (serially) connected to a BCS.

At first glance, this doesn’t differ significantly from Reference Model 4, in which an “IP/Serial Converter” is in the same logical position as the Cyber Asset in Model 6. However, in Model 4 it is stated that there is LERC, while in Model 6 it is stated that there isn’t. Specifically, the description for Model 6 says “There is a Layer 7 application layer break or the Cyber Asset requires authentication and then establishes a new connection to the Low impact BES Cyber System.” This, in the SDT’s opinion, separates LERC from non-LERC.

FERC states in paragraph 70, “...we seek comment on the implementation of the ‘layer 7 application layer break’ contained in certain reference diagrams in the Guidelines and Technical Basis section of proposed Reliability Standard CIP-003-6.It appears that guidance provided in the Guidelines and Technical Basis section of the proposed standard may conflict with the plain reading of the term ‘direct.’”

To summarize the above, it appears to me (and FERC could have been a little clearer, in my opinion) that FERC wants NERC to provide explicit guidance on what “application layer break” means and why it results in the connection no longer being “direct”. If NERC doesn’t explain this well enough, it sounds like FERC will order that the LERC definition be rewritten. Were that to happen, they should probably order that the ERC definition be rewritten as well, since the best discussion I’ve heard from NERC or the regions on ERC (by Morgan King of WECC. See my recent post on ERC) also relies heavily on the idea of an application layer break (Morgan calls it a “protocol break”).

This might sound like it’s a nice discussion about semantics, but there is quite an important implication for NERC entities with Low impact assets. Remember that the v5 and v6 SDTs have gone through contortions to make it clear that an inventory of Low impact Cyber Assets isn’t required for compliance – any requirements that apply to Lows are supposed to apply only at the asset level.[iii] However, if the entity is going to prove that there is no LERC at a particular Low asset – when there is clearly some routable connection going into the asset – then they are potentially going to have to do at least part of what Mediums and Highs have to do: identify any BCS and show that there is no LERC to any of them.

I started to write this section with my sympathies on NERC’s side of the argument; it appeared to me at first glance that FERC was taking things too far. However, I now see that the v6 SDT may have made a mistake by trying to, in cases where there is an external routable connection coming into a Low asset, assert that in some cases there is no LERC, whereas in others there is. It seems the only way this can be demonstrated is by looking at the individual Cyber Assets, and that will require some sort of inventory. It would probably be better if the definition just said that if there is any routable connection coming into the asset, there is LERC; if not, there isn’t LERC.

What can NERC do to satisfy FERC now? Presumably a guidance document would be sufficient, since the issue isn’t in the requirement itself, but in the Guidance and Technical Basis section of CIP-003-6. This issue should not require a new definition. But if NERC doesn’t provide guidance, or if they do and FERC isn’t satisfied with what they provide, then FERC will probably order a new definition be drafted. But note that, whatever happens, the clarification should apply to ERC as well as to LERC, since the idea of a protocol break applies to both definitions.[iv]

Transient Devices

FERC discusses the new requirement CIP-010-2 R4 for Transient Devices in Section C of the NOPR, which starts with paragraph 33 on page 20. They agree that the requirement, which applies only to High and Medium impact BES Cyber Systems, is good as far as it goes. However, they are concerned with the fact that it doesn’t apply to Lows.

To illustrate their concern, FERC says (page 26, paragraph 42) “For example, malware inserted via a USB flash drive at a single Low Impact substation could propagate through a network of many substations without encountering a single security control under NERC’s proposal. In addition, we note that Low Impact security controls do not provide for the use of mandatory anti-malware/antivirus protections within the Low Impact facilities, heightening the risk that malware or malicious code could propagate through these systems without being detected.”

To discuss the first sentence first, I wonder how many substations are connected like FERC imagines – a lot of substations all on one routable network? If there is any external connectivity in a substation, it is almost always with the control center (which should always have good security controls in place). I’m not an expert on this, but I really don’t think there are many networks where all the substations are routably connected in a peer-to-peer fashion. So I tend to agree with the SDT that the risk of a virus – carried in on a USB stick – propagating like wildfire from one substation (or generating station) to another is fairly minimal.

FERC’s second sentence points out that CIP v5 doesn’t require anti-malware measures at Lows, which FERC says heightens the risk. But I’d like to point out:

Just because anti-malware isn’t mandated doesn’t mean it’s not being used. Given the dangers, I doubt there are many NERC entities that wouldn’t always have measures like antivirus deployed wherever possible.
For devices where it isn’t deployed, my guess is they mostly are a.) not susceptible to normal malware because they don’t employ a standard OS like Windows or Linux or b.) performing real-time operations that might be hindered through antivirus software.
If there really isn’t the widespread connectivity among substations that FERC seems to think there is, the fact that CIP doesn’t require anti-malware at Lows doesn’t really change anything. An infection isn’t likely to spread beyond the substation regardless of whether anti-malware is deployed or not.

FERC goes on in paragraph 43 to request that NERC provide more justification for limiting this requirement to High and Medium impact BCS, and says that if NERC still can’t satisfy them, they will likely direct NERC to extend the requirement to Low BCS. This might be quite a difficult requirement to comply with, since substations are typically not manned. How do you prevent a contractor from plugging in a USB stick if it hasn’t been approved by the owner of the substation – since they will presumably already be granted the physical access they need to accomplish what they were sent to do?

This is a situation where the control, however beneficial, would be mis-applied. The controls need to be applied to the employees and contractors visiting the substation, through policies and training, as well as – in the contractors’ case – legal agreements that make clear what is expected regarding transient devices. But there should not be device-level controls for Low impact assets. If this really is an important issue, FERC should order NERC to make changes to CIP-004[v] so that the training requirements apply to Lows (or a subset of those requirements), not extend CIP-010 R4 to Low impact assets.

Here is Part III of this post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] Of course, an “Intermediate System” is required of Medium and High impact facilities to control interactive remote access to systems within the ESP. So FERC seems to feel there could be more protection applied to IRA.

[ii] However, see my next post, where I speculate whether FERC actually might want to move the enforceable date for CIP v5 and v6 back from 4/1/16, and whether they’ll use a delay in approving v6 as a means of accomplishing that goal.

[iii] The first draft of CIP v5, which was roundly voted down in December 2011, contained a single requirement that applied to Lows: that default vendor passwords needed to be changed. Since the only way to audit that would have been for the entity to show they’d changed passwords on all of their Cyber Assets, this would obviously have required an inventory of those Cyber Assets. The reaction was so negative that the SDT made sure to remove this requirement in the second draft and just left requirement CIP-003-5 R2, which requires the entity to have four policies – until FERC said that wasn’t enough in Order 791.

[iv] If NERC decides to take my advice and just say that whenever there is ERC into a Low asset there is LERC, then this clarification would not involve the protocol break concept – and it wouldn’t apply at all to ERC (which is definitely a cyber asset-level concept, not an asset-level one).

[v] CIP-004-6 R2.1 already references Transient Cyber Assets in the list of training required, for Highs and Mediums. If this is so important, maybe there should be a similar training requirement applicable to Lows, and also applicable to contractors. But even though that wouldn’t be a device-level requirement, it would also be very difficult to implement; I also don’t believe the benefit from this approach would outweigh the costs required for compliance.

Monday, July 27, 2015

FERC’s New NOPR, Part I

On July 16, 2015, FERC issued a new Notice of Proposed Rulemaking (NOPR)[i]. Since they have had the eight CIP version 6 standards on their desk for approval since February, it was widely anticipated that this document would discuss v6, which it certainly does. However, what was not widely anticipated was the form this document would take, as well as the other issues that are discussed in it.

For a little background, FERC could have issued two types of documents regarding v6. A NOPR (such as the one they issued in April 2013 stating they were going to approve CIP version 5) solicits comments on a rule that FERC is considering approving – in this case, the CIP v6 standards. Once FERC has gathered the comments, it makes its decision on whether or not to approve the rule, then issues an Order approving them (such as Order 791, which approved the v5 standards as submitted but then asked for four significant changes. Implementing these changes resulted in the eight v6 standards, which now need to be approved).

However, what I (and others) anticipated was that FERC would simply issue an Order approving v6, not a NOPR requesting comments. It seemed to me that, since CIP v6 was developed specifically to address the four changes FERC had asked for in Order 791, and since FERC staff members had observed – and commented on[ii] – the Standards Drafting Team meetings, there shouldn’t have been any big surprises when v6 was submitted in February. If there were a few points that they wanted changed, they could still have ordered NERC to make those (as they did in Order 791); these changes would become CIP version 7. But CIP v6 would have become the law of the land, and any entities who still had doubts whether it would be approved would have at long last had certainty.

However, FERC once again showed that anticipating what they will do is dangerous. They hinted strongly (in the very first sentence in paragraph 1, page 1[iii]) that they will approve the v6 standards as written; however, they also asked for comments on a number of issues that are either new or ones they don’t think NERC has adequately addressed. These are quite interesting, and I’ll discuss them at length below, and in part II of this post (which should follow in a day or two).

Before I do that, I want to emphasize the main takeaway of the NOPR: The CIP v6 standards will be approved as written, although once again there will be new and revised requirements (and probably a new CIP standard for supply chain security) coming in the future[iv]. And if v6 is approved at FERC’s October meeting[v], this means the compliance dates for the v6 requirements won’t be pushed back from what is shown in the v6 Implementation Plan.[vi]

The remainder of this post, and Part II, deals with topics on which FERC is soliciting comments. These may or may not result in revised requirements; one may result in an entirely new CIP standard.

CIP-006-6 R1.10 – “Communications Networks”

This is clearly an important topic for FERC; it is discussed in Section D of the NOPR, starting on page 27. CIP-006-6 R1.10 was added to CIP-006 in response to FERC’s directive in Order 791, which ordered NERC to develop a requirement to protect “the non-programmable components of communication networks”[vii]. By this they mean cabling and devices like dumb hubs and switches that might be physically tampered with to cause loss or alteration of communications between CIP-protected devices.

NERC chose to interpret this FERC directive to refer to a fairly narrow domain: cabling and non-programmable components between Cyber Assets within an ESP, when that cabling exits a PSP. R1.10 specifically addresses protecting such cabling and components. There are two other domains that are not addressed by R1.10. The first is cabling and components within the ESP that are also enclosed within a PSP. The SDT argued – and FERC agreed – that such items were now physically protected by the fact that all the Cyber Assets (BCS and PCAs) within a PSP already have physical protection. Thus, as long as the wiring between those Cyber Assets doesn’t exit the PSP, it is already protected. FERC doesn’t dispute this assertion.

The second domain is cabling and components that facilitate communications between ESPs. The fact is that these aren’t protected by CIP v5 at all now; it seems FERC believes they should be. This is a very interesting problem, and it wasn’t at all helped by the fact that NERC mistakenly claimed, in the Petition for approval of v6 filed in February, that this domain is already protected. Let me elaborate on this last point.

FERC says in paragraph 51 on page 34 “NERC further states that Part 1.10 only applies to nonprogrammable components used for connection between applicable Cyber Assets within the same Electronic Security Perimeter because Reliability Standard CIP-005-5 already requires logical protections for communications between discrete Electronic Security Perimeters.”

When I first read this, I thought FERC had to be wrong, since all of the CIP v5 and v6 standards include (in Section 4.2.3.2) an exclusion for “Cyber Assets associated with communication networks and data communication links between discrete Electronic Security Perimeters.”^{^[viii]}But then I realized that this exclusion doesn’t cover the cabling and components that FERC is talking about; rather, it just covers cyber assets like routers that are between ESPs.

I then thought FERC had to be simply misquoting NERC. NERC couldn’t have said that CIP-005-5 requires logical protections for communications between ESPs, could they? Because this is certainly not the case. But when I looked at page 49 of the Petition (cited by FERC), I saw this: “…Reliability Standard CIP-005-5 already requires logical protections for communications between discrete ESPs. For instance, under CIP-005-5, Requirement R2 responsible entities must do the following for Interactive Remote Access into an ESP: (1) use an Intermediate System such that the Cyber Asset initiating Interactive Remote Access does not directly access an applicable Cyber Asset; (2) use encryption that terminates at an Intermediate System; and (3) require multi-factor authentication for all Interactive Remote Access sessions.”

In other words, NERC said in their petition that the fact that there are protections for Interactive Remote Access means there is protection for “communications between discrete ESPs”. There is only one problem with this statement: it’s false. The whole idea of IRA is it is communications into an ESP from a system that is outside of any ESP (e.g., if I use my laptop in my living room to monitor or control systems within an ESP), not from one ESP to another. FERC points this out in paragraph 56 on page 35. I find it quite odd that NERC would have made this mistake, but let’s move on.

FERC goes further to state (paragraph 57) that they are concerned about protecting communications between ESPs, in particular communications between control centers; they are clearly considering ordering such protections, and want to hear comments on this topic. Given what this will entail if it becomes a requirement, I anticipate there will be a lot of comments.

To speed up the comment process, FERC concedes one point. They state in paragraph 58, “We also recognize that third-party communication infrastructure (e.g., facilities owned by a telecommunications company) cannot necessarily be physically protected by responsible entities.” But they go on to say that logical controls could be applied (and point out that CIP-006-6 R1.10 does allow for logical protections if physical protection isn’t possible).[ix]

They conclude this discussion in paragraph 59, where they state, “...we propose to direct NERC to develop a modification to proposed Reliability Standard CIP-006-6 to require responsible entities to implement controls to protect, at a minimum, all communication links and sensitive bulk electric system data communicated between all bulk electric system Control Centers.”

I don’t think I need to tell you that this is going to be a huge issue. It will certainly be an interesting discussion.

Here is Part II of this post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] To find it, go to www.ferc.gov and click on Documents, then eLibrary. Choose General Search and search for docket “RM15-14-“. Or drop me an email at talrich@deloitte.com and I’ll send it to you.

[ii] Of course, FERC staff members always preface any comments they make at any meeting by saying, “I don’t speak for the Commissioners….” And that’s true – the five Commissioners don’t reveal their thoughts or plans on future actions to anybody else (including each other. I’ve always wondered what they talk about in the lunchroom. Must be the weather).

[iii] In this post, page numbers will refer to the actual page numbers in the document, not to the page number of the PDF file.

[iv] These will of course be numbered as CIP version 7; they will require a new Standards Drafting Team (unless the current team feels like doing this again – my guess is they won’t), new ballots, new NERC Board approval and finally new FERC approval. My fervent hope is that this time – unlike with v6 – NERC won’t just revise the standards that actually have changes, but will instead “rev” all of them to CIP version 7. Having to comply with two different CIP versions at the same time (v5 and v6) is causing – and will cause – lots of completely unnecessary confusion. Let’s not compound that by adding a third version, v7, to the mix.

[v] Here’s why October is the latest FERC can approve v6, without the implementation dates being pushed back: The dates shown in the v6 Implementation Plan for each v6 standard are all qualified by “or the first day of the first calendar quarter that is three calendar months after the date that the standard is approved by an applicable governmental authority...” Since “approval” usually refers to the date 60 days after the Order is published in the Federal Register (and since it will take a few days for the Order to be published), this means that October is probably the last month in which FERC can approve v6 in time for it to be “approved” in Q4. In the Implementation Plan, the compliance dates for the standards themselves (although there are a lot of exceptions for particular requirements, or even parts of requirements) are all April 1, 2016. This happens to be exactly the date that would result from the above qualification, if approval is in the fourth quarter. I believe the 60-day delay is to give Congress the chance to object to the Order, although my guess is that NERC CIP version 6 won’t be at the top of Congress’ mind in the fourth quarter and they will pass on this great opportunity. I want to thank an auditor friend who explained this to me.

[vi] I will be very shortly posting a new post that raises the question whether FERC may actually delay approving v6, thus forcing the v5/v6 implementation date past 4/1/16.

[vii] Paragraph 46, page 28

[viii] In a webinar I did in June with EnergySec, we had a big discussion on this exclusion, although this was in the context of NERC’s April Memorandum on Network Devices.

[ix] They add, “Also, if latency concerns mitigate against use of encryption as a logical control for any inter-Control Center communications, our understanding is that other logical protections are available, and we seek comment on this point.” I’m sure there will be a lively technical debate on this issue.

Wednesday, July 15, 2015

Nine Months

On July 6, 2015 NERC emailed me and a couple thousand of my closest friends to say that a special meeting had been held on July 1 (to which I unfortunately was not invited – obviously a simple oversight); you can find the attachment to that email here. In what may qualify as the understatement of the year, NERC said they called the meeting because they “became aware that industry continued to have concerns over the issues after it (NERC) issued CIP Version 5 Memoranda dated April 21, 2015.” Furthermore, the meeting “was organized to discuss a way forward to resolve the issues and (identify) remaining questions or concerns for consideration through standards development or other means.”

Let me summarize what I think are the most important takeaways from this document:

NERC is withdrawing the Memoranda from April. This is not hugely surprising, since as I said in this recent post, half of the regions said they didn’t consider the Memoranda “auditable”, as NERC had termed them – and I’m sure the other regions agreed with them.
Since even I agree that one or two of the Memoranda – and parts of others – provide good guidance in themselves, NERC does say that “Portions of their content will be included in industry guidance, as appropriate.”
Quite significantly, NERC says that “Guidance documents will provide approach(es) to meet requirements of the standards, though entities may have other ways to achieve the same goal.” I interpret this to mean NERC is admitting they will never be able to provide any sort of “definitive” guidance on many or even most of the interpretation issues in CIP version 5; entities are ultimately responsible for determining how they will define undefined terms like “programmable”, as well as how they’ll interpret requirements that are vague or contradictory, most notably CIP-002-5.1 R1. I have been calling this approach “Roll Your Own”, but others have described it differently. For example, see the newsletter article by Lew Folkerth of RFC referenced in this post (to be honest, I like Lew’s methodology better than my own. I want to write a new post on that soon).
NERC states that future guidance will be based on the “Section 11” process, which means Lessons Learned. This is good news, since this process does allow for stakeholder input. One reason the Memoranda failed is that there was no input solicited before they were released in April.[i]
From a number of references to the “Standards Development process” in the document, it seems clear that NERC will be working on a few SARs (Standards Authorization Requests, which if approved can lead to development of a new standard or definition) to address various issues. This is of course the only way that problems with the CIP v5 standards (or any NERC standard) can really be fixed, but we are talking about a long process – it will take at the bare minimum two years for a SAR to produce a revised standard, and more likely three to four years.
Of course, what is really important is what happens next. It’s great that a) NERC has withdrawn the Memoranda, b) they’re going back to Lessons Learned as the primary guidance tool, c) they admit that entities are empowered to resolve ambiguities for themselves, and d) they’re restarting the Standards Development engine. But the compliance deadline remains 4/1/16, and it’s pretty late in the game for NERC to be still developing guidance at all, let alone putting out a document that provides no guidance in itself but merely promises guidance is coming on an unspecified schedule. Remember, the Lessons Learned typically take a long time to develop. NERC started developing the LLs last September, and since then only two have been finalized, both on completely non-controversial topics. The document doesn’t give any timetable for developing new Lessons Learned, but even some sort of huge rush effort (and there’s no indication in the document that NERC is planning such an effort) would take at a minimum three to four months. That puts us in the fourth quarter, less than six months before the compliance date. Are entities supposed to hold off starting their compliance efforts until then?
More importantly, there seem to be only about 20 LLs in the development process now, and there are many more issues that haven’t even been acknowledged by NERC (probably hundreds, and I suspect there are even more lurking in the Attachment 1 criteria. I have more than once compared issues with those criteria to the Hydra of Greek mythology; if you cut off one of its heads, two more would grow in its place. It seems as if there is no end to the questions that can be raised, once you look closely at any one of the criteria).
Even more importantly, there are problems with CIP-002 - and its associated definitions, including Cyber Asset, BES Cyber Asset and BES Cyber System - that couldn’t be fixed with LLs if you wrote one a day for the next century; there are fundamental contradictions built into the standard, which can’t be resolved through parsing the words further. CIP-002 and these three definitions need to be re-thought and rewritten. I don’t think this has to be a completely “from scratch” effort. I believe the way most people now interpret CIP-002 R1 (and Attachment 1, and the three definitions) is correct, and can provide a consistent methodology for identification and classification of BES Cyber Systems. However, this methodology (that more or less everyone is using today) doesn’t in fact correspond with most of the wording of CIP-002, meaning the only way to comply with the true meaning of CIP-002 R1 is to violate the requirement as written[ii]. The standard needs to be rewritten so it provides a consistent methodology and ontology – both internally consistent and consistent with the methodology and ontology that virtually all NERC entities are in fact now using in their compliance programs.
These problems wouldn’t be insoluble if it weren’t for the other significant (from a CIP point of view) event that transpired on July 1: As of that day, there are only nine months until the CIP v5 compliance date. I will cut to the chase: there is simply no way the interpretation problems of CIP v5 can be addressed in time for the standards to be enforceable on April 1, 2016. This would be the case even if tomorrow all of the questions regarding v5 interpretation were miraculously answered[iii] - and I can guarantee you they won’t be answered tomorrow.

So NERC - and the whole NERC community – needs to acknowledge reality and admit that CIP version 5 won’t be enforceable on April 1, 2016. Once that happens (and the sooner the better), the community can decide – perhaps in some sort of general meeting – the proper way forward, so that ultimately there will be a CIP v5[iv] that is well understood and completely enforceable. I have thought a lot about how this could actually come about; I’ll sketch below how I think this goal can be achieved (NOTE: I had planned to first write a few posts on why the current situation is untenable, before I laid out my ideas for the future. But the new NERC document led me to believe that maybe things are moving faster than I had thought they were, and it’s important to get these ideas on the table now – even though almost all of them have appeared separately in my various posts over the last nine months or so. I still plan to write the other posts in the near future – especially on why CIP-002 needs to be rewritten).

The first step is for NERC to get the SAR process moving, so that CIP-002 and portions of other standards (as well as certain definitions) can be rewritten. A SAR (or SARs) is needed to rewrite CIP-002, but probably also to address certain problem areas in other standards – e.g. a definition of “external routable connectivity” for CIP-005, and a definition of “software” for CIP-010 R1.
Next, NERC needs to acknowledge there is substantial uncertainty about way too many fundamental questions in CIP version 5 (primarily of course in CIP-002, but certainly not limited to that standard) for the standards to be enforceable on April 1, 2016.
NERC needs to go back to examine the CIP version 1 rollout, where entities had two compliance dates. The first was their “Compliant” date and the second – one year later – was their “Auditably Compliant” date. I think the same principle would work with v5. April 1, 2016 should be the Compliant date.[v] Entities need to do their best to be compliant on that date, but if they’re not – and if they’re audited in the following twelve months – they won’t be assessed any violations. Rather, the auditor will simply note the areas of improvement required, and discuss with the entity why they missed the mark on one or more requirements.[vi] On April 1, 2017, entities will have to be Auditably Compliant[vii], meaning they can be assessed violations.
However, I’m not even saying that pushing the Auditably Compliant date back to 4/1/17 is enough. In order for that to be the case, there needs to be definitive guidance (presumably from NERC) on all of the major interpretation issues with CIP v5 – the definitions of “programmable” and External Routable Connectivity; procedure for determining “adverse impact” on the BES in the BES Cyber Asset definition; BES Cyber System identification methodology, etc. – by 4/1/16. This will give NERC entities an entire year to implement compliance, based on an agreed-upon foundation. I’m not pretending that this effort in itself won’t be a huge one. When you consider that exactly two Lessons Learned have been finalized in the last nine months, and that less than nine months remains before 4/1/16, you can see why I have doubts even this could happen. But if this guidance hasn’t been provided by 4/1/16, then the Compliant date also needs to be moved back – by whatever time it takes to develop complete guidance; and the Auditably Compliant date needs to be moved back so that it’s still a year after the Compliant date. For example, let’s say the guidance is complete on 8/1/16. That would be the new Compliant date, and the Auditably Compliant date needs to move from 4/1/17 to 8/1/17. I know this sounds like a very tall order, but we don’t want to end up in the same situation this time next year that we are in now: less than nine months remaining to auditably comply, and lots of uncertainty still hanging over the meaning of the most fundamental concepts in CIP v5.
But there is one big exception to the two-step compliance process I’ve just described: Since CIP-002 needs to be rewritten so it can provide a consistent methodology for compliance, it obviously can’t be enforceable until it is rewritten. And it won’t be rewritten and approved by 4/1/17. This leads to an issue: How can CIP-003 through CIP-011 be enforceable if CIP-002 isn’t enforceable? The other standards all assume the entity has “properly” identified and classified its BES Cyber Systems in CIP-002 R1. Won’t there be all sorts of problems if NERC tries to enforce these other standards without at the same time enforcing 002?
This might sound like a big problem, but I don’t really think it is. As I said above, NERC entities – and the regions and NERC itself – are almost all on roughly the same page as to how to comply with CIP-002 R1: First you identify your assets (and perhaps Facilities) that are High or Medium impact through Attachment 1 (the “big iron”). Next, you identify the BES Cyber Assets and BES Cyber Systems that control those assets (the “little iron”). If the asset/Facility is High impact, the BCS are High; if Medium, the BCS are Medium. Finally, the entity takes its list of BES assets, subtracts those that are High and Medium impact, and identifies the remainder – those containing at least one control system – as “assets containing a Low impact BES Cyber System”. The problem of course is that CIP-002 isn’t written this way, leaving two choices: either every single NERC entity will be in violation of CIP-002, or CIP-002 needs to be rewritten so it reflects how everyone is actually trying to comply with it (as well as to clear up problems like the missing definitions). I submit that the latter is the more sensible approach.
Of course, the above is a simplification of the process that entities are using to comply with CIP-002, but it is in principle one that can be followed (indeed, it is basically the same process as with CIP versions 1-3, with the exception that there are now three classes of assets, rather than just two – Critical Assets and others). What is needed is for all parties (meaning NERC, the eight Regional Entities, and a healthy majority of the NERC entities subject to all of CIP v5) to agree that, until CIP-002 can be rewritten so that it actually reflects this process, they will abide by the above methodology (with a lot more detail added, of course) as the official “interpretation” of CIP-002-5.1. I don’t think there should be any actual PVs issued for CIP-002 R1 until the standard is rewritten[viii]. But I do think that, with this general understanding in place, it will be possible for an auditor to accept an entity’s lists of High and Medium BCS as legitimate, meaning that CIP-003 through -011 will be enforceable.
The corollary of the above is, of course, that CIP-002 will never be enforceable until it is rewritten. But my opinion is this will happen whether or not NERC officially acknowledges it. Given the huge problems with 002, I find it very hard to believe that auditors will issue PVs – or certainly that those PVs will turn into actual violations – if the entity has made a good faith effort to comply. Let’s say the entity has based their “Programmable” definition on the January Lesson Learned, while the auditor believes that NERC’s April Memorandum is a better guide. Is the auditor really going to write a PV, given that the entity was following what was NERC’s official guidance early this year – and that NERC withdrew the Memorandum two months after issuing it? No, the auditor isn’t going to issue a PV. This makes CIP-002 unenforceable, whether or not it’s officially declared so. But it’s much better for this to be made official, so that CIP-002 is indisputably unenforceable. If that doesn’t happen, there will be a bunch of fruitless arguments at audit time over – in this case – the “true” meaning of “Programmable”, which of course can never be determined before CIP-002 is rewritten[ix]. So whether or not NERC officially declares it so, CIP-002 R1 will be unenforceable on 4/1/16; of that I am certain.[x]
Once CIP-002 is rewritten (and approved by the NERC ballot body, the BoT and FERC), along with perhaps portions of other v5 standards, there will actually be a CIP version 5 that can be completely complied with. Which is a good thing, because I think v5 is an excellent family of standards. It can be the basis[xi] for NERC cyber security standards going forward many years.[xii]

Note: My analysis of FERC's recent NOPR raised the idea that the whole complicated process described above could be avoided if FERC simply delays approving CIP v6 until 2016. There will be no extraordinary actions required either on NERC's or FERC's part, if this happens. Of course, this assumes that FERC agrees with me that NERC entities should be given more time to become auditably compliant with CIP v5/v6. For more on this idea, see this post.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] Another reason they failed was that they were described as providing “auditable” guidance – meaning there was no discretion as to whether an entity followed them or not. This of course went over like a lead balloon, and was the chief reason why there was such a big revolt against the Memoranda.

[ii] I describe what I mean by this further on in this post.

[iii] I wrote a post on January 2 saying it was now too late for v5 to be enforceable on 4/1/16, and I continue to believe that is the case – the big problems with CIP v5 would have all needed to be resolved in some way by the end of last year, for the 4/1/16 date to have a chance of succeeding. The main problem is CIP-002, since that is the foundation for the rest of the standards. Until entities can be sure they’ve properly classified their BES assets and Facilities, and properly identified and classified their BES Cyber Systems, there is simply no way they can be sure the rest of their compliance program is good. While I said this in January, it is obviously much more compelling today: If the entities all knew tomorrow exactly how to comply with CIP-002 (and the other standards, since they aren’t free of ambiguities either), there still wouldn’t be enough time to finalize their asset identification and get all the standards complied with by 4/1/16. And as you can see by reading the NERC document on the July 1 meeting, there is zero chance that all the important issues will be addressed in even three months: NERC doesn’t give any timelines at all for providing guidance, and in fact doesn’t even list what guidance actually needs to be provided. But three months from now is six months before the compliance date, and even that would be way too late to allow the 4/1/16 enforcement date to be retained. So it has to go.

[iv] As usual, by “CIP v5” I mean the three standards from the actual v5, as well as the seven v6 standards (likely to be approved by FERC on Friday, July 17), that make up what everyone calls “CIP version 5”.

[v] In this post, I’m slinging a bunch of items around that NERC “should do”. I realize full well that NERC by itself doesn’t have the authority to change compliance dates, etc; FERC has that. But just petitioning FERC out of the blue on all of this probably won’t work, either. That’s why I think there need to be meetings of NERC, FERC, the regions and the entities, at which all of this is discussed and a consensus reached. None of what I’m advocating can be considered a purely formal process, and it will fail if it is addressed as such. Instead of being based on the letter of NERC’s Rules of Procedure, it will be based on a consensus among NERC, the Regional Entities, and the NERC entities themselves. Extraordinary times require extraordinary measures.

[vi] I’m assuming here that the entity has made a good faith effort to comply – read all the guidance they can, worked hard on compliance with all the v5 requirements, etc. If they have just sat back and done very little, they should be assessed whatever violations apply. Having said this, I know this will require that some clear line is drawn between what constitutes a good faith effort and a “bad faith” one. This is another question that will simply have to be resolved through consensus among NERC, FERC, the regions and the entities. It can’t be addressed through some sort of SAR process, in time for that to do any good.

[vii] Some of the v6 compliance dates may have to be pushed back along with the first v5 date (for a complete list of v5 and v6 dates, see this post). For example, the four v6 requirements that come due on January 1, 2017 will obviously have to be pushed back at least three months and maybe more. But I think some of the compliance dates – such as for the two Low impact requirements – may not need to be pushed back, since they may not be seriously affected by pushing the 4/1/16 date (for Highs/Mediums) back a year. These are all matters for discussion with the NERC community, and of course will require FERC approval.

[viii] Except, again, for an entity that doesn’t care about the whole process and doesn’t really try to comply with CIP-002 at all. They should be treated severely, although the same provision I mention in footnote vi above applies here.

[ix] If NERC only wanted to fix the “Programmable” definition, they could issue a SAR for just that; it might be developed and approved in around a year. But given the many problems in CIP-002, I don’t think just defining "Programmable" helps; the standard and its closely related definitions (especially “programmable” and “adversely impact”; possibly one or two others like “associated with”) need to be fixed at the same time. I'll have more to say on this in one of my soon-to-come posts on how to fix CIP-002.

[x] The reason I’m certain of this is that the enforceability of CIP, or any other NERC standard, is ultimately up to the US court system. NERC can issue violations and FERC can uphold them, but in the end any violation can be appealed (and I know entities have at least seriously considered doing this, although there haven't been any cases that have gone through so far). I’m sure that a judge won’t have to spend 15 minutes reading CIP-002-5.1 R1 to realize it’s too vague and contradictory to be enforceable. When that happens, it’s game over for CIP-002 (and perhaps for all of CIP v5); once all NERC entities know this has happened, you can be sure the auditors won’t want to waste their time issuing PVs that are sure to be thrown out if appealed – and the entities will know they only have to threaten to appeal in order to get NERC and FERC to back off a violation. It will be a very ugly scene, and for that reason I don’t think it will ever actually come to pass. I have faith that good sense will prevail, and may be prevailing already. And if good sense comes to the NERC community, maybe there’s even hope for Congress!

[xi] There is another “step” that I’m less certain about, so I’ll put it in a footnote; it regards the Attachment 1 criteria. I did a post in 2012 – which I reproduced here in 2013 – that said that, no matter how many problems there were with the bright-line criteria, it is useless to try to spend a lot of time refining them. I don’t think God Himself could write a set of short criteria that would capture the huge variability of the electric power industry. The only realistic way to deal with the many problems that have already come up with the BLC – and the exponentially larger number that will come up in future years – is to have some sort of “Supreme Court of BLC”, that will rule on each BLC dispute. This may seem arbitrary, but it’s still better than the RBAMs in CIP v1-3; FERC obviously thinks those were even more arbitrary, and intentionally so. Having “bright-line criteria” (FERC’s term, I believe) was in principle a good idea, but I just don’t see it ever working without some sort of arbitration mechanism like this.

[xii] I’m not naïve enough to say there won’t need to be any future versions of CIP (in fact, it seems that FERC is already proposing further extensions and changes to the CIP standards, in their July 16 NOPR). But I think the framework of the standards – once CIP-002 is fixed – can carry CIP forward for many years after that. Of course, IMHO the best thing would be a purely risk-based standard like CIP-014, as I described at the end of this post. I don’t see that happening very soon, though.

Friday, July 10, 2015

What’s a High-Impact PACS like You Doing in a Low-Impact Substation like This?

I very recently engaged in a good email discussion that I’d like to share with you. I think it’s worth reading because:

It can be important to certain entities that might be in a similar situation;
The discussion illustrates how this kind of CIP-002 question can be addressed; and
As an afterthought, a good – and much less expensive – solution to the problem came up. It’s important always to be looking for these.

The discussion began when a consultant that I have known for a long time – and engaged in a number of CIP discussions with, some of which have made it into this blog – emailed me about my recent post on the question of “far-end” relays. He stated that he had a client who has a PACS system in a Low impact substation, which is controlled by the PACS server that also protects their High impact Control Center.

He had determined that, even though the PACS was in a Low substation, the fact that it was associated with the High impact PACS meant it was, itself, High impact. The client at first resisted this finding (understandably, given the higher compliance costs it would entail), but when he pointed out that an attacker could use the substation PACS workstation to alter the configuration of the PACS server in the Control Center, the client came around. The consultant wondered whether I agreed with his position.[i]

My answer was a very safe “It depends”.[ii] If the question were whether a BES Cyber System at a substation could be High impact because it’s associated with a BCS at a High Control Center, my answer would be “Absolutely not”. High BCS are limited to those “used by and located at” High assets (which are all Control Centers, of course). Since the device in question isn’t located at the Control Center, it clearly can’t be a High BCS.

But since we’re talking about a PACS system, not a BCS, the question is different. “Used by and located at” doesn’t apply in this case. What does apply is the word “associated”. Normally, when I talk about this word – which is a lot – I’m referring to the preamble to Section 2 of CIP-002-5.1 Attachment 1, which says that Medium BCS are those that are “associated with” the assets or Facilities that are the subject of the 2.X criteria.

However, “associated” is used in another location in v5 (actually, many locations): that is the Applicability column of all of the CIP v5 standards (except for CIP-002 and -003). Just about every requirement applies to some variation of Medium and/or High impact BCS and “their associated…PACS”. The question now becomes, “Is the PACS workstation in the Low substation associated with the BCS in the High Control Center?” If so, then many (but not all) of the High impact requirements will apply to it.

Note the question isn’t whether the workstation is associated with the High impact PACS server itself; since it uses the server, the answer to that is clearly yes. But is the workstation in the Low impact substation really associated with the different BES Cyber Systems – ICCP servers, etc. – in the Control Center?

Two things are clear: First, the fact that the PACS workstation in the Low substation could be used to attack the PACS server in the Control Center is completely irrelevant here. Any computer in the world that’s connected to the Internet could, in theory, be used to attack the PACS server. Obviously, you want to take good precautions to protect the PACS workstation in any case, but it won’t be “High impact” for this reason.

Second, the “high water mark” concept has nothing to do with this situation. The fact that the workstation in the substation forms part of the overall PACS system – which includes the “High impact” PACS server – doesn’t mean that all of the components of the PACS system suddenly become “High impact”. Unless, of course, they’re all on a single network in a single ESP. But I really don’t recommend having an ESP that spans locations.[iii]

An implicit assumption so far is that the PACS server in the High Control Center is protecting the High BCS in the Control Center. If the server were really controlling just substations and/or generating stations, but was simply located at the Control Center for convenience, then I contend it would be a) Medium impact if one or more of the assets for which it controlled access were Medium, or b) Low if it controlled only Low assets.[iv]

At this point in the discussion (actually earlier – I’m condensing some), I brought in an auditor friend, with whom I often have conversations about different CIP v5 issues. He said he thought the PACS workstation in the Low impact substation was actually “associated” with the BCS in the High Control Center, and therefore would itself be subject to many of the High requirements - by the Applicability references in the v5 requirements that apply to High BCS. I believe he bases this interpretation on the idea that the workstation could actually change the security settings of the High BCS. This is because the workstation presumably would have access to the entire database on the server, not just the settings for the users with access to the substation it’s located at.[v] This means it would be associated with High BCS.

The auditor elaborated by saying:

“There are multiple components to a PACS and they all have different roles and types of access. Starting at the top, you have the PACS server that contains the database of access rights for all access points being controlled by the PACS. So, if you have multiple sites being controlled, the PACS server is associated with each and every one of them, and by extension, the BCS at those sites for which the PACS is controlling access.

“Then you have the door control panels, which is where the local site control takes place. The PACS server pushes the access rights for the access points controlled by the door control panel to the panel itself. The panel then operates autonomously, opening (or not) the doors and logging all transactions. It will ship its logged transactions to server if the server is available, waiting if necessary until communications with the server is reestablished. But, it can and will operate in the absence of the server using its latest download. So, the door control panel is associated with whatever is behind the door(s) it is controlling.

“And, of course, you have the badge readers, door strikes/mag locks, and door open/closed sensors located at each access point. They interface with the door controller. The badge reader sends the information from the badge just presented (along with the scanned finger print, entered PIN, or whatever in a multi-factor access) to the door controller and the door controller sends the signal to unlock the door if access is to be granted. These PACS components are explicitly excluded from the CIP standards.

“The odd man out is the workstation. The workstation is how you interface with the servers to see the logs and/or configure/revoke a badge’s access rights. A number of entities have tried to argue that the workstation is not part of the PACS because the workstation is not “controlling” access. The Regions have argued that the workstation is part of the PACS because granting access is more than reading a badge and opening the door. Granting access includes assigning access rights and that is done with the workstation.[vi] So, the workstation is associated with every access point for which it can manage access rights; usually the entire PACS database. And, by extension, it is therefore associated with the BCS sitting behind the door(s). If you have multiple impact levels, then the workstation takes on the highest impact level of what is behind the doors the workstation can affect.[vii]

“Most, but not all, entities segregate their PACS network from their operational networks. The workstations will either be on the PACS network, or they will be on the corporate network, with the server bridging the workstation and door controller networks. Some entities have put their PACS inside their ESP; we usually recommend they rethink that decision.”

I’m stopping here. This particular auditor says the PACS workstation in the Low substation is associated with High impact BCS, in the case presented to him. I don’t agree with him, because I believe this is stretching the meaning of the word “associated”. It would be nice if NERC were to develop a Lesson Learned on this topic, but I doubt that will happen any time before the compliance date next April, and perhaps not even after that. I’m afraid this has to go on top of the already big pile of questions for which the entity – that finds itself in this situation – needs to consider all available guidance, then “roll their own” approach.

I also want to call your attention to footnote v, which lays out a technical control that might by itself prevent this situation from happening. It will probably be much easier to simply implement that control than have to apply a bunch of High impact requirements to the workstation in the Low substation.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] Of course, this question is tied to the far-end relay question by the fact that both involve the issue of a High or Medium impact cyber asset being found in a Low impact substation – which can potentially be expensive, from the compliance point of view.

[ii] It was also followed by my standard disclaimer that the ultimate authority on any question of CIP v5 for any NERC entity is the applicable Regional Entity. In the end, it matters naught what I say, what NERC says, or what the other regions say. None of us will audit the entity.

[iii] An auditor friend – the same one quoted below in this post – pointed out that if an ESP encompasses multiple assets (locations), the communications equipment that connects the two assets will no longer be exempt from CIP v5, per Section 4.2.3.2 of each of the Standards. This is because that exemption applies to devices that are between ESPs; if there’s just one ESP, all the devices are potentially BES Cyber Assets.

[iv] The PACS server wouldn’t be subject to High requirements because in this case it wouldn’t be “associated with” any High BCS. If it controlled access at any Medium substations or plants, it would be subject to some Medium requirements, because then it would be “associated” with Medium BCS. If it only controlled access at Low assets, it would only be subject to CIP-003-6 R2.

[v] As I wrote these words, it occurred to me that there could be a way to restrict any user of the substation workstation from changing any security settings on the PACS server, other than those that apply to that particular substation. I discussed this with the auditor, who confirmed that, if an account on the server were linked to only that workstation – through say a digital certificate, as well as the usual user name/password – and if that account were restricted to only being able to change the settings for access to the Low substation, then that could possibly mean the workstation would no longer be considered “associated with” High BCS. Thus, it wouldn’t have to comply with any High impact requirements. This really seems to be the best solution to this problem, rather than making the PACS in the Low substation subject to a number of the High requirements.

Another assumption we made is that the PACS server isn’t networked with the High BCS in the Control Center. If it were, that might conceivably strengthen the case that the workstation in the Low substation was “associated” with the High BCS. But I don’t think having the PACS server on the same network as the BCS is a good practice in any case – for one thing, it increases the compliance burden since the PACS server now also becomes a Protected Cyber Asset.

[vi] The auditor made this note:

“(Here is) the definition of PACS from the NERC Glossary:

Physical Access Control Systems (PACS): Cyber Assets that control, alert, or log access to the Physical Security Perimeter(s), exclusive of locally mounted hardware or devices at the Physical Security Perimeter such as motion sensors, electronic lock control mechanisms, and badge readers.

“It could be a bit more clear for sure, but the workstation is needed for the control function (the act of granting and revoking access rights) and the alerting function (alarm display). That is why we have, since Day 1, viewed the workstation as part of the PACS.”

[vii] I think this is a good principle for entities to follow. In practice, it means that, when there are Medium and Low BCS at a single substation or generating station, the PACS will be Medium impact. This follows pretty clearly from the requirements.