Sunday, April 10, 2016

What’s in the CIP v7 SAR

This is the first of two posts on the draft NERC CIP Standards Authorization Request (SAR) recently posted for comment by NERC. This post discusses what is in the SAR; a subsequent post will discuss what isn’t there. The second post will be longer, since there’s more to say about what isn’t in the SAR than about what is in it.

The SAR starts by pointing out that there are two primary reasons why a new CIP version needs to be developed (or if you prefer, why CIP v5 and v6 need to be “revised”). The easier reason to discuss is the fact that FERC ordered some changes to NERC CIP in their Order 822. The harder reason is that the CIP v5 Transition Advisory Group (V5TAG) – the group that developed the Lessons Learned on v5 – decided there were certain v5 issues that couldn’t be addressed simply by providing guidance to NERC entities; they require revisions to the actual standards. The V5TAG provided a document discussing issues they would like the Standards Drafting Team (SDT) to address.[i]

What FERC Ordered
To address the easier part first, the SAR lists three items that were ordered by FERC:

  • Protection of transient electronic devices used at low-impact bulk electric system cyber systems,
  • Protections for communication network components between control centers, and
  • Refinement of the definition for Low Impact External Routable Connectivity (LERC). 

I was disappointed when I saw these listed in the SAR, since – as I reported in this recent post – this list is incomplete. You can read my discussion in my post on Order 822, but I believe the most significant of FERC’s directives was where FERC stated (in paragraph 56) “NERC’s response to the directives in this Final Rule should identify the scope of sensitive bulk electric system data that must be protected and specify how the confidentiality, integrity, and availability of each type of bulk electric system data should be protected while it is being transmitted or at rest.” (Emphasis mine)

In other words, FERC wants protections for both data at rest in control centers and data in motion between control centers; neither of those protections is included in the second point above.[ii] I assume the SDT will in fact correct this problem. I’m sure FERC wasn’t joking when they said they want protections for data being transmitted or at rest (in control centers).[iii]

I discussed all of these items in my post on Order 822, so I won’t repeat that discussion here.

What the V5TAG Requested
Besides what FERC ordered, the SAR tasks the SDT with a number of changes requested by the CIP v5 Transition Advisory Group (aka V5TAG). Here are what I consider to be the main changes requested by the V5TAG:

  1. Cyber Asset definition. Of course, the problem with this definition is that the word “Programmable” is undefined. As I’ve previously stated, I think it would be a waste of time to try to develop a dictionary-style definition of “programmable”. I think a set of use cases – e.g. “a device with any of the following characteristics is programmable…” or even better, “a device that doesn’t have the following characteristics isn’t programmable” – is the right approach here. The V5TAG’s document seems to favor this idea.

  1. BES Cyber Asset definition. The SAR lists three desired modifications to the BCA definition. The first is “Focusing the definition so that it does not subsume all other cyber asset types.” This actually sounds a little scary (“The definition that ate all cyber asset types!”), but what it seems to be aimed at is the idea that, if all devices on the OT network (including EACMS and PCAs) are considered to have adverse impact if misused, then everything on the OT network will be a BCA. Of course, this leads to an infinite regress, where you now need other EACMS to protect the current EACMS/BCAs, and you need further EACMS to protect those, etc. I think one way to avoid this might be to distinguish between adverse impact caused by the misuse of the device itself and adverse impact caused by the fact that the device is networked with other devices that definitely are BCAs. The device in the former case would be a BCA, while in the latter case it wouldn’t. I believe this would remove both PCAs and EACMS as being automatically BCAs.

  1. The lower bound. The second item related to the BCA definition is “Considering if there is a lower bound to the term ‘adverse’ in ‘adverse impact’.” The TAG continues “For example, is the focus of a typical generating unit the servers and operator human machine interfaces (HMI) and controller cabinets and Programmable Logic Controllers (PLCs) or is it the thousands of individual sensors and transmitters throughout the plant?” They’re clearly saying they would like there to be a clear line that would keep all those sensors and transmitters from becoming BCAs. All I can say on this one is, Good Luck!

This might seem like a question that has a straightforward answer, but suppose we come up with a technical boundary line where “impact” kicks in. For example, impact could be defined as “causing a variation in frequency of .001%” or something like that. How would an entity ever be able to establish how much impact the loss of a particular Cyber Asset would cause? And how would an auditor disprove an entity’s assertion that a Cyber Asset’s impact was below the threshold? Even if there were a straightforward way to do this, it seems to me it would lead to a nightmare situation, in which a test would have to be done at a certain point on the grid before and after any device was installed or removed from service. But how could the tester be sure that an electrical change wasn’t caused by some completely different event that happened in another part of the grid at exactly the same time? This request strikes me as impossible to fulfill. I honestly don’t see any alternative to simply having this be a matter of judgment, on both the entity’s and auditor’s parts, as is the case now with v5.[iv]

  1. Double impact. The third item related to the BCA definition is “Clarifying the double impact criteria (cyber asset affects a facility and that facility affects the reliable operation of the BES) such that ‘N-1 contingency’ is not a valid methodology that can eliminate an entire site and all of its Cyber Assets from scope.” I actually think this contains two issues.

The first issue is the fact that the BCA definition talks about the impact that the loss or misuse of a Cyber Asset would have on the BES, when in fact – in my opinion – it is almost impossible to conceive of a Cyber Asset impacting the BES by itself; rather, that impact occurs because it first impacts an asset or Facility, and that asset or Facility impacts the BES.[v] For example, the loss of the DCS in a unit of a generating station will certainly have an impact on the BES, but only if it actually controls that unit; if the unit happens to be offline anyway, the loss of the DCS wouldn’t impact the BES. This relates to what I have referred to both as the Fundamental Problem and the Original Sin of CIP v5: the fact that a lot of CIP-002 is written under the assumption that Cyber Assets can directly impact the BES, while other parts are written under the assumption that they only have an impact through the actual assets or Facilities.[vi] I definitely support making it clear (in the BCA definition) that a Cyber Asset only impacts the BES if it impacts an asset or Facility, and the latter impacts the BES.

The second issue is whether “N-1 contingency” is a valid “excuse” for removing an asset or Facility from scope in CIP v5. I think the answer to that is definitely no. In fact, I’m surprised it’s still considered a legitimate question, since I thought it had been laid to rest – with a stake driven through its heart – in the CIP v1-v3 days. An asset’s loss may not impact the grid if there’s another asset that will “fill in” for it if it suddenly goes down. But in the event of a widespread cyber attack that brings down a number of assets (e.g. generating plants) or Facilities (e.g. lines) at once, this reasoning goes out the window. So I think there needs to be some phrase added to the BCA definition to the effect of “N-1 contingency is not a valid reason for determining there is no BES impact”.

  1. Section We now get to a new section entitled “Network and Externally Accessible Devices”. The first item in this section refers to Section, which is found in all of the CIP v5 and v6 standards. The issue here is that this section exempts from CIP v5 “Cyber Assets associated with communication networks and data communication links between discrete Electronic Security Perimeters.” This provision is actually meant to exempt communication links between assets like control centers and substations; there is a very similar provision in CIP v3. However, in drafting CIP v5 the SDT forgot that – unlike in v3 – there can now be BES Cyber Systems in an asset where there is no routable network and therefore no ESP. A strict reading of this exemption would lead to the communications links between one or more such assets not being exempt from CIP v5, which was clearly never the original intent of the SDT.[vii] The V5TAG has already developed a good Lesson Learned on this problem, but it now needs to be incorporated into the standards themselves.

  1. TO / TOP Issue. For anybody affected by this, I know it’s a very big deal, and the TAG spends a lot of time discussing it. But since I don’t have any special knowledge of this issue, I won’t comment.

  1. Virtualization. I don’t think anyone in the NERC community disputes the idea that CIP needs to take specific account of virtualization. There wasn’t any account taken in CIP versions 1-3 either, but the effect of that was that most entities didn’t even try to use any form of virtualization within the ESP. However, virtualization has become so prevalent now that many entities are using it in their CIP v5 ESPs, notwithstanding the fact that it’s no more “permitted” in v5 than it was in v3.

The only way I differ from what the TAG says on this topic is in my idea of what will be required to make CIP “virtualization-friendly”. They just suggest “revisions to CIP-005 and the definitions of Cyber Asset and Electronic Access Point”. Of course, all of these are necessary, but I believe that a lot of the other requirements will need to be updated as well. To give one example, TRE pointed out in a webinar last year that the hypervisor and VMs need to be all within the same PSP. This should be stated in CIP-006.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] NERC has also stated that existing Requests for Interpretation (RFIs) will be incorporated into the SDT’s “agenda”, but this isn’t mentioned in the SAR.

[ii] I brought this up at the NERC CIPC meeting in March and was told by Tobias Whitney and Scott Mix that it was a mistake to leave this item off the list. However, it seems the SAR was being finalized that very day, and it wasn’t corrected. I did notice that the agenda for the NERC Technical Conference in Atlanta in two weeks seems to include data in motion between control centers.

[iii] I’m told that protecting data in motion between control centers with encryption won’t be too hard. I’m also told that trying to encrypt data at rest in a control center will be problematic, since this might hinder response to a crisis. However, there are other ways to provide special protection to data at rest; the SDT will have to determine what is most appropriate.

I also just noticed that FERC says (in the quote above) that NERC should specify how the CIA of each type of BES data should be protected in motion or at rest. This is actually a bigger mandate than I’ve been interpreting it to be. It will be interesting to see how the SDT addresses this.

[iv] Of course, when you have to rely on the auditor’s judgment in something as fundamental as the BCA definition, you’re conceding the fact that CIP v5 and v6 can never really be enforceable in the strict sense, as I’ve been saying for a while. I got over this problem by becoming convinced that it’s time to move to a different approach for NERC CIP altogether; if you meditate on that idea three times a day, you may get over the problem, too. You’ll sleep much better at night!

[v] I discussed this in a post in April 2015, where I pointed out that a Cyber Asset’s “impact on the BES” needs to be broken into two parts. The first is its impact on an asset or Facility; the second is that asset or Facility’s impact on the BES. In order to know whether the Cyber Asset is a BCA, you need to ask whether there is an impact in both cases. If so, then it is a BCA; if one or both of the answers is no, then it isn’t a BCA.

[vi] I remember raising this issue a couple times with the v5 SDT in 2011. In fact, I asked if anyone could give me an example of a Cyber Asset that had a direct impact on the BES, independently of any asset or Facility. The only example that was brought up was that of a leak detector, which sits directly on a line and, if tampered with, might not identify a leak that could itself have an effect on the BES. However, as discussed in this recent post, the leak detector wouldn’t even be in scope as a BES Cyber System in CIP v5, since it’s not located at one of the six asset types in CIP-002 R1. And even if it were so located, I think it’s a case of the tail wagging the dog if we pretend that the leak detector is actually the paradigm case for any Cyber Asset that impacts the BES; it is much more the exception than the rule.

[vii] Of course, as mentioned above, FERC now wants protection of cyber assets on communications networks between control centers – although they’re not asking for it for communications with substations.

No comments:

Post a Comment