Tom Alrich's Blog: An Auditor Comments on My Last Post

My last post discussed interesting things I’d learned at WECC’s recent Advanced CIP Training. An auditor (not from WECC) made a few good comments on the post, which I’d like to share with you – along with comments from Morgan King of WECC and a well-known relay vendor. None of this will make sense unless you read the last post.

“Protocol Break”

First, let’s go to the topic of ERC, which Steve Parker of EnergySec has referred to (in one of EnergySec’s newsletters) as “the gift that keeps on giving”.[i] In the last post, I paraphrased Morgan King of WECC, who stated at the training that, “if a device that “translates” routable to serial communications requires authentication of the user, this still does break both ERC and LERC.” In addition, he acknowledged there could be other ways in which ERC and LERC are broken.

Since the Schweitzer SEL-3620 is a widely-used device for substation communications, I approached Schweitzer and got a technician’s view on what this device does. The technician said, “The (3620’s) user session…is terminated on the 3620. The user is not communicating directly with the end device, and the IP address used to talk to the device is that of the 3620, not the relay. In addition to requiring authentication of the user, the 3620 enforces restrictions on user roles based upon group membership, reformulating commands to the end device as required, acting as the user's proxy.”

When I sent this description to the auditor, he replied “The key information…is the reformulation of the user’s commands and the acting as a proxy for the user. Yes, the IP address is that of the gateway and the port number identifies which serially-connected device is to be acted upon. That is how a terminal server works regardless of any other functionality. At audit, the technical information your source provided is exactly the type of information the entity will need to provide to demonstrate whether there is a protocol break.” In a subsequent email, the auditor said, “A terminal server may or may not include session authentication, but unless there are two clearly independent conversations like is seen in the 3620 or in an RTU, there is no protocol break[ii].”

To make his point clear, I’ll paraphrase the auditor by saying that just requiring authentication and substituting the 3620’s IP address don’t in themselves constitute a protocol break, since a pure terminal server could do the same things. What does break ERC in this case is the “reformulation of the user’s commands and the acting as a proxy for the user,” as well as creating “two clearly independent conversations.”

This might on the surface look like a contradiction of what Morgan King said at the WECC meeting, but I now realize this ERC discussion just keeps jumping up to a new level every time it comes up – so both auditors are right for the level at which the discussion stood when they made their statements. The fact that devices like the 3620 require authentication is necessary but not sufficient for them to break ERC. The auditor is pointing out that more is required for ERC to be truly broken, and I’m sure Morgan would agree with that (of course, it seems funny for me to refer to “the auditor” here, since Morgan is an auditor, too. I really mean “the other auditor”).

When I first wrote this post, I concluded this discussion by saying, “I would love to say that I think this settles the question of ERC with serially-connected devices, if I hadn’t thought the same thing at least four times before this.” Lo and behold, I’ve now rewritten this section just a few days later, even before it is posted – and I’m absolutely certain this isn’t the final word on this topic. ERC is a black hole. No amount of Lessons Learned and blog posts will ever be able to settle this question. What does this mean for folks who don’t have the luxury of endless debate, since they have a 4/1/16 compliance deadline looming? I really don’t know.

You might point out that NERC did release a Lesson Learned on ERC in September. However, this issue isn’t clarified by that document. There is a section on pages 2 and 3 entitled “System-to-system process controls”. This has a diagram showing a setup like we’ve been discussing, with a port server and serially-connected BCAs. The LL states that “Serially connected BES Cyber Assets that can be accessed via a protocol converter (identified as a port server in Figure 1) were not considered to be BES Cyber Assets with External Routable Connectivity as defined below.” This seems to say that a port server or terminal server would actually break ERC. As you can see from the above discussion, two very influential auditors don’t believe this will normally be the case - and NERC themselves didn’t think this was the case in their April Memorandum on this topic, which was subsequently withdrawn. So this LL hardly settles the matter.

LERC

In the next section of my post, I pointed out that Dr. Joe Baugh of WECC had confirmed (in answer to my question) that an entity asserting there was no LERC at a Low impact asset which had one or more routable connections to the outside world would most likely be required to inventory their cyber assets at the location and show none of them actually had LERC. The reason Dr. Joe said this was because the entity would have to be able to prove that the ERC coming to the asset was not actually shared by any of the cyber assets there.

However, the auditor commented “The key to demonstrating there is no LERC to Low Impact BCS is documentation of the network/communications design to and within the asset. I can envision this being demonstrated without having to have a detailed inventory of Low Impact BCS. Show me what is connected to the routable network (e.g., detailed network diagrams, router/switch/firewall rules) and you can demonstrate no BCS are attached without enumerating the BCS themselves. Now, if you have a terminal server device on your routable network, I will be asking questions. But I have seen plenty of substations where the BCS are all serially connected within the asset and communicate with the Control Center with a serial link while there is a routable network for non-BCS stuff including the station PC, substation automation, and digital fault recorders.”

The auditor is stating that an inventory of individual cyber assets may not be required to prove there is no LERC at a Low asset with an external routable connection. You just have to be able to show there are no BES Cyber Assets (whether serially or routably connected locally) that are connected to that link in a way that would lead to one or more of them having ERC (e.g. you would need to show that the device that connects them to that link is doing something similar to what the SEL-3620 does in the above discussion).

Intermediate Systems

In a discussion of Morgan King’s presentation at the WECC meeting, I stated that he said “an Intermediate System (IS) should be in the DMZ, but an overriding consideration is that it must be in a PSP; this is because an IS must be declared an EACMS (per the definition of IS). So if your DMZ is outside of the PSP, you need to include the IS within the ESP/PSP.”

However, Morgan pointed out to me that he didn’t say the IS should be within the “ESP/PSP”, but just within the PSP. In fact, he and the (other) auditor both pointed out that the IS can never be in an ESP. As the auditor said, “… per the very definition of Intermediate System, it absolutely cannot be within the ESP. Period. It is an EACMS and must be in a PSP. Putting it in a DMZ is a best practice, not explicitly required by the standard.” So I apologize to Morgan for misquoting him.

Physical Ports

Discussing a presentation on CIP-007, I paraphrased the presenter as saying “For CIP-007 R1.2 compliance (physical ports), it’s not OK to just put signs warning against use of USB devices on the PSP. They need to be on the device itself (or its cabinet if locked).”

The auditor commented, “Not exactly. The standard allows signage. The standard does not prescribe placement. We will evaluate the placement and its proximity to the protected Cyber Asset to determine if it has a deterrence capability.” Morgan said he agreed with the auditor, but added “The important thing is simply this: If an SME has physical access to a cabinet that contains Cyber Assets in scope for CIP v5 and others that are not in scope, the signage should remind the SME with physical access of this directive control. This will hopefully make him/her think before plugging anything into a Cyber Asset in scope without being authorized. If every asset in the cabinet is an Applicable System, then it may be a reasonable approach to place signage on the cabinet itself.”

Real-Time Alerting

From the same presentation on CIP-007, I quoted the presenter as saying “For R4.2, the Guidance implies that real-time alerting can be accomplished with technical means only. But the actual requirement allows for procedural means as well.” The auditor pointed out “I cannot envision a procedural control that provides for “real-time” alerting. Procedural controls can detect an alertable issue, but it will not be in real time.” Good point, Auditor!

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I certainly can’t argue with that statement, since I’ve done at least ten posts on ERC and LERC in various permutations. This is second of course only to the (probably) 60-80 posts I’ve written on problems with CIP-002 R1 and Attachment 1 - and associated topics like the meanings of “programmable” and “affect the BES”. It isn’t surprising that all of these topics should be what Steve Parker calls a “gift that keeps on giving” – and what I call black holes – since they all are really parts of the general issue of identifying and classifying cyber assets that are in scope for CIP v5. I believe CIP v5 is an excellent set of standards, once you get beyond the first step of identifying what they apply to. But with the current wording of the standards, there is nothing but confusion and contradiction in that first step.

This is why I’m advocating that CIP-002 needs to be rewritten if it’s ever going to be enforceable, even though this will take several years. It’s the foundation for the other standards, and no matter how good they are, I don't believe they can survive in the long run, being built on top of a rotten foundation. However, I’m not sure what to do about ERC. ERC is defined, and I don’t think there’s a problem with the definition as far as it goes. But coming up with a rule for how that definition applies in the case of BCS connected serially to an externally routably-connected device like an RTU has so far proved completely elusive.

It’s possible that a definitive guidance document (probably pretty long, covering every possible case of ERC) would suffice to make ERC an enforceable concept. But I do know that all of the ten or so times I’ve addressed this issue I thought I had the problems solved – and then it turns out I didn’t consider something else, which I then address in a subsequent post. Until that situation changes, I will continue to call ERC a black hole along with CIP-002 R1. If NERC comes to agree with me that ERC is a black hole – meaning that no document could possibly solve its problems – then I think all of the v5 and v6 requirements that apply to BCS with ERC will need to be rewritten. This is a daunting task and will probably have to wait for the next major revision of CIP. Until that time, there will need to be agreement that no violations will be issued for honest differences of opinion on what constitutes ERC. My next post will discuss one region that has already stated this is their policy

[ii] The auditor went on to say “Another, maybe better example, is a standard firewall that performs AAA authentication at the initiation of the session and then allows communication into the protected network. Yes, the firewall is inspecting each message, but once authenticated, the only decision by the firewall is whether the subsequent packets in the authenticated session are permitted or denied by the applied Access Control List rules. There is no further interception, reformulation, and proxying of the traffic. It is just passed along if the ACL permits it. If the firewall is fronting a relatively unsophisticated terminal server, there is no apparent protocol break anywhere in the path between the end user and the target Cyber Asset serially connected to the terminal server. And I said “maybe” because there are proxying firewalls out there. That is not the type of firewall I am talking about.”

Tom Alrich's Blog

Thursday, October 8, 2015

An Auditor Comments on My Last Post

No comments:

Post a Comment

Get new posts by email: