Thursday, August 31, 2017

What about Virtualization?


In my last post, I listed six items in the Standards Authorization Request (SAR) for the current CIP Modifications drafting team that I believe will never be accomplished. I discussed my reasons for thinking the first five won’t be accomplished, but I deliberately left the last one – virtualization – for another day. Another day has come, and I’ll explain why I don’t think the drafting team will accomplish this task, either.

First off, what is this task? NERC included virtualization in this drafting team’s SAR because CIP v5 and v6, like all of the other CIP versions before them, say nothing about virtualized cyber assets. But many NERC entities with High and Medium impact assets have been using virtualization – of systems, of networks, and of storage – for years. This is especially true in control centers, where the benefits of virtualization are especially pronounced.

So what does it mean that CIP doesn’t mention virtualization? Does that mean it’s forbidden? No, not at all. It means it’s completely off the radar for CIP. For instance, if you use virtualized servers in your control centers, all you have to do is treat the single piece of hardware that they all run on as a BES Cyber System; this is because CIP currently only applies to Cyber Assets, defined as “Programmable electronic devices.” And a “device”, in my naïve way of looking at things, is something that will hurt if you drop it on your foot. Even if you can figure out how to drop a virtual machine on your foot, it’s pretty clear that it won’t hurt you.

This means that you don’t have to declare your virtual servers themselves as BES Cyber Assets/Systems, and therefore that the requirements in CIP-003 through -011 don’t apply to them. In practice, of course, I’m sure that any NERC entity with High or Medium assets is declaring all of their virtual servers within the ESP to be either BCAs or PCAs; I’m also sure every Regional Entity is keeping an eagle eye out for VMs that haven’t been declared. But if a NERC entity were to be fined for not declaring virtual servers to be BCAs and they were to challenge the fine in court (which they are always allowed to do, since all NERC standards are administrative law, arbitrated by administrative law courts), I believe it is highly likely the fine would be thrown out, simply because they aren’t violating any current CIP requirements.[i]

But even leaving the question of virtual servers aside, there are lots of thorny questions that come up when you look at virtual networks and virtual storage – and these can’t be fixed with just a tweak or reinterpretation of the word “devices” in the Cyber Asset definition. The CIP Modifications drafting team didn’t try to sweep any of these questions under the rug; on the contrary, they actively sought them out.

Early in the team’s existence (they started operation in the spring of 2016), they formed a committee to work on virtualization. That committee quickly realized that there are some very fundamental concepts in CIP – besides the concept of Cyber Asset – that need to be changed to allow virtualization to be encompassed. For example, the idea of an Electronic Security Perimeter has limited use in a virtualized environment, since it focuses entirely on devices – the physical device is either in or out of the ESP, along with all of the virtual devices that may be housed in it. But since switches, storage and servers can all be “divided” in some way into different virtual components, and since these components won’t always be all part of the ESP, there needs to be another way of talking about a security perimeter when these are present; so the committee came up with the concept of an “Electronic Security Zone”.

I have good news and bad news. The good news is that this committee has really done some pioneering work, and they have developed a new way of thinking about cyber security in virtualized environments (not just on OT networks, either); to see some of their work, you can go to the drafting team’s web site and look at the slides and listen to the recordings from the webinars they’ve done. The team could document what they’ve done in a really nifty white paper (or even a book), which should have applicability far beyond the CIP world.

But here’s the bad news: Implementing the new concepts the committee is talking about will require a huge number of new or changed CIP requirements and definitions. Each of these will need to be drafted, posted for 2-4 ballots, commented on, revised multiple times, and finally approved. In this post from early January, I did a little math. I pointed out that, in the case of the changes to the LERC definition and the underlying requirement, this whole process had taken more than six months of this drafting team’s almost exclusive time in the second half of 2016 and the first 2-3 months of 2017.

I then combined this with an estimate that one of the virtualization committee members had given me, that there were probably 15-20 (or even more) definitions or requirements that would need to be changed or developed from scratch. Assuming this is true, and that each of these will take up 6 months of the SDT’s time to get approved by the ballot body, I came up with the (perhaps conservative) estimate that just virtualization will take ten years for the drafting team to accomplish! Needless to say, this is never going to happen.

I know that, at one point at least, there were some on the drafting team who were advocating they just draft all of the changes required for virtualization and submit these as one huge ballot (or maybe 2 or 3 big ballots), that NERC entities could either approve or disapprove in toto. This strikes me as a wonderful prescription for spending an incredible amount of time (just this effort would take well over a year, maybe two years), and likely coming up with exactly zero to show for your efforts, since – unlike the case with LERC – there is no mandate from FERC that this be done. And that seems to be what it takes to get new CIP requirements, or substantial changes to existing ones, approved nowadays, as was also demonstrated by CIP-013.

At one point, I heard a member of the drafting team compare the work they’re doing to the CIP v5 standards. Those standards were of course submitted as a whole to the NERC ballot body, and – while their path to passage certainly wasn’t easy, taking close to two years and at least four ballots – they were ultimately approved and are now in effect. And it’s certainly true that CIP v5 required many more individual changes than virtualization will. So why shouldn’t we expect the same thing to happen here - that it might be a rough year or two, but ultimately the changes required for virtualization will pass?

I’ll tell you why the same thing won’t happen with virtualization as happened with CIP v5. First off, remember that CIP v5 was very much ordered by FERC, in this case Order 706 from January 2008, although it took five years for NERC to “respond” to the order[ii] (there were of course a lot of diversions along the way, including CIP versions 2-4. Of these, v3 was due to a new FERC directive, while v2 and v4 were deliberate choices on NERC’s part). So the NERC ballot body was very much aware that FERC wanted them to approve v5. And FERC had added another incentive when they unexpectedly approved[iii] CIP v4 in April 2012. In doing that, they in effect said “You’d better approve v5 soon. If you don’t, you’ll end up having to comply with v4 and then with v5. Does that sound like much fun?”

The other reason why the same thing won’t happen has to do with the CIP v5 experience itself. While there was a lot of optimism around v5 initially (including on my part), as NERC entities started their v5 implementation work in early 2014 all sorts of questions about what the requirements and definitions really meant started popping up. NERC started trying various expedients to address these questions without doing the one thing that is allowed by the Rules of Procedure: develop a drafting team and task them with fixing many if not most of the worst problems; most of the “fixes” NERC tried ended up being withdrawn, resulting in the fact that today, August 31, 2017, NERC entities are no closer to having answers to their CIP v5 interpretation questions than they were the day FERC approved v5 in 2013 (this sorry process is discussed in a little bit of detail in this post. If you want to read the whole story, I suggest you read every post I wrote between this one and say January 1, 2016. Then maybe you can summarize them for me, since I have no intention of reading them all any time soon. A human being can only take so much punishment).

I’m not a psychologist, but I think the NERC community has been in a kind of shell-shock ever since the CIP v5 experience. This has manifested itself in two ways. The first is a suspicion of anything that a NERC drafting team puts out for a ballot, no matter how good it is. I saw this most vividly in the debate over the “LERC” changes last year, where there was a firestorm of opposition to the first ballot, and the second ballot probably only passed because there was a deadline of early February for NERC to get this change to FERC. This in spite of my opinions that a) the second ballot, which passed, wasn’t substantially different from the first one, which didn’t; and b) both ballots (actually, drafts) should have been no brainers to approve – they allowed the entity to do everything allowed in the CIP v6 version of this requirement, plus much more.

The second manifestation of shell-shock was vividly brought home in a regional CIP workshop I attended last year. At that meeting, a member of the CIP Modifications drafting team gave a presentation on their progress, including some of the changes that virtualization would require. Then a member of the CIP v5 drafting team (in fact, the team that drafted CIP versions 2-5) made an eloquent appeal that the team not subject the NERC membership to debate and vote on any more far-reaching requirements or definitions (and I believe he had the LERC experience in mind).

Think about this: Here was a very well-respected member of the CIP v5 drafting team essentially saying there shouldn’t be any more improvement or expansion of NERC CIP! And simply because the NERC membership was “too tired” (I think he used some words to that effect) to be able to debate these issues anymore.

Of course, what this person was talking about wasn’t just virtualization, but the whole agenda (SAR) for that team. If the team actually came out with a new requirement and/or definition that addressed any one of items 1-5 in the list of SAR directives in my last post (e.g. clarification of what “Programmable” means in the Cyber Asset definition), it’s almost inevitable there would be another big, divisive debate, just as there would be if the team starts to try to codify far-reaching changes like the Electronic Security Zone. So this is another reason why I think the drafting team will need to get their SAR reduced to the items they’ve addressed already and the two items they’re now actively working on: CIP-012 and the TOCC issue. If they try to slog through with the rest of their agenda, they should all literally ask for a reduction in their day job responsibilities for the next 15-20 years; I’m sure that won’t be any problem at all.

And given that, assuming the SDT does get their SAR reduced, the chance there will ever be a future SAR aimed at addressing the fundamental problems in CIP v5/v6 is exactly zero, this is another reason (besides the one discussed in my previous post) why I see no likelihood there will ever be any further changes or additions to the current CIP standards, except where explicitly allowed by FERC.

There is also a corollary to this statement, which I will bring up in one of my next posts.


The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte.


[i] I have heard a few people around NERC make the argument that, because the definition of Cyber Asset reads “Programmable electronic devices, and the hardware, software and data in those devices” (my emphasis), this means that virtual servers all do have to be treated just the same as physical servers. This is a pretty tenuous argument in my opinion, since this wording clearly was intended to cover the normal situation where application software is stored on a device and utilized in that device’s normal operation, not where virtual devices themselves – each with their own application software – are housed in a physical server.

[ii] And the fact that it did take five years was very much on FERC’s mind from then on. Since then, every significant new or changed standard that they have ordered has come with a deadline – this applies to CIP-014, LERC and CIP-013.

[iii] I didn’t start this blog until January 2013, but I did write about that event, and what it meant, on a Honeywell blog that no longer exists. I can send anyone who’s interested the post I wrote discussing this idea, which was later obliquely confirmed to me by a source who knows something about what goes on internally at FERC. The fact that FERC took this maneuver turned out to have some serious unfortunate consequences, as discussed in a series of posts starting with this one.

2 comments:

  1. I think fundamentally this goes back to the issue of devices being capable of having multiple designations or functions. Is an EAP also a BCA? Or, can a BCA be an EAP?

    Putting on my security hat, an EAP should never be a BCA and should solely be used for physically segmenting the ESP from other networks.

    Otherwise, if an EAP can be a BCA, why can't I have a VM environment that does everything and just be designated an EAP/BCA, with some EACM guests, BCA guests, intermediate system guests, etc?

    Again, putting on my security hat, the problem is that a compromise in the switching, hypervisor or storage layers can result in the entire VM infrastructure being vulnerable.

    Conversely, I'd put forward that a VM environment with a virtual guest that operates as a BCA should be designated a BCA as well and all components (blades, hypervisors, switching, storage, etc.) also be PCA and all guests be at minimum a PCA.

    A physical firewall should be required to isolate all components of the VM environment, including storage. If someone can take control of SAN storage (even from a totally different system), or cause a guest VM to act badly and compromise a hypervisor, it can adversely affect the entire VM environment.

    Just look at all the security patches' release notes available from the VM manufacturers. No matter how well patched, there will be new flaws.

    JasonR

    ReplyDelete
  2. Thanks, Jason. This is a good discussion to have with the virtualization committee of the SDT, assuming they decide to develop a white paper or something like that. But it illustrates the fact that there will be huge debates if the SDT tries to actually make the required changes in CIP. My estimate of 10 years is probably too low....

    ReplyDelete