Tom Alrich's Blog

Wednesday, December 9, 2015

An Auditor Responds on Phone Systems, XML Listeners, and More!

After my recent post discussing why phone systems (as well as fire suppression and HVAC systems) aren’t BES Cyber Assets, an auditor emailed to add the following points. I’m reproducing his words verbatim, although I’ve put some comments of my own in in italics.

Auditor: A quick note on redundancy. Redundancy as envisioned by the CIP standards is redundancy of functionality, not necessarily redundancy of physical Cyber Assets. For example, many market systems use XML as a backup to ICCP to send generation deployment instructions. If the ISO/RTO market rules require a primary and a backup method and XML happens to be the backup to ICCP, then both are considered BCS.

Tom: The auditor is referring to my discussion where I implied that redundant systems are usually identically configured. He makes a good point that systems can be quite different but still be “redundant” for purposes of the BCA definition – meaning they both could potentially be BES Cyber Systems. You might now ask, “OK, if the ICCP and XML systems can be considered redundant for the purpose of the BCA definition, why aren’t the ICCP and IP phone systems also redundant?” The answer is that the XML system is in place specifically to provide backup to ICCP, whereas the phone system obviously wasn’t put in place for that purpose, and does much more than back up ICCP.

Auditor: But, I also call your attention to the NERC CIP V5 FAQs. For all of the complaining about NERC guidance, this issue is squarely addressed in the FAQ. See FAQ 3-2014, found here. The question asked was “Some of the systems not previously covered under the CIP Standards before may fall under the assessment process under CIP V5. Do we assess the systems that could cause the EMS (BES Cyber Assets) to fail such as UPS, HVAC (building power control system and cooling for computer room)?

The response was “If a device meets the definition of a Cyber Asset, as defined in the NERC Glossary of Terms, then it is subject to consideration as a BES Cyber Asset as defined in the NERC Glossary of Terms. HVAC, UPS, and other support systems are not the focus of the CIP Standards and will not be the focus of compliance monitoring, unless any such support systems, including HVAC and UPS, are within an ESP. If such support systems are within an ESP, these systems would be a PCA inheriting the highest impact rating within the ESP.

While not explicitly calling out phone systems, the reference to “other support systems” quite properly includes telephone communication systems as being excluded. The only exception, as noted, is if the phone system is, for some unknown reason, connected to a network segment inside an Electronic Security Perimeter, making the phone system a Protected Cyber Asset.

Tom: I discussed this FAQ in this post from April. I frankly find it quite disappointing. Of course, I agree with the conclusion that “HVAC and UPS” aren’t BCAs. However, I don’t understand NERC’s reasoning. They seem to be saying simply that “support systems” aren’t BES Cyber Assets, notwithstanding the wording of the BCA definition.

I feel there are two problems with that. One is that “support systems” isn’t a NERC defined term. Someone might argue that their EMS is a support system, since it obviously supports the BES. So an EMS isn’t a BES Cyber Asset? And how about substation relays? They support the BES. Are they also out of scope? There won’t be much left in scope in CIP v5 if anything that seems like it might be a “support system” is ruled out.

Second, even if this were a defined term (really a phrase), NERC isn’t saying here that the BCA definition – as it currently reads – excludes support systems; to do that, they would have to first define the term, then show why these systems don’t adversely impact the reliable operation of the BES within fifteen minutes when needed if they are misused, etc. In other words, they would have to do what I did for phone systems and HVAC in my previous post – although I wasn’t grouping these under a general term like “support systems”.

In other words, they seem to have implicitly added a sentence to the end of the BCA definition, reading something like “Support systems are not BES Cyber Assets.” Now, I have said repeatedly that somebody – be it NERC, FERC, the Regions, President Obama, the United Nations – needs to go beyond the wording of the standards to clarify issues that can’t be addressed in pure Lessons Learned, so the fact that NERC is modifying the BCA definition doesn’t itself upset me. What does upset me is that NERC isn’t acknowledging that this is what they’re doing – in fact, I don’t think whoever wrote this FAQ was even aware of it.

But in this case I don’t think it was necessary to go beyond what the standards say. As I showed in the post from April linked above (and in the follow-on post on phone systems as well as the post just previous to this one, also on phone systems), the BCA definition as it currently stands seems to exclude phone systems, HVAC and fire suppression systems as BCAs. There was no reason for NERC to have to amend the BCA definition and invent a new – but undefined – class of “support systems”, just so they could eliminate HVAC and UPS from being considered BCAs. There was a completely “by the book” way to do this.

Auditor: Were I to make a case at all for including a phone system in my list of BES Cyber Systems, it would be on the basis that the phone system was the primary and only means to conduct reliability operations and that inability to conduct reliability operations resulted in a sub-15 minute reliability impact. I have not found too many registered entities so configured, certainly not enough to justify such a sweeping “in scope” declaration proffered by the other Region.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Tuesday, December 8, 2015

Back to the Phones

To be honest, I thought the issue of whether or not IP phone systems would be BES Cyber Systems was dead. I wrote a post on this question in April; it gave my reasoning (although a full account of the reasoning needs to include the two previous posts in that “series”. This was the first, and this was the second), which had been developed through email discussions with an auditor. I thought the issue was pretty settled, not because I’d written a post on it; but because I assumed that all of the auditors would be of more or less the same opinion (an auditor in another region had also confirmed my reasoning).

However, I was surprised to hear from at least three entities in one region that they had been told that IP phone systems were in scope as BCS (in one case, it was a 25-year-old general communications system that combined radio, phone, pagers, etc. It obviously wasn’t an IP phone system, but the entity was told it was in scope because it had a microprocessor – an 80286, no less! – and was thus technically a programmable electronic device). So this idea is evidently not quite dead. Let me try to drive a stake through its heart now.[i]

I’ve reread the original post, and I don’t have new arguments to add to that. However, in retrospect I see my emphasis should have been different. Here’s my new argument:

Of course, the issue is whether the phone system meets the definition of a BES Cyber Asset. That definition states that a Cyber Asset’s loss, misuse, etc. must “adversely impact” the Bulk Electric System within 15 minutes, for it to be a BES Cyber Asset.

Before I started the actual argument in the post, I pointed out that, strictly speaking, almost no cyber assets directly affect the BES. Most have their impact through their control of either an asset (e.g. the Distributed Control System controlling a generating station) or a Facility (e.g. a relay controlling a circuit breaker, whose operation impacts one or more lines). There are almost no instances I can think of where a cyber asset impacts the BES on its own;[ii] almost all of them first impact a BES asset or Facility, which in turn impacts the BES itself. This fact makes it hard to understand what the BCA definition means when it talks about the BCS impacting the BES.

In the post (actually, in my second post in the “series”, linked above), I suggested that the way the BCA definition could be interpreted was by looking at both steps of the process: the impact of the Cyber Asset on the asset or Facility it’s associated with, and the impact of the asset or Facility on the BES. You need to ask whether there is an impact at either step. If there is an impact at both steps, and the BES impact is within 15 minutes of the loss, misuse, etc. of the Cyber Asset,[iii] then the Cyber Asset is a BES Cyber Asset. If there is no impact at either of the steps, or impact at one step but not the other, then the Cyber Asset isn’t a BCA.

A generating plant’s Distributed Control System is unambiguously a BES Cyber Asset, since a) its failure will immediately shut down either the entire plant or one or a few units; and b) the shutdown of the plant (or its units) will have an immediate BES impact. How about the phone system?

Typically, an IP phone system that might be considered to be a BCA would be the system in a control center, which was used for various purposes including dispatching generation. There will often be an Automated Generation Control system that does the dispatching, and I don’t think there’s any dispute that this should be a BCA. But if the AGC goes down, the phone system may become the primary means of dispatching.

What happens if the RTO signals to the control center that a plant needs to be dispatched, but AGC is down? Obviously, the dispatcher will pick up a phone. But what happens if the phone system is down? Will the entity not be able to dispatch the plant? It’s certainly possible that could happen, but I would think someone would realize they have a cell phone in their pocket or purse that can achieve the same purpose. Wouldn’t they use that? So it’s hard to say that not having the phone system will have a 15-minute BES impact.

At this point, I know some will jump in to point out that the BCA definition states “Redundancy of affected Facilities, systems, and equipment shall not be considered when determining adverse impact.” Does this mean the phone system has to be a BES Cyber Asset, since it’s the alternative dispatching method to the AGC system (I hate to say the two systems are “redundant”, since an AGC system and a phone system are very different)? From what I’m told, this is in fact the argument that at least one Regional Entity is using.

What are we usually talking about when we say two systems are redundant? Typically, they will be identically configured servers, so one can take over if the other stops for some reason. More importantly, if both of the servers are down at the same time, no other system can step up to perform the tasks they were doing (and if another system can step up, then it should be considered a third redundant system and thus in scope for CIP v5). In other words, for redundant systems, if both systems go down, the task they performed will necessarily go undone; this will almost certainly impact the asset or Facility (since we’re talking about control systems here. If the loss of the systems doesn’t impact an asset or Facility, then they’re probably not control systems in the first place).

How about with the AGC and the phone system? As I’ve just said, if they are both down, the dispatcher will presumably pull out his cell phone to make the dispatch call. Or he might walk over to another office where the phones are still working. Or he might go to a phone booth (that’s just a joke. I’m not sure I’ve even seen a phone booth in years). Or if the plant is just a couple minutes away, he might get in his car and drive there. Or he could use a carrier pigeon or smoke signals. The point is that the loss of AGC and the IP phone system won’t necessarily mean the dispatch instruction won’t get through, causing an adverse impact on the BES. That is the big difference between the redundant servers and the AGC/phone system.[iv]

This is why I said in the April post – and I’m reiterating here – that it’s important to insert the word “necessarily” (or “inevitably”) into the BCA definition, i.e. “…would, within 15 minutes of its required operation, misoperation, or non-operation, necessarily adversely impact one or more Facilities, systems, or equipment…” In other words, only Cyber Assets whose loss or misuse will necessarily impact the BES within 15 minutes when needed are BES Cyber Assets. And phone systems just don’t make the cut.

There are a couple other systems people have brought up, for which the same argument applies as for phone systems. First, there’s one that I previously thought had to be a BCS: a fire suppression system in a substation. I thought, “Surely this is a system that, if not able to function when needed (i.e. during a fire), would adversely impact the BES”.

However, the same auditor referred to at the beginning of this post pointed out to me that, just because a fire breaks out in a substation and the fire suppression system isn’t there to extinguish it, this doesn’t necessarily mean there’s a BES impact. For example, someone who’s there could grab a fire extinguisher and put the fire out before it causes damage. Or the wind might be blowing in a direction that takes the fire away from the lines and equipment, so there is still no BES impact. Just as with the phone system, the fact that there isn’t a necessary impact means the fire suppression system isn’t a BCS.

Another system that comes up often is HVAC systems in control centers or in generating plants. Let’s say the heating fails in the control room of a plant in Grand Forks, ND in January. Obviously, one way of dealing with that could be shutting the plant down, but there are certainly other ways. Maybe everyone can put on coats while the system is being fixed. Maybe they can have shifts of people alternating one hour off and one on to keep the plant running. Again, there isn’t necessarily a BES impact.

The moral of this story: If you think IP phone systems have to be BCS, you should take a cue from the character Sportin’ Life in Gershwin’s Porgy and Bess: “It ain’t necessarily so…”

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I may additionally have to bury it at midnight in a lead-lined coffin with its head pointed downward. You can never be too careful with these things!

[ii] At one of the meetings of the CSO 706 Standards Drafting Team (which drafted CIPs v2, 3, 4 and 5), I asked if someone could give me an example of a cyber asset that did impact the BES without the mediation of an asset or Facility. The only example I was given was that of a leak detector, which I hadn’t heard of but evidently sits in the middle of a line and alerts the control center of current leaks – thus potentially fulfilling the Situational Awareness BROS. I’ll stipulate this may be true, but of course this wouldn’t be in scope for CIP v5 anyway. The only BCS that are in scope are those a) “used by and located at” one of the six asset types listed in CIP-002-5.1 R1 that is High impact; b) “associated with” an asset or Facility that is Medium impact; or c) “contained” by a Low impact asset. And of course, Low impact BCS aren’t themselves in scope for CIP v5; only the assets that contain them are.

[iii] “When needed”, of course.

[iv] Of course, this assumes that all of these alternative systems – cell phones, your car, etc. – wouldn’t be considered redundant systems, and themselves wouldn’t have to be declared BCS. Yet if they aren’t BCS, then the phone system shouldn’t be, either.

Friday, December 4, 2015

What’s a Security Patch?

My recent post on patch management brought up a new issue having to do with CIP-007-6 R2. An auditor made this statement:

“Probably the biggest question is whether an update that provides for configuring and entering a complex password constitutes a security patch. While I would dearly like to assert that it is, I cannot. A security patch is an update that corrects a vulnerability in the affected code. Modifying the code to allow a better password is a feature update and does not address a vulnerability. Therefore, the patch that allows for stronger passwords is not subject to the CIP standards[i].”

Let’s unpack what the auditor is saying:

R2 says the entity needs to consider “cyber security patches” for application; by implication, any patches that may be released by the “patching source” (typically the software vendor) that are not “cyber security patches” do not need to be considered. So what is a cyber security patch?
I’m sure this will surprise you, but there is no NERC Glossary definition of this phrase. However, the Guidance and Technical Basis discussion of R2 says “The requirement applies to patches only, which are fixes released to handle a specific vulnerability in a hardware or software product. The requirement covers only patches that involve cyber security fixes and does not cover patches that are purely functionality-related with no cyber security impact.”
In the quote above, the auditor is saying that a hypothetical update that provides the capability of entering complex passwords (presumably to a device that previously didn’t have this capability) is a “functionality-related” patch, not a cyber security one. This means the entity isn’t violating R2 if they decide not to even consider applying this patch.
The auditor’s statement led two people – a cyber security and compliance professional at a NERC entity, and Prince Riley of Pinnacle Systems Group – to write in and say they didn’t see how the hypothetical patch couldn’t be called a security patch. After all, there aren’t too many patches that could enhance security more than one that made possible the use of complex passwords, right?
The important thing here is to look at what the SDT said in the Guidance: “A security patch is an update that corrects a vulnerability (italics mine) in the affected code.” Does the lack of complex password capability constitute a vulnerability? If you look at the previous post to the one cited above, you will see that one software vendor (quoted by Brandon Workentin of EnergySec in end note iii) states that "Security vulnerabilities involve inadvertent weaknesses; by-design weaknesses may sometimes occur in a product, but these aren't security vulnerabilities." If you follow their reasoning, the lack of complex password capability in a device is certainly not inadvertent. It is most likely there because, at the time the device firmware was developed, there was some reason why this capability couldn’t be provided. This position implies that a firmware patch that provided this capability isn’t a “cyber security patch” as defined by the SDT.
The discussion so far would seem to bring the whole question of whether the patch in question is or isn’t a security one down to the question of what is a “security vulnerability”. Of course, that’s not a defined NERC term[ii], which would normally lead me to say that this is a case just like the phrase “Programmable electronic device” in the definition of Cyber Asset. There is no NERC definition of “programmable”, nor is there likely to be some sort of definitive guidance on this anytime soon. So it is up to each entity to decide how they will determine whether a device is programmable or not[iii]. Similarly, the discussion so far leads me to want to say that, since there is no NERC definition of “cyber security vulnerability” (and since I have never heard that one is on the horizon – indeed, I have never even heard this issue raised until now), it is simply up to the entity to determine how they will determine whether or not a patch is a cyber security one or not. As in all cases of missing definitions or ambiguity, the entity needs to document a) how they resolved the ambiguity (e.g., by developing a definition) and b) that they have applied that resolution on a consistent basis (meaning they can’t, for example, use one definition for “programmable” in one case but a different one in another).
However, I’m not ready to say this. The reason for my hesitation is because of the second sentence in the quotation from the CIP-007-6 Guidance and Technical basis that I quoted above: “The requirement covers only patches that involve cyber security fixes and does not cover patches that are purely functionality related with no cyber security impact (italics mine).” Here, the SDT is saying that one of the criteria for a patch that isn’t covered by this requirement is that it doesn’t have cyber security impact. By implication, this means that any patch that impacts cyber security is subject to the requirement. Doesn’t adding the capability for complex passwords impact cyber security? It’s very hard to say no to that. This may be why the auditor stated in the previous post that it was likely other auditors would take the other side on this issue.[iv]

What’s the point of this discussion? It’s that there is a fundamental ambiguity in CIP-007-6 R2 that is unlikely to be resolved without a SAR. The term “cyber security patch” is simply not defined anywhere, and the Guidance can be read two different ways (plus the Guidance isn’t mandatory in any case). As with “programmable” and many other issues, it is up to the entity to a) decide how they will address the issue, b) document their decision, and c) apply that decision consistently in complying with R2.

Of course, doing all this will require a large amount of time, which is why I’ve said many times that the effort required for CIP v5 compliance is somewhere between 5 and probably 50 times more than what was required for v3 (and this is assuming the same number of assets are in scope. For many if not most NERC entities, there are many more assets that are Medium or High impact under v5 than were Critical Assets under v3. This only compounds the increased workload). I used to think there was a white knight that was going to come in and clear up all of the issues with CIP v5 that are making so many peoples’ lives (and budgets) much more complicated. But I don’t see any white knights on the horizon. If you see one, let me know.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] Of course, the issue of functionality vs. security patches would normally only come up in the context of firmware upgrades. I doubt there are many functionality updates that are pure software patches. They would normally be software upgrades, which aren’t in scope for R2.

[ii] The auditor pointed out to me that there is a definition of “security vulnerability” in NIST 800-30: “A flaw or weakness in system security procedures, design, implementation, or internal controls that could be exercised (accidentally triggered or intentionally exploited) and result in a security breach or a violation of the system's security policy.” The auditor notes that, were this to be adopted as NERC’s definition, his opinion on the issue of the hypothetical functionality patch would most likely change 180 degrees. But he has to deal with what is in the standards now, not what he would like to see in them.

[iii] Notice that I’m not saying here that each entity has to come up with their own “definition” of programmable. This is because it’s not at all clear to me that there could even be a definition in the dictionary sense. I’ve written several posts on this issue, but the one I like best is the first one from September 2014. In that, a CIP professional discusses the method he developed – and documented – for distinguishing programmable from non-programmable electronic devices. As you can see, this is certainly not a dictionary definition, but it has worked very well for this person; a couple others told me they adopted it for their organizations.

[iv] I made this point to the auditor, and he said that—in his opinion—the Guidance for CIP-007-6 R2 as a whole supports the idea that a cyber security patch is one that addresses a specific vulnerability, not one that updates functionality. Of course, when an auditor sees ambiguity in the wording of a requirement (or in this case, in the Guidance), he or she should never issue a PV just because their personal opinion on the issue is different from that of the entity being audited. This means that, even if the auditor held the opposite opinion that the patch that increases password complexity should be implemented, he certainly wouldn't issue a PV to an entity that had decided they didn’t need to consider that patch.

Tuesday, December 1, 2015

A Free Offer: CIP Version 5 “Quick Check”

Note from Tom: Deloitte Advisory has offered this workshop to a number of NERC entities in the past couple of months, and have found an excellent reception for it. If you are interested in this, please get in touch with me. Note that I can’t guarantee we can accommodate everybody in this offer, but I would certainly like to have a conversation with you!

The compliance date for NERC CIP version 5 is rapidly approaching. Most NERC entities have already put a lot of effort into their CIP v5 compliance projects. But what is “compliance”? Because of the many new complexities in v5 and the changing guidance from NERC and the Regions, many – if not most – NERC entities still have substantial questions about what the CIP v5 requirements and definitions mean. This is especially true given the recent notice that FERC will be conducting some CIP v5 audits.

In addition, many NERC entities are still struggling with questions on appropriate procedures and technologies for compliance with the requirements of CIP version 5. There are perhaps many possible ways to comply with one requirement, but which is the most efficient for your organization? What have other entities done?

In order to help your organization answer some of these questions and to introduce you to our NERC CIP consulting services, Deloitte Advisory is offering a free one-day on-site CIP version 5 “Quick Check.” Two senior individuals from Deloitte Advisory will provide the Quick Check:

· Tom Alrich writes a popular blog on NERC CIP. In his blog, he has identified, and tried to resolve, many interpretation issues in CIP v5. Tom will solicit beforehand a list of the biggest interpretation issues your organization has come across regarding CIP v5 and discuss his views on them - as well as discuss how other NERC entities and Regional Entities view them. He will also discuss the “implicit requirements” he has identified in CIP v5 and v6, and provide a written document listing them.

· Dave Nowak is Project Manager for a multi-year Deloitte Advisory project to help one of the largest NERC entities come into full CIP v5 compliance. As Deloitte Advisory has been deeply involved with implementing almost all of the CIP v5 compliance procedures and technologies at this entity – and because Deloitte Advisory has permission from the client to discuss these with other NERC entities – Dave can provide good insight into how his client, as well as other Deloitte Advisory clients, is complying with a particular requirement. He will solicit beforehand a list of your organization’s questions on how other NERC entities are complying with particular requirements.

If you would like to take advantage of this free offer or have other questions, please email Tom at talrich@deloitte.com.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

Tuesday, November 24, 2015

An Auditor Responds to my Firmware Post

Judging by the large number of page views, my recent post on firmware – and the fact that firmware patches are subject to CIP-007-6 R2 – has clearly struck a chord. The day after the post appeared, I received an email from a well-known CIP auditor. As usual, this person had some very good points to make. I reproduce his email in full, with my comments interspersed in italics.

Auditor: First, firmware is nothing more than software burned onto a chip. It might be updatable (through a process of “flashing” the PROM), or it might require replacement of the chip to implement the update. However, to emphasize what you referenced in your blog, it is software and it is subject to CIP-007-6 until such time as an official, FERC-approved definition of Programmable Electronic Device determines otherwise.

Tom: It hadn’t occurred to me that the definition of “Programmable” – or rather the lack thereof – would have anything to do with this. In any case, I wouldn’t count on a definition coming out any time soon, and especially one that ruled out firmware updates being subject to CIP-007-6 R2.

A: Probably the biggest question is whether an update that provides for configuring and entering a complex password constitutes a security patch. While I would dearly like to assert that it is, I cannot. A security patch is an update that corrects a vulnerability in the affected code. Modifying the code to allow a better password is a feature update and does not address a vulnerability. Therefore, the patch that allows for stronger passwords is not subject to the CIP standards. Having said that, I am not convinced that all Regional auditors share the same view, so the entity would be well advised to have this discussion with their Region sooner rather than later.

T: This point was also made by Brandon Workentin of EnergySec, in my long footnote quoting him. The bottom line is that, if you want to be safe, you need to consider a firmware patch adding a feature like stronger passwords to be a security update, even though it isn’t one from a purely technical point of view. Of course, like all CIP questions, you can always discuss this with your region. And like all questions since a few weeks ago, you probably can’t count on your region’s answer (if you get one) guiding your auditor, since your auditor may now be from FERC, not your region. In general, it’s better to take the more “hawkish” interpretation, until FERC comes out with guidance on these and a lot of other issues (and there is no assurance now that they’ll come out with guidance on any issue regarding CIP v5 interpretation).

A: And, to confirm what others have stated, you have to assess security patches for applicability. You do not have to implement them as long as you have a mitigation plan that addresses the vulnerabilities covered by the patch. I disagree that you have to test and determine with certainty that implementation of the patch “breaks” the system before you can choose to not implement the patch. It is well established that outside of the traditional servers and workstations found in the Control Center, (1) updates are often cumulative, meaning you cannot pick and choose what you are going to implement, (2) vendors to this point have very frequently not identified whether a patch or update includes security fixes, (3) vendors have been reluctant to go back and try to provide information necessary to identify security patches for updates released over the past many years, and (4) even if the next released update does explicitly include a security patch, the fact that updates are generally cumulative and the vast majority of the previously released, unimplemented updates are feature releases still causes an entity to appropriately be wary of installing the update. Extensive testing would be required and even then there is no guarantee that some feature or function change won’t be overlooked or misunderstood, introducing reliability risks that exceed whatever the security patch may be addressing.

T: The second part of this paragraph brings up something I’ve heard elsewhere: namely, that the patching process is very different once you move away from the standard IT OS platforms and go to devices like relays. As the auditor points out, the fact that the firmware updates are often cumulative, and that security updates are often not explicitly identified as such, means the entity is probably justified in being very cautious about applying even updates that are identified as being for security purposes. So having a way of mitigating the risk addressed by a security update – without actually applying the update – is especially important for non-standard OS’s.

A: The bottom line is that the entities need to make sure they are on the top of their game, that they have implemented as many defensive controls as they can regardless of whether there is a pending security patch, and that they are focused on cyber security rather than the absolute minimum necessary to demonstrate compliance.

T: This paragraph makes yet another good point: That it is important to have defensive controls (like IDS or firewall rules) that go beyond immediate compliance needs. You never know when a patch will come out that can’t be installed without impacting reliability (or for another reason). This will require you to have an alternative means of mitigating the risk, and those controls may turn out to be absolutely necessary for compliance.

T: I do have another point to make. I got an email from a compliance professional that referred to the part near the end of the post where I mentioned that firmware for video cards, etc. needs to be updated as well. This person said:

“In line with this question I feel like there’s the potential for this to bring device drivers into scope…It could be argued that on top of checking my source for firmware updates for my network card, I also need to check my source for driver updates for the same NIC.”

I really don’t see why this wouldn’t be the case. Device drivers would seem to have the same status as firmware updates – they’re “software” updates that affect the basic operation of the system.[i] I can hear 1,000 hearts stop beating as I’ve just added yet another item to your list of “Things I Need to Address in my CIP v5 Compliance Program.”

Believe me, I’m not writing this post (and the previous one) because I’m trying to make your job harder. But that is the way CIP v5 has been ever since its approval in 2013 (two years and two days before the day I’m writing this): as entities (and auditors) get further into the implementation effort, they discover more and more “implicit requirements”. These were in theory implied by CIP v5 all along, but just haven’t been discovered until now. And guess what: These will continue, probably long after April 1, 2016.

Here’s a tip: I’m soon going to start writing about how I think CIP v5 needs to be rewritten, so we can finally get off this hamster wheel of constantly-increasing requirements, explicit or implicit. And I’ll be talking about it (although “preaching” might be a better word) at Digital Bond’s S4X16 conference in Miami Beach in January.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I showed this comment to the auditor. He agreed with it, and had the following additional comments: “The CIP standard is explicit in that the registered entity is not expected to separately baseline drivers and other components that come with the operating system (see the discussion found in the Guidelines and Technical Basis section of CIP-010-1/2). But, video drivers and network drivers are often provided by different vendors from the operating system vendor and are separately installed from the operating system. In other words, these drivers are not part of the installed operating system. So, yes, you must baseline, track, and patch the nvidia display driver and the Intel NIC driver software packages that were installed on the PC (and show up as installed, versioned software under Programs and Features on a Windows system). This auditor, at least, is applying the same concepts to installed third-party software as apply to the operating system. You must baseline and manage the package. You do not have to separately baseline all of the drivers and other components comprising the package. Again, registered entities should check with their Regions to make sure their auditors share the same view.”

Monday, November 23, 2015

Tom's Lessons Learned No. 4: “Firmware is Software”

Did you know that, under CIP-007 R2, you must evaluate patches (or updates) to firmware in the same way you track “pure” software patches? I must admit I didn’t realize this until recently, and it seems many others didn’t either. Here is how I was led to this conclusion, as well as a few other issues that came up along the way.

In my post on RF’s CIP v5 workshop in early October, I included this paragraph, discussing a presentation by Todd Thompson, one of the RF auditors:

“Todd Thompson addressed the monster requirement, CIP-007. After Todd’s presentation, a questioner asked if an entity must upgrade their firmware if the upgrade will assist with compliance – for example, it will enable longer passwords – given that firmware upgrades can sometimes lead to unanticipated issues. Todd answered that the entity needs to test to make sure the upgrade won’t cause a problem. If it does, they don’t need to apply the upgrade, but they do need to document why they couldn’t apply it.”

I must admit I didn’t think too much about the full meaning of this question when I wrote the post. I knew that a number of NERC entities were making a concerted effort to upgrade firmware on lots of devices in substations in advance of the v5 compliance date, April 1, 2016. I knew there was no requirement to have the latest firmware as of the compliance date, but I assumed the entities were doing this as a good practice, perhaps with a nudge from their NERC Region. I assumed the questioner was referring to this process.

However, one reader with whom I’ve exchanged other emails about interpretation issues in the past, Brandon Workentin of EnergySec, saw something different in there. He thought this was really a CIP-007-6 R2 question; in other words, the real meaning of the question was “I know that R2 covers firmware patches as well as pure ‘software’ ones. But am I still obligated to apply one if I believe it might lead to a reliability problem?” Firmware patches are nowhere explicitly mentioned in R2, nor are they explicitly mentioned in the Guidance and Technical Basis for R2.

He pointed out that the auditor’s answer just addressed compliance with R2.2; he was saying that, if you evaluate a firmware patch and determine it will cause a reliability problem, you need to document that finding but you don’t have to apply the patch. However, Brandon pointed out that you still need to comply with R2.3, which requires the entity to develop a mitigation plan that addresses the vulnerability the patch was meant to fix.

In other words, the questioner was probably pretty happy with the auditor’s response, since he may have thought that he had no further obligation for a particular firmware patch, as long as he could document that applying it would impact reliability. But the questioner would probably not be so happy if he had received the full answer, which is that he would still have to go through the full mitigation plan process for firmware patches, just like pure “software” patches.

This struck me as potentially a lot of work, and before I advised people that they need to treat firmware patches the same as pure “software” ones, I decided to check with some people I know on whether they agreed with this. Two of those people were auditors (different regions); they both agreed that firmware patches (although they’re usually called “updates”) are covered by CIP-007-6 R2, even though the requirement does not explicitly say that (as one auditor stated, “Firmware is software”).[i]

I also asked several entities (large and small), as well as one consultant, whether they knew that “Firmware is software”. Only one of them (the small entity) said he knew about this. He said he’d heard about it at a WECC (Western Electric Coordinating Council) meeting a year or two ago, but couldn’t remember which one. I have attended a lot of regional meetings, but I couldn’t remember hearing this at any of them. However, I did go back and read the presentation on CIP-007-6 that was given at WECC’s Advanced CIP Training in early September. While I can’t remember this being stated in the actual meeting, slides 66, 67 and 72 do say that firmware patches need to be treated like pure “software” ones.[ii]

Of course, just the fact that WECC has stated this in one or two meetings doesn’t mean that all NERC entities in the US and Canada have thereby received notification that firmware is software. In case you want to counter that by saying “Since firmware is software, the burden isn’t on NERC or the regions to notify people of this; they should have known that”, I wish to counter you by pointing out that Brandon did an excellent analysis (see the end note[iii]) of the wording in the Guidance and Technical Basis. His analysis indicates that even the Standards Drafting Team members who wrote the Guidance didn’t seem to always understand this. I think it can quite legitimately be argued that NERC entities have not been sufficiently made aware that they should treat firmware as software under CIP-007-6 R2, for them to be held to strict compliance with that particular “implied requirement” on April 1, 2016. Even if NERC (or FERC) comes out with guidance on this issue tomorrow, I think it's too late for entities to be expected to be in compliance on that date.

Before I go, I’d like to point out three other items that came up in these discussions:

First, for both pure “software” and firmware patches (or “updates”), CIP-007-6 R2 requires the entity to distinguish security patches from non-security ones; you only need to deal with the former. However, as one of the auditors pointed out to me, these security patches are usually identified as such by the vendor releasing them, but not always. So for a vendor that doesn’t distinguish between the two patch types, it is important for the entity to read the description of every patch released to see if it is a security patch or not.

Second, at GridSecCon I met Monta Elkins of FoxGuard Solutions (who gave an excellent presentation on patch management). He pointed out to me that servers and workstations will usually have different pieces of hardware – video card, I/O controller, etc. – that are made by different vendors. Does this mean you have to inventory each component of each server or workstation and monitor each component vendor for firmware patches? This would lead to a huge effort.

I think the answer to that issue, and Monta agreed with me, is you should confirm with the server or workstation vendor that they will provide firmware patches for any of the third-party devices in your system. This way, you still just have to monitor one patch source for all of those patches.

Third, in Brandon Workentin’s analysis (in end note iii below) of the Guidance and Technical Basis for CIP-007-6 R2, he pointed out another issue on which the Standards Drafting Team left contradictory guidance. The first paragraph of the R2 guidance reads:

“The SDT’s intent of Requirement R2 is to require entities to know, track, and mitigate the known software vulnerabilities associated with their BES Cyber Assets. It is not strictly an ‘install every security patch’ requirement; the main intention is to ‘be aware of in a timely manner and manage all known vulnerabilities’ requirement.”

I, like probably most people, didn’t notice on first reading this that it raises a big problem, but after Brandon’s email it stands out to me like a sore thumb: My understanding (and I think most others’) was that R2 requires entities to evaluate security patches and make sure the vulnerabilities they address are mitigated, either through applying the patch or (if the patch can’t be applied) through alternative measures (described in the mitigation plan).

But here the SDT is saying something quite different. They’re saying that the entity is responsible for knowing, tracking and mitigating all of the “known” software vulnerabilities associated with BES Cyber Assets! How do they do this? Clearly, just considering all patches that are released isn’t sufficient, since there are many “known” vulnerabilities – in probably most software products – that are never patched, either because the software vendor doesn’t consider them serious vulnerabilities, or because they have determined that the cost of patching them is too high.

This wording seems to imply that entities must continually scan all of the many sources of vulnerability information for any mention of a vulnerability in any software or firmware that’s installed anywhere in their ESPs. When they find mention of a vulnerability they didn’t know about, they need to first see if there’s been a patch released that addresses it. Presumably, if a patch has already been released, the entity has either been patched or mitigated the vulnerability, through complying with R2. However, this can’t automatically be assumed.

But what if no patch has been released? According to this wording, the entity is still responsible for mitigating any “known” vulnerability. And how would your auditor determine whether or not you had mitigated all of those? Will NERC scan all of the literature and publish a continuously-updated list? Clearly, that’s not going to happen. Also quite clearly, R2 is for patching, not for vulnerability management.

I think it’s safe to say you don’t need to think that your CIP v5 compliance burden has just taken another significant leap up, and that you are now responsible for assessing all vulnerabilities, as well as patches. As everywhere in CIP v5, only the language of the Requirement is auditable, and that clearly states that this is a requirement for patch management, not vulnerability management. But it’s quite striking that the Guidance could be so very off base in this case.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] One of the auditors pointed out that CIP-010-2 R1.1.1 says configurations need to be tracked for “Operating system(s) (including version) or firmware where no independent operating system exists”. Of course, this doesn’t necessarily mean that firmware needs to be covered in CIP-007 R2. It also indicates that firmware configurations do not need to be tracked when the device runs an O/S.

[ii] Slide 91 also mentions that a device’s lack of External Routable Connectivity, while not getting the device in question off the hook for compliance with R2, will make the compliance burden for that device easier. This is because some patches may simply not be applicable to it, due to the device not having ERC.

[iii]This is the analysis Brandon provided, to show that the Guidance and Technical Basis for CIP-007-6 R2 seems to be confused as to whether or not firmware patches are in scope for the requirement:

· R2 exclusively uses the term "security patches." There are various places this term is attempted to be defined, but using the definition from those various places can lead to differing conclusions.

· Guidance and Technical Basis for Requirement R2 ¶ 1: "The SDT’s intent of Requirement R2 is to require entities to know, track, and mitigate the known software vulnerabilities associated with their BES Cyber Assets...the main intention is to 'be aware of in a timely manner and manage all known vulnerabilities'"

o (Let’s take the example of a firmware patch that addresses security, such as increasing the password length.) Password length functionality isn't generally considered a "vulnerability." This would imply that a firmware update which adds functionality, even if that functionality is security-related, wouldn't be applicable. A firmware update which removed/addressed a vulnerability, however, would. (An example of this is if an undocumented admin account were hard-coded into the firmware. When that happens and is discovered, those are generally addressed as vulnerabilities in the sense that they get an ICS-CERT alert or something similar.)

· G&TB §2.1 ¶ 1: "The requirement applies to patches only, which are fixes released to handle a specific vulnerability in a hardware or software product."

o Again, is the inability to have long passwords a vulnerability? I would say no. It would qualify as an "insecure-by-design" feature, but I don't think it's a "vulnerability."

o Tom’s comment: Here, the SDT seems to move away from what they said about the focus of R2 being vulnerabilities. I would take this as the more authoritative statement, not the quote from the first paragraph of the Guidance that I cited in the last section of the post.

· (A major software vendor defines “vulnerability” as) “a security exposure that results from a product weakness that the product developer did not intend to introduce and should fix once it is discovered." They formally define it as, "a weakness in a product that could allow an attacker to compromise the integrity, availability, or confidentiality of that product," and state, "Security vulnerabilities involve inadvertent weaknesses; by-design weaknesses may sometimes occur in a product, but these aren't security vulnerabilities."

o Tom’s comment: I can’t believe that the vendor is saying that the same piece of code would be a vulnerability if not intended, but wouldn’t be one if intended!

· G&TB §2.1 ¶ 1: "The requirement covers only patches that involve cyber security fixes and does not cover patches that are purely functionality related with no cyber security impact."

o A firmware update which allows for longer passwords or more complex passwords pretty clearly has a "cyber security impact" (even though it would technically be a functionality update). If you ignore that this uses the term "patches" again, this sentence would imply that such a firmware update would be "in-scope" for the requirement.

· G&TB §2.1 ¶ 1: "The National Vulnerability Database, Operating System vendors, or Control System vendors could all be sources to monitor for release of security related patches, hotfixes, and/or updates."

o This one pretty clearly states that "updates" would need to go through the R2 process.

· G&TB §2.1 ¶ 1: "A patch source is not required for Cyber Assets that have no updateable software or firmware (there is no user accessible way to update the internal software or firmware executing on the Cyber Asset)"

o This part pretty clearly says that updatable firmware needs to be tracked, if possible.

· G&TB § 2.2 ¶ 1: the first several sentences all use the term "patch," which as discussed above seems to have only a limited definition in this document.

· G&TB § 2.2 ¶ 1: "Considerable care must be taken in applying security related patches, hotfixes, and/or updates or applying compensating measures to BES Cyber Systems or BES Cyber Assets that are no longer supported by vendors."

o Again, uses the term "updates" and assumes they are included in the R2 process.

· G&TB § 2.3: This section again uses "patch" and "vulnerability," so it implicitly includes the limitation those terms imply.

Thursday, November 19, 2015

Whistling Past the Graveyard

On November 10-12, 2015, I attended the Northeast Power Coordinating Council’s (NPCC) Fall Compliance Workshop near White Plains, NY. It was a very good workshop, with good presentations on both CIP and more general NERC compliance issues such as the Reliability Assurance Initiative (or Risk-Based Compliance Monitoring Enforcement Plan, as it’s now called). I plan to have one or two more posts on that conference. This post discusses one interesting thing I noticed at the conference.

First, I’ll say there was a lot of discussion – both by the speakers and in informal conversations among participants – of the fact that FERC has announced that it will start auditing compliance with CIP v5 and with CIP-014 next year. It appeared that almost everyone was in agreement that the full implications of this announcement may not be known for a while; I have no argument with that idea.

However, there also seemed to be general agreement that this probably will not be such a big deal. I didn’t hear a single entity say they were going to start doing things differently because of the announcement; I also didn’t hear any of the speakers say that things were likely to be very different.

I’ll be blunt: There was a lot of skepticism that FERC really has the manpower and the industry knowledge to pull this off.[i] While I know they have some really top-notch cyber security professionals on their team, I also don’t believe they have the staff today to start doing a number of audits at once - although this partly depends on what you mean by “audit”. NERC’s model includes about 90 days of offsite document discovery and say 1-3 weeks of onsite audit. I’m told that FERC’s audits can – and often do – take years, and the entity being audited can go for months without ever hearing from their auditors. If FERC decides to use their traditional model, they actually could conduct a number of simultaneous audits without a huge staff increase. But I also don’t doubt they could get a lot more audit staff if needed. Remember, they likely won’t be doing any v5 audits until next fall; they certainly aren’t going to come knocking on your door on April 2, 2016.

But the general feeling that things aren’t really going to be too different under FERC went beyond issues of staffing. I feel it was due to the very human reaction to any potentially big news that isn’t immediately accompanied by a change in circumstances. When something big happens far away – say, the stock market crash of 1929 – we cling to the idea that its full impact isn’t known, and we gravitate to the best possible interpretation of what might happen. Of course, just the news that the stock market had crashed didn’t cause any immediate change to the majority of US citizens in 1929, unless they owned significant stock holdings. The reaction of some was glee: “Those guys had it coming.” It was a couple years later, as the banks started failing and unemployment climbed, that there was no denying things had drastically changed.

FERC has confirmed that they will be doing auditing, but they’ve said nothing more. It’s natural to simply assume the best outcome will occur: That they will do a few audits, but just of the “really big guys” (of course, if you work for a “really big guy”, this isn’t much comfort). The regions will continue to do most of the audits, and their approach won’t really change from what it is now.

Folks, I beg to differ. I think FERC is under a lot of pressure today – mainly from their bosses, the US Congress – to crack down on what is perceived to be a lax attitude toward cyber security on the part of the electric power industry. Why do I think that? Just the fact that FERC is going to be doing this auditing, and that it is a big change from the past, makes me believe this isn’t some idle whim of the Commissioners. I’m sure they thought all of this out before making their move. And I’m sure they know how they are going to get the staff they need to handle the audits.

If you want to experience firsthand the pressure FERC is under, I recommend you pick up Ted Koppel’s new book, Lights Out. Whatever your opinion of the book may be, it is going to have a big influence on the public. I doubt the average man on the street has heard of cyber attacks on the grid, except possibly from Hollywood. But that will be different now. In fact, Congressmen and women will read the book and try to jump ahead of their constituents by demanding changes.

I don’t want to exaggerate the influence of Ted Koppel’s book by itself. The point is that pressure on FERC is coming from a lot of directions and is growing, not diminishing. To expect them to back away from the audit idea due to not having the expertise or the manpower is very dangerous. If this is what we’re thinking, we’re just whistling past the graveyard.

The views and opinions expressed here are my own and don’t necessarily represent the views or opinions of Deloitte Advisory.

[i] I’ve also heard a couple people question whether FERC even has the authority to conduct these audits. Rest assured, FERC has that authority, although they haven’t been exercising it too much in the past. They are the regulator, not NERC.