Wednesday, October 16, 2024

NERC CIP: What’s the difference between SaaS and BES Cyber Systems in the cloud?

My most recent post concluded with this paragraph:

But that doesn’t mean you have to stay away from the cloud altogether for six years. You can’t deploy medium or high impact systems in the cloud, but you can certainly use SaaS to perform the functions of medium or high impact systems. More on that topic is coming soon to a blog near you.

The post had already made it clear there’s no good way to deploy or utilize medium and high impact BES Cyber Systems (BCS), Electronic Access Control or Monitoring Systems (EACMS) and Physical Access Control Systems (PACS) in the cloud today. Why did I say you can use SaaS to perform the functions of those systems? Isn’t SaaS just software that the vendor has implemented in the cloud for other organizations to access? Why is that different from BCS in the cloud?

The difference is this: If a SCADA vendor implements their software in the cloud with the intention of having multiple users, none of the normal I/O that handles communications with substations and generating facilities will be implemented with it; this is because the I/O is always customer specific. This means the cloud implementation will not have an impact on the BES in 15 minutes or otherwise, so it will clearly not be a BCS. It will be SaaS, which is now “allowed” in the cloud.[i]

However, if the same vendor implemented their software in the cloud for a particular customer and implemented all the customer’s required I/O with it, that would be a BCS in the cloud. This isn’t currently “legal” for medium or high impact systems. Moreover, it will never be permitted until there is a major revision to the CIP standards (fortunately, this long process has at least started).

As I discussed in the previous post, there will still be a compliance obligation for the EMS-as-SaaS, since some of the data it utilizes will be BCSI. This means that, while the obligation to comply will fall entirely on the NERC entity, the SaaS provider will need to provide appropriate compliance evidence, which I described in the previous post. The NERC entity must also take account of the SaaS provider’s use of their BCSI in their CIP-011-3 R1 Information Protection Plan.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] It isn’t likely that most (or even any) SCADA implementations for electric utilities would tolerate not having direct I/O to substations and/or generating stations. Those communications usually need to be as real-time as possible. On the other hand, a renewables Control Center (which manages multiple wind and/or solar installations) will not usually require real-time communications.

Tuesday, October 15, 2024

NERC CIP: Who is responsible for compliance in the cloud?


I have heard NERC entities ask the question in the title at least a few times regarding cloud service providers (note that I am using this term broadly to include not just “Platform CSPs” but providers of cloud-based services like SaaS and security monitoring services). My guess is they’re doing this just to show they have a sense of humor, since the answer is very clear: The entity that is responsible for compliance with any CIP requirement, whether the systems in scope are deployed onsite, in a third party’s cloud, or both, is the entity that is listed in Section 4.1 of each currently enforced CIP standard. That section is titled “Functional Entities”.

Of course, you’ll note there is no Functional Entity called “CSP”. The only entity responsible for CIP compliance is you, Mr./Ms. NERC entity. Even if NERC decided tomorrow that CSPs need to comply with the CIP Reliability Standards, NERC has no authority to enforce such a decision, since its regulatory authority comes from FERC – and FERC has no authority over CSPs, even if they happen to serve NERC entities (should the FDA have authority over CSPs, just because the CSPs provide services to pharmaceutical manufacturers?).

However, saying that the CSP isn’t responsible for CIP compliance is not the same as saying the CSP has no role to play in CIP compliance. If the NERC entity entrusts workloads subject to CIP compliance considerations to a CSP, often only the CSP will be able to provide the evidence required for the NERC entity to prove compliance. But the NERC entity should never assume the CSP knows what evidence they are on the hook to provide, or that they have implicitly agreed to provide it. For the time being, the NERC entity should assume it’s necessary to explain to the CSP exactly what evidence they will need and when they will need it. This would ideally be done during contract negotiations.

Recently, I wrote a post stating there are only two types of workloads subject to CIP compliance that can be safely trusted to the cloud today (meaning no compliance problems are likely to arise from doing so): BCSI used by a SaaS application and low impact Control Centers. I described in nausea-inducing detail what evidence should be required for each, although I need to point out that your mileage may vary, since I certainly don’t know what evidence your auditor will require.

I also pointed out that, unlike for medium or high impact BCS, EACMS or PACS implemented in the cloud, a CSP should be able to provide this evidence without a lot of trouble. But I didn’t point out that I sincerely wonder what kind of response you’ll get when you ask your CSP to take these special measures on your behalf.

Even though I combined both SaaS providers (those that require access to BCSI) and platform CSPs under the “CSP” moniker at the beginning of this post, I’ll break the two categories apart now:

First, I think SaaS providers (who are providing evidence for CIP-004-7 Requirement 6 Part 6.1 compliance) are likely to agree to provide evidence, for two reasons:

1.      They’re a lot smaller than the platform CSPs, and

2.      If they need to utilize BCSI, they’re obviously focused on power industry customers; they at least know that entities subject to NERC CIP compliance can make some strange requests for evidence. Rather than waste time trying to convince the entity that they don’t need that evidence (which is guaranteed to be a losing battle), they should just do what they’re asked to do. Fortunately, if one entity asks for certain evidence, other entities will as well, so the SaaS provider won’t have to provide different documentation for each customer. It’s not like NERC entities will make outlandish requests on their SaaS provider, unless they think it’s likely their auditors will ask for that evidence.

However, platform CSPs (which will presumably be required to provide evidence regarding low impact Control Centers deployed on their platform) are a quite different story:

1.      For one thing, they’re huge; it’s going to be very difficult to get them to agree to do anything that’s not part of their normal services.

2.      For another…how can I say this?...While I haven’t surveyed the platform CSPs on this issue, my guess is they’re not very inclined to bend over backwards for a small sliver - electric utilities and IPPs subject to NERC CIP compliance – of a small industry, namely the electric power industry. In other words, I don’t advise NERC entities to stomp on the floor and scream bloody murder if you don’t succeed in getting the CSP to do what you’re asking them to do. And certainly, don’t threaten to take your business elsewhere – it’s likely to be counterproductive at best.

All this is to say that the chances of convincing a platform CSP to provide compliance evidence for even a low impact Control Center (LICC) in the cloud (and not much evidence is required in that case. I detailed what’s required of an LICC in the post linked above) are very small. Which is another reason why deploying medium or high impact BCS, EACMS or PACS in the cloud now is the stuff of fantasy.

The day will likely come when such systems can be safely deployed in the cloud while maintaining CIP compliance, but that will be under a different set of CIP standards - one in which cloud-based systems (perhaps called “Cloud BCS”) are subject to their own requirements. That day is 5-6 years away, although it’s good there’s now a Standards Drafting Team that’s at least starting the process.

But that doesn’t mean you have to stay away from the cloud altogether for six years. You can’t deploy medium or high impact systems in the cloud, but you can certainly use SaaS to perform the functions of medium or high impact systems. More on that topic is coming soon to a blog near you.

“CIP in the cloud” is one of the most important issues facing the NERC CIP community; its importance is increasing every day. If your organization is a NERC entity or a provider/potential provider of software or cloud services to NERC entities, I would love to discuss this topic with you. Please email me to set up a time for this.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

Monday, October 14, 2024

How can we truly automate software vulnerability identification?

Given the proliferation of serious software vulnerabilities like the log4shell vulnerabilities in the log4j library, software vulnerability management is an important component of any organization’s security program. Successful vulnerability management starts with successful vulnerability identification. This requires that:

1.      The supplier of the software reports vulnerabilities they find in their products. These reports are incorporated into vulnerability databases, especially the US National Vulnerability Database (NVD). Almost all software vulnerabilities are reported by the supplier of the software, not a third party.

2.      Later, users of the software can search the NVD for new vulnerabilities that apply to software products they use. Learning about these vulnerabilities enables the user to coordinate with the suppliers of those products, to learn when they will patch the vulnerabilities and encourage them to speed up patches for the most important vulnerabilities.

However, one important assumption underlies these two requirements: that the user will always be able to learn about vulnerabilities that apply to a product they use when they search a vulnerability database like the NVD. The user will only be able to do this if they know how the supplier has identified the product in the database.

It might seem like the solution to this problem is obvious: The supplier will report the vulnerability using the name of the product and the user will search for that name. The problem is that software products are notorious for having many names, due to being sold under different brands or in different sales venues, acquisition by a different supplier, etc. Even among the employees of a large software supplier, their own products may be known by different names. Trying to create – and especially maintain – a database that lists all the names for a particular software product would be hugely expensive and would ultimately fail, due to the rapidly increasing volume of new software products.

Given there will never be a definitive database of all the names by which a single software product is known, how can a user be sure their search will find the correct product in a vulnerability database? There needs to be a single machine-readable identifier for the product, which the supplier includes in the vulnerability report and the user searches for in the vulnerability database. We have already ruled out the idea of a centralized database that lists all the possible names for a single software product. How can we accomplish this goal without a central database?

The solution is for the identifier to be based on something that the supplier will always know before they report a vulnerability for their product, and that the user will also know (or can easily learn) before they search for that product in a vulnerability database. A good analogy for this is the case of the formula for a chemical compound.

If a chemist has identified a compound whose molecules consist of two hydrogen atoms and one oxygen atom, the chemist will write it as “H2O” (of course, the “2” is normally written as a subscript). Every other chemist will recognize that as water. Similarly, a compound of one sodium and one chlorine atom is NaCl, which is table salt. Note that all chemists can create and interpret these identifiers, without having to look them up in a central database. A chemist who reads “NaCl” always knows which compound that refers to.

There is a software identifier that works in the same way. It’s called “purl”, which stands for “package URL”. It is in widespread use as an identifier in vulnerability databases for open source software that is made available for download through package managers (these are the primary locations through which open source software is made available for download, although not all open source software is available in a package manager).

To create a purl for an open source product, the supplier or user only needs to know the product name, the version number (usually called a “version string”) and the package manager name (such as PyPI). Because every product name/version string combination will always be unique within one package manager (although the same product/version might be available in a different package manager), the purl that includes those three pieces of information is guaranteed to be unique; it is also guaranteed always to point to the same product, since the combination of product name and version string will never change for that product/version.

For example, the purl for version 1.11.1 of the package named “django” in the PyPI package manager is “pkg:pypi/django@1.11.1”. If a user wants to learn about vulnerabilities for version 1.11.1 of django in the pypi package manager, they will always be able to find them using that purl. If they upgrade their instance of django to version 1.12.1, they will search for “pkg:pypi/django@1.12.1” (the “pkg” field is found in all purls). Since the supplier will always use the same purl to report vulnerabilities, the user can be sure their search will find all reported vulnerabilities for that product/version.

Besides purl, the only vulnerability identifier in widespread use is CPE, which stands for “Common Platform Enumeration”. Without going into a lot of detail, CPE is the identifier used in the National Vulnerability Database. It was developed more than 20 years ago by the National Institute of Standards and Technology (NIST), which operates the NVD.

A CPE is created by a NIST employee or contractor and added to a vulnerability (CVE) record in the NVD. Unfortunately, there is no way that anyone can predict with certainty the CPE that this person will create. Some of the reasons why this is the case are described on pages 4-6 of the OWASP SBOM Forum’s 2022 white paper titled “A proposal to operationalize component identification for vulnerability management”.

Currently (as of the fall of 2024), there is an even more serious problem with CPE, in that since February the NVD staff has drastically reduced the number of CPEs it creates. The result is that over two thirds of new CVE records entered in 2024 do not have a CPE name attached to them. This makes those CVEs invisible to automated searches using a CPE name. A user that searches with a CPE name today may potentially never learn about two thirds of the vulnerabilities that apply to their product/version.

The upshot of this situation is that, if truly automated software vulnerability management is going to be possible again, purl needs to be the default software identifier, both in CVE records and the National Vulnerability Database. While most of the groundwork for achieving this result has already been laid, there remains one big obstacle: Currently, there is no workable way for purl to identify proprietary software. Since the majority of private and public sector organizations in the world rely primarily on proprietary software to run their businesses, this obstacle needs to be removed, so that users of proprietary software products can easily learn about vulnerabilities present in those products.

The OWASP SBOM Forum has identified two methods by which the purl specification can be expanded to make vulnerabilities in proprietary software products as easily discoverable as are vulnerabilities in open source products today. We will soon be starting a working group to address this problem. If you would like to participate in that group and/or provide financial support through a donation to OWASP, please email me.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. 

Sunday, October 13, 2024

NERC CIP: What is “legal” in the cloud today?


If you have been following this blog for – say – the last eight years or so, you probably know that a big problem in the world of NERC CIP compliance is the fact that NERC entities are severely limited in what kinds of workloads they can implement or utilize in the cloud. While this has been the case for many years, the problem is becoming more acute all the time, as software products and security services announce that henceforth they will only be available in the cloud, or else most of their upgrades and enhancements will only be available in the cloud.

As you may also know, a new NERC Standards Drafting Team (SDT) is now meeting to consider what changes may be required to the CIP standards in order to fix this problem. However, they have a long road ahead of them, as I described in this post in January. I doubt that the final set of new or revised CIP standards will become mandatory for at least 5-6 years from today. This isn’t because NERC is dilatory, but because the NERC standards development process includes many steps designed to ensure that NERC members (as well as members of the public) are able to participate in the standards development process at all stages.

So, the good news is the new (or revised) “cloud CIP” standards are guaranteed to be well thought out. The bad news is this will take a long time. I’m sure many NERC entities want to make more use of the cloud now, but are being held back by uncertainty over what exactly is “legal” today - and especially how they will prove at their next audit that they are still compliant.

I must admit that I can only find two use cases in which I am sure that a NERC entity will be found compliant if they utilize the cloud today (although in both cases there’s a catch, which I’ll describe below).

The first of these is low impact BES Cyber Systems in the cloud, and especially low impact Control Centers.[i] This post describes how – after I was initially skeptical that it’s possible for a CSP to provide evidence of compliance with Requirement CIP-003-8 Requirement R2, especially Section 3 of Attachment 1 - a retired CIP auditor convinced me that in fact this is possible[ii]. However, just because it’s possible doesn’t mean that NERC entities with a low impact Control Center are going to rush to redeploy it in the cloud today. See below.

The second use case is BCSI (BES Cyber System Information) in the cloud. Since BCSI is only defined for information regarding medium and high impact BCS, EACMS (Electronic Access Control or Monitoring Systems) and PACS (Physical Access Control Systems), this isn’t a low impact problem. BCSI in the cloud was effectively verboten before January of this year, but the “BCSI-in-the-cloud” problem was in theory solved when CIP-004-7 and CIP-011-3 came into effect on January 1. Why do we need to discuss this now?

It's because, unfortunately, the single new BCSI requirement, CIP-004-7 Requirement R6, was not written with the most important use case for BCSI-in-the-cloud in mind: SaaS that needs access to BCSI. Instead, the requirement was written for simple storage of BCSI in the cloud. However, why would any NERC entity bother to store their BCSI in the cloud? BCSI is almost never voluminous, and usually, on-premises BCSI can be easily (and inexpensively) enclosed within the NERC entity’s ESP and PSP, with zero compliance risk.

However, if a SaaS application for say configuration or vulnerability management requires access to BCSI, the wording of the new CIP-004-7 Requirement R6 Part 6.1.1 poses a problem. Here’s a little background:

The first sentence of Requirement R6 reads, “Each Responsible Entity shall implement one or more documented access management program(s) to authorize, verify, and revoke provisioned access to BCSI…”

The second and third sentences read, “To be considered access to BCSI in the context of this requirement, an individual has both the ability to obtain and use BCSI. Provisioned access is to be considered the result of the specific actions taken to provide an individual(s) the means to access BCSI (e.g., may include physical keys or access cards, user accounts and associated rights and privileges, encryption keys).”

In other words, an individual is considered to have “provisioned access” to BCSI when it is possible for them to view the unencrypted data, regardless of whether or not they actually do so. Therefore, if the person has access to encrypted BCSI but also has access, however briefly, to the decryption key(s), they have provisioned access to BCSI, even if they never view the unencrypted data.

Requirement 6 Part 6.1.1 requires that the NERC entity’s access management program must “Prior to provisioning, authorize…based on need, as determined by the Responsible Entity…Provisioned electronic access to electronic BCSI.” In other words, the entity’s BCSI access management program needs to specifically address how individuals will be granted provisioned access.

Note that, if the case for BCSI in the cloud were simply storage of the encrypted BCSI, there wouldn’t be any question regarding provisioned access. No CSP employee should ever need access to the decryption keys for data that is merely stored in the cloud; the NERC entity would retain full control over the keys the entire time that the BCSI was stored in the cloud.

However, if a SaaS application needs to process BCSI, it will normally require that the BCSI be decrypted first. There is a technology called “homomorphic encryption” that enables an application to utilize encrypted data without decrypting it, but unless the application already supports this, it is unlikely to be available. Thus, an employee of the SaaS provider (or perhaps of the platform CSP on which the SaaS resides) will need provisioned access to BCSI, if only for a few seconds.

If the NERC entity needs to authorize provisioned access for the cloud employees, that’s a problem, since that would probably require the SaaS provider to get the permission of every NERC CIP customer whenever they want a new or existing employee to receive provisioned access to BCSI. In fact, each customer would need to give authorization for each individual employee that receives provisioned access; it can’t be granted to for example every CSP employee that meets certain criteria.

Last winter, there was some panic over this issue among NERC Regional Entity (ERO) staff members, along with suggestions that this issue needs to be kicked back to the new “cloud” SDT – which would mean years before it is resolved. However, it now seems that, if a NERC entity has signed a delegation agreement with the SaaS provider (or the CSP), that might be considered sufficient evidence of compliance.

But how can the NERC entity be sure this is the case? Currently, they can’t, since even NERC ERO-endorsed “Implementation Guidance” isn’t binding on auditors (officially, they have to “give deference” to it, whatever that means). However, the closest thing to a document that commits the auditors to supporting a particular interpretation of a requirement is a “CMEP Practice Guide”. This must be developed by a committee of Regional auditors, although they are allowed to take input from the wider NERC CIP community. It is possible that such a Guide might be developed for BCSI in the cloud, and th guide might call for a delegation agreement.

If a CMEP Practice Guide is developed for BCSI in the cloud, it is likely (in my opinion) that it would recommend that a NERC entity sign a delegation agreement with the SaaS provider, if they wish to utilize a SaaS product that utilizes BCSI. Of course, they would do this to demonstrate their compliance with CIP-004-7 Requirement R1 Part 6.1.

I’ve just described the two use cases in which I think cloud use for NERC CIP workloads is “legal” for NERC entities today. However, being legal doesn’t mean the NERC entity’s work is done. To prove compliance in either of these cases, the entity will need to get the CSP to cooperate with them and provide certain evidence of actions they have taken regarding the CIP requirements in scope for that use case.

In the case of a low impact Control Center in the cloud, the CSP will need to provide evidence that:

1.      The CSP has documented security policies that cover the topics in CIP-003-8 Requirement R1 Part R1.2.

2.      The CSP has documented security plans for low impact BCS that include each of the five sections of CIP-003-8 Requirement R2 Attachment 1 (found on pages 23-25 of CIP-003-8). Since sections 1, 2, 4 and 5 all require policies or procedures, and since it is likely that most CSPs will already have these in place as part of their compliance with a standard like ISO 27001/2, proving compliance in those cases should not be difficult.[iii]

3.      The NERC entity's cloud environment permits “only necessary inbound and outbound electronic access as determined by the Responsible Entity for any communications that are…between a low impact BES Cyber System(s) and a Cyber Asset(s) outside the asset containing low impact BES Cyber System(s)” as required by Section 3 of Attachment 1. This section is a little more difficult, since it is a technical requirement, not a policy or procedure. On the other hand, demonstrating compliance with it should be quite simple. Kevin Perry pointed out to me that the NERC entity will normally control electronic access in their environment, so they won't need the CSP to provide them this evidence; they can gather it themselves.

In the use case of BCSI in the cloud, the NERC entity will need to provide evidence that they signed a delegation agreement for authorization of provisioned access to BCSI with the SaaS provider. The entity will also need to provide evidence that the SaaS provider complied with the terms of the agreement whenever they authorized provisioned access to the entity’s BCSI (which will hopefully not be a very frequent occurrence). I believe that evidence will need to include the name of each individual authorized, as well as when they were authorized.

In both use cases, it should not be difficult for the SaaS provider to furnish this evidence, although this will most likely require negotiating contract terms to ensure they do this. Will a SaaS provider agree to this? I hope so.

I’ve identified two cloud use cases that are “legal” today. Are there any others? I really don’t think so, although if anybody knows of one, I’d be pleased to hear about it. It seems to me that all other cloud use cases won’t work today, mainly because they require deploying or utilizing high or medium impact systems in the cloud.

If a CSP did that, they would need to provide the NERC entity with evidence of most of the CIP requirements and requirement parts that apply to high or medium impact systems. For example, they would need to implement a Physical Security Perimeter and an Electronic Security Perimeter in their cloud. Implementing either of those is impossible for a CSP, unless they’re willing to break the cloud model and constrain your data to reside on a single set of systems in a locked room with access controlled and documented.

If they’re going to do that, most of the advantages of cloud use go away, which raises the question why any NERC entity would pay the higher cost they would likely incur for using the cloud for these systems. I don’t think any would do that, which is of course why I doubt there are any high or medium impact BCS, EACMS or PACS deployed in the cloud today.

However, there may be some hope regarding EACMS in the cloud, which may be the most serious of the CIP/cloud problems. Some well-known cloud-based security monitoring services are effectively off limits to NERC entities with high or medium BCS, because their service is considered to meet the definition of EACMS: “Cyber Assets that perform electronic access control or electronic access monitoring of the Electronic Security Perimeter(s) or BES Cyber Systems. This includes Intermediate Devices.” In other words, these services are considered cloud-based EACMS, making them subject to almost all of the medium and high impact CIP requirements.

Some current and former NERC CIP auditors are wondering whether the term “monitoring” in the EACMS definition is currently being too broadly interpreted by auditors. If the standard interpretation of that term (which doesn’t have a NERC Glossary definition) were narrowed, those auditors believe there would be no impact on on-premises security monitoring systems, while cloud-based monitoring systems would have less likelihood of being identified by auditors as EACMS.

If this could be accomplished (perhaps with another CMEP Practice Guide), this would be a significant achievement, since it would allow NERC entities to start using cloud-based security services they currently cannot use. This would increase security of the Bulk Electric System and would not require that NERC entities wait 5-6 years for the full set of “cloud CIP” requirements to come into effect.

“CIP in the cloud” is one of the most important issues facing the NERC CIP community today, and its importance is increasing every day. If your organization is a NERC entity or a provider/potential provider of software or cloud services to NERC entities, I would love to discuss this topic with you. Please email me to set up a time for this.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] I don’t think it is likely that NERC entities will deploy loads from either substations or synchronous generating stations in the cloud. This is because both of those environments require a very low level of latency. I believe most NERC entities won’t want to deploy those workloads in the cloud. 

[ii] Of course, low impact systems are subject to compliance with other CIP requirements and requirement parts as well (e.g., having an incident response plan and physical security controls), but most CSPs should have no problem providing evidence for those. 

[iii] There is currently no NERC policy in place that, for “policies or procedures” requirements like these, it is sufficient evidence of compliance to point to where the substance of the requirement is addressed in ISO 27001 (or any other certification. Note that FedRAMP is an authorization for certain federal agencies to utilize the service in question; it is not a certification). However, I would hope it would not be a heavy lift for NERC to create such a policy, perhaps in a CMEP Practice Guide.

Thursday, October 10, 2024

A great irony of supply chain cybersecurity

When FERC ordered NERC to develop a supply chain cybersecurity risk management standard in 2016, they listed four areas they wanted that standard to address: (1) software integrity and authenticity; (2) vendor remote access; (3) information system planning; and (4) vendor risk management and procurement controls. When FERC approved CIP-013-1 in 2018 in Order 850, they did so in large part because NERC had encompassed all four of those items in the standard.

The first of those items was addressed in two Requirement Parts: CIP-013-1 Requirement R1 Part R1.2.5 and CIP-010-3 Requirement R1 Part R1.6. FERC summarizes the latter on page 18 with these two sentences:

NERC asserts that the security objective of proposed Requirement R1.6 is to ensure that the software being installed in the BES Cyber System was not modified without the awareness of the software supplier and is not counterfeit. NERC contends that these steps help reduce the likelihood that an attacker could exploit legitimate vendor patch management processes to deliver compromised software updates or patches to a BES Cyber System.

In reading these sentences yesterday, I was struck by a huge irony: This provision is meant to protect against a “poisoned” software update that introduces malware into the system. It accomplishes this purpose by requiring the NERC entity to verify that the update a) was provided by the supplier of the product and not a malicious third party (authenticity), and b) wasn’t modified in some way before or while it was being downloaded (integrity).

Yet, since FERC issued Order 850, what have been probably the two most devastating supply chain cyberattacks anywhere? I’d say they’re the SolarWinds and CrowdStrike attacks (you may want to tell me that CrowdStrike wasn’t actually a cyberattack because it was caused by human error, not malice. However, this is a distinction without a difference, as I pointed out in this post last summer).

Ironically, both attacks were conveyed through software updates. Could a user organization (of any type, whether or not they were subject to NERC CIP compliance) have verified integrity and authenticity before applying the update and prevented the damage? No, for two reasons:

First, both updates were exactly what the developer had created. In the SolarWinds case, the update had been poisoned during the software build process itself, through one of the most sophisticated cyberattacks ever. Since an attack on the build process had seldom been attempted and in any case had never succeeded on any large scale, it would have been quite hard to prevent[i].

What might have prevented the attack was an improvement in SolarWinds’ fundamental security posture, which turned out to be quite deficient. This allowed the attackers to penetrate the development network with relative ease.

In the case of CrowdStrike, the update hadn’t been thoroughly tested, but it hadn’t been modified by any party other than CrowdStrike itself. Both updates would have passed the authenticity and integrity checks with flying colors.

Second, both updates were completely automatic, albeit with the user’s pre-authorization. While neither the SolarWinds nor the CrowdStrike users were forced to accept automatic software updates, I’m sure most of those users trusted the developers completely. They saw no point in spending a lot of time trying to test integrity or authenticity of these updates. Of course, it turns out their trust was misplaced. But without some prior indication that SolarWinds didn’t do basic security very well, or that CrowdStrike didn’t always test its updates adequately before shipping them out, it’s hard to believe many users would have gone through the trouble of trying to verify every update. In fact, I doubt many of them do that now.

It turns out that, practically speaking, verifying integrity and authenticity of software updates wouldn’t have prevented either the SolarWinds or the CrowdStrike incidents, since a) both updates would have easily passed the tests, and b) both vendors were highly trusted by their users (and still are, from all evidence). What would have prevented the two incidents?

Don’t say regulation. I’m sure both vendors have plenty of controls in place now to prevent the same problem from recurring. Regulations are like generals; they’re always good at re-fighting the last war.

What’s needed are controls that can prevent a different problem (of similar magnitude) from occurring. The most important of those controls is imagination. Are there products that will imagine attack scenarios that nobody has thought of before? I doubt there are today, but that might be a good idea for an AI startup.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] Affandicios of the in-toto open source software tool point out that it might have prevented the SolarWinds attack, although that assertion always comes with qualifications about actions the supplier and their customers would need to have taken. While the benefit of taking those actions (or similar ones) today is now much more apparent, that need wasn’t apparent at the time.

Wednesday, October 9, 2024

It’s time to give NERC a break!

 

Last week, I pointed out that FERC, in their recent Notice of Proposed Rulemaking (NOPR), demonstrated they’re not happy with the way that CIP-013-2 (and by extension CIP-013-1) has been implemented by NERC and NERC entities. Although FERC didn’t assign blame for this situation, they made it clear they want it fixed. They’re allowing two months for comment, with a deadline of early December. Early next year, they’ll issue an order requiring that NERC draft a revised standard, which will address the problems they discuss in the NOPR.

The NOPR suggests (at a high level) various changes that FERC is considering ordering in CIP-013-2. I’ve seen a number of FERC NOPRs that deal with existing CIP standards; almost all have essentially said, “We don’t have any problem with your first version of the standard, but now we’re going to have you do something more.” However, in this NOPR, FERC effectively said, “The standard you drafted originally (which remained virtually the same in the second version, except it was expanded to cover EACMS and PACS, as well as BES Cyber Systems) was insufficient. We want you to do better this time. Here are some changes we’re considering requiring you to make in our Final Rule next year.”

If my interpretation is correct and this is FERC’s meaning, I don’t think they are being fair to NERC or to the team that drafted CIP-013-1. Here’s why:

·        In their Order 829 of July 2016, FERC handed the standards drafting team (SDT) an almost impossible task: They had to develop and get approved probably the first supply chain cybersecurity standard outside of the military, which would also be the first completely risk-based NERC standard. Most importantly, they had to do all of this – meaning they wanted it completely approved by NERC and ready for their consideration – in 12 months.

·        All new or revised NERC standards are drafted by a Standards Drafting Team (composed of subject matter experts from NERC entities) and submitted for approval to a Ballot Body composed of NERC entities that choose to participate. The balloting process is very complicated, but approval of any standard requires a supermajority of the ballot body.

·        Usually, new or revised CIP standards have required four ballots for final approval. With each ballot, NERC entities can submit comments on the standards. The SDT is required to respond to all comments. Including the commenting process, each ballot can easily require 3-4 months.

·        Since the comments often explain why an entity has voted no, the SDT scrutinizes them carefully, trying to identify changes that could be made in the draft standard that would increase its chances of approval. Having attended some of the CIP-013 SDT meetings, I know they received a lot of negative comments and made a lot of changes that some observers (including me) thought were “watering down” the requirements of the standard. However, the team members were always keenly aware of the deadline they faced. They had to make some tough choices, to have a chance of meeting that deadline (which they did, of course).

·        After having pushed NERC to meet the one-year deadline, did FERC rush to approve the standard? Well…not exactly. Even though CIP-013-1 was on FERC’s desk by the middle of 2017, they didn’t approve it until more than a year later. There was a reason for that. You may remember there was some sort of upheaval in Washington around the end of 2016 and a lot of people departed their jobs (voluntarily and otherwise). In all of that, FERC lost most of its members and was left with one or two Commissioners, which wasn’t a quorum. That’s why it took them longer to approve CIP-013 (in October 2018) than it took NERC to draft it.

In their new NOPR, FERC states they’re considering imposing a 12-month deadline for NERC to revise the standard, fully approve it, and send it to FERC for their approval. This is a terrible idea, since in that case it’s almost certain the new standard will be no more to FERC’s liking than the current one.  Fortunately, near the end of the NOPR, FERC suggested they would be open to considering an 18-month deadline. I think that’s a great idea!

This will give the SDT time to discuss and submit for a ballot some of the items FERC listed in their NOPR, as well as perhaps some items that the earlier team considered in 2016-2017, but had to remove in the face of strong opposition. I remember a couple of them (although I don’t have time to go back to the original records to verify every detail of this):

1.      It seems obvious that a supply chain security standard should have a definition of “vendor”. Since there is no such definition in the NERC Glossary, the “CIP-013” SDT drafted one. When a new or revised NERC standard requires a new definition, it usually gets balloted along with the standard itself; that happened in this case (I believe it was the first ballot). The definition was solidly voted down. I remember the discussion in an SDT meeting after this happened; the team decided their one-year deadline would be in jeopardy if they kept revising and re-balloting the definition. This is why even today, there’s no NERC Glossary definition of “vendor”.

2.      As originally drafted, Requirement R3 mandated that every 15 months, the NERC entity would review and, where needed, revise the supply chain cybersecurity risk management plan that they developed for Requirement R1. That led to negative comments in the early ballots, which led the SDT to water down R3 to the current language: “Each Responsible Entity shall review and obtain CIP Senior Manager or delegate approval of its supply chain cyber security risk management plan(s) specified in Requirement R1 at least once every 15 calendar months.” In other words, the CIP Senior Manager needs to approve the plan every 15 months. If they don’t even look at it to see what if anything has changed, that’s perfectly fine.

To be honest, I felt (and still feel) that CIP-013-1 was a missed opportunity to develop a risk-based NERC CIP standard that could serve as a model for future risk-based CIP standards. In fact, the NERC community will need such a model, since whatever standards or requirements are developed by the new Project 2023-09 Risk Management for Third-Party Cloud Services drafting team will have to be risk-based: nothing else will work in the cloud.

Fortunately (or unfortunately), the new “cloud” SDT hasn’t even started to consider (except at a very high level) what any new standard will look like, and they won’t be able to do that until next year at the earliest. By that time, FERC will have issued their Final Rule and the CIP-013-3 drafting team should be well into the balloting process. They may have some good advice for the cloud team.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Tuesday, October 8, 2024

NERC CIP-013: Vendor Risk vs. Vendor Risks


In last week’s post regarding FERC’s recent Notice of Proposed Rulemaking (NOPR) for CIP-013-2, the NERC supply chain cybersecurity risk management standard, I noted that FERC is apparently quite concerned that NERC entities aren’t properly performing the three essential elements of any risk management activity:

1.      Identify risks. Of course, in the case of CIP-013, those are supply chain security risks to the Bulk Electric System (BES). They primarily arise through the vendors in the supply chain for intelligent hardware and software that operate or monitor the BES, although they can also arise through the NERC entity itself (for example, does the entity have a policy always to buy from the lowest cost vendor, without being concerned with frou-frou like cybersecurity?).

2.      Assess those risks. The whole point of risk management is that some risks are very serious and need to be addressed as soon as possible, and others are not important and can simply be accepted. The assessment tries to distinguish high from low risks.

3.      Respond to those risks. FERC notes that CIP-013-2 only mentions identifying and assessing risks, but never requires the entity to do anything about them. It’s a sure bet that the order FERC issues sometime after the NOPR comment period ends (in early December) will focus heavily on response.

The NOPR states that FERC proposes to issue an order for NERC to revise CIP-013-2 (i.e., replace it with a new standard, which will be CIP-013-3). Since FERC addresses each of the three risk management elements separately in the NOPR, I’ll discuss them in three separate posts, starting with risk identification in this post.

Probably the most important lesson I took from my experience working with NERC entities during the runup to compliance with CIP-013-1 was the difference between “vendor risk” and “vendor risks”. By vendor risk, I mean the risk posed by the vendor itself. Vendor risk is important when you’re considering whether to buy from a particular vendor at all; for example, when you’re evaluating five vendors with a competitive RFP. In that case, you ideally want to rank all five vendors by their overall level of risk (although that’s hard to do in practice).

In the OT world (i.e., the world that CIP compliance lives in), overall vendor risk isn’t usually a consideration. This is because the decision which vendor to buy (for example) electronic relays from is almost always “the same guys we’ve been buying from for the last 15 years”. Your relay vendor was chosen long ago for technical reasons, and your organization is very familiar with their products. Only a huge screw-up by that vendor would justify even talking to other vendors – and it still isn’t likely you would drop them, unless they had done something horrendous.

There are various services that will sell you risk scores for vendors, which are based on many factors (you can also compute your own scores, of course). For those rare occasions when you might switch vendors (or you’re starting to purchase a new type of product that you’ve never bought before), overall vendor risk scores can be very helpful. But for most OT procurements, vendor risk scores don’t help much. The engineers will make the decision based on functionality. Even if one vendor has an overall risk score that’s higher than another vendor’s, that’s not usually going to sway the decision.

So, why do you need to “identify” risks, if the decision which vendor to buy from was made long ago and won’t change for just one procurement? It’s because the risks that apply to one vendor can change substantially from one procurement to the next.

For example, suppose your organization’s last procurement from Vendor A was a year ago. The vendor hasn’t changed very much, but the environment certainly has. Let’s say that in the past year, the SolarWinds attack happened. Before SolarWinds, your organization never even considered whether a vendor maintained a secure software development environment. This year, you decide you need to ask them questions like

1.      Do you require MFA for access to your development network?

2.      How do you vet your developers?

3.      Do you utilize a product like in-toto (an open source product) to verify the integrity of your development chain?

Or, suppose Vendor A has recently been acquired by Vendor B. Of course, they assure you up and down that they’re still the same Vendor A you’ve known and (mostly) loved for the last ten years. However, you decide that the safe path is to re-assess them before you start a new procurement. Along with the questions you’ve asked them in previous assessments, you add new ones like

A.     How have your security practices changed since you were acquired?

B.     What security certifications does Vendor B have? If you (Vendor A) don’t have one of them, when will you get it?

C.      Will you merge your network with Vendor B’s, and if so, what measures will you take to make sure there’s no degradation of security controls?

All six of the above questions are based on new risks; that is, you have identified these as risks that apply to Vendor A, even though you’re not now even considering whether you want to continue using them. You’ll keep using them, but at the same time you need to identify all the risks that now apply to them, so you can pressure the vendor to mitigate them if needed.

In other words, it isn’t a question of whether Vendor A is now too risky to continue buying from. Rather, it’s a question of what are the new risks that apply to A, that didn’t apply the last time you bought from them? You’re not asking about Vendor A’s overall risk level; you’re asking what new risks they may pose to you, that arose since the last time you assessed them.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.