Monday, October 14, 2024

How can we truly automate software vulnerability identification?

Given the proliferation of serious software vulnerabilities like the log4shell vulnerabilities in the log4j library, software vulnerability management is an important component of any organization’s security program. Successful vulnerability management starts with successful vulnerability identification. This requires that:

1.      The supplier of the software reports vulnerabilities they find in their products. These reports are incorporated into vulnerability databases, especially the US National Vulnerability Database (NVD). Almost all software vulnerabilities are reported by the supplier of the software, not a third party.

2.      Later, users of the software can search the NVD for new vulnerabilities that apply to software products they use. Learning about these vulnerabilities enables the user to coordinate with the suppliers of those products, to learn when they will patch the vulnerabilities and encourage them to speed up patches for the most important vulnerabilities.

However, one important assumption underlies these two requirements: that the user will always be able to learn about vulnerabilities that apply to a product they use when they search a vulnerability database like the NVD. The user will only be able to do this if they know how the supplier has identified the product in the database.

It might seem like the solution to this problem is obvious: The supplier will report the vulnerability using the name of the product and the user will search for that name. The problem is that software products are notorious for having many names, due to being sold under different brands or in different sales venues, acquisition by a different supplier, etc. Even among the employees of a large software supplier, their own products may be known by different names. Trying to create – and especially maintain – a database that lists all the names for a particular software product would be hugely expensive and would ultimately fail, due to the rapidly increasing volume of new software products.

Given there will never be a definitive database of all the names by which a single software product is known, how can a user be sure their search will find the correct product in a vulnerability database? There needs to be a single machine-readable identifier for the product, which the supplier includes in the vulnerability report and the user searches for in the vulnerability database. We have already ruled out the idea of a centralized database that lists all the possible names for a single software product. How can we accomplish this goal without a central database?

The solution is for the identifier to be based on something that the supplier will always know before they report a vulnerability for their product, and that the user will also know (or can easily learn) before they search for that product in a vulnerability database. A good analogy for this is the case of the formula for a chemical compound.

If a chemist has identified a compound whose molecules consist of two hydrogen atoms and one oxygen atom, the chemist will write it as “H2O” (of course, the “2” is normally written as a subscript). Every other chemist will recognize that as water. Similarly, a compound of one sodium and one chlorine atom is NaCl, which is table salt. Note that all chemists can create and interpret these identifiers, without having to look them up in a central database. A chemist who reads “NaCl” always knows which compound that refers to.

There is a software identifier that works in the same way. It’s called “purl”, which stands for “package URL”. It is in widespread use as an identifier in vulnerability databases for open source software that is made available for download through package managers (these are the primary locations through which open source software is made available for download, although not all open source software is available in a package manager).

To create a purl for an open source product, the supplier or user only needs to know the product name, the version number (usually called a “version string”) and the package manager name (such as PyPI). Because every product name/version string combination will always be unique within one package manager (although the same product/version might be available in a different package manager), the purl that includes those three pieces of information is guaranteed to be unique; it is also guaranteed always to point to the same product, since the combination of product name and version string will never change for that product/version.

For example, the purl for version 1.11.1 of the package named “django” in the PyPI package manager is “pkg:pypi/django@1.11.1”. If a user wants to learn about vulnerabilities for version 1.11.1 of django in the pypi package manager, they will always be able to find them using that purl. If they upgrade their instance of django to version 1.12.1, they will search for “pkg:pypi/django@1.12.1” (the “pkg” field is found in all purls). Since the supplier will always use the same purl to report vulnerabilities, the user can be sure their search will find all reported vulnerabilities for that product/version.

Besides purl, the only vulnerability identifier in widespread use is CPE, which stands for “Common Platform Enumeration”. Without going into a lot of detail, CPE is the identifier used in the National Vulnerability Database. It was developed more than 20 years ago by the National Institute of Standards and Technology (NIST), which operates the NVD.

A CPE is created by a NIST employee or contractor and added to a vulnerability (CVE) record in the NVD. Unfortunately, there is no way that anyone can predict with certainty the CPE that this person will create. Some of the reasons why this is the case are described on pages 4-6 of the OWASP SBOM Forum’s 2022 white paper titled “A proposal to operationalize component identification for vulnerability management”.

Currently (as of the fall of 2024), there is an even more serious problem with CPE, in that since February the NVD staff has drastically reduced the number of CPEs it creates. The result is that over two thirds of new CVE records entered in 2024 do not have a CPE name attached to them. This makes those CVEs invisible to automated searches using a CPE name. A user that searches with a CPE name today may potentially never learn about two thirds of the vulnerabilities that apply to their product/version.

The upshot of this situation is that, if truly automated software vulnerability management is going to be possible again, purl needs to be the default software identifier, both in CVE records and the National Vulnerability Database. While most of the groundwork for achieving this result has already been laid, there remains one big obstacle: Currently, there is no workable way for purl to identify proprietary software. Since the majority of private and public sector organizations in the world rely primarily on proprietary software to run their businesses, this obstacle needs to be removed, so that users of proprietary software products can easily learn about vulnerabilities present in those products.

The OWASP SBOM Forum has identified two methods by which the purl specification can be expanded to make vulnerabilities in proprietary software products as easily discoverable as are vulnerabilities in open source products today. We will soon be starting a working group to address this problem. If you would like to participate in that group and/or provide financial support through a donation to OWASP, please email me.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com. 

Sunday, October 13, 2024

NERC CIP: What is “legal” in the cloud today?


If you have been following this blog for – say – the last eight years or so, you probably know that a big problem in the world of NERC CIP compliance is the fact that NERC entities are severely limited in what kinds of workloads they can implement or utilize in the cloud. While this has been the case for many years, the problem is becoming more acute all the time, as software products and security services announce that henceforth they will only be available in the cloud, or else most of their upgrades and enhancements will only be available in the cloud.

As you may also know, a new NERC Standards Drafting Team (SDT) is now meeting to consider what changes may be required to the CIP standards in order to fix this problem. However, they have a long road ahead of them, as I described in this post in January. I doubt that the final set of new or revised CIP standards will become mandatory for at least 5-6 years from today. This isn’t because NERC is dilatory, but because the NERC standards development process includes many steps designed to ensure that NERC members (as well as members of the public) are able to participate in the standards development process at all stages.

So, the good news is the new (or revised) “cloud CIP” standards are guaranteed to be well thought out. The bad news is this will take a long time. I’m sure many NERC entities want to make more use of the cloud now, but are being held back by uncertainty over what exactly is “legal” today - and especially how they will prove at their next audit that they are still compliant.

I must admit that I can only find two use cases in which I am sure that a NERC entity will be found compliant if they utilize the cloud today (although in both cases there’s a catch, which I’ll describe below).

The first of these is low impact BES Cyber Systems in the cloud, and especially low impact Control Centers.[i] This post describes how – after I was initially skeptical that it’s possible for a CSP to provide evidence of compliance with Requirement CIP-003-8 Requirement R2, especially Section 3 of Attachment 1 - a retired CIP auditor convinced me that in fact this is possible[ii]. However, just because it’s possible doesn’t mean that NERC entities with a low impact Control Center are going to rush to redeploy it in the cloud today. See below.

The second use case is BCSI (BES Cyber System Information) in the cloud. Since BCSI is only defined for information regarding medium and high impact BCS, EACMS (Electronic Access Control or Monitoring Systems) and PACS (Physical Access Control Systems), this isn’t a low impact problem. BCSI in the cloud was effectively verboten before January of this year, but the “BCSI-in-the-cloud” problem was in theory solved when CIP-004-7 and CIP-011-3 came into effect on January 1. Why do we need to discuss this now?

It's because, unfortunately, the single new BCSI requirement, CIP-004-7 Requirement R6, was not written with the most important use case for BCSI-in-the-cloud in mind: SaaS that needs access to BCSI. Instead, the requirement was written for simple storage of BCSI in the cloud. However, why would any NERC entity bother to store their BCSI in the cloud? BCSI is almost never voluminous, and usually, on-premises BCSI can be easily (and inexpensively) enclosed within the NERC entity’s ESP and PSP, with zero compliance risk.

However, if a SaaS application for say configuration or vulnerability management requires access to BCSI, the wording of the new CIP-004-7 Requirement R6 Part 6.1.1 poses a problem. Here’s a little background:

The first sentence of Requirement R6 reads, “Each Responsible Entity shall implement one or more documented access management program(s) to authorize, verify, and revoke provisioned access to BCSI…”

The second and third sentences read, “To be considered access to BCSI in the context of this requirement, an individual has both the ability to obtain and use BCSI. Provisioned access is to be considered the result of the specific actions taken to provide an individual(s) the means to access BCSI (e.g., may include physical keys or access cards, user accounts and associated rights and privileges, encryption keys).”

In other words, an individual is considered to have “provisioned access” to BCSI when it is possible for them to view the unencrypted data, regardless of whether or not they actually do so. Therefore, if the person has access to encrypted BCSI but also has access, however briefly, to the decryption key(s), they have provisioned access to BCSI, even if they never view the unencrypted data.

Requirement 6 Part 6.1.1 requires that the NERC entity’s access management program must “Prior to provisioning, authorize…based on need, as determined by the Responsible Entity…Provisioned electronic access to electronic BCSI.” In other words, the entity’s BCSI access management program needs to specifically address how individuals will be granted provisioned access.

Note that, if the case for BCSI in the cloud were simply storage of the encrypted BCSI, there wouldn’t be any question regarding provisioned access. No CSP employee should ever need access to the decryption keys for data that is merely stored in the cloud; the NERC entity would retain full control over the keys the entire time that the BCSI was stored in the cloud.

However, if a SaaS application needs to process BCSI, it will normally require that the BCSI be decrypted first. There is a technology called “homomorphic encryption” that enables an application to utilize encrypted data without decrypting it, but unless the application already supports this, it is unlikely to be available. Thus, an employee of the SaaS provider (or perhaps of the platform CSP on which the SaaS resides) will need provisioned access to BCSI, if only for a few seconds.

If the NERC entity needs to authorize provisioned access for the cloud employees, that’s a problem, since that would probably require the SaaS provider to get the permission of every NERC CIP customer whenever they want a new or existing employee to receive provisioned access to BCSI. In fact, each customer would need to give authorization for each individual employee that receives provisioned access; it can’t be granted to for example every CSP employee that meets certain criteria.

Last winter, there was some panic over this issue among NERC Regional Entity (ERO) staff members, along with suggestions that this issue needs to be kicked back to the new “cloud” SDT – which would mean years before it is resolved. However, it now seems that, if a NERC entity has signed a delegation agreement with the SaaS provider (or the CSP), that might be considered sufficient evidence of compliance.

But how can the NERC entity be sure this is the case? Currently, they can’t, since even NERC ERO-endorsed “Implementation Guidance” isn’t binding on auditors (officially, they have to “give deference” to it, whatever that means). However, the closest thing to a document that commits the auditors to supporting a particular interpretation of a requirement is a “CMEP Practice Guide”. This must be developed by a committee of Regional auditors, although they are allowed to take input from the wider NERC CIP community. It is possible that such a Guide might be developed for BCSI in the cloud, and th guide might call for a delegation agreement.

If a CMEP Practice Guide is developed for BCSI in the cloud, it is likely (in my opinion) that it would recommend that a NERC entity sign a delegation agreement with the SaaS provider, if they wish to utilize a SaaS product that utilizes BCSI. Of course, they would do this to demonstrate their compliance with CIP-004-7 Requirement R1 Part 6.1.

I’ve just described the two use cases in which I think cloud use for NERC CIP workloads is “legal” for NERC entities today. However, being legal doesn’t mean the NERC entity’s work is done. To prove compliance in either of these cases, the entity will need to get the CSP to cooperate with them and provide certain evidence of actions they have taken regarding the CIP requirements in scope for that use case.

In the case of a low impact Control Center in the cloud, the CSP will need to provide evidence that:

1.      The CSP has documented security policies that cover the topics in CIP-003-8 Requirement R1 Part R1.2.

2.      The CSP has documented security plans for low impact BCS that include each of the five sections of CIP-003-8 Requirement R2 Attachment 1 (found on pages 23-25 of CIP-003-8). Since sections 1, 2, 4 and 5 all require policies or procedures, and since it is likely that most CSPs will already have these in place as part of their compliance with a standard like ISO 27001/2, proving compliance in those cases should not be difficult.[iii]

3.      The NERC entity's cloud environment permits “only necessary inbound and outbound electronic access as determined by the Responsible Entity for any communications that are…between a low impact BES Cyber System(s) and a Cyber Asset(s) outside the asset containing low impact BES Cyber System(s)” as required by Section 3 of Attachment 1. This section is a little more difficult, since it is a technical requirement, not a policy or procedure. On the other hand, demonstrating compliance with it should be quite simple. Kevin Perry pointed out to me that the NERC entity will normally control electronic access in their environment, so they won't need the CSP to provide them this evidence; they can gather it themselves.

In the use case of BCSI in the cloud, the NERC entity will need to provide evidence that they signed a delegation agreement for authorization of provisioned access to BCSI with the SaaS provider. The entity will also need to provide evidence that the SaaS provider complied with the terms of the agreement whenever they authorized provisioned access to the entity’s BCSI (which will hopefully not be a very frequent occurrence). I believe that evidence will need to include the name of each individual authorized, as well as when they were authorized.

In both use cases, it should not be difficult for the SaaS provider to furnish this evidence, although this will most likely require negotiating contract terms to ensure they do this. Will a SaaS provider agree to this? I hope so.

I’ve identified two cloud use cases that are “legal” today. Are there any others? I really don’t think so, although if anybody knows of one, I’d be pleased to hear about it. It seems to me that all other cloud use cases won’t work today, mainly because they require deploying or utilizing high or medium impact systems in the cloud.

If a CSP did that, they would need to provide the NERC entity with evidence of most of the CIP requirements and requirement parts that apply to high or medium impact systems. For example, they would need to implement a Physical Security Perimeter and an Electronic Security Perimeter in their cloud. Implementing either of those is impossible for a CSP, unless they’re willing to break the cloud model and constrain your data to reside on a single set of systems in a locked room with access controlled and documented.

If they’re going to do that, most of the advantages of cloud use go away, which raises the question why any NERC entity would pay the higher cost they would likely incur for using the cloud for these systems. I don’t think any would do that, which is of course why I doubt there are any high or medium impact BCS, EACMS or PACS deployed in the cloud today.

However, there may be some hope regarding EACMS in the cloud, which may be the most serious of the CIP/cloud problems. Some well-known cloud-based security monitoring services are effectively off limits to NERC entities with high or medium BCS, because their service is considered to meet the definition of EACMS: “Cyber Assets that perform electronic access control or electronic access monitoring of the Electronic Security Perimeter(s) or BES Cyber Systems. This includes Intermediate Devices.” In other words, these services are considered cloud-based EACMS, making them subject to almost all of the medium and high impact CIP requirements.

Some current and former NERC CIP auditors are wondering whether the term “monitoring” in the EACMS definition is currently being too broadly interpreted by auditors. If the standard interpretation of that term (which doesn’t have a NERC Glossary definition) were narrowed, those auditors believe there would be no impact on on-premises security monitoring systems, while cloud-based monitoring systems would have less likelihood of being identified by auditors as EACMS.

If this could be accomplished (perhaps with another CMEP Practice Guide), this would be a significant achievement, since it would allow NERC entities to start using cloud-based security services they currently cannot use. This would increase security of the Bulk Electric System and would not require that NERC entities wait 5-6 years for the full set of “cloud CIP” requirements to come into effect.

“CIP in the cloud” is one of the most important issues facing the NERC CIP community today, and its importance is increasing every day. If your organization is a NERC entity or a provider/potential provider of software or cloud services to NERC entities, I would love to discuss this topic with you. Please email me to set up a time for this.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] I don’t think it is likely that NERC entities will deploy loads from either substations or synchronous generating stations in the cloud. This is because both of those environments require a very low level of latency. I believe most NERC entities won’t want to deploy those workloads in the cloud. 

[ii] Of course, low impact systems are subject to compliance with other CIP requirements and requirement parts as well (e.g., having an incident response plan and physical security controls), but most CSPs should have no problem providing evidence for those. 

[iii] There is currently no NERC policy in place that, for “policies or procedures” requirements like these, it is sufficient evidence of compliance to point to where the substance of the requirement is addressed in ISO 27001 (or any other certification. Note that FedRAMP is an authorization for certain federal agencies to utilize the service in question; it is not a certification). However, I would hope it would not be a heavy lift for NERC to create such a policy, perhaps in a CMEP Practice Guide.

Thursday, October 10, 2024

A great irony of supply chain cybersecurity

When FERC ordered NERC to develop a supply chain cybersecurity risk management standard in 2016, they listed four areas they wanted that standard to address: (1) software integrity and authenticity; (2) vendor remote access; (3) information system planning; and (4) vendor risk management and procurement controls. When FERC approved CIP-013-1 in 2018 in Order 850, they did so in large part because NERC had encompassed all four of those items in the standard.

The first of those items was addressed in two Requirement Parts: CIP-013-1 Requirement R1 Part R1.2.5 and CIP-010-3 Requirement R1 Part R1.6. FERC summarizes the latter on page 18 with these two sentences:

NERC asserts that the security objective of proposed Requirement R1.6 is to ensure that the software being installed in the BES Cyber System was not modified without the awareness of the software supplier and is not counterfeit. NERC contends that these steps help reduce the likelihood that an attacker could exploit legitimate vendor patch management processes to deliver compromised software updates or patches to a BES Cyber System.

In reading these sentences yesterday, I was struck by a huge irony: This provision is meant to protect against a “poisoned” software update that introduces malware into the system. It accomplishes this purpose by requiring the NERC entity to verify that the update a) was provided by the supplier of the product and not a malicious third party (authenticity), and b) wasn’t modified in some way before or while it was being downloaded (integrity).

Yet, since FERC issued Order 850, what have been probably the two most devastating supply chain cyberattacks anywhere? I’d say they’re the SolarWinds and CrowdStrike attacks (you may want to tell me that CrowdStrike wasn’t actually a cyberattack because it was caused by human error, not malice. However, this is a distinction without a difference, as I pointed out in this post last summer).

Ironically, both attacks were conveyed through software updates. Could a user organization (of any type, whether or not they were subject to NERC CIP compliance) have verified integrity and authenticity before applying the update and prevented the damage? No, for two reasons:

First, both updates were exactly what the developer had created. In the SolarWinds case, the update had been poisoned during the software build process itself, through one of the most sophisticated cyberattacks ever. Since an attack on the build process had seldom been attempted and in any case had never succeeded on any large scale, it would have been quite hard to prevent[i].

What might have prevented the attack was an improvement in SolarWinds’ fundamental security posture, which turned out to be quite deficient. This allowed the attackers to penetrate the development network with relative ease.

In the case of CrowdStrike, the update hadn’t been thoroughly tested, but it hadn’t been modified by any party other than CrowdStrike itself. Both updates would have passed the authenticity and integrity checks with flying colors.

Second, both updates were completely automatic, albeit with the user’s pre-authorization. While neither the SolarWinds nor the CrowdStrike users were forced to accept automatic software updates, I’m sure most of those users trusted the developers completely. They saw no point in spending a lot of time trying to test integrity or authenticity of these updates. Of course, it turns out their trust was misplaced. But without some prior indication that SolarWinds didn’t do basic security very well, or that CrowdStrike didn’t always test its updates adequately before shipping them out, it’s hard to believe many users would have gone through the trouble of trying to verify every update. In fact, I doubt many of them do that now.

It turns out that, practically speaking, verifying integrity and authenticity of software updates wouldn’t have prevented either the SolarWinds or the CrowdStrike incidents, since a) both updates would have easily passed the tests, and b) both vendors were highly trusted by their users (and still are, from all evidence). What would have prevented the two incidents?

Don’t say regulation. I’m sure both vendors have plenty of controls in place now to prevent the same problem from recurring. Regulations are like generals; they’re always good at re-fighting the last war.

What’s needed are controls that can prevent a different problem (of similar magnitude) from occurring. The most important of those controls is imagination. Are there products that will imagine attack scenarios that nobody has thought of before? I doubt there are today, but that might be a good idea for an AI startup.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] Affandicios of the in-toto open source software tool point out that it might have prevented the SolarWinds attack, although that assertion always comes with qualifications about actions the supplier and their customers would need to have taken. While the benefit of taking those actions (or similar ones) today is now much more apparent, that need wasn’t apparent at the time.

Wednesday, October 9, 2024

It’s time to give NERC a break!

 

Last week, I pointed out that FERC, in their recent Notice of Proposed Rulemaking (NOPR), demonstrated they’re not happy with the way that CIP-013-2 (and by extension CIP-013-1) has been implemented by NERC and NERC entities. Although FERC didn’t assign blame for this situation, they made it clear they want it fixed. They’re allowing two months for comment, with a deadline of early December. Early next year, they’ll issue an order requiring that NERC draft a revised standard, which will address the problems they discuss in the NOPR.

The NOPR suggests (at a high level) various changes that FERC is considering ordering in CIP-013-2. I’ve seen a number of FERC NOPRs that deal with existing CIP standards; almost all have essentially said, “We don’t have any problem with your first version of the standard, but now we’re going to have you do something more.” However, in this NOPR, FERC effectively said, “The standard you drafted originally (which remained virtually the same in the second version, except it was expanded to cover EACMS and PACS, as well as BES Cyber Systems) was insufficient. We want you to do better this time. Here are some changes we’re considering requiring you to make in our Final Rule next year.”

If my interpretation is correct and this is FERC’s meaning, I don’t think they are being fair to NERC or to the team that drafted CIP-013-1. Here’s why:

·        In their Order 829 of July 2016, FERC handed the standards drafting team (SDT) an almost impossible task: They had to develop and get approved probably the first supply chain cybersecurity standard outside of the military, which would also be the first completely risk-based NERC standard. Most importantly, they had to do all of this – meaning they wanted it completely approved by NERC and ready for their consideration – in 12 months.

·        All new or revised NERC standards are drafted by a Standards Drafting Team (composed of subject matter experts from NERC entities) and submitted for approval to a Ballot Body composed of NERC entities that choose to participate. The balloting process is very complicated, but approval of any standard requires a supermajority of the ballot body.

·        Usually, new or revised CIP standards have required four ballots for final approval. With each ballot, NERC entities can submit comments on the standards. The SDT is required to respond to all comments. Including the commenting process, each ballot can easily require 3-4 months.

·        Since the comments often explain why an entity has voted no, the SDT scrutinizes them carefully, trying to identify changes that could be made in the draft standard that would increase its chances of approval. Having attended some of the CIP-013 SDT meetings, I know they received a lot of negative comments and made a lot of changes that some observers (including me) thought were “watering down” the requirements of the standard. However, the team members were always keenly aware of the deadline they faced. They had to make some tough choices, to have a chance of meeting that deadline (which they did, of course).

·        After having pushed NERC to meet the one-year deadline, did FERC rush to approve the standard? Well…not exactly. Even though CIP-013-1 was on FERC’s desk by the middle of 2017, they didn’t approve it until more than a year later. There was a reason for that. You may remember there was some sort of upheaval in Washington around the end of 2016 and a lot of people departed their jobs (voluntarily and otherwise). In all of that, FERC lost most of its members and was left with one or two Commissioners, which wasn’t a quorum. That’s why it took them longer to approve CIP-013 (in October 2018) than it took NERC to draft it.

In their new NOPR, FERC states they’re considering imposing a 12-month deadline for NERC to revise the standard, fully approve it, and send it to FERC for their approval. This is a terrible idea, since in that case it’s almost certain the new standard will be no more to FERC’s liking than the current one.  Fortunately, near the end of the NOPR, FERC suggested they would be open to considering an 18-month deadline. I think that’s a great idea!

This will give the SDT time to discuss and submit for a ballot some of the items FERC listed in their NOPR, as well as perhaps some items that the earlier team considered in 2016-2017, but had to remove in the face of strong opposition. I remember a couple of them (although I don’t have time to go back to the original records to verify every detail of this):

1.      It seems obvious that a supply chain security standard should have a definition of “vendor”. Since there is no such definition in the NERC Glossary, the “CIP-013” SDT drafted one. When a new or revised NERC standard requires a new definition, it usually gets balloted along with the standard itself; that happened in this case (I believe it was the first ballot). The definition was solidly voted down. I remember the discussion in an SDT meeting after this happened; the team decided their one-year deadline would be in jeopardy if they kept revising and re-balloting the definition. This is why even today, there’s no NERC Glossary definition of “vendor”.

2.      As originally drafted, Requirement R3 mandated that every 15 months, the NERC entity would review and, where needed, revise the supply chain cybersecurity risk management plan that they developed for Requirement R1. That led to negative comments in the early ballots, which led the SDT to water down R3 to the current language: “Each Responsible Entity shall review and obtain CIP Senior Manager or delegate approval of its supply chain cyber security risk management plan(s) specified in Requirement R1 at least once every 15 calendar months.” In other words, the CIP Senior Manager needs to approve the plan every 15 months. If they don’t even look at it to see what if anything has changed, that’s perfectly fine.

To be honest, I felt (and still feel) that CIP-013-1 was a missed opportunity to develop a risk-based NERC CIP standard that could serve as a model for future risk-based CIP standards. In fact, the NERC community will need such a model, since whatever standards or requirements are developed by the new Project 2023-09 Risk Management for Third-Party Cloud Services drafting team will have to be risk-based: nothing else will work in the cloud.

Fortunately (or unfortunately), the new “cloud” SDT hasn’t even started to consider (except at a very high level) what any new standard will look like, and they won’t be able to do that until next year at the earliest. By that time, FERC will have issued their Final Rule and the CIP-013-3 drafting team should be well into the balloting process. They may have some good advice for the cloud team.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Tuesday, October 8, 2024

NERC CIP-013: Vendor Risk vs. Vendor Risks


In last week’s post regarding FERC’s recent Notice of Proposed Rulemaking (NOPR) for CIP-013-2, the NERC supply chain cybersecurity risk management standard, I noted that FERC is apparently quite concerned that NERC entities aren’t properly performing the three essential elements of any risk management activity:

1.      Identify risks. Of course, in the case of CIP-013, those are supply chain security risks to the Bulk Electric System (BES). They primarily arise through the vendors in the supply chain for intelligent hardware and software that operate or monitor the BES, although they can also arise through the NERC entity itself (for example, does the entity have a policy always to buy from the lowest cost vendor, without being concerned with frou-frou like cybersecurity?).

2.      Assess those risks. The whole point of risk management is that some risks are very serious and need to be addressed as soon as possible, and others are not important and can simply be accepted. The assessment tries to distinguish high from low risks.

3.      Respond to those risks. FERC notes that CIP-013-2 only mentions identifying and assessing risks, but never requires the entity to do anything about them. It’s a sure bet that the order FERC issues sometime after the NOPR comment period ends (in early December) will focus heavily on response.

The NOPR states that FERC proposes to issue an order for NERC to revise CIP-013-2 (i.e., replace it with a new standard, which will be CIP-013-3). Since FERC addresses each of the three risk management elements separately in the NOPR, I’ll discuss them in three separate posts, starting with risk identification in this post.

Probably the most important lesson I took from my experience working with NERC entities during the runup to compliance with CIP-013-1 was the difference between “vendor risk” and “vendor risks”. By vendor risk, I mean the risk posed by the vendor itself. Vendor risk is important when you’re considering whether to buy from a particular vendor at all; for example, when you’re evaluating five vendors with a competitive RFP. In that case, you ideally want to rank all five vendors by their overall level of risk (although that’s hard to do in practice).

In the OT world (i.e., the world that CIP compliance lives in), overall vendor risk isn’t usually a consideration. This is because the decision which vendor to buy (for example) electronic relays from is almost always “the same guys we’ve been buying from for the last 15 years”. Your relay vendor was chosen long ago for technical reasons, and your organization is very familiar with their products. Only a huge screw-up by that vendor would justify even talking to other vendors – and it still isn’t likely you would drop them, unless they had done something horrendous.

There are various services that will sell you risk scores for vendors, which are based on many factors (you can also compute your own scores, of course). For those rare occasions when you might switch vendors (or you’re starting to purchase a new type of product that you’ve never bought before), overall vendor risk scores can be very helpful. But for most OT procurements, vendor risk scores don’t help much. The engineers will make the decision based on functionality. Even if one vendor has an overall risk score that’s higher than another vendor’s, that’s not usually going to sway the decision.

So, why do you need to “identify” risks, if the decision which vendor to buy from was made long ago and won’t change for just one procurement? It’s because the risks that apply to one vendor can change substantially from one procurement to the next.

For example, suppose your organization’s last procurement from Vendor A was a year ago. The vendor hasn’t changed very much, but the environment certainly has. Let’s say that in the past year, the SolarWinds attack happened. Before SolarWinds, your organization never even considered whether a vendor maintained a secure software development environment. This year, you decide you need to ask them questions like

1.      Do you require MFA for access to your development network?

2.      How do you vet your developers?

3.      Do you utilize a product like in-toto (an open source product) to verify the integrity of your development chain?

Or, suppose Vendor A has recently been acquired by Vendor B. Of course, they assure you up and down that they’re still the same Vendor A you’ve known and (mostly) loved for the last ten years. However, you decide that the safe path is to re-assess them before you start a new procurement. Along with the questions you’ve asked them in previous assessments, you add new ones like

A.     How have your security practices changed since you were acquired?

B.     What security certifications does Vendor B have? If you (Vendor A) don’t have one of them, when will you get it?

C.      Will you merge your network with Vendor B’s, and if so, what measures will you take to make sure there’s no degradation of security controls?

All six of the above questions are based on new risks; that is, you have identified these as risks that apply to Vendor A, even though you’re not now even considering whether you want to continue using them. You’ll keep using them, but at the same time you need to identify all the risks that now apply to them, so you can pressure the vendor to mitigate them if needed.

In other words, it isn’t a question of whether Vendor A is now too risky to continue buying from. Rather, it’s a question of what are the new risks that apply to A, that didn’t apply the last time you bought from them? You’re not asking about Vendor A’s overall risk level; you’re asking what new risks they may pose to you, that arose since the last time you assessed them.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Sunday, October 6, 2024

NERC CIP needs vulnerability management, not patch management


For many years, I’ve believed that the big problem with the NERC CIP cybersecurity standards isn’t that they’re not strong enough to be effective. Au contraire, the problem is that they’re very effective, but that effectiveness comes at a huge cost. In other words, although CIP compliance has protected the North American Bulk Electric System (BES) very effectively, the cost of compliance paid by NERC entities (in both time and dollars) has been huge.

One of the main reasons for this is that there are a few very prescriptive CIP requirements that require the NERC entity to document compliance in a huge number of single instances, usually regarding individual Cyber Assets. My biggest examples for this statement by far have always been CIP-007 R2 for patch management and CIP-010 R1 for configuration management. Both require tracking and documenting a huge number of individual instances of compliance across many individual devices.

But there’s a difference between the two requirements. CIP-010 R1 compliance requires that the NERC entity track every change to every Cyber Asset in scope (mostly BES Cyber Assets that are part of medium or high impact BES Cyber Systems), as well as whether it was authorised, whether the baseline configuration information was updated after the change, etc. All these items are important, but if this requirement (and its requirement parts) were rewritten as a risk-based one, the change wouldn’t impact security very much, yet it would greatly ease the burden of compliance.

CIP-007 R2 is a different story. It’s very hard to say that all the steps it requires are important for security. For example, is it really necessary for a NERC entity to check every 35 days to see if a new security patch is available for every software or firmware product (as well as every version of every software or firmware product) that is installed on any Cyber Asset within their ESP?

For some very important products that often issue patches, the answer is yes. However, for other products that seldom if ever issue patches, the answer is probably no. If this requirement and its parts were rewritten as risk-based, NERC entities could save a huge amount of time and money every year, with little impact on security.

The heart of CIP-007 R2 is the requirement to apply every security patch issued by every supplier of any software or firmware product installed in the ESP. Why is that important? Of course, it’s important because software “develops” vulnerabilities over time and they (at least in many cases) need to be mitigated.

But what about vulnerabilities for which a patch is never issued? For example, what if your organization uses a custom software product for which there is no third-party “supplier”? Maybe the person who developed the software left the scene long ago; no current staff member has the slightest idea how to make any change to the software, let alone issue a patch for it. Does CIP-007 R2 require you to remove the software and/or rewrite it? No, it says nothing about this situation at all.

Even more importantly, what about an end of life product like Windows XP? I believe there are some devices in use in OT environments that are still running XP. Of course, Microsoft is no longer issuing regular patches for it – they stopped doing that in 2014. If a serious vulnerability appears, your EOL status may turn into SOL, and your entire ESP might be compromised. But don’t worry, you won’t be in violation of CIP-007 R2!

These two examples show clearly that the real risk isn’t not applying a patch, but not mitigating a serious software vulnerability. R2 needs to be replaced with a risk-based requirement for vulnerability management. That would require the NERC entity to:

1.      Track vulnerabilities for the software products installed in their ESP;

2.      Identify those vulnerabilities whose exploitation might lead to serious damage to the BES, if those vulnerabilities were being exploited in the wild (perhaps as indicated by the vulnerability’s presence in CISA’s Key Exploited Vulnerabilities catalog);

3.      Make sure the supplier of the vulnerable product plans to issue a patch for the vulnerability soon; and

4.      Apply the patch soon after it becomes available.

Of course, if the supplier of the software product doesn’t issue a patch for the vulnerability, the entity should ideally remove that product from their environment and replace it with a more secure product. Unfortunately, that’s easier said than done, since OT software products are often “one of a kind” and can’t be easily replaced, if at all. In such cases, the best course of action is to take whatever steps are possible to mitigate the increased risk caused by the unpatched vulnerability.

There are other problems with a vulnerability management requirement as well. For example, I’ve learned that suppliers of intelligent devices often don’t report vulnerabilities for their devices. Since almost all vulnerabilities (including almost all CVEs) are reported by the manufacturer of the device or the developer of the software, this means that the first step above will be impossible in the case of many devices: any search of a vulnerability database such as the NVD will identify zero vulnerabilities.

However, the device may be very far from being vulnerability-free, as was the case for the device described in the post I just linked: A search of the NVD for that device (which is mostly used in military and other high assurance environments, by the way) will yield zero applicable CVEs, when in fact a researcher (Tom Pace of NetRise) estimates there are at least 40,000 firmware vulnerabilities in it.

Without much doubt, the biggest reason why automated vulnerability management isn’t possible today is that the US National Vulnerability Database, which up until early this year was the most widely used vulnerability database in the world, has been experiencing serious problems that call into question its continued functioning. I’ve described these problems in many recent posts, including this one.

Until the NVD fixes its problems and restores its credibility (which I strongly doubt will happen anytime soon), vulnerability management will remain what it is now: more of an art than a science. Unfortunately, this means it’s currently impossible to develop a usable vulnerability management requirement for NERC CIP.

Does this mean the NERC CIP community has no choice but to continue to invest big resources in compliance with CIP-007 R2, while realizing sub-optimal returns on its investment? I’m afraid so. It would be nice if someone would draft a SAR (Standards Authorization Request) to replace CIP-007 R2 with a true vulnerability management requirement. However, even if a new requirement were drafted and approved, I don’t think there’s a realistic way to implement it, given the current problems with the NVD and the ongoing problems with the CPE identifier.[i]

Currently, the two big focuses of the CIP standards development effort are virtualization and the cloud. The former process is close to complete, but the latter is just getting started. As the “cloud CIP” effort gathers momentum, it’s likely to suck a lot of the air out of other concerns like CIP-007 R2[ii]. And it’s likely that the cloud effort will take years to be completed.

Of course, if a NERC entity wants to free itself from having to comply with CIP-007 R2 and CIP-010 R1, one way to do this will be to move as many medium and high impact BES Cyber Systems as possible to the cloud, as soon as it becomes “legal” to do so. But if a lot of other NERC entities have the same idea, some would argue that the result will be that the BES is less secure, not more secure. Wouldn’t that be ironic?

Are you a vendor of current or future cloud-based services or software that would like to figure out an appropriate strategy for selling to customers subject to NERC CIP compliance? Or are you a NERC entity that is struggling to understand what your current options are regarding cloud-based software and services? Please drop me an email so we can set up a time to discuss this!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] The OWASP SBOM Forum will soon release a proposal for finally addressing the problems with CPE, by expanding use of the purl identifier to proprietary software (which purl does not currently address). Stay tuned to this blog!

[ii] Since any new “cloud CIP” requirements will need to be completely risk-based, it is likely that CIP-007 R2 will be replaced with a risk-based vulnerability management requirement for use by systems deployed in the cloud. However, it’s also likely that any new CIP standards will have two “branches”: one for use by cloud systems and one for use by on-premises systems. The latter will bear a close resemblance to the current CIP standards, if they aren’t virtually identical. Thus, a prescriptive patch management requirement is likely to remain part of the CIP standards for years, even after the “cloud problem” in CIP is finally fixed.

Thursday, October 3, 2024

Why is FERC so concerned with NERC CIP-013?

Two weeks ago, FERC issued an important Notice of ProposedRulemaking (NOPR) regarding NERC CIP-013, the CIP standard for supply chain cybersecurity risk management. There was a good discussion of the NOPR in the NERC Supply Chain Working Group meeting on Monday, and yesterday Mike Johnson (author of an excellent blog on NERC CIP that has been on hiatus for a couple of years) put up a post about this.

The NOPR states (in paragraph 2 on page 3), “…we preliminarily find that gaps remain in the SCRM Reliability Standards related to the: (A) sufficiency of responsible entities’ SCRM plans related to the (1) identification of, (2) assessment of, and (3) response to supply chain risks, and (B) applicability of SCRM Reliability Standards to PCAs.”

To address (B) first, PCA stands for Protected Cyber Asset. This is the only type of Cyber Asset subject to NERC CIP compliance that is not already covered by CIP-013-2. It’s certainly not a controversial move to extend CIP-013 to cover PCAs as well, and I agree with their concern.

However, I find (A) quite interesting. Why? Because it seems to just repeat what’s been in CIP-013 from the beginning. FERC ordered NERC to develop a SCRM standard in 2016. CIP-013-1 came into effect on October 1, 2020. CIP-013-1 R1 states that the NERC entity with high and/or medium impact BES Cyber Systems must develop a plan

…to identify and assess cyber security risk(s) to the Bulk Electric System from vendor products or services…

In my experience, most FERC NOPRs that deal with an existing CIP standard start with an affirmation that the standard has met its objective and continue to say that more is needed (of course, adding Protected Cyber Assets to the three Cyber Asset types already addressed in CIP-013, as item B suggests, is an example of that). However, this NOPR seems to be saying, “NERC, we’re not happy with how CIP-013-1[i] has turned out. Not only do we want you to add PCAs to the scope of the standard, but we want you to go back and fix the language of the standard itself.”

The specifics of why FERC is unhappy with the existing Supply Chain standards (which include CIP-013-2, CIP-005- 7 and CIP-010-4) are found starting in paragraph 24 on page 21 of the NOPR. These include:

“While providing a baseline of protection, the Reliability Standards do not provide specific requirements as to when and how an entity should identify and assess supply chain risks, nor do the Standards require entities to respond to those risks identified through their SCRM plans.” (paragraph 24)

“A responsible entity’s failure to properly identify and assess supply chain risks could lead to an entity installing vulnerable products and allowing compromise of its systems, “effectively bypassing security controls established by CIP Reliability Standards.” Further, incomplete or inaccurate risk identification may result in entity assessments of the likelihood and potential impact of supply chain risks that do not reflect the actual threat and risk posed to the responsible entity. In the absence of clear criteria, procedures of entities with ad hoc approaches do not include steps to validate the completeness and accuracy of the vendor responses, assess the risks, consider the vendors’ mitigation activities, or respond to any residual risks.” (paragraph 25)

“…a lack of consistency and effectiveness in SCRM plans for evaluating vendors and their supplied equipment and software. While a minority of audited entities had comprehensive vendor risk evaluation processes in place and displayed a consistent application of the risk identification process to each of their vendors, other entities displayed inconsistent and ad hoc vendor risk identification processes. These risk identification processes were typically completed by only using vendor questionnaires. Further, using only vendor questionnaires resulted in inconsistency of the information collected and was limited to only “yes/no” responses regarding the vendors’ security posture.” (paragraph 27)

“…many SCRM plans did not establish procedures to respond to risks once identified.”

The last comment relates to the fact that CIP-013-2 R1.1 (quoted above) only requires NERC entities’ to develop a plan to “identify and assess” supply chain cybersecurity risks to the Bulk Electric System (BES)”. It says nothing directly about mitigating those risks. I pointed out in my posts during the runup to enforcement of CIP-013-1 that any NERC entity that developed a CIP-013 R1 plan that didn’t address risk mitigation at all was inviting compliance problems; I doubt any entity tried to do that, either.

In paragraph 30 on page 26, FERC summarizes their concerns:

In light of these identified gaps, we are concerned that the existing SCRM Reliability Standards lack a detailed and consistent approach for entities to develop adequate SCRM plans related to the (1) identification of, (2) assessment of, and (3) response to supply chain risk.  Specifically, we are concerned that the SCRM Reliability Standards lack clear requirements for when responsible entities should perform risk assessments to identify risks and how those risk assessments should be conducted to properly assess risk.  Further, we are concerned that the Reliability Standards lack any requirement for an entity to respond to supply chain risks once identified and assessed, regardless of severity. 

In my opinion (which dovetails with FERC’s), there were two big problems with most NERC entities’ R1.1 plans. The first is that the plans didn’t try to identify any risks beyond the six that are “required” by items R1.2.1 through R1.2.6. Those six items had been mentioned in various random places in FERC’s Order 829 of July 2016, which ordered development of what became CIP-013. However, they were never intended to constitute the total set of risks that needed to be identified in the NERC entity’s plan. Yet, it seems that many, if not most, NERC entities considered them to be exactly that.

The second problem is that, even though NERC entities took the “assessment” requirement in R1.1 seriously, utilizing supplier questionnaires to accomplish that purpose, the responses to the questions weren’t considered to be measures of individual risks, requiring specific follow-up from the entity. Instead, they were considered to be simply one datapoint (among many) contributing to an answer to the question, “Is this supplier safe to buy from in the first place?”

It's certainly important to ask that whether your supplier is safe to buy from. But let’s face it: In the OT world, there’s seldom a question who you will buy from. It’s the supplier you’ve been buying from for the last 30-40 years. Is your utility really going to drop your current relay vendor and move to another one, just because the incumbent gave one unsatisfactory answer out of 50 on a security questionnaire? Certainly not. On the other hand, the unsatisfactory answer needs to be looked at for what it is: an indication that your current supplier poses a degree of risk in the area addressed by that question.

Suppose your questionnaire asked whether the supplier had implemented multifactor authentication for their own remote access system, and that your incumbent relay supplier indicated this was in their budget for FY2026. In my opinion, that’s an unsatisfactory answer. In 2018, a DHS briefing indicated that nation-state attackers had penetrated a large number of suppliers to electric utilities (and through them at least some utilities), often through exploiting weak passwords in remote access systems. Clearly, no supplier to the power industry should still have MFA on their to-do list; it should already be in place.

A NERC entity that receives the above response on a supplier questionnaire should immediately contact the supplier and ask when they will implement MFA. The right answer is “By 5:00 today”, although “By noon tomorrow” might also be acceptable. Whatever deadline the supplier sets, the entity should follow up with them on that day to find out whether they kept their promise. If they did, great. If they didn’t, when will they implement MFA and what price or other concessions are they ready to accept if they don’t make the new date? After all, by not mitigating this risk, they’re transferring the mitigation burden to your organization. Since you will have to implement additional controls due to the supplier’s lax attitude, you need to be compensated for that fact.

The point is that each question in your questionnaire should address a specific risk that you think is important; if you don’t think the risk is important, don’t ask the question – you’re just wasting your and the supplier’s time. But since the risk is important, you need to follow up and make sure it gets mitigated. Presumably, you’ll get the supplier’s attention, and it will get mitigated. But if it isn’t, you will receive at least partial compensation for the additional costs the supplier is imposing on you, since you will probably have to spend money and/or time implementing mitigations.

Since this is a Notice of Proposed Rulemaking, FERC concludes this section with a discussion (on pages 26-31) of what changes they might require NERC to make in CIP-013-2 (of course, those changes would be in a new standard CIP-013-3). All of their suggestions are good, although I will provide a more comprehensive discussion of what I propose in a later post. FERC is giving the NERC community until December 2, 2024 to submit comments on this NOPR.

When FERC ordered the supply chain standard in 2016, there had been few true supply chain cybersecurity attacks to point to, although the potential for them was clear. That certainly isn’t true today, when new attacks seem to be reported every day and some of them, such as SolarWinds and the attacks based on log4j, have caused immense damage. FERC is quite right to ring the alarm bell on this.

Are you either a NERC entity or a supplier to NERC entities that is trying to figure out what this NOPR means for your organization? Please drop me an email so we can set up a time to discuss this!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] And presumably CIP-013-2 as well. That standard didn’t change CIP-013-1 much, other than to add EACMS and PACS to the Cyber Asset types in scope for the standard.