Thursday, October 10, 2024

A great irony of supply chain cybersecurity

When FERC ordered NERC to develop a supply chain cybersecurity risk management standard in 2016, they listed four areas they wanted that standard to address: (1) software integrity and authenticity; (2) vendor remote access; (3) information system planning; and (4) vendor risk management and procurement controls. When FERC approved CIP-013-1 in 2018 in Order 850, they did so in large part because NERC had encompassed all four of those items in the standard.

The first of those items was addressed in two Requirement Parts: CIP-013-1 Requirement R1 Part R1.2.5 and CIP-010-3 Requirement R1 Part R1.6. FERC summarizes the latter on page 18 with these two sentences:

NERC asserts that the security objective of proposed Requirement R1.6 is to ensure that the software being installed in the BES Cyber System was not modified without the awareness of the software supplier and is not counterfeit. NERC contends that these steps help reduce the likelihood that an attacker could exploit legitimate vendor patch management processes to deliver compromised software updates or patches to a BES Cyber System.

In reading these sentences yesterday, I was struck by a huge irony: This provision is meant to protect against a “poisoned” software update that introduces malware into the system. It accomplishes this purpose by requiring the NERC entity to verify that the update a) was provided by the supplier of the product and not a malicious third party (authenticity), and b) wasn’t modified in some way before or while it was being downloaded (integrity).

Yet, since FERC issued Order 850, what have been probably the two most devastating supply chain cyberattacks anywhere? I’d say they’re the SolarWinds and CrowdStrike attacks (you may want to tell me that CrowdStrike wasn’t actually a cyberattack because it was caused by human error, not malice. However, this is a distinction without a difference, as I pointed out in this post last summer).

Ironically, both attacks were conveyed through software updates. Could a user organization (of any type, whether or not they were subject to NERC CIP compliance) have verified integrity and authenticity before applying the update and prevented the damage? No, for two reasons:

First, both updates were exactly what the developer had created. In the SolarWinds case, the update had been poisoned during the software build process itself, through one of the most sophisticated cyberattacks ever. Since an attack on the build process had seldom been attempted and in any case had never succeeded on any large scale, it would have been quite hard to prevent[i].

What might have prevented the attack was an improvement in SolarWinds’ fundamental security posture, which turned out to be quite deficient. This allowed the attackers to penetrate the development network with relative ease.

In the case of CrowdStrike, the update hadn’t been thoroughly tested, but it hadn’t been modified by any party other than CrowdStrike itself. Both updates would have passed the authenticity and integrity checks with flying colors.

Second, both updates were completely automatic, albeit with the user’s pre-authorization. While neither the SolarWinds nor the CrowdStrike users were forced to accept automatic software updates, I’m sure most of those users trusted the developers completely. They saw no point in spending a lot of time trying to test integrity or authenticity of these updates. Of course, it turns out their trust was misplaced. But without some prior indication that SolarWinds didn’t do basic security very well, or that CrowdStrike didn’t always test its updates adequately before shipping them out, it’s hard to believe many users would have gone through the trouble of trying to verify every update. In fact, I doubt many of them do that now.

It turns out that, practically speaking, verifying integrity and authenticity of software updates wouldn’t have prevented either the SolarWinds or the CrowdStrike incidents, since a) both updates would have easily passed the tests, and b) both vendors were highly trusted by their users (and still are, from all evidence). What would have prevented the two incidents?

Don’t say regulation. I’m sure both vendors have plenty of controls in place now to prevent the same problem from recurring. Regulations are like generals; they’re always good at re-fighting the last war.

What’s needed are controls that can prevent a different problem (of similar magnitude) from occurring. The most important of those controls is imagination. Are there products that will imagine attack scenarios that nobody has thought of before? I doubt there are today, but that might be a good idea for an AI startup.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] Affandicios of the in-toto open source software tool point out that it might have prevented the SolarWinds attack, although that assertion always comes with qualifications about actions the supplier and their customers would need to have taken. While the benefit of taking those actions (or similar ones) today is now much more apparent, that need wasn’t apparent at the time.

Wednesday, October 9, 2024

It’s time to give NERC a break!

 

Last week, I pointed out that FERC, in their recent Notice of Proposed Rulemaking (NOPR), demonstrated they’re not happy with the way that CIP-013-2 (and by extension CIP-013-1) has been implemented by NERC and NERC entities. Although FERC didn’t assign blame for this situation, they made it clear they want it fixed. They’re allowing two months for comment, with a deadline of early December. Early next year, they’ll issue an order requiring that NERC draft a revised standard, which will address the problems they discuss in the NOPR.

The NOPR suggests (at a high level) various changes that FERC is considering ordering in CIP-013-2. I’ve seen a number of FERC NOPRs that deal with existing CIP standards; almost all have essentially said, “We don’t have any problem with your first version of the standard, but now we’re going to have you do something more.” However, in this NOPR, FERC effectively said, “The standard you drafted originally (which remained virtually the same in the second version, except it was expanded to cover EACMS and PACS, as well as BES Cyber Systems) was insufficient. We want you to do better this time. Here are some changes we’re considering requiring you to make in our Final Rule next year.”

If my interpretation is correct and this is FERC’s meaning, I don’t think they are being fair to NERC or to the team that drafted CIP-013-1. Here’s why:

·        In their Order 829 of July 2016, FERC handed the standards drafting team (SDT) an almost impossible task: They had to develop and get approved probably the first supply chain cybersecurity standard outside of the military, which would also be the first completely risk-based NERC standard. Most importantly, they had to do all of this – meaning they wanted it completely approved by NERC and ready for their consideration – in 12 months.

·        All new or revised NERC standards are drafted by a Standards Drafting Team (composed of subject matter experts from NERC entities) and submitted for approval to a Ballot Body composed of NERC entities that choose to participate. The balloting process is very complicated, but approval of any standard requires a supermajority of the ballot body.

·        Usually, new or revised CIP standards have required four ballots for final approval. With each ballot, NERC entities can submit comments on the standards. The SDT is required to respond to all comments. Including the commenting process, each ballot can easily require 3-4 months.

·        Since the comments often explain why an entity has voted no, the SDT scrutinizes them carefully, trying to identify changes that could be made in the draft standard that would increase its chances of approval. Having attended some of the CIP-013 SDT meetings, I know they received a lot of negative comments and made a lot of changes that some observers (including me) thought were “watering down” the requirements of the standard. However, the team members were always keenly aware of the deadline they faced. They had to make some tough choices, to have a chance of meeting that deadline (which they did, of course).

·        After having pushed NERC to meet the one-year deadline, did FERC rush to approve the standard? Well…not exactly. Even though CIP-013-1 was on FERC’s desk by the middle of 2017, they didn’t approve it until more than a year later. There was a reason for that. You may remember there was some sort of upheaval in Washington around the end of 2016 and a lot of people departed their jobs (voluntarily and otherwise). In all of that, FERC lost most of its members and was left with one or two Commissioners, which wasn’t a quorum. That’s why it took them longer to approve CIP-013 (in October 2018) than it took NERC to draft it.

In their new NOPR, FERC states they’re considering imposing a 12-month deadline for NERC to revise the standard, fully approve it, and send it to FERC for their approval. This is a terrible idea, since in that case it’s almost certain the new standard will be no more to FERC’s liking than the current one.  Fortunately, near the end of the NOPR, FERC suggested they would be open to considering an 18-month deadline. I think that’s a great idea!

This will give the SDT time to discuss and submit for a ballot some of the items FERC listed in their NOPR, as well as perhaps some items that the earlier team considered in 2016-2017, but had to remove in the face of strong opposition. I remember a couple of them (although I don’t have time to go back to the original records to verify every detail of this):

1.      It seems obvious that a supply chain security standard should have a definition of “vendor”. Since there is no such definition in the NERC Glossary, the “CIP-013” SDT drafted one. When a new or revised NERC standard requires a new definition, it usually gets balloted along with the standard itself; that happened in this case (I believe it was the first ballot). The definition was solidly voted down. I remember the discussion in an SDT meeting after this happened; the team decided their one-year deadline would be in jeopardy if they kept revising and re-balloting the definition. This is why even today, there’s no NERC Glossary definition of “vendor”.

2.      As originally drafted, Requirement R3 mandated that every 15 months, the NERC entity would review and, where needed, revise the supply chain cybersecurity risk management plan that they developed for Requirement R1. That led to negative comments in the early ballots, which led the SDT to water down R3 to the current language: “Each Responsible Entity shall review and obtain CIP Senior Manager or delegate approval of its supply chain cyber security risk management plan(s) specified in Requirement R1 at least once every 15 calendar months.” In other words, the CIP Senior Manager needs to approve the plan every 15 months. If they don’t even look at it to see what if anything has changed, that’s perfectly fine.

To be honest, I felt (and still feel) that CIP-013-1 was a missed opportunity to develop a risk-based NERC CIP standard that could serve as a model for future risk-based CIP standards. In fact, the NERC community will need such a model, since whatever standards or requirements are developed by the new Project 2023-09 Risk Management for Third-Party Cloud Services drafting team will have to be risk-based: nothing else will work in the cloud.

Fortunately (or unfortunately), the new “cloud” SDT hasn’t even started to consider (except at a very high level) what any new standard will look like, and they won’t be able to do that until next year at the earliest. By that time, FERC will have issued their Final Rule and the CIP-013-3 drafting team should be well into the balloting process. They may have some good advice for the cloud team.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Tuesday, October 8, 2024

NERC CIP-013: Vendor Risk vs. Vendor Risks


In last week’s post regarding FERC’s recent Notice of Proposed Rulemaking (NOPR) for CIP-013-2, the NERC supply chain cybersecurity risk management standard, I noted that FERC is apparently quite concerned that NERC entities aren’t properly performing the three essential elements of any risk management activity:

1.      Identify risks. Of course, in the case of CIP-013, those are supply chain security risks to the Bulk Electric System (BES). They primarily arise through the vendors in the supply chain for intelligent hardware and software that operate or monitor the BES, although they can also arise through the NERC entity itself (for example, does the entity have a policy always to buy from the lowest cost vendor, without being concerned with frou-frou like cybersecurity?).

2.      Assess those risks. The whole point of risk management is that some risks are very serious and need to be addressed as soon as possible, and others are not important and can simply be accepted. The assessment tries to distinguish high from low risks.

3.      Respond to those risks. FERC notes that CIP-013-2 only mentions identifying and assessing risks, but never requires the entity to do anything about them. It’s a sure bet that the order FERC issues sometime after the NOPR comment period ends (in early December) will focus heavily on response.

The NOPR states that FERC proposes to issue an order for NERC to revise CIP-013-2 (i.e., replace it with a new standard, which will be CIP-013-3). Since FERC addresses each of the three risk management elements separately in the NOPR, I’ll discuss them in three separate posts, starting with risk identification in this post.

Probably the most important lesson I took from my experience working with NERC entities during the runup to compliance with CIP-013-1 was the difference between “vendor risk” and “vendor risks”. By vendor risk, I mean the risk posed by the vendor itself. Vendor risk is important when you’re considering whether to buy from a particular vendor at all; for example, when you’re evaluating five vendors with a competitive RFP. In that case, you ideally want to rank all five vendors by their overall level of risk (although that’s hard to do in practice).

In the OT world (i.e., the world that CIP compliance lives in), overall vendor risk isn’t usually a consideration. This is because the decision which vendor to buy (for example) electronic relays from is almost always “the same guys we’ve been buying from for the last 15 years”. Your relay vendor was chosen long ago for technical reasons, and your organization is very familiar with their products. Only a huge screw-up by that vendor would justify even talking to other vendors – and it still isn’t likely you would drop them, unless they had done something horrendous.

There are various services that will sell you risk scores for vendors, which are based on many factors (you can also compute your own scores, of course). For those rare occasions when you might switch vendors (or you’re starting to purchase a new type of product that you’ve never bought before), overall vendor risk scores can be very helpful. But for most OT procurements, vendor risk scores don’t help much. The engineers will make the decision based on functionality. Even if one vendor has an overall risk score that’s higher than another vendor’s, that’s not usually going to sway the decision.

So, why do you need to “identify” risks, if the decision which vendor to buy from was made long ago and won’t change for just one procurement? It’s because the risks that apply to one vendor can change substantially from one procurement to the next.

For example, suppose your organization’s last procurement from Vendor A was a year ago. The vendor hasn’t changed very much, but the environment certainly has. Let’s say that in the past year, the SolarWinds attack happened. Before SolarWinds, your organization never even considered whether a vendor maintained a secure software development environment. This year, you decide you need to ask them questions like

1.      Do you require MFA for access to your development network?

2.      How do you vet your developers?

3.      Do you utilize a product like in-toto (an open source product) to verify the integrity of your development chain?

Or, suppose Vendor A has recently been acquired by Vendor B. Of course, they assure you up and down that they’re still the same Vendor A you’ve known and (mostly) loved for the last ten years. However, you decide that the safe path is to re-assess them before you start a new procurement. Along with the questions you’ve asked them in previous assessments, you add new ones like

A.     How have your security practices changed since you were acquired?

B.     What security certifications does Vendor B have? If you (Vendor A) don’t have one of them, when will you get it?

C.      Will you merge your network with Vendor B’s, and if so, what measures will you take to make sure there’s no degradation of security controls?

All six of the above questions are based on new risks; that is, you have identified these as risks that apply to Vendor A, even though you’re not now even considering whether you want to continue using them. You’ll keep using them, but at the same time you need to identify all the risks that now apply to them, so you can pressure the vendor to mitigate them if needed.

In other words, it isn’t a question of whether Vendor A is now too risky to continue buying from. Rather, it’s a question of what are the new risks that apply to A, that didn’t apply the last time you bought from them? You’re not asking about Vendor A’s overall risk level; you’re asking what new risks they may pose to you, that arose since the last time you assessed them.

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

 

Sunday, October 6, 2024

NERC CIP needs vulnerability management, not patch management


For many years, I’ve believed that the big problem with the NERC CIP cybersecurity standards isn’t that they’re not strong enough to be effective. Au contraire, the problem is that they’re very effective, but that effectiveness comes at a huge cost. In other words, although CIP compliance has protected the North American Bulk Electric System (BES) very effectively, the cost of compliance paid by NERC entities (in both time and dollars) has been huge.

One of the main reasons for this is that there are a few very prescriptive CIP requirements that require the NERC entity to document compliance in a huge number of single instances, usually regarding individual Cyber Assets. My biggest examples for this statement by far have always been CIP-007 R2 for patch management and CIP-010 R1 for configuration management. Both require tracking and documenting a huge number of individual instances of compliance across many individual devices.

But there’s a difference between the two requirements. CIP-010 R1 compliance requires that the NERC entity track every change to every Cyber Asset in scope (mostly BES Cyber Assets that are part of medium or high impact BES Cyber Systems), as well as whether it was authorised, whether the baseline configuration information was updated after the change, etc. All these items are important, but if this requirement (and its requirement parts) were rewritten as a risk-based one, the change wouldn’t impact security very much, yet it would greatly ease the burden of compliance.

CIP-007 R2 is a different story. It’s very hard to say that all the steps it requires are important for security. For example, is it really necessary for a NERC entity to check every 35 days to see if a new security patch is available for every software or firmware product (as well as every version of every software or firmware product) that is installed on any Cyber Asset within their ESP?

For some very important products that often issue patches, the answer is yes. However, for other products that seldom if ever issue patches, the answer is probably no. If this requirement and its parts were rewritten as risk-based, NERC entities could save a huge amount of time and money every year, with little impact on security.

The heart of CIP-007 R2 is the requirement to apply every security patch issued by every supplier of any software or firmware product installed in the ESP. Why is that important? Of course, it’s important because software “develops” vulnerabilities over time and they (at least in many cases) need to be mitigated.

But what about vulnerabilities for which a patch is never issued? For example, what if your organization uses a custom software product for which there is no third-party “supplier”? Maybe the person who developed the software left the scene long ago; no current staff member has the slightest idea how to make any change to the software, let alone issue a patch for it. Does CIP-007 R2 require you to remove the software and/or rewrite it? No, it says nothing about this situation at all.

Even more importantly, what about an end of life product like Windows XP? I believe there are some devices in use in OT environments that are still running XP. Of course, Microsoft is no longer issuing regular patches for it – they stopped doing that in 2014. If a serious vulnerability appears, your EOL status may turn into SOL, and your entire ESP might be compromised. But don’t worry, you won’t be in violation of CIP-007 R2!

These two examples show clearly that the real risk isn’t not applying a patch, but not mitigating a serious software vulnerability. R2 needs to be replaced with a risk-based requirement for vulnerability management. That would require the NERC entity to:

1.      Track vulnerabilities for the software products installed in their ESP;

2.      Identify those vulnerabilities whose exploitation might lead to serious damage to the BES, if those vulnerabilities were being exploited in the wild (perhaps as indicated by the vulnerability’s presence in CISA’s Key Exploited Vulnerabilities catalog);

3.      Make sure the supplier of the vulnerable product plans to issue a patch for the vulnerability soon; and

4.      Apply the patch soon after it becomes available.

Of course, if the supplier of the software product doesn’t issue a patch for the vulnerability, the entity should ideally remove that product from their environment and replace it with a more secure product. Unfortunately, that’s easier said than done, since OT software products are often “one of a kind” and can’t be easily replaced, if at all. In such cases, the best course of action is to take whatever steps are possible to mitigate the increased risk caused by the unpatched vulnerability.

There are other problems with a vulnerability management requirement as well. For example, I’ve learned that suppliers of intelligent devices often don’t report vulnerabilities for their devices. Since almost all vulnerabilities (including almost all CVEs) are reported by the manufacturer of the device or the developer of the software, this means that the first step above will be impossible in the case of many devices: any search of a vulnerability database such as the NVD will identify zero vulnerabilities.

However, the device may be very far from being vulnerability-free, as was the case for the device described in the post I just linked: A search of the NVD for that device (which is mostly used in military and other high assurance environments, by the way) will yield zero applicable CVEs, when in fact a researcher (Tom Pace of NetRise) estimates there are at least 40,000 firmware vulnerabilities in it.

Without much doubt, the biggest reason why automated vulnerability management isn’t possible today is that the US National Vulnerability Database, which up until early this year was the most widely used vulnerability database in the world, has been experiencing serious problems that call into question its continued functioning. I’ve described these problems in many recent posts, including this one.

Until the NVD fixes its problems and restores its credibility (which I strongly doubt will happen anytime soon), vulnerability management will remain what it is now: more of an art than a science. Unfortunately, this means it’s currently impossible to develop a usable vulnerability management requirement for NERC CIP.

Does this mean the NERC CIP community has no choice but to continue to invest big resources in compliance with CIP-007 R2, while realizing sub-optimal returns on its investment? I’m afraid so. It would be nice if someone would draft a SAR (Standards Authorization Request) to replace CIP-007 R2 with a true vulnerability management requirement. However, even if a new requirement were drafted and approved, I don’t think there’s a realistic way to implement it, given the current problems with the NVD and the ongoing problems with the CPE identifier.[i]

Currently, the two big focuses of the CIP standards development effort are virtualization and the cloud. The former process is close to complete, but the latter is just getting started. As the “cloud CIP” effort gathers momentum, it’s likely to suck a lot of the air out of other concerns like CIP-007 R2[ii]. And it’s likely that the cloud effort will take years to be completed.

Of course, if a NERC entity wants to free itself from having to comply with CIP-007 R2 and CIP-010 R1, one way to do this will be to move as many medium and high impact BES Cyber Systems as possible to the cloud, as soon as it becomes “legal” to do so. But if a lot of other NERC entities have the same idea, some would argue that the result will be that the BES is less secure, not more secure. Wouldn’t that be ironic?

Are you a vendor of current or future cloud-based services or software that would like to figure out an appropriate strategy for selling to customers subject to NERC CIP compliance? Or are you a NERC entity that is struggling to understand what your current options are regarding cloud-based software and services? Please drop me an email so we can set up a time to discuss this!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] The OWASP SBOM Forum will soon release a proposal for finally addressing the problems with CPE, by expanding use of the purl identifier to proprietary software (which purl does not currently address). Stay tuned to this blog!

[ii] Since any new “cloud CIP” requirements will need to be completely risk-based, it is likely that CIP-007 R2 will be replaced with a risk-based vulnerability management requirement for use by systems deployed in the cloud. However, it’s also likely that any new CIP standards will have two “branches”: one for use by cloud systems and one for use by on-premises systems. The latter will bear a close resemblance to the current CIP standards, if they aren’t virtually identical. Thus, a prescriptive patch management requirement is likely to remain part of the CIP standards for years, even after the “cloud problem” in CIP is finally fixed.

Thursday, October 3, 2024

Why is FERC so concerned with NERC CIP-013?

Two weeks ago, FERC issued an important Notice of ProposedRulemaking (NOPR) regarding NERC CIP-013, the CIP standard for supply chain cybersecurity risk management. There was a good discussion of the NOPR in the NERC Supply Chain Working Group meeting on Monday, and yesterday Mike Johnson (author of an excellent blog on NERC CIP that has been on hiatus for a couple of years) put up a post about this.

The NOPR states (in paragraph 2 on page 3), “…we preliminarily find that gaps remain in the SCRM Reliability Standards related to the: (A) sufficiency of responsible entities’ SCRM plans related to the (1) identification of, (2) assessment of, and (3) response to supply chain risks, and (B) applicability of SCRM Reliability Standards to PCAs.”

To address (B) first, PCA stands for Protected Cyber Asset. This is the only type of Cyber Asset subject to NERC CIP compliance that is not already covered by CIP-013-2. It’s certainly not a controversial move to extend CIP-013 to cover PCAs as well, and I agree with their concern.

However, I find (A) quite interesting. Why? Because it seems to just repeat what’s been in CIP-013 from the beginning. FERC ordered NERC to develop a SCRM standard in 2016. CIP-013-1 came into effect on October 1, 2020. CIP-013-1 R1 states that the NERC entity with high and/or medium impact BES Cyber Systems must develop a plan

…to identify and assess cyber security risk(s) to the Bulk Electric System from vendor products or services…

In my experience, most FERC NOPRs that deal with an existing CIP standard start with an affirmation that the standard has met its objective and continue to say that more is needed (of course, adding Protected Cyber Assets to the three Cyber Asset types already addressed in CIP-013, as item B suggests, is an example of that). However, this NOPR seems to be saying, “NERC, we’re not happy with how CIP-013-1[i] has turned out. Not only do we want you to add PCAs to the scope of the standard, but we want you to go back and fix the language of the standard itself.”

The specifics of why FERC is unhappy with the existing Supply Chain standards (which include CIP-013-2, CIP-005- 7 and CIP-010-4) are found starting in paragraph 24 on page 21 of the NOPR. These include:

“While providing a baseline of protection, the Reliability Standards do not provide specific requirements as to when and how an entity should identify and assess supply chain risks, nor do the Standards require entities to respond to those risks identified through their SCRM plans.” (paragraph 24)

“A responsible entity’s failure to properly identify and assess supply chain risks could lead to an entity installing vulnerable products and allowing compromise of its systems, “effectively bypassing security controls established by CIP Reliability Standards.” Further, incomplete or inaccurate risk identification may result in entity assessments of the likelihood and potential impact of supply chain risks that do not reflect the actual threat and risk posed to the responsible entity. In the absence of clear criteria, procedures of entities with ad hoc approaches do not include steps to validate the completeness and accuracy of the vendor responses, assess the risks, consider the vendors’ mitigation activities, or respond to any residual risks.” (paragraph 25)

“…a lack of consistency and effectiveness in SCRM plans for evaluating vendors and their supplied equipment and software. While a minority of audited entities had comprehensive vendor risk evaluation processes in place and displayed a consistent application of the risk identification process to each of their vendors, other entities displayed inconsistent and ad hoc vendor risk identification processes. These risk identification processes were typically completed by only using vendor questionnaires. Further, using only vendor questionnaires resulted in inconsistency of the information collected and was limited to only “yes/no” responses regarding the vendors’ security posture.” (paragraph 27)

“…many SCRM plans did not establish procedures to respond to risks once identified.”

The last comment relates to the fact that CIP-013-2 R1.1 (quoted above) only requires NERC entities’ to develop a plan to “identify and assess” supply chain cybersecurity risks to the Bulk Electric System (BES)”. It says nothing directly about mitigating those risks. I pointed out in my posts during the runup to enforcement of CIP-013-1 that any NERC entity that developed a CIP-013 R1 plan that didn’t address risk mitigation at all was inviting compliance problems; I doubt any entity tried to do that, either.

In paragraph 30 on page 26, FERC summarizes their concerns:

In light of these identified gaps, we are concerned that the existing SCRM Reliability Standards lack a detailed and consistent approach for entities to develop adequate SCRM plans related to the (1) identification of, (2) assessment of, and (3) response to supply chain risk.  Specifically, we are concerned that the SCRM Reliability Standards lack clear requirements for when responsible entities should perform risk assessments to identify risks and how those risk assessments should be conducted to properly assess risk.  Further, we are concerned that the Reliability Standards lack any requirement for an entity to respond to supply chain risks once identified and assessed, regardless of severity. 

In my opinion (which dovetails with FERC’s), there were two big problems with most NERC entities’ R1.1 plans. The first is that the plans didn’t try to identify any risks beyond the six that are “required” by items R1.2.1 through R1.2.6. Those six items had been mentioned in various random places in FERC’s Order 829 of July 2016, which ordered development of what became CIP-013. However, they were never intended to constitute the total set of risks that needed to be identified in the NERC entity’s plan. Yet, it seems that many, if not most, NERC entities considered them to be exactly that.

The second problem is that, even though NERC entities took the “assessment” requirement in R1.1 seriously, utilizing supplier questionnaires to accomplish that purpose, the responses to the questions weren’t considered to be measures of individual risks, requiring specific follow-up from the entity. Instead, they were considered to be simply one datapoint (among many) contributing to an answer to the question, “Is this supplier safe to buy from in the first place?”

It's certainly important to ask that whether your supplier is safe to buy from. But let’s face it: In the OT world, there’s seldom a question who you will buy from. It’s the supplier you’ve been buying from for the last 30-40 years. Is your utility really going to drop your current relay vendor and move to another one, just because the incumbent gave one unsatisfactory answer out of 50 on a security questionnaire? Certainly not. On the other hand, the unsatisfactory answer needs to be looked at for what it is: an indication that your current supplier poses a degree of risk in the area addressed by that question.

Suppose your questionnaire asked whether the supplier had implemented multifactor authentication for their own remote access system, and that your incumbent relay supplier indicated this was in their budget for FY2026. In my opinion, that’s an unsatisfactory answer. In 2018, a DHS briefing indicated that nation-state attackers had penetrated a large number of suppliers to electric utilities (and through them at least some utilities), often through exploiting weak passwords in remote access systems. Clearly, no supplier to the power industry should still have MFA on their to-do list; it should already be in place.

A NERC entity that receives the above response on a supplier questionnaire should immediately contact the supplier and ask when they will implement MFA. The right answer is “By 5:00 today”, although “By noon tomorrow” might also be acceptable. Whatever deadline the supplier sets, the entity should follow up with them on that day to find out whether they kept their promise. If they did, great. If they didn’t, when will they implement MFA and what price or other concessions are they ready to accept if they don’t make the new date? After all, by not mitigating this risk, they’re transferring the mitigation burden to your organization. Since you will have to implement additional controls due to the supplier’s lax attitude, you need to be compensated for that fact.

The point is that each question in your questionnaire should address a specific risk that you think is important; if you don’t think the risk is important, don’t ask the question – you’re just wasting your and the supplier’s time. But since the risk is important, you need to follow up and make sure it gets mitigated. Presumably, you’ll get the supplier’s attention, and it will get mitigated. But if it isn’t, you will receive at least partial compensation for the additional costs the supplier is imposing on you, since you will probably have to spend money and/or time implementing mitigations.

Since this is a Notice of Proposed Rulemaking, FERC concludes this section with a discussion (on pages 26-31) of what changes they might require NERC to make in CIP-013-2 (of course, those changes would be in a new standard CIP-013-3). All of their suggestions are good, although I will provide a more comprehensive discussion of what I propose in a later post. FERC is giving the NERC community until December 2, 2024 to submit comments on this NOPR.

When FERC ordered the supply chain standard in 2016, there had been few true supply chain cybersecurity attacks to point to, although the potential for them was clear. That certainly isn’t true today, when new attacks seem to be reported every day and some of them, such as SolarWinds and the attacks based on log4j, have caused immense damage. FERC is quite right to ring the alarm bell on this.

Are you either a NERC entity or a supplier to NERC entities that is trying to figure out what this NOPR means for your organization? Please drop me an email so we can set up a time to discuss this!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] And presumably CIP-013-2 as well. That standard didn’t change CIP-013-1 much, other than to add EACMS and PACS to the Cyber Asset types in scope for the standard.

Wednesday, October 2, 2024

It’s September 30. Do you know where your CVE backlog is?

 

Monday was September 30, the end of the federal fiscal year. That morning, Patrick Garrity of VulnCheck decided to look in on the National Vulnerability Database (NVD), to see whether they had kept the promise they made on May 29. He pointed out in this blog post,

On May 29, NIST announced that it awarded a contract to a third-party, Maryland-based Analygence, to help address the backlog. Its announcement expressed confidence that “this additional support will allow us to return to the processing rates prior to February 2024 within the next few months.” It further stated that it expects the “backlog to be cleared by the end of the fiscal year.”

What is this backlog? It’s the set of “unenriched” CVE Records. You can learn what those are in this post of mine. Briefly, a CVE record describes a newly-identified software vulnerability and includes a textual description of one or more software products (or intelligent devices) in which this vulnerability was found. Of course, it is good to have this information, but another piece of information is essential: a machine-readable identifier for the vulnerable software product(s), called a CPE name.

If the CPE name isn’t present in the CVE Record, a software user who is trying to learn about newly identified vulnerabilities in a software product they operate (or previously identified vulnerabilities in products they recently acquired) will not learn about this vulnerability through an automated search using the CPE name for the product. This is because the search will only identify the vulnerability if the CPE name is included in the CVE record.

In other words, a CVE Record with no CPE name (an unenriched Record) is invisible to automated searches; the only way for the user of a software product to be sure to identify vulnerabilities applicable to their product, if the Record doesn’t have a CPE name, is by doing a “manual” text search.

CVE Records are produced by CVE.org (part of DHS) and forwarded to the NVD (part of NIST, which is part of the Department of Commerce). Until February of this year, NIST almost always “enriched” the record by adding one or more CPE names that correspond to the product(s) described in the text of the record, meaning that in general almost all CVE Records were enriched soon after they were added to the NVD.

However, that has been far from the case this year. In fact, Patrick stated that the number of unenriched 2024 CVE Records as of September 21 was 18,358. This is roughly the number that Bruce Lowenthal of Oracle estimated two weeks ago, which I reported in my post of September 18. The fact that Patrick’s number is lower than Bruce’s is probably not statistically significant.

Patrick went on to post a few more interesting observations:

1.      The 18,358 unenriched CVE records amount to about 73% of 2024 CVEs. In other words, the NVD has only enriched about one quarter of the CVE records it has received this year. Normally, that number doesn’t deviate much from 100%.

2.      46.7% of the vulnerabilities in CISA’s Known Exploited Vulnerabilities (KEV) catalog have yet to be enriched. This is important, because the fact that a vulnerability is on the list means it is currently being exploited. In other words, it would be a good idea to patch that vulnerability as soon as possible. The fact that the NVD doesn’t yet show – at least in response to an automated search -close to one half of the vulnerabilities on the KEV list is, to say the least, disappointing. The NVD was supposedly prioritizing CVEs on the KEV list for enrichment. What happened to that?

3.      On the other hand, only 14% of 2024 CVEs do not now include a CVSS score. This is quite impressive, given that 14% was the number of CVEs with CVSS scores not very long ago. This is because, along with adding CPEs to CVE records, the NVD previously also added CVSS scores; in other words, the CVSS enrichment rate fell almost as much as did the CPE enrichment rate last February. What’s behind this impressive turnaround? I believe it’s CISA’s “CNA Enrichment Recognition List”, which I wrote about in this post recently (as Patrick points out in his post, CISA’s separate Vulnrichment program is also contributing to this improvement).

It's good to see that Patrick is working his beat!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

I co-lead the OWASP SBOM Forum and its Purl Expansion Working Group. These groups work to understand and address issues like what’s discussed in this post; please email me to learn more about what we do or to join us. You can also support our work through easy directed donations to OWASP, a 501(c)(3) nonprofit, which are passed through to the SBOM Forum. Please email me to discuss that, or to receive easy instructions on how to donate online.

My book "Introduction to SBOM and VEX" is available in paperback and Kindle versions! For background on the book and the link to order it, see this post.

 

Friday, September 27, 2024

Can you implement a low impact Control Center in the cloud today and maintain NERC CIP compliance?

In this recent post, I explained why I think low impact BES[i] Cyber Systems (specifically, those LIBCS that are found in low impact Control Centers) can be deployed in the cloud without any CIP compliance implications today. My reason for saying this is I don’t think low impact BCS are even defined in the cloud, because of wording in CIP-002-5.1a. In fact, I think there would not be a CIP compliance obligation if a NERC entity deployed their entire low impact Control Center in the cloud.

However, I got pushback from a former NERC CIP auditor on that post. He thinks my readers who put LIBCS in the cloud, and especially a low impact Control Center in the cloud, will be at compliance risk if they don’t comply with the low impact CIP requirements. He based his statement on the NERC definition of Control Center.

While I didn’t agree with my friend’s reasoning in this matter, I realized I may be making a false analogy between medium/high impact requirements and low impact requirements. I don’t see any way that medium or high impact BCS (let alone Control Centers) can be implemented in the cloud today without a) putting the NERC entity in violation of many CIP requirements, or b) requiring the CSP to break the cloud model by enclosing all of the systems that hold the entity’s BCS within a Physical Security Perimeter and an Electronic Security Perimeter.

My mistake was making the same assumption about the CIP requirements that apply to low impact BCS (LIBCS). It took a long exchange of emails, but my friend – Kevin Perry, retired Chief CIP Auditor of the NERC SPP Regional Entity – finally convinced me that it’s in fact possible for a NERC entity to deploy a low impact Control Center (LICC) in the cloud, and also comply with the same set of CIP requirements that an on premises LICC would need to comply with; more specifically, it will be possible for the CSP to provide the NERC entity the evidence they need to prove compliance with each low impact requirement. Here’s why I now think this is possible:

1.      All the CIP requirements that apply to LIBCS are found in CIP-003-8. Three of them are simple to understand; the CSP will have no problem providing compliance evidence for them:

a.      Requirement R1 mandates that the entity develop security policies.

b.      Requirement R3 requires designation of a CIP Senior Manager.

c.      Requirement R4 prescribes development of a policy for the CIP Senior Manager to delegate their authority in certain circumstances.

2.      However, one of the requirements isn’t as simple to understand. It’s Requirement R2, which states that a Responsible Entity with LIBCS needs to develop and document a plan “that include(s) the sections in Attachment 1.”

3.      Attachment 1 appears on pages 23-25 of CIP-003-8. It consists of five Sections. Despite the terminology that’s used, I believe it’s better to think of each Section as a Requirement Part of CIP-003-8 Requirement R2.

4.      There are four Sections that prescribe policies or practices that the Responsible entity needs to have in place: “Cyber Security Awareness” (Section 1), “Physical Security Controls” (Section 2), “Cyber Security Incident Response” (Section 4), and “Transient Cyber Asset[ii] and Removable Media[iii] Malicious Code Risk Mitigation” (Section 5).

5.      All four of the above are functionally equivalent to policies and practices that the CSP is certain to have in place now, perhaps in the wording of standards the CSP has been certified on, such as ISO 27001[iv]. It will be up to the NERC entity to demonstrate that the CSP’s policies and practices address each of the above requirements, based on evidence provided by the CSP.[v]

CIP-003-8 Requirement 2 Attachment 1 Section 3 is different from all the above requirements in that it’s a technical requirement. A few of the technical requirements that apply to medium and high impact systems, including CIP-007 R2, CIP-010 R1 and CIP-005 R1, are literally impossible for a CSP to comply with, since they require tracking the actual devices (physical and virtual) on which BES Cyber Systems reside – and then providing evidence showing that the literal wording of the requirement was applied to every device on which the BCS resided over the 3-year audit period. Since systems in the cloud move from device to device and datacenter to datacenter all the time, this is impossible.

However, CIP-003-8 Requirement 2 Attachment 1 Section 3 doesn’t apply to individual devices. This is because requiring an inventory of low impact BCS is strictly prohibited by wording in CIP-002 and CIP-003. Section 3 requires the Responsible Entity to inter alia implement electronic access controls that “Permit only necessary inbound and outbound electronic access… for any communications that are... between a low impact BES Cyber System(s) and a Cyber Asset(s) outside the asset containing low impact BES Cyber System(s)…”

Although I’m not an auditor, it seems to me that the CSP will need to provide evidence like the following:

·        A list of all routable protocols used to communicate between any low impact BCS in the Control Center (without identifying which BCS use which protocols) and any system outside of the CSP’s cloud (without identifying those systems, where they’re located or who they belong to).[vi]

·        Documentation showing how the Responsible Entity manages the virtual firewalls to ensure that only necessary inbound and outbound electronic access is permitted (the NERC entity will probably be responsible for this evidence, assuming they are allowed to manage the firewalls).

This evidence seems to me to be very much in the realm of possibility. Thus, it seems it’s possible to implement a low impact Control Center in the cloud and be fully CIP compliant at the same time.

Are you a vendor of current or future cloud-based services or software that would like to figure out an appropriate strategy for selling to customers subject to NERC CIP compliance? Or are you a NERC entity that is struggling to understand what your current options are regarding cloud-based software and services? Please drop me an email so we can set up a time to discuss this!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.


[i] BES stands for Bulk Electric System.

[ii] Transient Cyber Assets are usually laptops of third parties that are used for less than thirty days on the NERC entity’s network.

[iii] These include thumb drives, removable optical drives, etc.

[iv] If you’re wondering why I don’t also mention FedRAMP, I need to point out that the CSP isn’t “certified” as “FedRAMP-compliant”. FedRAMP is a means of approving a federal agency’s use of a third-party service, such as a cloud service. It isn’t meant to provide an overall certification of the service’s security for the private sector, as are ISO 27001 and other certifications.

[v] Since the CSP won’t be able to provide evidence of compliance with the exact wording of any of these requirements, it’s likely that some other NERC ERO document, like a CMEP Practice Guide, may need to be in place to “legalize” all of this. The Regions (including the auditors) need to draft and approve CMEP Practice Guides; that might take 6-9 months. But that’s still a lot better than the approximately six years that I estimate will be required before changes to the CIP standards to “legalize” full cloud use will be in effect (the Standards Drafting Team started that process in August).

[vi] I admit it’s stretching a point to say that “outside of the CSP’s cloud” is the equivalent of “outside the asset containing low impact BES Cyber System(s)”. The latter is really longhand for “low impact asset”. That means one of the six BES asset types listed in CIP-002-5.1a R1, which wasn’t already classified as high or medium impact in CIP-002-5.1a R1 Attachment 1. Of course, the cloud, or even a cloud data center, isn’t one of those six asset types, although I think it could well be considered to be such.

My guess is there will need to be a CMEP Practice Guide to clarify this point. Again, waiting the 6-9 months required to draft the Practice Guide is a lot better than waiting six years for a full rewrite of the CIP standards (although, since I’ve already advocated for two Practice Guides in this blog post - and others, including Guides regarding BCSI and the meaning of “access monitoring” in the EACMS definition, will be required as well - there probably needs to be some organized effort to draft and approve the Guides, with multiple approvals in process at the same time).