Tom Alrich's Blog: October 2023

Friday, October 27, 2023

You’ve got a new VEX format? Great! How will it be used?

Red Hat came out with their new VEX format this week and described it in this blog post (Pete Allor of Red Hat also discussed it in today's OWASP SBOM Forum meeting). You might say that VEX is on a roll lately, with Cisco having announced their VEX format in September. Also, Oracle is publishing VEX documents, but I haven’t seen a description of their format yet. However, it’s clear that all three of these formats differ from each other.

Is it a problem that they’re different formats, all carrying the name VEX? There’s certainly no trademark on the VEX name (any more than there’s a trademark on “SBOM”). Moreover, the only published definition of VEX is in the CISA “VEX Use Cases” document. It reads, “A VEX document is a binding of product information, vulnerability information, and the relevant status details relating them.” In other words, any document that mentions a product, a vulnerability, and the “status details relating them” (with status being undefined, so it could mean almost anything) can be called a VEX document. All these documents, as well as OpenVEX documents and probably anything else that’s ever been called VEX, easily meet that low bar.

However, there’s a much higher bar that needs to be met, and I know of no VEX format in use today that can clear it. That bar includes three parts:

1. There must be a defined use case for the document, and it must be a machine-readable use case. In other words, simply “providing information on vulnerabilities that affect/don’t affect Product A” isn’t a machine-readable use case. If that’s your goal, I can save you a lot of time and effort: You can enter that information into a PDF document and email it to your customers. Only if the document you produce is machine readable and intended to be utilized as part of a larger automated use case – e.g., “allowing customers of Product A Version 2.3 to automatically remove from their list of component vulnerabilities those that have been determined by the supplier not to be exploitable in that product/version”[i] – is it worthwhile to produce VEX documents.

2. There must be a tightly-constrained specification for the format; specifically, it needs to address only one use case[ii]. Even if the VEX document “sits on top of” a more general vulnerability reporting format like CSAF or CycloneDX, that fact alone doesn’t constitute a specification (and by the way, there will never be a single VEX specification applicable to all the different platforms that currently form the base for documents called VEX – CSAF, CycloneDX, and OpenVEX. They are just too diverse. For that matter, there is no single SBOM specification that encompasses both CycloneDX and SPDX SBOMs; they are too diverse as well).

3. The specification needs to be tightly enough constrained, that it will be possible to develop two completely interoperable tools. The first is a VEX production tool that allows a supplier to produce a VEX simply by asking a set of questions that can be answered without prior knowledge of the format (e.g., “Which vulnerability/vulnerabilities will be addressed in the document?”).[iii] The second is a VEX consumption tool[iv] that is limited to reading documents produced in the tightly constrained format on which the production tool is based.

As I’ve already advertised, the OWASP SBOM Forum has formed a sub-group, now meeting bi-weekly, that is going to produce a VEX format that meets all three of the above criteria. Our first step is to produce a detailed use case for VEX, which is in progress in this Google Doc. You are invited not only to review the document, but to add comments or suggest changes. No permission is required to do that.

We plan to finalize the use case document within a few weeks and move on to developing the specification. Since we have decided that a CSAF specification is more urgently needed, we will develop that specification, and one of our members might develop prototype tools to produce and consume VEX documents that follow that specification.

Then, we’ll start with the specification based on the CycloneDX platform, and probably produce tools for that format as well (since there are already lots of tools that produce and consume documents using the base CycloneDX format – which underlies CDX SBOM and a number of other BOMs, as well as VEX and VDR documents – and since we will have learned a lot in developing the CSAF specification), we expect the CDX VEX tools will be much easier to produce. In fact, there are already tools that produce and read “VEX” documents based on the CDX platform, but none of these are likely to follow the VEX specification we will develop.

What is the use case we will base our VEX specifications on? I’ll be honest that I want it to be the one laid out in the Google Doc, which can be summarized in this quotation from the only document on VEX produced by the NTIA VEX working group:

The primary use cases for VEX are to provide users (e.g., operators, developers, and services providers) additional information on whether a product is impacted by a specific vulnerability in an included component and, if affected, whether there are actions recommended to remediate.

Why do I want this use case? Having attended most VEX working group meetings from the second meeting under NTIA until recently, I can attest that this was the only use case even discussed for at least the first year of the group’s existence. VEX was driven by a few very large software (and device) suppliers that were quite concerned that their help desks would be overwhelmed with calls from frantic customers, once they started receiving SBOMs and looking up the components in the NVD.

The great majority of component vulnerabilities they learn about will be false positives, meaning they only affect the component if tested independently; they won’t affect the product itself. The VEX will be ingested by a vulnerability management tool that first ingests an SBOM and then guides the organization’s response to the vulnerabilities identified through that SBOM. Ideally, the VEX will reduce that list to include only exploitable vulnerabilities.

It will be years before there are true end-user tools (licensed commercially so the user organization has just one throat to choke when problems arise) for this purpose, but – as discussed in the use case document – third party service providers should quickly arise to address this use case. But that is unlikely to happen until there is a single agreed-upon VEX specification, although with two variants (CSAF and CycloneDX).

But there’s an even more important reason for this to happen: As I’ve said many times before, there are a small number of “showstopper” problems that are preventing SBOMs from being used to any significant degree by organizations whose primary business isn’t software development. I used to think the naming problem was the number one showstopper, but more recently I’ve come to realize that VEX is number one. I’ve heard a number of suppliers say that they won’t put out SBOMs to their customers with any regularity, until they’re sure they’ll be able to identify for their customers – in a machine-readable fashion – which of the component vulnerabilities they learn about aren’t exploitable in the product itself, and therefore can be safely ignored.

The existence of this tightly constrained spec, as well as at least one service provider that is willing to prototype the service described in the “VEX Use Cases” document, should be enough to at least have a real proof of concept for production and utilization of machine readable SBOM and VEX documents. at that point, it will be much more possible to at least frame the remaining issues. Those issues will have much more to do with practices and processes, not with technical specifications. Since those issues will be much more amenable to being addressed through jointly agreed upon guidelines, we should be able to make much more progress toward widespread use of SBOMs themselves at that point. Perhaps there’s light at the end of the tunnel after all!

Any opinions expressed in this blog post are strictly mine and are not necessarily shared by any of the clients of Tom Alrich LLC. If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com.

I lead the OWASP SBOM Forum. If you would like to learn more about what that group does, please go here.

[i] This was the original use case for VEX under the NTIA. However, it was never explicitly articulated, with the result that we find ourselves today in the situation where VEX can be anything and is therefore nothing.

[ii] This isn’t to say there’s only one use case for VEX; there are already a number of them. But it is to say that each of those use cases needs to be addressed in a different specification, and therefore using different tools. If that isn’t done, we immediately get into the situation where it becomes impossible to develop tools, as described in this post. I’m sure that in the future, there will be tools that can create or consume multiple VEX formats; but it’s useless to concern ourselves with them now.

[iii] This is especially important for VEX documents in CSAF, which has a tremendously complex specification. Very few software developers have the resources or time available to learn that spec well enough to create documents on their own, and even fewer end users could learn it well enough to create a useful consumption tool on their own.

[iv] This will always be part of a larger tool, since, as already discussed, simply reading the information in a VEX document doesn’t constitute a use case for VEX. A PDF attached to an email is a much better way to address that use case.

Tuesday, October 24, 2023

Cybersecurity Training for the Utility Workforce

There's a great free cybersecurity training opportunity for electric utilities, sponsored by DoE's CESER office. It's three days of training with top-notch instructors, provided in six different locations. A utility can send as many of its employees as desired. Attendees need to pay travel and meals expenses, but nothing more than that. I was told today that there are lots of open slots available currently.

The full story is here.

Sunday, October 22, 2023

NERC CIP: What is required to implement medium or high impact BCS in the cloud?

Note from Tom: I prepared this document for my discussion with Lew Folkerth of the RF NERC Regional Entity on Monday October 23, at RF's monthly Tech Talk. The Tech Talk is open to anybody and does not require registration. The event runs from 2:00 to 3:30 Eastern Time. Go here to find the agenda and the link to the webinar. RF never records the Tech Talks, so there will be no recording available (the small set of slides I'll use will be posted, though).

My and Lew's discussion will start around 2:30 to 2:45, but the previous two speakers are both very good, so I recommend you join the webinar as soon after 2:00 as possible. Note that Lew and I are speaking third, even though I'm listed second in RF's announcement.

If you would like a PDF of this post, please email me at tom@tomalrich.com.

Executive summary

This document is based entirely on the opinions of Tom Alrich, owner of Tom Alrich LLC.

This document describes a possible path by which changes could be made to the NERC CIP standards that would allow high and medium impact BES Cyber Systems (BCS) to be implemented in cloud environments. It is not meant to be an actual blueprint for achieving that purpose, but is more of a “proof of concept” that this is something that could be achieved.

While achieving this goal will require that a Standards Authorization Request (SAR) be prepared and submitted to the NERC Standards Committee, the author believes it is still too early to do that, since any SAR will need to guide the drafting team regarding a feasible path to achieve this goal. Since no FERC order mandates what needs to be done (as has been the case with almost all other new CIP standards or changes to existing standards), the author believes it is up to the NERC community to come to agreement regarding generally how to proceed.

When rough agreement has been achieved, a SAR can then be prepared and submitted to the Standards Committee. When the SAR has been approved and a Standards Drafting Team (SDT) has been seated, that team can go to work on filling in the details of what is required. But the SDT should not be required to come up with the overall plan; that is not the purpose of an SDT.

This document proposes that:

1. There will be two “tracks” for NERC CIP compliance: Track 1 for “On-premises BCS” and Track 2 for “Cloud BCS”. Track 1 will be almost identical to the current CIP-004 to CIP-013 standards. All on-premises BCS will follow this track, meaning there should be few if any changes required to existing CIP compliance programs, for on-premises BCS.

2. Systems implemented in the cloud will follow CIP-003, CIP-013, and a new standard (CIP-015?) that will apply only to medium or high impact BCS implemented in the cloud. This standard will function a lot like CIP-013, in that it will require a NERC entity with medium and/or high impact Cloud BCS to develop and implement a cloud security risk management plan.

3. CIP-003 may be unchanged and will apply to both on-premises and cloud based low impact BCS.

4. CIP-002 will be where the two tracks diverge. Some fairly small changes will allow CIP-002 to define both Track 1 and Track 2.

The new standard will require:

That CSPs be assessed by NERC or a NERC-designated organization every year,
That the NERC entity compare the results of the assessment to the criteria in the entity’s cloud risk management plan, and
That the CSP be assessed on other criteria identified by the NERC Standards Drafting Team, specifically related to how well they have mitigated risks to CSPs that are not now addressed in certification regimes like Fed RAMP.

I. What do we mean by “implement BCS in the cloud”?

We envision the following as one possible scenario for implementation of BES Cyber Systems in the cloud, although there can certainly be others as well. Note that the discussion below does not address issues with use of SaaS applications that are in fact BCS in the cloud (e.g., outsourced SCADA). It also does not address the issue of utilizing an EACMS that is hosted in the cloud (as may be the case with a managed security service). Those two cases need to be addressed separately.

1. The NERC entity contracts with a major cloud service provider (CSP) to host some of their BCS.

2. The CSP agrees in the contract to provide to the NERC entity the “Infrastructure as a Service” (IaaS) level of service defined by NIST (currently, we’re restricting “CSP” to refer to an IaaS provider, even though PaaS and SaaS providers are technically CSPs as well).

3. The NERC entity (or a third party engaged by the entity) installs the operating system and software required to implement one or more Cloud BCS on the CSP’s infrastructure.

4. The NERC entity will be responsible for implementing and/or configuring cloud security controls in cases where these are not provided by their contract with the CSP. The entity will also be responsible for all controls – virtual firewalls, etc. - required to protect their own environment in the cloud.

5. The entity will be responsible for any connectivity required between Cloud BCS and other systems or databases in the cloud, and between Cloud BCS and systems outside the cloud. Important considerations include the latency, security, redundancy, and reliability of the connections.

A critical assumption behind this document is that there will be separate “compliance tracks” for on-premises BCS (meaning the set of systems in scope for CIP compliance today) and cloud-based medium and high impact BCS (which are not being implemented in the cloud at all now, to our knowledge). It is not envisioned that a NERC entity will need to do anything different with respect to CIP compliance for its on-premises systems than it needs to do today. Only if the entity wishes to implement one or more medium or high impact BCS in the cloud will it need to implement different CIP compliance processes and even then, it will only need to implement them with respect to the cloud based BCS.

This does not mean it would not be good in the future to entertain ideas for improvements to the current CIP standards with respect to on-premises systems. However, it does mean that now would be the wrong time to entertain those ideas, since it would greatly delay opening a path to CIP compliance for medium and high impact BCS deployed in the cloud.

II. Why do we have this problem?

No CIP requirement currently forbids deploying medium and high impact BCS in the cloud. In fact, none of the CIP requirements even mentions the cloud. Despite that fact, there have been no reports of a NERC entity deploying medium and/or high impact BCS in the cloud. If a NERC entity were to do this, they would have to demonstrate during an audit that they have remained compliant with every CIP requirement that applies to medium and/or high impact BCS. Here are just three of the items that the NERC entity would have to prove[1]:

· That the CSP performed a background check on any CSP employee who might have just walked by a server that contained a part of a BCS anywhere in the US. This presumably might include every employee of every data center in which some portion of the NERC entity’s BCS resided at any time during the audit period. The background check must be according to the NERC entity’s methodology. If more than one NERC entity implements medium and/or high impact BCS in one CSP’s cloud, the check will need to be performed according to the methodology of each of those entities. In other words, if the CSP has 20 such NERC entity customers, every employee of each affected data center will need to have 20 separate background checks (required by CIP-004 R3).

· That the CSP declared a PSP around every one of their data centers that contains any piece of a medium or high impact BCS during the three-year audit period, and put up card readers (controlled by the utility) to which all their employees must badge in. If the CSP has ten NERC entities with high and/or medium impact Cloud BCS as customers, each one of those utilities will need to have their own card readers on every cloud data center that might house even part of one of their BCS at any time. Every employee of the data center will need to badge in – with a separate badge – to each of those card readers, whenever they enter the building (required by CIP-006 R1).

· That the CSP granted to each NERC entity with high and/or medium impact BCS the ability to control which of the CSP’s employees can enter any data center that houses any part of their BCS. It is possible that each such NERC entity will need to approve every CSP employee’s ability to enter the data center where they work and renew that approval periodically (required by CIP-006 R1).

Of course, there is no way that a CSP could ever provide the evidence required for a NERC entity to prove compliance with any of these requirements, or with most of the other NERC CIP requirements and requirement parts that apply to medium and/or high impact BCS. This lack of evidence means a NERC entity that implemented medium or high impact BCS in the cloud might well be found in violation of a large number of CIP requirements and requirement parts, with each day – and each data center - for which there is no evidence counting as a separate violation. Even though the NERC entity might assert that the CSP “more than complies” with the CIP standards, the auditor would point out that, with the NERC CIP requirements, the only meaning of “compliance” is being able to provide evidence that the required actions were taken in every required instance. Anything less than that is non-compliance.[2]

At this point, it should be clear that a lot of work – by NERC, the six NERC Regional Entities, and NERC entities themselves – will be required to make it possible for NERC entities to implement medium and/or high impact BES Cyber Systems in cloud environments. Three separate workstreams are required (they could be engaged in simultaneously). These are described in the three sections below.

III. Workstream one: Changes to the definition of BES Cyber System and to CIP-002

As mentioned earlier, there will be two NERC CIP “compliance tracks”: one for BES Cyber Systems implemented “on-premises” (as is the case with all BES Cyber Systems today) and one for BCS implemented in the cloud. The two tracks will be implemented using two definitions of BCS, as well as some changes to CIP-002.

1. Definition of On-premises BES Cyber System

The definition of On-premises BCS will read something like, “A BES Cyber System is one or more BES Cyber Assets logically grouped by a responsible entity to perform one or more reliability tasks for a NERC functional entity and deployed on the premises of the Responsible Entity.”

2. Identifying and classifying On-premises BES Cyber Systems

The required steps for identifying and classifying on-premises BCS will not be changed from the current steps. Unless re-deployed in a cloud environment, every current BCS will become an on-premises BCS, once the changes to CIP-002 take effect. Briefly (and omitting some steps that are not crucial to this narrative), to identify its medium and high impact BCS, a NERC entity must currently perform the following steps:

1. Identify BES assets (most commonly Control Centers, transmission substations and generating resources, including renewable generation) in its footprint that contain or utilize Cyber Assets, defined as “programmable electronic devices”.

2. Classify these assets[3] into high, medium and low impact, based on the criteria in Attachment 1 of CIP-002.

3. Identify Cyber Assets that meet the definition of BES Cyber Asset (BCA), that are either

a. “Used by and located at” any asset described in Attachment 1, Section 1 (known unofficially as “high impact assets”); or

b. “Associated with” any asset described in Attachment 1, Section 2 (known unofficially as “medium impact assets”).

4. Incorporate each BCA into one or more BES Cyber Systems (BCS). A high impact BCS contains one or more high impact BCAs. A medium impact BCS contains no high impact BCAs and one or more medium impact BCAs.

5. All other BCS are low impact if they are “associated with” any of the asset types listed in section 3 of Attachment 1. They also must “meet the applicability qualifications in Section 4 - Applicability, part 4.2 – Facilities”, of CIP-002-5.1A. Note that CIP-002 R1.3 says, “a discrete list of low impact BES Cyber Systems is not required”, meaning a NERC entity with low impact BCS should not be required to produce a list of low impact BCS, as they are required to do for medium and high impact BCS.[4]

Note that none of the above steps are explicitly required by CIP-002, but they are all implicitly required by the wording of CIP-002-5.1a R1.1 – R1.3 and by the definitions of Cyber Asset, BES Cyber Asset and BES Cyber System.

3. New Cloud BCS definition

For high and medium impact BCS to be deployed in the cloud, it is essential that the terms Cyber Asset and BES Cyber Asset not be used in any new CIP requirements; this is because systems in the cloud are not deployed on particular Cyber Assets (i.e., servers). The CSP will never be able to provide compliance evidence showing every Cyber Asset (or Virtual Cyber Asset) on which any component of a cloud BCS was deployed. Rather, BES Cyber System needs to be the fundamental NERC CIP compliance term for systems deployed in the cloud, since that term carries no implicit reference to any device.

However, even though the concepts of Cyber Asset and BES Cyber Asset (BCA) are not required to identify cloud-based BES Cyber Systems, the contents of the BCA definition are required. This is because, in the current CIP ontology, the BCA definition provides the link between the system and the BES. Therefore, the contents of the BCA definition need to be moved into the Cloud BCS definition, which then might read, “A System that if rendered unavailable, degraded, or misused would, within 15 minutes of its required operation, misoperation, or non-operation, adversely impact one or more Facilities, systems, or equipment, which, if destroyed, degraded, or otherwise rendered unavailable when needed, would affect the reliable operation of the Bulk Electric System. Redundancy of affected Facilities, systems, and equipment shall not be considered when determining adverse impact.”

4. Identifying and classifying high and medium impact Cloud BCS

Neither “Cyber Asset” nor “BES Cyber Asset” is used in CIP-002 (or anywhere else in the NERC CIP requirements). Thus, the following minimal changes should be sufficient to allow a NERC entity to identify and classify “Cloud BCS” in CIP-002. If the NERC entity also has on-premises BCS, it will continue to follow the five steps outlined above to identify and classify those systems. Note that, since BCS are always associated with a BES asset described in Appendix 1 of CIP-002, and since whether the BCS is deployed on premises or in the cloud should make no difference for its classification, no big changes are needed for CIP-002 R1 or Attachment 1.

Below are the changes needed to accommodate Cloud BCS.

5. Addition to CIP-002 R1

The following requirement parts should be added to CIP-002 R1 (phrases in red are new compared with the wording of R1.1, R1.2, and R1.3):

1.4. Identify each of the high impact Cloud BES Cyber Systems according to Attachment 1, Section 4, if any, at associated with each asset;

1.5. Identify each of the medium impact Cloud BES Cyber Systems according to Attachment 1, Section 5, if any, at associated with each asset; and

1.6. Identify each asset that ~~contains~~ is associated with a low impact Cloud BES Cyber System according to Attachment 1, Section 6, if any.

6. Changes to CIP-002 Attachment 1

Since sections 1-3 of Attachment 1 deal with classification of high, medium, and low impact BCS respectively (i.e., on premises BCS), there need to be three new sections that deal with classification of high, medium, and low impact Cloud BCS. As already stated, the classification of the BCS has nothing to do with where it is deployed, so none of the “bright line criteria” in Attachment 1 need to change for Cloud BCS.

New Section 4:

The first line of Section 4 should read, “Each Cloud BES Cyber System used by ~~and located~~ at[5] any of the following” (followed by the same list found in Section 1). These Cloud BCS are high impact.

New Section 5:

This section should begin, “Each Cloud BES Cyber System, not included in Section 1 above, associated with any of the following” (followed by the same list found in Section 2). These Cloud BCS are medium impact.

New Section 6:

The section should begin, “Cloud BES Cyber Systems not included in Sections 4 or 5 above that are associated with any of the following assets and that meet the applicability qualifications in Section 4 - Applicability, part 4.2 – Facilities, of this standard” (followed by the same list found in Section 3). These Cloud BCS are low impact.

IV. Workstream two: Changes to the other existing CIP standards

After BCS and Cloud BCS are identified and classified in CIP-002, the “On-premises” CIP track will follow CIP-003 through CIP-013. However, the “Cloud” track will follow CIP-003 and then skip CIP-004 through CIP-012; it will also include CIP-013.

CIP-003 will be in both tracks, although it needs to be modified to acknowledge that there are now both On-premises and Cloud BCS; it is likely that no further changes will be needed. No substantial changes are required because CIP-003 is dedicated to low impact BCS, and it has always been “legal” to locate low impact BCS in the cloud.

V. Workstream three: New CIP standard(s)

The Cloud track will need to include a new CIP standard (or standards), perhaps numbered CIP-015. The purpose of this standard will be to mitigate risks attendant on implementing medium and high impact BCS in the cloud. Of course, this standard will apply to NERC entities, not to the CSPs. The new standard will be drafted through the normal SAR/SDT process.[6]

One way of envisioning the new standard is to analogize to CIP-013. Like CIP-013, the Cloud BCS standard will be aimed at mitigating vendor risks (in this case, CSP risks), both during procurement of cloud services and during use of those services. The NERC entity’s practices with respect to CSP services (e.g., protection of access to the CPS’s cloud, coming through the NERC entity’s networks) will also be in scope. The new standard will operate by requiring the NERC entity that wishes to implement medium and/or high impact BCS in the cloud to develop a cybersecurity risk management plan (in this case, a Cloud Cybersecurity risk management plan instead of a Supply Chain Cybersecurity risk management plan, as in the case of CIP-013).

In CIP-013, the NERC Entity is required to do the following:

1. Develop a Supply Chain Cybersecurity risk management plan to identify, assess, and mitigate supply chain cybersecurity risks for medium and high impact BCS.

2. Implement that plan.

3. Review the plan annually (every 15 months).

One important deficiency of CIP-013 was that it did not list the areas of risk that must be addressed in the plan. As a result, many NERC entities took the default path of just focusing on the six items (found in R1.2.1 through R1.2.6) that are specified in the standard, even though these were never intended to be the entirety of CIP-013 compliance.

The NERC entity’s Cloud Cybersecurity risk management plan will need to address two broad categories of cloud cybersecurity risk: general IT risks and risks specific to cloud service providers. It is our opinion that auditing the general IT risks is a waste of time if the CSP has a FedRAMP or SOC II[7] audit report that can be made available for inspection[8]. However, since many NERC entities may request the same audit report(s) from one of the major CSPs for CIP compliance purposes, it would be better if NERC could examine the audit report(s) and disseminate the results of their analysis to NERC entities that request them. It is also possible to have some trusted third party examine the CSPs’ audit reports, rather than NERC itself.

In reviewing the CSPs’ audit reports, NERC would not make a decision whether or not any NERC entity should use a particular CSP. Rather, NERC would summarize the audit report(s) in enough detail for entities to make their own decision whether a CSP had sufficiently mitigated the general IT risks included in the report. The decision regarding which CSP(s) to use (if any), as well as the degree to which they should be used, will be entirely up to the NERC entity.

In addition to general IT cybersecurity risks (which apply to many types of organizations), there are cybersecurity risks that apply almost exclusively to CSPs; these are most likely not addressed in certifications like FedRAMP. A few of these are described below. These and other cloud-specific risks could be listed in the new standard as risks that the NERC entity needs to consider in developing their cloud cybersecurity risk management plan (if the NERC entity decides that one of these risks does not apply in their situation, they can indicate in their plan that this risk does not apply to them. However, they will not be free to ignore any of these risks altogether).

a. Failure adequately to protect BES Cyber System Information (BCSI). Because CIP-004-7 R6 and CIP-011-3 R1, which both take effect on January 1, 2024, were developed specifically to address this risk, the SDT may decide that these two requirements constitute adequate mitigation for BCSI risk. However, the SDT may also decide that additional controls are required, either on the NERC entity’s part or the CSP’s part.

b. Personnel risks. While it is likely that FedRAMP certification addresses most personnel-related general IT risks, it is also likely there are personnel risks specific to cloud providers that are not addressed in FedRAMP at all. For example, Paige Thompson, who breached Capital One after she was fired by her CSP employer, bragged online that she had been able to penetrate 30 other customers of the same CSP, who had made the same mistake in configuring their cloud controls that Capital One made.

Since it is likely that her access was removed from all accounts within a short time after she was fired, this indicates that access removal alone does not guarantee adequate protection for the CSP’s customers, after a knowledgeable employee has been terminated for cause. Additional steps may be needed, and they may not be adequately addressed by FedRAMP.

c. Risks from third parties that sell access to applications running in the CSP’s cloud environment. Third-party cloud “access brokers” do not always have adequate security themselves. This could potentially lead to cloud breaches that can compromise users of those third party services. This may have been the vehicle by which the SolarWinds attackers were able to penetrate that network and plant their malware in the software development pipeline.

d. Risks from misconfiguration of an organization’s cloud environment. The major cloud service providers usually operate under a “shared responsibility” model, in which the customer is responsible for most of the security controls in their cloud environment (the CSPs may also charge extra to assist customers who don’t wish to take responsibility for their security in the cloud). Thus, a NERC entity may misconfigure their cloud security controls and be breached as a result.

Some NERC entities may argue that they can only mitigate risks that are “under their control” (as some have argued regarding CIP-013) – meaning that, if a vendor won’t take the steps necessary to secure themselves, the NERC entity can’t do anything about that.

However, important risks must be mitigated. If the cloud provider needs to mitigate a serious risk but refuses to do so, the NERC entity should utilize other means to mitigate the risk, including:

1. Alternative mitigations if available;

2. Escalating help desk tickets;

3. Discussion of the issue in online forums; and

4. Legal action up to and including termination of the contract.

[1] Since there are currently no examples of NERC entities that have implemented medium and/or high impact BCS in the cloud, each of these three statements is speculative. However, it is possible that the current NERC CIP requirements, and the Measures listed for those requirements, would be interpreted to require these actions, as well as similar actions for most of the other CIP requirements and requirement parts, if a NERC entity were to try to implement medium or high impact BCS in the cloud.

[2] Of course, the fact that the NERC entity is found to have a huge number of violations does not necessarily mean they would face a big fine for each violation, although their total fine would almost certainly be huge. The bigger danger for the NERC entity is that they would be required by their state’s Public Utility Commission to stop providing electric power to customers in their service area, due to their massive violations of NERC standards.

[3] Technically, the assets themselves are not classified low, medium, or high impact, but the BCS associated with them are. However, since most of the NERC community finds it much easier to assign the classification to the asset, we have adopted that convention here.

[4] However, there are reports that at least one or two NERC Regional Entities do in fact require a list of low impact BCS from the entities in their Region. Of course, for low impact BCS deployed in the cloud, it would make no sense if the NERC entity did not have a list of them.

[5] Obviously, “located at” won’t work for BCS located in the cloud. On the other hand, removing those two words ignores the fact that the CSO706 SDT inserted them here for a reason: to prevent Cyber Assets in substations (and perhaps generating stations) that are “used by” the high impact control center from becoming high impact themselves. The new “Cloud SDT” should probably discuss this with members of the CSO706 SDT to identify a workable solution to this problem.

[6] The changes to CIP-002 and to the BCS definition will also go through the SAR/SDT process, but these don’t have to be addressed in the same SAR and with the same SDT as the new CIP standard. The changes to CIP-002 and the BCS definition are almost entirely a question of NERC process, while the new CIP standard will be entirely a question of cloud security. The required skill sets for SDT members will be different in the two cases.

Of course, as with all NERC meetings including SDT meetings, the meetings of both drafting teams will be open to any member of the public. This can include cloud security experts, security tool vendors and CSPs, as well as staff members of NERC entities.

[7] SOC II does not provide a set of required controls, but instead assesses whether the organization (in this case the CSP) has properly implemented whatever controls it has decided to follow. Therefore, proper assessment of SOC II audit results will require the CSP to provide documentation of the controls assessed.

[8] Other certifications like ISO 27001 might also be acceptable. This will have to be determined by the team that drafts the new CIP standard.

Thursday, October 19, 2023

Making VEX work

"He who defends everything defends nothing."

Frederick the Great

Steve Springett, leader of the OWASP Dependency Track and CycloneDX projects, led off last Friday’s OWASP SBOM Forum meeting (which didn’t have a fixed topic. This often happens, and as usual led to a better meeting than we could ever have planned) by saying that the biggest reason why suppliers aren’t regularly producing VEX documents for customers is that it’s so expensive to do so.

This surprised me. I knew there are multiple reasons why suppliers aren’t producing VEXes, but I had never even thought about cost as a reason. Since there are already many open source tools for producing and interpreting CycloneDX VEX documents (because the CDX VEX format is based on the same base code that CDX SBOMs, HBOMs, OBOMs, MLBOMs, VDR, etc. are built on), it certainly didn’t seem at first glance that there should be much if any cost to producing VEXes. And since Steve’s day job is ensuring that over 1,000 developers at ServiceNow follow best practices for software security, it certainly seems that one of them might be assigned to producing VEXes on a part-time basis. So, where does the cost come from?

Steve elaborated, saying that the big problem is that nobody knows exactly what VEX is and there is no fixed specification for it in either of the two primary VEX formats, CSAF and CycloneDX (the spec will need to be specific to the format. The formats are very different, and there’s no way to produce a common spec. Even the only SBOM spec – the 7 minimum fields listed in the NTIA Minimum Elements document – while it kinda sorta applies to both the SPDX and CDX SBOM formats, it is so minimal that it is literally useless by itself. If you want to produce useful SBOMs, you have to go beyond those 7 fields).

While Steve has no control over CSAF of course, he can certainly find someone to develop a CDX VEX spec. But, until someone can tell him exactly what a VEX is and – just as importantly – what it isn’t, there’s no point in even trying to develop a spec. If you wanted to address all the possible VEX use cases being discussed by the CISA VEX working group, you would have to develop a bloated spec that covered all of these use cases, which would then require complicated, bloated tools to produce and interpret VEXes.

But why can’t somebody tell Steve what a VEX is? Surely, he has friends that know. I’m his friend – he could ask me. However, I’ve already told him that as far as I’m concerned, the term VEX has no meaning anymore, other than that it’s a document that makes statements about the status of vulnerabilities in software products. A good example of this fact is an OpenSSF document that was put in the chat at this week’s CISA VEX meeting. I haven’t read even half of the document and don’t intend to read any more, but just by reading the first 4 or 5 paragraphs, I’ve identified at least 5 separate use cases, all of which the authors consider to be VEX.

You might ask, what’s the problem with spreading a big tent and allowing a very diverse group in? I don’t have a problem with diversity when it comes to human beings, but when it comes to a format for machine-readable documents, it quickly leads to the need for tooling (both to produce and consume the documents) that is hugely time-consuming to develop – which, of course, is exactly what Steve was talking about.

Here's the problem. I’ve never seen it articulated this way, and I would certainly like to hear from anybody who says they see a problem with my logic (and is willing to articulate that problem, of course. Idle sniping is not allowed in comments on my blog posts):

1. The cost of developing a software tool to produce or consume machine-readable documents depends on the number of independent operations that need to be accommodated in the tool. For example, if you develop a tool to add positive integers, but then you decide to incorporate subtraction into the tool, you have doubled the number of independent operations. And if you add to that a requirement to display the result of the operation in red if it is over 100, you have tripled the number of independent operations.

2. It’s important to note that having a bunch of mandatory fields in a document format does not in itself increase the number of operations, as long as those fields just insert text at various places in the document (which is all the same operation, even though the location of the text changes).

3. However, the cost of developing the tool doesn’t go up linearly with the number of independent operations; in other words, the cost doesn’t double when one new operation is added, triple when two new operations are added, etc. Instead, the cost is proportional to the factorial of the number of independent operations. The factorial of X is the number of ways that a group of X independent (not identical) objects can be arranged. 1 factorial (written 1!) equals one. 2 factorial (2!) equals 2 X 1 = 2. 3 factorial equals 3 X 2 X 1 = 6. 4 factorial equals 4 X 3 X 2 X 1 = 24. Etc.

4. Why do the tool costs go up according to the factorial of independent options? It’s because the developer needs to take account of each possible arrangement of independent operations. To take our example of 3 independent operations, namely addition, subtraction, and a display rule that requires changing the result’s color to red, the tool will have to be able to produce or consume a document that contains any combination of those 3 operations in any order (there are three of these combinations). It will also need to produce or consume the possible combinations of 2 options, which is 2; and of course, there is only one possible combination of 1 option. Adding these together, you get 6 possible combinations (please check my math. It’s been a long time since Mrs. Clauser’s first grade class went over addition).

Of course, handling 6 possible combinations of operations doesn’t seem like a huge hurdle for most developers (although it would be for me!). What about when the number is 5? Then it’s 120. How about 10? That’s 3.6 million. And how about 20? That’s 2.4 quadrillion, give or take a hundred trillion. You get the idea…the cost of developing a tool to produce or consume a machine-readable document will rapidly escalate as each new independent option is added.

So here’s the question: If we want to develop a tool that creates or consumes VEX documents, how many independent operations does it need to perform? Another way to put that is to ask, “If I were to develop a tool to produce a VEX document from scratch, for a user (e.g., a software supplier) that doesn’t know anything about VEX, how many independent questions would the tool need to ask them, in order to produce a VEX in one of the two formats? Note there should be no need to ask a question about a text field, since the user can just fill it in themselves.

Let’s start with CycloneDX VEX. I once asked myself how many questions would need to be asked, to produce a CDX VEX document. To answer that question, I looked at the examples contained in the CISA VEX Use Cases document, which is the best document on VEX written so far. In fact, I don’t recommend you even read any of the NTIA or CISA VEX documents other than Use Cases and Status Justifications. The answer was about nine. 9! is 362,880, meaning a tool to produce or consume a CDX VEX document would need to be able to accommodate 362,880 independent use cases. Does that seem like a lot?

Not in comparison to CSAF. If you look at the closest thing to a specification for a CSAF 2.0 VEX, the VEX profile that Thomas Schmidt of the German BSI created (BTW, there was no CSAF 1.0. CVRF, the predecessor to CSAF, was renamed CSAF 2.0 when it came time to develop an update to CVRF 1.0. The OASIS committee that originally developed CVRF was called CSAF, so they named the new version after themselves. Nothing wrong with that, of course), you will think that this is a really simple format. In fact, I believe that the profile only two or three independent operations. Any field that is mandatory in every VEX document just counts as a single operation, but there are a couple fields in the CSAF VEX profile that depend on the contents of another field - that's an independent operation.

However, there’s a huge omission in the VEX profile: Every CSAF document requires the “product tree” and “branches” fields. If these were simple text fields, that would be no problem; they would add at most one new operation.

Unfortunately, these two fields add a lot more operations than one. How many do they add? I have never even tried to answer that question, since I have never felt like devoting the week or two (I kid you not) that would probably be required to develop a good understanding of those fields. In order to understand those two fields, you should open the 100-page (or so) CSAF 2.0 specification and start reading at least with Section 2 (Design Considerations); then read everything up to Section 4. At that point, you might have an idea of how many independent operations are required adequately to address all the possibilities in just those two fields. Of course, I have no idea how many operations that is (I can’t even count the number of pages, since they’re not numbered). But I’m sure there are at least 50 independent operations.

What’s 50 factorial? 3.0414093e+64; for comparison purposes, the number of atoms in the universe is between 10e+78 and 10e+82, although I admit I haven’t counted them lately. How many lifetimes of the universe would it take for every software developer who’s ever lived, or will live, to code every one of those operations in a tool? I don’t know that, but I’m sure even that is a big number.

But even if we go back to CycloneDX VEX, with its lowly count of 362,880 independent operations, you can see it would be ridiculous even to try to account for all of them in any tool. What needs to be done is to constrain both VEX specs (CDX and CSAF) so they only include 1-3 independent operations each. This will of course mean that some people will be disappointed that their favorite use case can’t be accommodated in our spec; on the other hand, we will make sure we include the use cases in the NTIA VEX One-pager, as well as in this unpublished Google Docs paper, which Allan Friedman drafted and which I still think is the best document (published or unpublished) about VEX.

Once we have the spec developed, toolmakers will be able to begin developing both VEX production and consumption tools – in fact, one of the members of our group, Anthony Harrison of the UK, has already developed proof of concept tools for CSAF production and consumption, so he might modify those to accommodate VEX.

At that point, we will hopefully have a workable VEX spec!

I lead the OWASP SBOM Forum. If you would like to learn more about what that group does, please go here.

Wednesday, October 11, 2023

NERC CIP: Which way to the cloud?

I’m pleased to report that probably the number one issue in the world of NERC CIP – the fact that NERC entities with medium and high impact BES Cyber Systems (BCS) can’t implement them in the cloud today, while remaining compliant with the CIP standards – is now officially “heating up”. In fact, it seems to me there is widespread consensus in the NERC community, including among NERC entities, NERC and Regional Entity staff members, third party software and service providers, and – last but certainly not least – the cloud service providers themselves, that there needs to be a “legal” pathway to the cloud for those NERC entities that feel comfortable implementing some of their medium and high impact systems in the cloud.

I used to be quite pessimistic that this day would ever dawn, since it seemed to me that we would never get the whole NERC CIP community to agree to fundamental changes in the CIP standards, especially given that these would entail major changes in NERC entities’ compliance programs. And guess what? I still believe we’ll never get that agreement, at least for a long time.

Then, what has changed? A Standards Authorization Request (SAR) was submitted to the NERC Standards Committee earlier this year[i], which contained an innovation I hadn’t thought of: having two “compliance tracks” for medium and high impact BCS. One track would be what I call “Classic CIP”. Entities wishing to follow that track could continue complying with the existing CIP standards (although with small modifications to CIP-002); nothing at all about their current compliance programs would need to change.

However, if a NERC entity wishes to locate high and/or medium impact BCS[ii] in the cloud, they would need to follow a new “Cloud CIP” track.[iii] I have recorded my ideas on how this new program would work in a document that will be the basis for a discussion that I will engage in with Lew Folkerth of the RF NERC Regional Entity, at RF’s monthly Tech Talk webinar. It will be held on Monday, October 23 from 2:00 to 3:30 Eastern Time (note 10/22: I mistakenly put 1:00 to 2:30 ET in this post when I originally put it up). It is open to anyone, and no pre-registration is required. The attendance URL is included in the page linked just above.

I will be one of two speakers; I’m not sure whether I’ll be first or second. However, the other speaker is someone I’ve known for a long time, who is (without exaggeration) one of my favorite people: Shon Austin, a CIP auditor with RF (and previously with the SPP Regional Entity). His topic is directly related to mine: He is discussing the project that developed CIP-004-7 and CIP-011-1, which will become enforceable on January 1, 2024. Note that neither my disicussion with Lew nor Shon's presentation will be recorded, in accordance with RF's longtime policy.

The changes in these two standards are what will allow BES Cyber System Information to be stored in the cloud (BCSI has also been “prohibited”, with even less justification than BCS) – among other “modern third‐party data storage and analysis systems”. Shon, being an auditor discussing CIP requirements that will be in effect in less than three months, will have to be much more careful than I will – since I’ll discuss my general ideas regarding a way forward. If Shon makes a mistake, it will be painfully obvious in half a year, whereas if I make a mistake, I’ll be retired by the time it becomes obvious - if anybody even remembers what I said then. Some of us have all the luck.

See you there!

[i] I don’t know what the status of that SAR is. My guess is it wasn’t accepted, but that’s OK. Before there can be any serious SAR, there needs to be rough consensus in the NERC CIP community about how to address this problem. There has been almost no serious discussion of the topic so far. Maybe the webinar described in this post will help get that discussion going.

[ii] High and medium impact Electronic Access Control or Monitoring Systems (EACMS) and Physical Access Control Systems (PACS) are also currently prevented from being located in the cloud, due to similar (but not identical) considerations as those preventing high and medium BCS. In fact, it can be argued that the fact that EACMS are not currently allowed in the cloud has significantly degraded the level of BES security from what it could be otherwise; this gap is continually widening, as managed security service providers increasingly move toward providing their services primarily or entirely using cloud-based analysis.

Any plan to permit BCS in the cloud should include a plan to permit EACMS and PACS. In fact, it seems to me that, since addressing EACMS (and perhaps PACS) in the cloud will be much easier than addressing medium and high impact BCS, and since the security return on investment will be higher (at least initially), it would make sense to make EACMS the initial focus of the “Cloud CIP” effort. The icing on the cake is that it’s likely that nothing that would be required to make EACMS in the cloud “legal” would not also be required as part of the BCS effort. So, the BCS effort won’t be set back much (if at all) by the decision initially to focus on EACMS. Indeed, the SAR that was submitted this year makes that suggestion.

[iii] Entities that wish to implement some of their BCS in the cloud, and other BCS on premises, would need to follow the Classic CIP track for the latter and the Cloud CIP track for the former. It is likely that only a small percentage of entities that decide to take advantage of the cloud for BCS will be able to implement all of them in the cloud; some BCS need to remain local by their very nature.