Sunday, August 17, 2025

What cloud risks does CIP need to address? (full version)


Note from Tom: I am now putting up all of my full posts in my Substack blog, but only select ones in Blogspot. Even though I put this up as a paywalled post on Blogspot two days ago, I'm now making the full post available because of its importance for the NERC CIP community. Enjoy this post, but please also subscribe to my Substack blog on a paid ($30 per year) basis. You will be able to read of my new posts, as well as all 1200+ “legacy” posts that I originally put up in Blogspot - but which are now also available to paid Substack subscribers.

 

Two days ago, I participated in two lengthy web conversations regarding NERC CIP and the cloud. The first was the bi-weekly meeting of the informal Cloud Technical Advisory Group (CTAG), a group led by Lew Folkerth of RF and Chris Holmquest of SERC. The group discusses how NERC entities can make use of the cloud for systems that are subject to compliance with the CIP standards.

A lot of this group’s discussions are about why there is so little cloud use by NERC entities today. There are two main reasons:

1.      Most use of the cloud by high or medium impact BES Cyber Systems (BCS) is impossible today, because the cloud service provider (CSP) will not be able to provide the NERC entity with the required compliance evidence for the over 100 CIP Requirements and Requirement Parts that apply to high and medium impact BCS.

2.      At least two uses of the cloud for BCS are possible today, including low impact Control Centers and medium or high impact BES Cyber System Information (BCSI) used in SaaS applications. However, I know of only a few instances of these uses today. I believe this is because neither NERC nor the NERC ERO has provided guidance on how to do this, while maintaining CIP compliance.

The second meeting was a normal meeting of the NERC Risk Management for Third Party Cloud Services Standards Drafting Team. Of course, this is very much an official NERC group; they are charged with drafting new and/or revised CIP standards that should, when put in place, allow much more extensive use of the cloud by systems subject to CIP compliance, than is possible today. This team has been meeting for more than a year and a few months ago started putting virtual pen to virtual paper to draft some real requirements.

How long will it take before what I call the “Cloud CIP” standards (i.e., the standards the SDT is drafting) come into effect? The new standards will be in effect once they have a) been drafted (which will take a at least 1-2 more years), b) gone through at least four ballots by NERC members (including a two month comment period after each ballot, plus another month for the SDT to respond to the comments and make changes to the draft standards), c) been approved by the NERC Board of Trustees, d) been submitted to FERC for their approval, e) been approved by FERC, most likely more than a year after they are submitted, and f) gone through a 2-3 year implementation period.

With some luck, all the above steps will be completed by…drumroll, please…2031. However, I think even this estimate may be over-optimistic. I say this because I haven’t allowed for changes to the NERC Rules of Procedure (RoP) – and I think they are likely to be needed. Since I don’t believe any new or revised CIP standard has ever required an RoP change, nobody I know can even tell me how this can be accomplished, except that it’s certain the SDT does not have the power to do it themselves. Therefore, I can’t quantify the time required for this step. The best case is that it might be possible to make the RoP changes during the multiyear implementation period for the “Cloud CIP” standards. If that happens, my estimated implementation year would remain 2031.

However, one thing I didn’t allow for in my 2031 estimate is that the SDT might spend a year or two in what turns out to be totally unproductive activity – for example, if the team spends 1-2 years polishing off a set of standards that are subsequently rejected by NERC or FERC for being unenforceable. Were this to happen the SDT would need to start over from the beginning. This might sound like a joke, but it isn’t – it could very well happen. Of course, this would be a big disappointment, since it would push the implementation date for the new standards to 2033 or later.

This is why I’ve decided to create an online Cloud CIP Working Group whose goal will be to help move up the implementation date for the Cloud CIP standards. This group will be open to all members of the NERC community who are concerned about both the arrival time and the quality of the Cloud CIP standards; this includes NERC entities, NERC ERO staff members, software vendors, CSPs, consultants, etc.

Fortunately, there is a way to accelerate development of the new or revised standards. It is to start performing tasks that the SDT will otherwise need to perform themselves, if they are to produce a quality product. In other words, I propose to have this new group “break a path” for the SDT by working on steps that the SDT will inevitably have to take in the future, if they’re going to produce a worthwhile set of standards.

I want this group to start at the very beginning. The first question that needs to be answered is, “What is the problem to be addressed in the new standards?” Cybersecurity is about risk mitigation. A cybersecurity standard needs to mitigate a certain set of risks. The risks addressed by CIP in general are those that apply to systems used to maintain the reliability of the Bulk Electric System.

The problem that the SDT is addressing is the fact that the current CIP standards, except for three requirements having to do with BCSI that came into effect on January 1, 2024, were written on the assumption that all systems in scope will be on premises. At first glance, the solution to that problem appears to be to rewrite the existing requirements so they take account of the cloud (in fact, I believe this is the approach the SDT is currently taking).

However, that approach doesn’t take account of the fact that, while the wording of the current standards needs to be changed, the bigger issue is that the entity will need to provide, for each requirement, evidence that the CSP has complied with the requirement. The CSPs are more than willing to make available their audit reports for ISO 27001, Soc 2 Type 2 audits, and FedRAMP, but they are not willing or able to provide evidence for any requirement that relates to specific devices or network configurations. This is because cloud data is always spread across devices and data centers. In order to comply with a requirement like CIP-007 R2 Patch Management, the CSP would need to identify each device in any data center that might hold just one part of a BCS during a 3-year audit period; the CSP would need to apply all of the parts of CIP-007 R2 to every one of those devices.

Of course, this would literally be impossible, but even if it were, the CSPs would - rightly – never agree to do it, because the cost of doing so would be astronomical. Moreover, they certainly won’t be able to pass that cost on to their NERC entity customers, who expect to be on the same rate schedule as any other customer.

The cloud business model is based on providing the same services to huge numbers of customers. If 100 NERC entities start requiring compliance information for over 100 NERC CIP Requirements and Requirement Parts (which is the number of requirements in scope for each BES Cyber System or Electronic Access Control or Monitoring System – EACMS – deployed in the cloud), that will be 10,000 pieces of information per system in scope. If each of those entities has 100 BES Cyber Systems deployed in the cloud, that will be one million pieces of compliance information required for just those 100 NERC entities.

This is one reason why any CIP requirement that refers to particular systems or network configurations will need to be changed, if it is to apply to cloud-based systems. Moreover, even requirements that don’t apply to particular systems at all, such as the CIP-008 and CIP-009 requirements, have a similar problem. This is because they require specific deliverables from the CSP – e.g., an incident response plan and a backup and recovery plan, as well as testing of those plans. The CSPs are no more inclined to prepare those plans for individual NERC entities than they are to provide information on individual devices.

However, even if the CSPs were able to provide compliance evidence for the existing CIP requirements, I don’t understand why the drafting team would even waste their time making the existing CIP requirements “cloud-friendly”. This is because almost all those requirements are like provisions that are already covered in ISO 27001 and Soc 2 Type 2 certifications, as well as FedRAMP authorizations.

Of course, changing the existing CIP requirements won’t work, since those requirements still need to apply to on-premises systems. What’s needed is a separate set of Cloud CIP requirements that just apply to BES Cyber Systems deployed in the cloud (I’ll call them “Cloud BCS”). Each of these would be intended to address a similar risk to the “equivalent” on-premises CIP requirement, but it would be based on wording found in a roughly equivalent requirement in 27001, Soc 2 or FedRAMP. If that is done, the only compliance evidence required would evidence that the CSP’s audit for the standard in question didn’t produce any findings for the requirement in question.

For example, Requirement CIP-004-7 Part 5.1 reads, “A process to initiate removal of an individual’s ability for unescorted physical access and Interactive Remote Access upon a termination action, and complete the removals within 24 hours of the termination action (Removal of the ability for access may be different than deletion, disabling, revocation, or removal of all access rights).” FedRAMP AC-2.h reads:

The organization notifies account managers when accounts are no longer required; when users are terminated or transferred; and when individual information system usage or need-to-know changes.

Of course, this isn’t exactly the same as Part 5.1 – it is both more comprehensive and less so (there are other FedRAMP requirements that also correspond to Part 5.1). However, let’s say the drafting team decides that the wording is close enough that this can be considered a rough equivalent of Requirement CIP-004-7 Part 5.1. The language of the FedRAMP requirement would be the basis for a CIP requirement applicable to Cloud-based BCS. The “Measures” section of the requirement would require evidence of the absence of audit findings for FedRAMP Requirement AC-2.h.

By using this approach, the SDT will avoid the fate of developing a bunch of CIP requirements that can’t be audited, since the CSP will never provide evidence of compliance with a requirement that they consider to be already covered in ISO 27001, Soc 2 Type 2, or FedRAMP. Compliance evidence using the approach I’m suggesting will consist solely of pointing to a particular section in an audit report.

Of course, if a CSP’s evidence is found to comply with a Cloud CIP requirement by one NERC entity, all other NERC entities should have the same finding. This means it would be silly to have each NERC entity require the CSP to provide the same evidence for each Cloud CIP requirement. Instead, there needs to be a mechanism in which the CSP provides the evidence to NERC (or some third party designated by them), which then makes it available to each NERC entity.[i]

Once the SDT drafts “cloud versions” of the existing CIP requirements, will they be finished? Hardly. Those cloud requirements are based on just one type of risks: risks addressed in existing CIP requirements. There are two other types of risk that the SDT (or the group I’m proposing) should examine, to determine which of those risks are important enough to merit their own requirements in Cloud CIP.

The second risk type is requirements in ISO 27001, Soc 2 Type 2 and FedRAMP that don’t have “near-equivalent” CIP requirements, but that are important enough to include in the Cloud CIP requirements. For example, FedRAMP requirement AC-2.k reads, “The organization establishes a process for reissuing shared/group account credentials (if deployed) when individuals are removed from the group.”

This requirement doesn’t match any existing CIP requirement, but our working group (or the SDT) might decide it addresses a source of risk that is important enough to warrant its own CIP requirement (applicable to Cloud BCS, not onsite BCS). Therefore, this requirement could be reworded to be applicable to Cloud BCS. The “Measures” section of the requirement would call for evidence of “no findings” in the FedRAMP audit report.[ii]

The third risk type is by far the most important of the three, since there is no “up front” evidence (such as a certification or authorization) that the CSP has already mitigated risks of this type. This third type consists of risks that only apply to cloud-based systems. Since both ISO 27001 and FedRAMP can apply to both on premises and cloud-based systems, I believe they don’t include cloud-only risks (although I’m open to correction if someone knows otherwise).

Three cloud-only risks that I’ve identified just by reading the news are:

1. The CSP doesn’t make sure their customers are adequately trained in the security measures required to protect their cloud environment. As I described in this post, Paige Thompson, the woman who single-handedly almost brought down Capitol One, was a technical staff member who had recently been fired by a CSP. She got revenge by breaking into the cloud environments of at least 30 customers of that CSP, one of which was Capitol One. She bragged online that all those customers had made the same mistake in configuring their security controls; moreover, she said there were many other customers that made that mistake.

Of course, the CSP shouldn’t be held responsible for every configuration mistake made by a customer. However, if 30+ customers have all made the same mistake, that’s clearly a problem that needs to be addressed by the CSP. And the answer can’t be, “We offered a class for $695 that included discussion of this issue, but they didn’t take it.” The CSP should provide the training for free, if reasonably security-proficient customers are prone to make a serious mistake like this one.[iii]

If this risk were to be addressed in a CIP requirement, the requirement might call for the CSP to explain what they are doing to make sure their customers understand how to securely configure their cloud environment.

2. The CSP didn’t vet their online access providers’ security properly. The Russian attackers that perpetrated the SolarWinds attack were quite smart. Perhaps the smartest thing they did was to conduct a classic supply chain attack by conducting their attack on the SolarWinds development environment by first compromising a popular SaaS application used by SolarWinds. They did that by first compromising a third party that sells access to that SaaS application. Through that third party, they gained access to the cloud environment on which the SaaS application was running (the SaaS app itself was not compromised). From that environment, they launched the entire attack on SolarWinds (which was without much doubt one of the most sophisticated cyberattacks of all time).

Of course, the solution to this problem is for the platform CSP to tighten security requirements on the third party access brokers. A CIP requirement to address this risk might ask the CSP to describe the security requirements they place on third party access brokers and how they enforce them, as well as whether they have experienced any breaches through these access brokers.

3. Cloud Hopper. This attack was revealed in a Wall Street Journal article[iv] by Rob Barry and Dustin Volz in 2018. What’s most scary about it is that the attackers were able to jump from customer to customer within the clouds of multiple CSPs, and that they used a variety of techniques to penetrate different customers. This shows that your security in the cloud is at least partially dependent on whether your fellow cloud customers also practice good security – i.e., it’s a kind of herd immunity.

Of course, a platform CSP can’t police the general cybersecurity practices of all companies that utilize their cloud. However, the CSP should have measures in place today to detect and counteract attempts to hop from one customer to another. A CIP requirement might ask the CSP to describe (at least in general terms) the measures they have in place, as well as whether those measures have been successful in preventing Cloud Hopper-type breaches.

Of course, the three risks listed above are not the only cloud risks faced by NERC entities! The primary task of the group I want to form will be to review cloud risks identified by various organizations – federal agencies, the military, CSPs themselves, etc. – and decide which of them should be included as the basis for Cloud CIP requirements.

You may have noticed that, when I reached cloud-only risks (the third type of cloud cybersecurity risks), I abandoned my earlier concern that the CSPs won’t be willing to provide answers to unique questions like these. This is because the first two types of risks are already addressed in ISO 27001 and FedRAMP, with which the CSP is presumably already compliant. To provide evidence of compliance for CIP requirements based on those two types of risks, the CSP can simply point the customer to the audit reports.

However, risks of the third type – cloud-only risks – aren’t addressed by those other standards. Therefore, the CSP should feel obligated to provide compliance evidence for CIP requirements based on those risks. Again, since the CSP’s response to any of these requirements will be the same for all NERC entities, NERC, or a third party designated by them, should be the single point of contact for the CSP.

NERC will “audit” each of the CIP cloud-only risk requirements by evaluating the CSP’s response to a question or small set of questions, e.g., “Describe the security requirements you place on third party access brokers and how you enforce them. Have you experienced any security breaches that came through one of these access brokers?” The response will be evaluated by asking and answering the question whether the CSP has adequately mitigated the risk that is the basis for the CIP requirement. If the response has not convinced NERC (or the third party) that the CSP has mitigated that risk, they will need to ask the CSP for more evidence.

The problem with the process I’ve just described is that it’s not currently allowed by anything in the NERC Rules of Procedure. As I’ve already said, I doubt there is any way to implement this process without making RoP changes.

However, in my opinion these cloud-only risks should be the primary focus of evaluation of the CSPs. This is because the fact that the major CSPs have all passed ISO 27001 and Soc 2 Type 2 audits, and have all been authorized for use by federal agencies under FedRAMP, means they don’t present a big problem when it comes to “normal” risks, like lack of patch management or configuration management programs. It’s fine to have Cloud CIP requirements that apply to normal risks, but the only compliance evidence the CSPs should have to provide is what’s in their audit reports based on those three compliance regimes (and perhaps others as well).

On the other hand, I strongly doubt that the cloud-only risks I’ve listed (and many more identified by others) are found in any standard compliance regimes today. The working group I want to put together will have one primary responsibility: compile a list of cloud-only risks that are not currently addressed by standard compliance regimes, then decide which of these are important enough to be addressed in the Cloud CIP standards. You’re welcome to participate in this effort if you are with a NERC entity, a vendor of cloud or software services to NERC entities (including SaaS providers and platform CSPs), NERC or the NERC ERO, a consulting organization that provides services based on NERC CIP, or if you’re just a user of electricity.

If you don’t use electricity, you obviously have no stake in what this group will do, so you’re not welcome. On the other hand, if you don’t use electricity, I’d like to know how you’re reading this blog post.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com or comment on this blog’s Substack community chat.



[i] There’s no mechanism today by which NERC can receive audit evidence from a third party and distribute it to individual entities. This is one reason why I think there will need to be changes to the NERC Rules of Procedure. As you’ll see later in this post, this isn’t the only reason why I think that.

[ii] It’s possible that the SDT (or our Cloud CIP Working Group) might decide that, if the CSP has received the appropriate certifications (ISO 27001 and Soc 2 Type 2) and authorization (FedRAMP), there’s no need to consider the other requirements in the certification or authorization, besides those that map to CIP requirements. This is because the certification means the CSP has either received no audit finding on each requirement, or they received a finding but have already mitigated the risk satisfactorily.

Therefore, the SDT and/or the Cloud Security Working Group might just skip this second risk type altogether and address the third risk type. The third type is much more important than the first two, since risks of the third type presumably are not already addressed by any of the three compliance regimes.

[iii] After my posts on this incident appeared in 2018, I and Dick Brooks of Reliable Energy Analytics talked with someone from that CSP, who had contacted me about those posts. They convinced me that they had already addressed the problem (the meeting was at least a year after the incident).

[iv] The linked article was originally made open access but may have slipped behind the paywall. Rob Barry gave me a link to a PDF of the article on his personal website; if you can’t access the article itself, email me and I’ll send you that link.

 


What cloud risks does CIP need to address?

Note from Tom: I am now putting up all my posts in my Substack blog, but only select ones in Blogspot. Enjoy this post, but please also subscribe to my Substack blog on a paid ($30 per year) basis, in order to see all of my new posts, as well all 1200+ “legacy” posts that I originally put up in Blogspot, but which are now available to all Substack subscribers.

 

Two days ago, I participated in two lengthy web conversations regarding NERC CIP and the cloud. The first was the bi-weekly meeting of the informal Cloud Technical Advisory Group (CTAG), a group led by Lew Folkerth of RF and Chris Holmquest of SERC. The group discusses how NERC entities can make use of the cloud for systems that are subject to compliance with the CIP standards.

A lot of this group’s discussions are about why there is so little cloud use by NERC entities today. There are two main reasons:

1.      Most use of the cloud by high or medium impact BES Cyber Systems (BCS) is impossible today, because the cloud service provider (CSP) will not be able to provide the NERC entity with the required compliance evidence for the over 100 CIP Requirements and Requirement Parts that apply to high and medium impact BCS.

2.      At least two uses of the cloud for BCS are possible today, including low impact Control Centers and medium or high impact BES Cyber System Information (BCSI) used in SaaS applications. However, I know of only a few instances of these uses today. I believe this is because neither NERC nor the NERC ERO has provided guidance on how to do this, while maintaining CIP compliance.

The second meeting was a normal meeting of the NERC Risk Management for Third Party Cloud Services Standards Drafting Team. Of course, this is very much an official NERC group; they are charged with drafting new and/or revised CIP standards that should, when put in place, allow much more extensive use of the cloud by systems subject to CIP compliance, than is possible today. This team has been meeting for more than a year and a few months ago started putting virtual pen to virtual paper to draft some real requirements.

How long will it take before what I call the “Cloud CIP” standards (i.e., the standards the SDT is drafting) come into effect? The new standards will be in effect once they have a) been drafted (which will take a at least 1-2 more years), b) gone through at least four ballots by NERC members (including a two month comment period after each ballot, plus another month for the SDT to respond to the comments and make changes to the draft standards), c) been approved by the NERC Board of Trustees, d) been submitted to FERC for their approval, e) been approved by FERC, most likely more than a year after they are submitted, and f) gone through a 2-3 year implementation period.

With some luck, all the above steps will be completed by…drumroll, please…2031. However, I think even this estimate may be over-optimistic. I say this because I haven’t allowed for changes to the NERC Rules of Procedure (RoP) – and I think they are likely to be needed. Since I don’t believe any new or revised CIP standard has ever required an RoP change, nobody I know can even tell me how this can be accomplished, except that it’s certain the SDT does not have the power to do it themselves. Therefore, I can’t quantify the time required for this step. The best case is that it might be possible to make the RoP changes during the multiyear implementation period for the “Cloud CIP” standards. If that happens, my estimated implementation year would remain 2031.

However, one thing I didn’t allow for in my 2031 estimate is that the SDT might spend a year or two in what turns out to be totally unproductive activity – for example, if the team spends 1-2 years polishing off a set of standards that are subsequently rejected by NERC or FERC for being unenforceable. Were this to happen the SDT would need to start over from the beginning. This might sound like a joke, but it isn’t – it could very well happen. Of course, this would be a big disappointment, since it would push the implementation date for the new standards to 2033 or later.

This is why I’ve decided to create an online Cloud CIP Working Group whose goal will be to help move up the implementation date for the Cloud CIP standards. This group will be open to all members of the NERC community who are concerned about both the arrival time and the quality of the Cloud CIP standards; this includes NERC entities, NERC ERO staff members, software vendors, CSPs, consultants, etc.

Fortunately, there is a way to accelerate development of the new or revised standards. It is to start performing tasks that the SDT will otherwise need to perform themselves, if they are to produce a quality product. In other words, I propose to have this new group “break a path” for the SDT by working on steps that the SDT will inevitably have to take in the future, if they’re going to produce a worthwhile set of standards.

I want this group to start at the very beginning. The first question that needs to be answered is, “What is the problem to be addressed in the new standards?” Cybersecurity is about risk mitigation. A cybersecurity standard needs to mitigate a certain set of risks. The risks addressed by CIP in general are those that apply to systems used to maintain the reliability of the Bulk Electric System.

The problem that the SDT is addressing is the fact that the current CIP standards, except for three requirements having to do with BCSI that came into effect on January 1, 2024, were written on the assumption that all systems in scope will be on premises. At first glance, the solution to that problem appears to be to rewrite the existing requirements so they take account of the cloud (in fact, I believe this is the approach the SDT is currently taking).

However, that approach doesn’t take account of the fact that, while the wording of the current standards needs to be changed, the bigger issue is that the entity will need to provide, for each requirement, evidence that the CSP has complied with the requirement. The CSPs are more than willing to make available their audit reports for ISO 27001, Soc 2 Type 2 audits, and FedRAMP, but they are not willing or able to provide evidence for any requirement that relates to specific devices or network configurations. This is because cloud data is always spread across devices and data centers. In order to comply with a requirement like CIP-007 R2 Patch Management, the CSP would need to identify each device in any data center that might hold just one part of a BCS during a 3-year audit period; the CSP would need to apply all of the parts of CIP-007 R2 to every one of those devices.

Of course, this would literally be impossible, but even if it were, the CSPs would - rightly – never agree to do it, because the cost of doing so would be astronomical. Moreover, they certainly won’t be able to pass that cost on to their NERC entity customers, who expect to be on the same rate schedule as any other customer.

The cloud business model is based on providing the same services to huge numbers of customers. If 100 NERC entities start requiring compliance information for over 100 NERC CIP Requirements and Requirement Parts (which is the number of requirements in scope for each BES Cyber System or Electronic Access Control or Monitoring System – EACMS – deployed in the cloud), that will be 10,000 pieces of information per system in scope. If each of those entities has 100 BES Cyber Systems deployed in the cloud, that will be one million pieces of compliance information required for just those 100 NERC entities.

This is one reason why any CIP requirement that refers to particular systems or network configurations will need to be changed, if it is to apply to cloud-based systems. Moreover, even requirements that don’t apply to particular systems at all, such as the CIP-008 and CIP-009 requirements, have a similar problem. This is because they require specific deliverables from the CSP – e.g., an incident response plan and a backup and recovery plan, as well as testing of those plans. The CSPs are no more inclined to prepare those plans for individual NERC entities than they are to provide information on individual devices.

However, even if the CSPs were able to provide compliance evidence for the existing CIP requirements, I don’t understand why the drafting team would even waste their time making the existing CIP requirements “cloud-friendly”. This is because almost all those requirements are like provisions that are already covered in ISO 27001 and Soc 2 Type 2 certifications, as well as FedRAMP authorizations.

Of course, changing the existing CIP requirements won’t work, since those requirements still need to apply to on-premises systems. What’s needed is a separate set of Cloud CIP requirements that just apply to BES Cyber Systems deployed in the cloud (I’ll call them “Cloud BCS”). Each of these would be intended to address a similar risk to the “equivalent” on-premises CIP requirement, but it would be based on wording found in a roughly equivalent requirement in 27001, Soc 2 or FedRAMP. If that is done, the only compliance evidence required would evidence that the CSP’s audit for the standard in question didn’t produce any findings for the requirement in question.

For example, Requirement CIP-004-7 Part 5.1 reads, “A process to initiate removal of an individual’s ability for unescorted physical access and Interactive Remote Access upon a termination action, and complete the removals within 24 hours of the termination action (Removal of the ability for access may be different than deletion, disabling, revocation, or removal of all access rights).” FedRAMP AC-2.h reads:

The organization notifies account managers when accounts are no longer required; when users are terminated or transferred; and when individual information system usage or need-to-know changes.

Of course, this isn’t exactly the same as Part 5.1 – it is both more comprehensive and less so (there are other FedRAMP requirements that also correspond to Part 5.1). However, let’s say the drafting team decides that the wording is close enough that this can be considered a rough equivalent of Requirement CIP-004-7 Part 5.1. The language of the FedRAMP requirement would be the basis for a CIP requirement applicable to Cloud-based BCS. The “Measures” section of the requirement would require evidence of the absence of audit findings for FedRAMP Requirement AC-2.h.

By using this approach, the SDT will avoid the fate of developing a bunch of CIP requirements that can’t be audited, since the CSP will never provide evidence of compliance with a requirement that they consider to be already covered in ISO 27001, Soc 2 Type 2, or FedRAMP. Compliance evidence using the approach I’m suggesting will consist solely of pointing to a particular section in an audit report.

Of course, if a CSP’s evidence is found to comply with a Cloud CIP requirement by one NERC entity, all other NERC entities should have the same finding. This means it would be silly to have each NERC entity require the CSP to provide the same evidence for each Cloud CIP requirement. Instead, there needs to be a mechanism in which the CSP provides the evidence to NERC (or some third party designated by them), which then makes it available to each NERC entity.[i]

Once the SDT drafts “cloud versions” of the existing CIP requirements, will they be finished? Hardly. Those cloud requirements are based on just one type of risks: risks addressed in existing CIP requirements. There are two other types of risk that the SDT (or the group I’m proposing) should examine, to determine which of those risks are important enough to merit their own requirements in Cloud CIP.

The second risk type is requirements in ISO 27001, Soc 2 Type 2 and FedRAMP that don’t have “near-equivalent” CIP requirements, but that are important enough to include in the Cloud CIP requirements. For example, FedRAMP requirement AC-2.k reads, “The organization establishes a process for reissuing shared/group account credentials (if deployed) when individuals are removed from the group.”

This requirement doesn’t match any existing CIP requirement, but our working group (or the SDT) might decide it addresses a source of risk that is important enough to warrant its own CIP requirement (applicable to Cloud BCS, not onsite BCS). Therefore, this requirement could be reworded to be applicable to Cloud BCS. The “Measures” section of the requirement would call for evidence of “no findings” in the FedRAMP audit report.[ii]

The third risk type is by far the most important of the three, since there is no “up front” evidence (such as a certification or authorization) that the CSP has already mitigated risks of this type. This third type consists of risks that only apply to cloud-based systems. Since both ISO 27001 and FedRAMP can apply to both on premises and cloud-based systems, I believe they don’t include cloud-only risks (although I’m open to correction if someone knows otherwise).

Three cloud-only risks that I’ve identified just by reading the news are:

1. The CSP doesn’t make sure their customers are adequately trained in the security measures required to protect their cloud environment. As I described in this post, Paige Thompson, the woman who single-handedly almost brought down Capitol One, was a technical staff member who had recently been fired by a CSP. She got revenge by breaking into the cloud environments of at least 30 customers of that CSP, one of which was Capitol One. She bragged online that all those customers had made the same mistake in configuring their security controls; moreover, she said there were many other customers that made that mistake.

Of course, the CSP shouldn’t be held responsible for every configuration mistake made by a customer. However, if 30+ customers have all made the same mistake, that’s clearly a problem that needs to be addressed by the CSP. And the answer can’t be, “We offered a class for $695 that included discussion of this issue, but they didn’t take it.” The CSP should provide the training for free, if reasonably security-proficient customers are prone to make a serious mistake like this one.[iii]

If this risk were to be addressed in a CIP requirement, the requirement might call for the CSP to explain what they are doing to make sure their customers understand how to securely configure their cloud environment.

2. The CSP didn’t vet their online access providers’ security properly. The Russian attackers that perpetrated the SolarWinds attack were quite smart. Perhaps the smartest thing they did was to conduct a classic supply chain attack by conducting their attack on the SolarWinds development environment by first compromising a popular SaaS application used by SolarWinds. They did that by first compromising a third party that sells access to that SaaS application. Through that third party, they gained access to the cloud environment on which the SaaS application was running (the SaaS app itself was not compromised). From that environment, they launched the entire attack on SolarWinds (which was without much doubt one of the most sophisticated cyberattacks of all time).

Of course, the solution to this problem is for the platform CSP to tighten security requirements on the third party access brokers. A CIP requirement to address this risk might ask the CSP to describe the security requirements they place on third party access brokers and how they enforce them, as well as whether they have experienced any breaches through these access brokers.

3. Cloud Hopper. This attack was revealed in a Wall Street Journal article[iv] by Rob Barry and Dustin Volz in 2018. What’s most scary about it is that the attackers were able to jump from customer to customer within the clouds of multiple CSPs, and that they used a variety of techniques to penetrate different customers. This shows that your security in the cloud is at least partially dependent on whether your fellow cloud customers also practice good security – i.e., it’s a kind of herd immunity.

Of course, a platform CSP can’t police the general cybersecurity practices of all companies that utilize their cloud. However, the CSP should have measures in place today to detect and counteract attempts to hop from one customer to another. A CIP requirement might ask the CSP to describe (at least in general terms) the measures they have in place, as well as whether those measures have been successful in preventing Cloud Hopper-type breaches.

Of course, the three risks listed above are not the only cloud risks faced by NERC entities! The primary task of the group I want to form will be to review cloud risks identified by various organizations – federal agencies, the military, CSPs themselves, etc. – and decide which of them should be included as the basis for Cloud CIP requirements.

You may have noticed that, when I reached cloud-only risks (the third type of cloud cybersecurity risks), I abandoned my earlier concern that the CSPs won’t be willing to provide answers to unique questions like these. This is because the first two types of risks are already addressed in ISO 27001 and FedRAMP, with which the CSP is presumably already compliant. To provide evidence of compliance for CIP requirements based on those two types of risks, the CSP can simply point the customer to the audit reports.

However, risks of the third type – cloud-only risks – aren’t addressed by those other standards. Therefore, the CSP should feel obligated to provide compliance evidence for CIP requirements based on those risks. Again, since the CSP’s response to any of these requirements will be the same for all NERC entities, NERC, or a third party designated by them, should be the single point of contact for the CSP.

NERC will “audit” each of the CIP cloud-only risk requirements by evaluating the CSP’s response to a question or small set of questions, e.g., “Describe the security requirements you place on third party access brokers and how you enforce them. Have you experienced any security breaches that came through one of these access brokers?” The response will be evaluated by asking and answering the question whether the CSP has adequately mitigated the risk that is the basis for the CIP requirement. If the response has not convinced NERC (or the third party) that the CSP has mitigated that risk, they will need to ask the CSP for more evidence.

The problem with the process I’ve just described is that it’s not currently allowed by anything in the NERC Rules of Procedure. As I’ve already said, I doubt there is any way to implement this process without making RoP changes.

However, in my opinion these cloud-only risks should be the primary focus of evaluation of the CSPs. This is because the fact that the major CSPs have all passed ISO 27001 and Soc 2 Type 2 audits, and have all been authorized for use by federal agencies under FedRAMP, means they don’t present a big problem when it comes to “normal” risks, like lack of patch management or configuration management programs. It’s fine to have Cloud CIP requirements that apply to normal risks, but the only compliance evidence the CSPs should have to provide is what’s in their audit reports based on those three compliance regimes (and perhaps others as well).

On the other hand, I strongly doubt that the cloud-only risks I’ve listed (and many more identified by others) are found in any standard compliance regimes today. The working group I want to put together will have one primary responsibility: compile a list of cloud-only risks that are not currently addressed by standard compliance regimes, then decide which of these are important enough to be addressed in the Cloud CIP standards. You’re welcome to participate in this effort if you are with a NERC entity, a vendor of cloud or software services to NERC entities (including SaaS providers and platform CSPs), NERC or the NERC ERO, a consulting organization that provides services based on NERC CIP, or if you’re just a user of electricity.

If you don’t use electricity, you obviously have no stake in what this group will do, so you’re not welcome. On the other hand, if you don’t use electricity, I’d like to know how you’re reading this blog post.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com or comment on my free Substack community chat.


[i] There’s no mechanism today by which NERC can receive audit evidence from a third party and distribute it to individual entities. This is one reason why I think there will need to be changes to the NERC Rules of Procedure. As you’ll see later in this post, this isn’t the only reason why I think that.

[ii] It’s possible that the SDT (or our Cloud CIP Working Group) might decide that, if the CSP has received the appropriate certifications (ISO 27001 and Soc 2 Type 2) and authorization (FedRAMP), there’s no need to consider the other requirements in the certification or authorization, besides those that map to CIP requirements. This is because the certification means the CSP has either received no audit finding on each requirement, or they received a finding but have already mitigated the risk satisfactorily.

Therefore, the SDT and/or the Cloud Security Working Group might just skip this second risk type altogether and address the third risk type. The third type is much more important than the first two, since risks of the third type presumably are not already addressed by any of the three compliance regimes.

[iii] After my posts on this incident appeared in 2018, I and Dick Brooks of Reliable Energy Analytics talked with someone from that CSP, who had contacted me about those posts. They convinced me that they had already addressed the problem (the meeting was at least a year after the incident).

[iv] The linked article was originally made open access but may have slipped behind the paywall. Rob Barry gave me a link to a PDF of the article on his personal website; if you can’t access the article itself, email me and I’ll send you that link.

 

Saturday, August 9, 2025

CISA affirms they support the CVE Program. Is that good or bad news?

Note from Tom: As of August 11, my new posts will only be available to paid subscribers on Substack. Subscriptions cost $30 per year (or $5 per month); anyone who can’t afford to pay that should email me, since I want everyone to be able to read the posts. To have uninterrupted access to my new posts, please open a paid Substack subscription or upgrade your free Substack subscription to a paid one.

Last Thursday, at the Black Hat conference in Las Vegas, two CISA officials committed to “supporting the MITRE-backed Common Vulnerabilities and Exposures Program, just months after it faced a near complete lapse in funding” (quoting from Nextgov/FCW). Given that someone at CISA almost cut off funding for the program in April (although others tried – very unconvincingly - to deny this was anything more than an administrative glitch), it was good to hear this.

A MITRE[i] official set off this firestorm with his letter to the CVE Board members on April 15. The letter stated that the contract wasn’t going to be renewed and the program would be cancelled. However, this was followed shortly afterwards by an announcement that a group of CVE Board members (and others) were already putting together the framework (and funding) for a privately-run nonprofit organization called the CVE Foundation. Over the next few weeks, the group proceeded to fill in many of the details of their story (this effort had been ongoing for a few months, but it hadn’t been announced previously. Of course, this was because at the timer there didn’t seem to be any need to rush the announcement.

The Foundation is an international effort, which already – from what I hear – has more than enough funding promised for them to take over the MITRE contract when it comes up for renewal next March (the funding will come from both private and government sources, although I’m guessing that the US government isn’t currently supporting it). However, they intend to be much more than an “In case of emergency, break glass” option if CISA doesn’t renew the contract (which I still think is very likely, no matter what the two gentlemen – neither of whom has been at CISA very long – said at Black Hat).

The CVE Foundation was founded (and is led) by a few CVE Board members who have been involved with the CVE Program since its early days. Since then, they have been part of the numerous discussions about how the program can be improved (the Foundation is now led by Pete Allor, former Director of Product Security for Red Hat. Pete has been very involved with the CVE Program since 1999. He is an active Board member).

While the CVE Program, in my opinion, has done an exceptional job and continues to do so, the fact is that government-run programs almost without exception are hampered by the constraints imposed by the same bureaucracy that often makes government agencies a stable, not-terribly-challenging place to work. That is, they don’t exactly welcome new, innovative ideas and they make it hard to get anything done in what most of us consider a reasonable amount of time.

This week, one well-regarded person who has worked with the CVE Program for 10-15 years and is a longtime Board member, wrote on an email thread for one of the CVE working groups that he was happy to be part of the CVE Foundation from now on. He wrote that, while he enjoyed working with the CVE program, “…we measure progress in months and years instead of weeks.” Like others, he has many ideas for improvements that can be made to the program, but hasn’t seen it make much progress  in implementing them so far. I’m sure he’s quite happy to have the chance to have a serious discussion about these and other changes, assuming the CVE Foundation is placed in charge of the CVE Program.

However, if CISA somehow remains in control of the CVE Program (i.e., the contract remains with them), it will be a very different picture. I don’t think CISA ever had a big role in the operation of the program (beyond having one or two people on the CVE Board and of course paying MITRE under their contract). Moreover, CISA is unlikely to take a big role if it remains as the funder of the program.

If CISA retains control of the contract, MITRE will remain in day-to-day charge of the program. As I said, I think MITRE has done a good job so far, but like any government contractor, they must adhere strictly to the terms of their contract. If someone comes up with a great new idea that requires more money, or even just re-deploying people from what they’re doing now, the only thing that can be done is put it on the to-do list for the next contract negotiation.

My guess is that, when MITRE’s contract comes up for negotiation next year, the CVE Foundation will take it over from CISA; it’s hard to imagine that, given the huge personnel cuts that are being executed now in the agency, there will be a big effort to retain control of a contract that costs CISA around $47 million a year.

There’s also no question that the CVE Foundation will write their own contract with MITRE. It will require MITRE staff members to do the day-to-day work of the CVE Program, but it will give the Foundation a big role in determining its priorities. Frankly, I think the MITRE people – who are all quite smart, at least the ones I’ve worked with – will be just as happy as anyone else to see the program achieve more of its potential than it does now.

I also think the CVE Foundation will try to resolve some serious problems with the current CVE Program. Doing that has been put off so far, because the problems are very difficult to fix. For example, up until about ten years ago, MITRE created all new CVE records. That meant that CVE Records were fairly consistent, but as the number of new records increased every year, MITRE simply couldn’t keep up with the new workload.

At that point, the CVE Program moved to a “federated” approach, in which CVE Numbering Authorities (CNAs) were appointed. These included some of the largest software developers, who reported vulnerabilities in their own software as well as vulnerabilities in the products of other developers (in their “scope”. Today, there are 463 CNAs of many types (including GitHub, ENISA, JP-CERT and the Linux Foundation).

Of course, it’s good that so many organizations have volunteered to become CNAs; the problem is that this has led to huge inconsistencies in CVE records. For example, a lot of CNAs don’t include CVSS scores or CPE names in the new records they create[ii]; the CVE Program (i.e., MITRE staff members) has been reluctant to press them to do this. If CISA had made this problem a priority, they could have addressed it during contract negotiations with MITRE.

So, I see good things ahead for the CVE Program. However, that requires moving MITRE’s contract from CISA to the CVE Foundation next March. I confess I don’t want this to happen next March; I want it to happen tomorrow.

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com, or even better, sign up as a free subscriber to this blog’s Substack community chat and make your comment there.


[i] MITRE is a nonprofit Federally Funded R&D Corporation (FFRDC) that has operated the CVE program on behalf of DHS since its inception in 1999 (CISA came into being six years ago). The idea for CVE came from MITRE researchers.

[ii] Many CNAs will tell you that the National Vulnerability Database (NVD) had longstanding policies that they would create CVSS scores and CPE names, and add them to the record; in fact, if the CNA created either of these items, the NVD would discard what the CNA created and substitute their own. Fortunately, the NVD now has a new leader. Hopefully, that will lead to a lot of change there; it’s sorely needed.

Friday, August 8, 2025

One of many good reasons to fix the cloud problem in NERC CIP


Note from Tom: As of August 11, all but a few of my new posts will only be available on Substack to paid subscribers. Subscriptions cost $30 per year (or $5 per month); anyone who can’t afford to pay that should email me, since I want everyone to be able to read the posts. To have uninterrupted access to my new posts, please open a paid Substack subscription or upgrade your free Substack subscription to a paid one. 

On Wednesday evening, Microsoft and CISA announced a “high-severity vulnerability” that affects on-premises versions of Exchange. The vulnerability also affects the Entra cloud-based authentication system.

I won’t discuss the details of the vulnerability, since they’re not important for this post. What is important is the fact that this high-severity vulnerability only affects the on-premises version of Exchange, not the cloud version (Exchange Online). Of course, since it’s on-premises, users have to a) see the patch availability notification, b) locate and download the patch, and c) apply the patch, to fix the vulnerability. None of these steps are hard, but since human beings miss emails or forget to follow up on them, leave on vacation without performing all 1,963 items on their to-do list, etc., it’s certain that some users won’t have the patch applied even a year from now.

This is a reminder of one of the biggest reasons for using the cloud (especially SaaS applications in the cloud): The CSP just needs to apply a patch once, for all their users to be protected. The users don’t necessarily need to be told about the patch, although they should be informed for peace of mind.

Of course, this is one of many reasons why it’s important that the “Cloud CIP” problem be solved as soon as possible, so that full use of the cloud will be possible for NERC entities with medium and high impact CIP environments. Fortunately, I think the solution is right around the corner in…2031.

What, you say it’s unacceptable that we need to wait so long for the solution? If it will make you feel better, I’ll point out that it’s possible that 1) the current Standards Drafting Team will produce their first draft of the new standards sometime next year, 2) that it will take just a year for the standards to be debated and balloted at least four times by the NERC ballot body (I believe this has historically been the minimum number of ballots required to pass any major change to the CIP standards), 3) that it will be approved in six months by FERC, and 4) that the ballot body will agree to a one-year implementation period.

In all of these things come to pass, and with a helping of good luck, the new and/or revised CIP standards will be in place in mid-2029; you might think even that is slow, but I can assure you it’s lightning-fast by NERC standards; it took five and a half years for the last major change to CIP – CIP version 5 – to go through these same steps. To be honest, I consider the above to be a wildly over-optimistic scenario. In fact, I think that, if the required processes are all followed, even the 2031 target may be over-optimistic.

What can be done to shorten this time period? There is an “In case of emergency, break glass” provision in the NERC Rules of Procedure that might be used to speed up the whole process. However, it would require a well-thought-out plan of action that will need to be approved by the NERC Board of Trustees. I doubt they’re even thinking about this now.

The important thing to remember here is that there are some influential NERC entities that not only swear they will never use the cloud (on either their IT or OT sides), but they also are opposed to use of the cloud by any NERC entity – even though they know they won’t be required to use the cloud themselves.

Another thing to remember: Unlike almost any other change in the CIP standards, FERC didn’t order this one. This means they might take a long time to approve the new standards (I believe it took FERC at least a year and a half to approve CIP version 1); it also means they might order a number of changes. These changes would be included in version 2 of the “Cloud CIP” standards, which would appear 2-3 years after approval of the version 1 standards. FERC could also remand the v1 standards and send NERC back to the drawing board. However, since one or two FERC staff members are closely monitoring the standards development process, that is unlikely.

The danger is that, if the standards development process is rushed and the standards are watered down to get the required supermajority approval by the NERC ballot body, what comes out in the end won’t address the real risks posed by use of the cloud by medium and high impact CIP environments. In fact, this is what happened with CIP-013-1: It didn’t address most of the major supply chain security risks for critical infrastructure. The fault in that case was FERC’s, since they gave NERC only one year to draft and approve the new standard - which was one of the first supply chain security standards outside of the military.

This is why FERC put out a new Notice of Proposed Rulemaking (NOPR) last fall. Essentially, it said, “We admit we should never have approved CIP-013-1 mostly as is. Now we intend to rectify that error.” The NOPR suggested a few changes, but its main purpose was to request suggestions for improving the standard by early December 2024. I thought that, once that deadline had passed, FERC would quickly come out with a new NOPR – or even an Order – that laid out what changes they want to see in CIP-013-3 (CIP-013-2 is the current version, although its only changes were adding EACMS and PACS to the scope of CIP-013-1). However, as my sixth grade teacher often said, “You thought wrong.” There’s been nary a peep from FERC on this topic since December. In my opinion, a revised CIP-013 is still very much needed.

So, I hope the current SDT doesn’t feel rushed to put out a first draft of the new or revised standard(s) they’re going to propose. Just like for on-premises systems, there are big risks for systems deployed in the cloud – and few of them are the same as risks that apply to on-premises systems. It’s those cloud-only risks that need to be addressed in the new standards. There’s more to be said about this topic, coming soon to a blog near you. 

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com, or even better, sign up as a free subscriber to this blog’s Substack community chat and make your comment there.

Wednesday, August 6, 2025

AI is already powering half the US economy. And that’s only half the story.

 

Note from Tom: Since 2013, I’ve been publishing “Tom Alrich’s blog” on Blogspot. I’m now publishing my posts in this Substack blog, named “Tom Alrich’s blog, too”. I’m posting for free on Substack now, but after August 11, new posts on Substack will only be available to paid subscribers. A subscription to this blog costs $30 per year (or $5 per month); anyone who can’t pay that should email me. To have uninterrupted access to my new posts, please open a paid account on Substack, or upgrade your free account to a paid one. There are lots of good posts to come!

 

My latest post, which was based mostly on last Saturday’s column by Greg Ip of the Wall Street Journal, described three negative societal impacts of the massive AI buildout that is going on:

1.      Investment in other tech areas besides AI is being squeezed because of the huge amounts that companies like Microsoft and Meta are spending on the AI rollout (Microsoft alone is likely to spend $80 Bn this year, mostly on new data centers. I’ve been told they’re opening a new data center almost every day).

2.      The huge amounts of cash being spent on the AI buildout are starting to raise interest rates. Given the miniscule revenues that are now coming in to the big AI players, they need to borrow a lot of the money to finance the buildout, whether from a bank, the bond market or just other revenue streams (e.g., I’m sure revenue from Facebook finances at least some of Meta’s AI buildout). If anything, this trend will accelerate; for example, Microsoft is likely to spend over $100 billion on AI next year.

What’s the third negative societal impact? While Greg didn’t mention this in his column, I wrote in a blog post last year that the huge power needs of AI data centers are causing more and more electric utilities to postpone retirement of coal plants. Of course, this will damage our (i.e., humanity’s) ability to combat climate change.

However, I noted at the end of my latest post that my next post would talk about the benefits of AI. That goal has been aided by two new newspaper articles, one in the Wall Street Journal (this time not by Greg Ip) and the other in the Washington Post. Both articles discuss huge economic benefits that are accruing to the US today, due to the current AI boom.

The fact that these are accruing today is important, since Greg Ip’s column had spoken of AI’s benefits as coming far in the future. This isn’t a contradiction, because Greg discusses capital markets; his big concern in this article is whether the stock market is justified in its apparent belief that the huge AI buildouts will return concomitant benefits in a reasonable time frame (say, 5-10 years). He is clearly skeptical that this will happen; he thinks the full benefits to the companies doing the buildouts won’t arrive for 10-15 years.

On the other hand, both the WaPo article and the new WSJ article point out that just about half of the growth projected for the US economy this year will be due to the AI buildout, since most of that money stays in the US. For example, lots of people are employed in that buildout (at decent wages, hopefully); those people eat at restaurants, buy clothes for their kids, buy new TVs, etc. I don’t know how often in the past a single industry has accounted for half of GDP - other than in World War II, when I’m sure the military was the dominant industry (for example, a lot of factories that made cars, planes, etc. were converted to wartime production).

Of course, a lot of the chips, motherboards, and pieces of furniture those people are installing are manufactured overseas. Will these expenditures result in an overstatement of the GDP benefits of the buildout? No. In fact, the result will be just the opposite. Imports are subtracted from GDP. This means that what’s being spent on domestic labor and products in the AI rollout will be equal to or larger than the sum of the cost of the imported products (but with a positive sign) and the yearly increase in all other domestic activity that falls under GDP. This makes the fact that the AI buildout will account for half of GDP even more impressive.

To quote the article,

“The AI complex seems to be carrying the economy on its back now,” said Callie Cox, a market strategist with investment firm Ritholtz Wealth Management. “In a healthy economy, consumers and businesses from all backgrounds and industries should be participating meaningfully. That’s not the case right now.”

“AI executives argue the spending boom will create more jobs and bring about scientific breakthroughs with advancements in the technology. OpenAI has said that once its AI data centers are built, the resulting economic boom will create “hundreds of thousands of American jobs.”[i]

The WSJ becomes Mr. Softee

The Wall Street Journal usually focuses on hard numbers that can be easily verified – closing stock prices, trade statistics, etc. True to form, this WSJ article starts by focusing on a hard economic number: productivity. This is defined as the rate of output per unit of input – that is, the amount by which output varies from one period to another if changes in the “factors of production”, usually grouped into labor and capital, are accounted for.

For example, suppose a plant has 100 workers in period 1 and 200 in period 2. The plant also has $1,000 of capital (machinery, buildings, cash on hand, etc.) in period 1, which increases to $2,000 in period 2. If output increases from 300 widgets in 1 to 600 in 2, that means both inputs and output have doubled; thus, the ratio of quantity output to quantity of input doesn’t change. Thus, productivity stays the same.

On the other hand, if the inputs doubled, output only increased from 300 to 450, this means productivity fell, since the same inputs produced a lower output. Of course, this isn’t a good thing. Conversely, if inputs doubled but output increased from 300 widgets to 750, this means output more than doubled and productivity increased, which is a good thing. There is thus more money for raises for workers and bonuses for management, as well as for investment.

When you look at an entire economy, productivity needs to grow at a certain amount every year, just to keep up with growth of the population. Let’s assume population grows at 2% per year. This means that productivity will also need to grow at 2%, just to allow the population to maintain their current standard of living. If productivity grows at more than 2%, the standard of living can increase. Conversely, if it grows at less than 2%, the standard of living will decrease, unless the government increases its borrowing to maintain living standards. But as the US is learning now, there are limits to the borrowing strategy.

The best way to increase productivity in the short term is to grow the amount and/or quality of capital that is used for production (it takes much longer to “grow” workers). For example, if productive capital grows by 10% but the labor force only grows by 2%, then output per worker will grow enough that the standard of living can increase.

But the increased capital needs to be the kind that will allow more output to be produced. For example, suppose there are two types of capital: Type A machines that produce clothes and food, and Type B machines that produce pencils. Obviously, if the entire capital investment is in B machines, the increase in output will consist entirely of pencils; meanwhile, the workers will all be naked and starve to death.

As Greg IP pointed out, the AI buildout isn’t designed to raise economic output much in the near term; therefore, it’s much more like Type B investment than Type A. What keeps valuations of the AI companies high is that it’s well known there will be a huge increase in economic output (due to productivity gains brought on by AI) at some point in the future – but that point is currently not known. Therefore, traditional economic analysis, which assumes that productivity is the key to prosperity, finds the AI buildout to be a colossal waste.

However, the authors of the second WSJ article point out that there’s another economic measure that paints a completely different picture of the AI buildout. This measure can’t be quantified exactly but can be estimated through surveys. It’s called “consumer surplus”; it’s the difference between the price a consumer would be willing to pay for a product or service and its actual price. Of course, this quantity varies by the consumer, the product, and even the time of day, so it can never be directly measured. However, the authors (both academics) have conducted surveys that allow them to estimate the consumer surplus from AI products at $97 billion (here, “consumers” means individuals and organizations).

Of course, AI products today are mostly free, or at least free enhancements to existing for-charge products (e.g., Microsoft’s CoPilot add-on to its Office 365 suite). The authors point out that free AI products are almost never included in GDP, which is based almost entirely on sales data. However, they definitely produce benefits for consumers, just like for-pay products do:

“When a consumer takes advantage of a free-tier chatbot or image generator, no market transaction occurs, so the benefits that users derive—saving an hour drafting a brief, automating a birthday-party invitation, tutoring a child in algebra—don’t get tallied. That mismeasurement grows when people replace a costly service like stock photos with a free alternative like Bing Image Creator or Google’s ImageFX.”

In other words, the consumer surplus can be considered a quantity that should be maximized just like GDP should be maximized, even though it will probably never be possible to include it in GDP. They describe how they arrived at the $97 billion estimate in this passage:

“Rather than asking what people pay for a good, we ask what they would need to be paid to give it up. In late 2024, a nationally representative survey of U.S. adults revealed that 40% were regular users of generative AI. Our own survey found that their average valuation to forgo these tools for one month is $98. Multiply that by 82 million users and 12 months, and the $97 billion surplus surfaces.”

They continue,

“William Nordhaus calculated that, in the 20th century, 97% of welfare gains from major innovations accrued to consumers, not firms. Our early AI estimates fit that pattern. While the consumer benefits are already piling up, we believe that measured GDP and productivity will improve as well. History shows that once complementary infrastructure matures, the numbers climb.

Tyler Cowen forecasts a 0.5% annual boost to U.S. productivity, while a report by the National Academies puts the figure at more than 1% and Goldman Sachs at 1.5%. Even if the skeptics prove right and the officially measured GDP gains top out under 1%, we would be wrong to call AI a disappointment. Life may improve far faster than the spreadsheets imply, especially for lower-income households, which gain most, relative to their baseline earnings, from free tools.”

To paraphrase these two paragraphs, the authors estimate there will eventually be a big boost in GDP due to AI use, even though today the boost is mostly outside of GDP. Of course, they are talking about an increase in GDP due to use of AI, whereas the earlier estimate that half of GDP growth this year will be due to AI is referring to the massive spending for infrastructure rollout going on now.

In other words, AI will produce two big boosts to GDP: due to the rollout (starting this year, but certainly not ending anytime soon) and due to the productivity gains caused by widespread use of AI products. The latter gains can’t be measured today, but they will in the future.

The authors conclude,

“As more digital goods become available free, measuring benefits as well as costs will become increasingly important. The absence of evidence in GDP isn’t evidence of absence in real life. AI’s value proposition already sits in millions of browser tabs and smartphone keyboards. Our statistical mirrors haven’t caught the reflection. The productivity revolution is brewing beneath the surface, but the welfare revolution is already on tap.”

 

If you would like to comment on what you have read here, I would love to hear from you. Please email me at tom@tomalrich.com, or even better, sign up as a free subscriber to the Substack community chat for my subscribers and make your comment there.


[i] The WaPo article points out that a large portion of the growth due to AI is simply Nvidia’s profits. But it is certainly not the lion’s share of that growth.