Last Friday, the Wall Street Journal carried a very good article[i] on the February grid failure in Texas, written by Rebecca Smith. I’ve appreciated her (and also become friends with her) since she wrote a great article on Russian supply chain grid attacks in 2018. In my opinion, that article had a big impact whose consequences continue to be felt today. I hope to write a post to elaborate on that cryptic comment soon.
The main focus of the article is blackstart resources, in particular the power plant units that are required to start the grid again after it’s been completely shut down (e.g. as happened in the Northeast Blackout of 2003). The article provides probably the best explanation of blackstart for laypeople that I’ve ever seen, along with some really great graphics illustrating what happens in blackstart.
The main point of Rebecca’s article is that a large number of blackstart resources (9 of 13 primary generators and 6 of 15 secondary generators) weren’t operating on the critical morning of February 15, when the ERCOT grid came within five minutes of completely shutting down. Because of this, it’s very possible the grid would not have been completely operational for weeks or even longer. The article says:
Pat Wood III, former head of the
Public Utility Commission of Texas and former chairman of the Federal
Energy Regulatory Commission, or FERC, said the poor performance of black
starts in Texas stunned him. Were there an uncontrolled grid collapse, whether
from extreme weather, a cyberattack, or some other cause, Mr. Wood said,
they are “what keeps us from going back to the Stone Age.”
Because my friend Kevin Perry – former Chief CIP Auditor of the SPP Regional Entity and previously member of the IT leadership team for SPP itself – had provided some excellent insight on blackstart and the Texas grid collapse in this post and this one, I asked him to comment on the article. Here is what he said (with my comments in italics):
The article is mostly on the mark.
Except for renewable resources (hydro, solar, wind), blackstart units
need fossil fuel. Some are oil or diesel; many are gas combustion
turbines that need natural gas. And, in some cases, the blackstart unit
is actually a commercial generator (500 KW to a MW or two) that is just enough
to start a combustion turbine. The turbine then provides the power to
start the thermal units. Regardless, if you don’t have fuel, the units
cannot run.
I am still having trouble accepting
the idea of a months-long outage if the blackstart units cannot all start up.
The blackstart unit supplies the 40-60 MW power needed to run the station
auxiliaries and fire up the big thermal units. But, once you get a big
unit online, the blackstart unit’s role is done. Power from the restarted
thermal unit is then used to start additional units/plants as well as serve
load (where the distribution infrastructure is intact). Yes, it is a
delicate balancing act, and voltage or frequency instability can bring it all
crashing back down again.
My contention is that once you get
one or more main thermal units online somewhere in the Interconnection
(which in this case means the ERCOT grid, since it isn’t part of either the
Eastern or Western Interconnect), you can build cranking paths from an
energized TOP (Transmission Operator) to a TOP who has no available
blackstart capability for some reason and get a thermal unit started up there. Once that is accomplished, the TOP can then
start up the rest of its area as it normally would.
In an emergency, you do what you
need to do to get the grid back up and sort out payment for cranking power
served after the fact (there’s a lot of payment sorting-out going
on in Texas now as a result of the outages. It will continue for
years. Sorting out blackstart – or “cranking power” – payments would be child’s
play compared to the huge amounts being fought over now). And, yes,
you will likely need to design a cranking path on the fly, but it shouldn’t
take months. The big concern is that you deplete the substation batteries
before you can get power restored, further complicating matters. I don’t
know, maybe that is where the months-long estimate is coming from.
And that leads me to the concept of
blackstart markets. It used to be that each TOP had to have blackstart
units or defined contracts with an adjoining utility, per EOP-005. Where
there is a blackstart resources market (as in Texas), blackstart owners now
bid their capabilities into the market and the selected providers are
compensated for making the capabilities available in case they are needed.
Restoration plans for the market participant TOPs are now built based on
the market results. So, there are likely (or at least were) more
blackstart units than what the market requires. Because units have to be
bid and not every bid is successful, there is an incentive for repeatedly
unsuccessful bidders to take their units out of service, reducing options in an
emergency.
Back when the CIP standards were
being modified in accordance with Order 706, blackstart owners currently in an
ISO/RTO with a blackstart market came out and asserted that if their blackstart
units were subject to the CIP standards (the pending Version 4 at the time),
they would not bid their units into the market. Period. End of discussion.
Their argument was that they would not spend huge dollars to comply with
the CIP Standards only to not be selected for the market (and no longer need to
comply) with the next bid cycle (1-2 years). Today, blackstart units are
specifically designated as low impact, partly because of this issue.
Of course, Kevin’s last paragraph is quite interesting. He points out that, despite the fact that blackstart units should be considered critical facilities, they aren’t considered such by the CIP standards; they are “low impact” and aren’t subject to the much-more-stringent requirements that “medium impact” plants are subject to.[ii]
It’s very hard not to sympathize with the blackstart plant owners when they say (as in Rebecca’s article) that they couldn’t afford to pay the cost of complying with CIP on the medium impact level, given that they have to bid into a cutthroat market in order to have the blackstart designation in the first place. Why bother to even bid, if the return you get won’t make up for the additional CIP compliance costs (as well as costs for compliance with NERC’s EOP-005 and EOP-006 standards for blackstart units)?
So here’s the real problem: Blackstart plant owners aren’t compensated for the cost required to “harden” themselves against threats that would prevent their plants from being available when needed. There are two ways to deal with that: the “free market” solution (currently in effect in Texas and many other states) and the “regulated” solution.
In a pure free market solution, if a class of resources (in this case, blackstart generators) isn’t adequately compensated for their product, they will go out of business (or withdraw from the market, in this case). This will reduce the supply of that product, resulting in a shortage that will cause the price to rise. The rise in price will cause more companies to get into the market (or companies that had withdrawn to return), and the price will settle at a more sustainable level. And we’ll all live happily ever after.
Of course, this is a pretty good description of the system that was in place for the Texas market as a whole, and – in case you didn’t notice – that didn’t work too well in February. And, had the low frequency event in the early morning of February 15 not finally been dispelled by load shedding (i.e. putting people in the dark and cold), it’s very likely that the system wouldn’t have worked too well for blackstart resources, either – in other words, they wouldn’t have been there when needed to restore the ERCOT grid from a widespread shutdown. Perhaps people in Texas would still be in the dark today – although at least they wouldn’t be cold now.
So obviously there needs to be some regulated solution, where blackstart resources would be compensated through the rate base and they would be allowed to spend a reasonable amount to meet both their security and their compliance costs, including the cost of storing an adequate amount of fuel to enable them to do their job when they’re needed. In turn, they would be inspected regularly (as required by EOP-005). If you look at all the problems faced by the power grid nowadays, this looks like one of the easier (and cheaper) ones to remediate.
[i] You
probably won’t be able to read the article, unless you’re a WSJ subscriber or
you want to sign up for the free trial. But if you drop me an email, I’ll send
you a PDF of the article.
[ii] On the other hand, very few power plants of any size have medium impact BES Cyber Systems (which is the only other level applicable to plants. Only large Control Centers can have high impact BES Cyber Systems). Almost all BES Cyber Systems (BCS) in the power plants are low impact, regardless of the plant size. And technically, the CIP standards apply to BCS, not assets like plants, substations and Control Centers.