Monday, May 31, 2021

By the skin of their teeth

Last Friday, the Wall Street Journal carried a very good article[i] on the February grid failure in Texas, written by Rebecca Smith. I’ve appreciated her (and also become friends with her) since she wrote a great article on Russian supply chain grid attacks in 2018. In my opinion, that article had a big impact whose consequences continue to be felt today.  I hope to write a post to elaborate on that cryptic comment soon. 

The main focus of the article is blackstart resources, in particular the power plant units that are required to start the grid again after it’s been completely shut down (e.g. as happened in the Northeast Blackout of 2003). The article provides probably the best explanation of blackstart for laypeople that I’ve ever seen, along with some really great graphics illustrating what happens in blackstart. 

The main point of Rebecca’s article is that a large number of blackstart resources (9 of 13 primary generators and 6 of 15 secondary generators) weren’t operating on the critical morning of February 15, when the ERCOT grid came within five minutes of completely shutting down. Because of this, it’s very possible the grid would not have been completely operational for weeks or even longer.  The article says: 

Pat Wood III, former head of the Public Utility Commission of Texas and former chairman of the Federal Energy Regulatory Commission, or FERC, said the poor performance of black starts in Texas stunned him. Were there an uncontrolled grid collapse, whether from extreme weather, a cyberattack, or some other cause, Mr. Wood said, they are “what keeps us from going back to the Stone Age.”

Because my friend Kevin Perry – former Chief CIP Auditor of the SPP Regional Entity and previously member of the IT leadership team for SPP itself – had provided some excellent insight on blackstart and the Texas grid collapse in this post and this one, I asked him to comment on the article. Here is what he said (with my comments in italics):

The article is mostly on the mark.  Except for renewable resources (hydro, solar, wind), blackstart units need fossil fuel.  Some are oil or diesel; many are gas combustion turbines that need natural gas.  And, in some cases, the blackstart unit is actually a commercial generator (500 KW to a MW or two) that is just enough to start a combustion turbine.  The turbine then provides the power to start the thermal units.  Regardless, if you don’t have fuel, the units cannot run.

 

I am still having trouble accepting the idea of a months-long outage if the blackstart units cannot all start up.  The blackstart unit supplies the 40-60 MW power needed to run the station auxiliaries and fire up the big thermal units.  But, once you get a big unit online, the blackstart unit’s role is done.  Power from the restarted thermal unit is then used to start additional units/plants as well as serve load (where the distribution infrastructure is intact).  Yes, it is a delicate balancing act, and voltage or frequency instability can bring it all crashing back down again.  

 

My contention is that once you get one or more main thermal units online somewhere in the Interconnection (which in this case means the ERCOT grid, since it isn’t part of either the Eastern or Western Interconnect), you can build cranking paths from an energized TOP (Transmission Operator) to a TOP who has no available blackstart capability for some reason and get a thermal unit started up there.  Once that is accomplished, the TOP can then start up the rest of its area as it normally would.  

 

In an emergency, you do what you need to do to get the grid back up and sort out payment for cranking power served after the fact (there’s a lot of payment sorting-out going on in Texas now as a result of the outages. It will continue for years. Sorting out blackstart – or “cranking power” – payments would be child’s play compared to the huge amounts being fought over now).  And, yes, you will likely need to design a cranking path on the fly, but it shouldn’t take months.  The big concern is that you deplete the substation batteries before you can get power restored, further complicating matters.  I don’t know, maybe that is where the months-long estimate is coming from.

 

And that leads me to the concept of blackstart markets.  It used to be that each TOP had to have blackstart units or defined contracts with an adjoining utility, per EOP-005.  Where there is a blackstart resources market (as in Texas), blackstart owners now bid their capabilities into the market and the selected providers are compensated for making the capabilities available in case they are needed.  Restoration plans for the market participant TOPs are now built based on the market results.  So, there are likely (or at least were) more blackstart units than what the market requires.  Because units have to be bid and not every bid is successful, there is an incentive for repeatedly unsuccessful bidders to take their units out of service, reducing options in an emergency.

 

Back when the CIP standards were being modified in accordance with Order 706, blackstart owners currently in an ISO/RTO with a blackstart market came out and asserted that if their blackstart units were subject to the CIP standards (the pending Version 4 at the time), they would not bid their units into the market.  Period.  End of discussion.  Their argument was that they would not spend huge dollars to comply with the CIP Standards only to not be selected for the market (and no longer need to comply) with the next bid cycle (1-2 years). Today, blackstart units are specifically designated as low impact, partly because of this issue.

Of course, Kevin’s last paragraph is quite interesting. He points out that, despite the fact that blackstart units should be considered critical facilities, they aren’t considered such by the CIP standards; they are “low impact” and aren’t subject to the much-more-stringent requirements that “medium impact” plants are subject to.[ii]

It’s very hard not to sympathize with the blackstart plant owners when they say (as in Rebecca’s article) that they couldn’t afford to pay the cost of complying with CIP on the medium impact level, given that they have to bid into a cutthroat market in order to have the blackstart designation in the first place. Why bother to even bid, if the return you get won’t make up for the additional CIP compliance costs (as well as costs for compliance with NERC’s EOP-005 and EOP-006 standards for blackstart units)?

So here’s the real problem: Blackstart plant owners aren’t compensated for the cost required to “harden” themselves against threats that would prevent their plants from being available when needed. There are two ways to deal with that: the “free market” solution (currently in effect in Texas and many other states) and the “regulated” solution.

In a pure free market solution, if a class of resources (in this case, blackstart generators) isn’t adequately compensated for their product, they will go out of business (or withdraw from the market, in this case). This will reduce the supply of that product, resulting in a shortage that will cause the price to rise. The rise in price will cause more companies to get into the market (or companies that had withdrawn to return), and the price will settle at a more sustainable level. And we’ll all live happily ever after.

Of course, this is a pretty good description of the system that was in place for the Texas market as a whole, and – in case you didn’t notice – that didn’t work too well in February. And, had the low frequency event in the early morning of February 15 not finally been dispelled by load shedding (i.e. putting people in the dark and cold), it’s very likely that the system wouldn’t have worked too well for blackstart resources, either – in other words, they wouldn’t have been there when needed to restore the ERCOT grid from a widespread shutdown. Perhaps people in Texas would still be in the dark today – although at least they wouldn’t be cold now.

So obviously there needs to be some regulated solution, where blackstart resources would be compensated through the rate base and they would be allowed to spend a reasonable amount to meet both their security and their compliance costs, including the cost of storing an adequate amount of fuel to enable them to do their job when they’re needed. In turn, they would be inspected regularly (as required by EOP-005). If you look at all the problems faced by the power grid nowadays, this looks like one of the easier (and cheaper) ones to remediate.


[i] You probably won’t be able to read the article, unless you’re a WSJ subscriber or you want to sign up for the free trial. But if you drop me an email, I’ll send you a PDF of the article.

[ii] On the other hand, very few power plants of any size have medium impact BES Cyber Systems (which is the only other level applicable to plants. Only large Control Centers can have high impact BES Cyber Systems). Almost all BES Cyber Systems (BCS) in the power plants are low impact, regardless of the plant size. And technically, the CIP standards apply to BCS, not assets like plants, substations and Control Centers.

2 comments:

  1. Good morning Tom,

    Great post and I concur with your and Kevin's analyses of the blackstart and cranking path situations. Please send me a copy of Ms.Smith's article.

    Thanks, Joe

    Dr. Joseph Baugh
    Associate Director
    Risk, Compliance & Security Team
    Guidehouse, Inc.

    joseph.baugh@guidehouse.com

    ReplyDelete
  2. Nice hearing from you, Joe! I just sent the article.

    ReplyDelete