Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of DC lines when loading HIFLD data #233

Closed
1 task done
danielolsen opened this issue Oct 13, 2021 · 14 comments
Closed
1 task done

Improve handling of DC lines when loading HIFLD data #233

danielolsen opened this issue Oct 13, 2021 · 14 comments
Assignees
Labels
feature request Request for a new feature. (Only lives in Backlog) hifld Related to ingestion of the HIFLD data

Comments

@danielolsen
Copy link
Contributor

danielolsen commented Oct 13, 2021

🚀

  • Is your feature request essential for your project?

Describe the workflow you want to enable

I wish when we loaded the HIFLD lines data (i.e. prereise.gather.griddata.hifld.data_access.load.get_hifld_electric_power_transmission_lines):

  • We filtered out DC lines based on the TYPE columns of the data frame (i.e. explicitly remove "DC; OVERHEAD" and "DC; UNDERGROUND") to eventually become part of the Grid.dcline table instead of Grid.branch.
  • We made sure that all real-world DC lines got added somehow. Filtering just on {"DC; OVERHEAD", "DC; UNDERGROUND"} and looking at the substation names, I can identify:
    • the Pacific DC Intertie (Path 65): ID 200823 from CELILO to NOT AVAILABLE
    • the Intermountain Power Project DC Line (Path 27): ID 308464 from INTERMOUNTAIN to ADELANTO
    • the TransBay Cable: ID 310053 from PITTSBURG to TRANS BAY CABLE FACILITY
    • Square Butte: ID 108354 from ARROWHEAD to UNKNOWN133416
    • CU: ID 150123 from DICKINSON to UNDERWOOD
  • Looking at the map, I was able to find:
    • The Cross-Sound Cable: ID 157627 from NEW HAVEN HARBOR to SHOREHAM POWER STATION (ambiguously classified as UNDERGROUND)
    • The Hudson Project: ID 157629 from HVDC B2B STATION HUDSON to WEST 49TH STREET (accurately classified as AC; UNDERGROUND, since it connects to a B2B facility)
    • The Neptune Cable: ID 113313 from SAYREVILLE to DUFFY AVENUE CONVERTER STATION (erroneously classified as AC; UNDERGROUND)

I also see the Quebec/New England Transmission line, in two sections (ID 158515 from NOT AVAILABLE [Canada] to MONROE and ID 131914 from UNKNOWN133682 [Ayer, MA] to MONROE).

I don't see the Neptune cable, the Cross-Sound cable, or the Hudson Project, although they might be part of the more general {"OVERHEAD", "UNDERGROUND", or "NOT AVAILABLE"} type lines (16% of the total).

Later, when we're building the Grid.dcline table, I wish we can add back-to-back converter stations as applicable, and remove DC lines coming from Canada, since we don't currently model the Canadian grid (but maybe keep the HVDC line from Monroe to Ayer, MA, since this line is multi-terminal?).

Describe your proposed implementation

  • get_hifld_electric_power_transmission_lines should be updated to filter out {"DC; OVERHEAD", "DC; UNDERGROUND"} before we use the table to build AC transmission lines. Maybe we change the return to a table of AC lines and a table of DC lines.
  • When we add a function to build the Grid.dcline table using the {"DC; OVERHEAD", "DC; UNDERGROUND"} transmission lines, we should add the ability to define additional DC lines not present in the original data, for DC lines/cables/B2Bs that we know exist in real life. We may also want to add the ability to further filter lines, to avoid adding HVDC lines that originate outside of the U.S.
@danielolsen danielolsen added feature request Request for a new feature. (Only lives in Backlog) hifld Related to ingestion of the HIFLD data labels Oct 13, 2021
@danielolsen
Copy link
Contributor Author

danielolsen commented Oct 28, 2021

It seems like some of the AC lines that connect to HVDC B2B stations can also be identified. For these, we'll want to ensure that they don't connect to the substations closest to both endpoints, since one end will need to connect to a zero-distance HVDC link instead in order to keep the interconnections separate. Going from north to south along the east/west seam:

  • Miles City, MT (200 MW): there are three distinct MILES CITY substations (IDs 203572, 203573, 203574), with slightly different lat/lon values, which will only be grouped together by lat/lon if we consider coordinates with single-decimal precision, since two of them have longitude of approximately 105.79 W, and one has longitude approximately 105.79504 W, so will just barely round up to 105.80. We may want to add some sort of manual substation-clustering step to be able to handle these sorts of cases. Besides this point, Miles City seems like a fairly clean break. On some maps, I also see a long finger of the Eastern interconnect that stretches through northern Montana, and the grid looks connected, but I don't know of any B2B facilities up there, so we may need to somehow create a disconnect somewhere between Great Falls and Glasgow, MT. The MALTA substation looks like a good candidate, and I don't see any generators listed in EIA Form 860 that are in Montana, parts of SPP or MISO NERC regions, and west of this substation. If generators make a good split, and there's probably not too much demand up there, we may be able to get away with picking a somewhat arbitrary point for now.
  • Rapid City, SD (200 MW): there's a RAPID CITY DC TIE substation, and three lines which appear to connected to it when looking at the map (IDs 111784, 128047, and 161012). This seems to be a good example of why we want to use line endpoint locations, rather than substation names: 111784 has RAPID CITY DC TIE listed as both substation endpoints, and 161012 doesn't have the Rapid City substation listed for either substation. This is a densely connected area, and we may also need to manually exclude certain branches or substations in order to get a clean break.
  • Stegall, NE (100-110 MW): there are many lines which connect to the STEGALL substation.
  • Virginia Smith/Sidney, NE (200 MW): there are two lines which seem to connect Sidney to Colorado, and they both terminate at the same substation UNKNOWN202159. ID 201877 connects to PEETZ substation, and ID 201938 connects to UNKNOWN201534. Substation UNKNOWN202159 also seems to have several other connections, including two to a substation named VIRGINIA SMITH CONVERTER STATION. There are several other connections between Colorado and Nebraska, since the service territory of the Highline Electric Association crosses over into parts of Nebraska, and these may have been connected back in the 60s/70s when the Eastern & Western grids were connected. We'll need to split these somehow in a way that doesn't add HVDC B2Bs.
  • Lamar Tie, CO (210 MW): ID 211991 connects the LAMAR HVDC TIE substation to the UNKNOWN209699 substation.
  • Blackwater/Clovis/Roosevelt County, NM (200 MW): ID 304971 connects the BLACKWATER TIE substation with the TAIBAN MESA substation.
  • Artesia/Eddy County, NM (200 MW): ID 307684 connects the AMRAD substation with the EDDY AC-DC-AC TIE substation.

For the ERCOT/Eastern seam:

  • Oklaunion, TX (220 MW): the UNKNOWN304477 substation looks like it's the actual location of the converter station, although there's another OKLAUNION substation that's connected to this first substation by two spur lines.
  • Welsh HVDC Converter Station, TX (600 MW): similarly, the UNKNOWN304994 seems to be the location of the B2B, although there's one spur line connecting to the WELSH substation.

I think one approach we could take to interpret this data is to designate a set of substations as 'B2B substations' (potentially after grouping some substations together, to deal with cases like Miles City), then somehow split the lines connecting to these substations into two groups (one for each interconnection), and then replace the one original substation with a pair of substations, each connected to one set of lines, with a single HVDC link between them.

We will also probably need to somehow manually exclude certain lines or substations, in order to avoid cross-interconnection connections when we know they don't actually connect, but there's no B2B facility there.

@BainanXia
Copy link
Collaborator

@danielolsen I like the proposed approach. I was thinking of the same thing when fixing the HVDC layout in the current usa_tamu model a while ago, but went with work arounds instead given we don't have the corresponding substations right on the coordinates as you posted above in the synthetic network. All literatures regarding my previous exploration can be found in Explorations/BainanX/HVDC analysis.

@rouille
Copy link
Collaborator

rouille commented Oct 28, 2021

Why don't we use the DC lines table from TAMU, look for the closest substations in HIFLD data and remove all entries related to DC lines in the HIFLD transmission file?

@danielolsen
Copy link
Contributor Author

We added the DC lines ourselves to the usa_tamu grid (see Breakthrough-Energy/PowerSimData#180), and added them at 'reasonable' existing locations within the synthetic grid, but the data from the HIFLD dataset should be better for the real HVDC lines, and should line up better with the rest of that network data I think. The B2B facilities are all missing, but I think starting from the HIFLD network should also be better there, unless there's something I'm missing.

@rouille
Copy link
Collaborator

rouille commented Oct 28, 2021

Ok. I thought the DC lines in the TAMU were a close representation of the existing one (of course with some geographical adjustment for the from/to to find a bus in the model)

@danielolsen
Copy link
Contributor Author

Ok. I thought the DC lines in the TAMU were a close representation of the existing one (of course with some geographical adjustment for the from/to to find a bus in the model)

Close-ish, but now that we've identified all the HVDC lines in the HIFLD dataset (I've updated the original post with details), I think using these is the best approach.

Based on the information we have, I think for we can:

  • When loading the HIFLD Transmission data, refactor build_transmission to take additional inputs for certain lines to ignore (e.g. the half of the New England/Quebec HVDC line that stretches into Canada), and certain lines which aren't labelled as DC to consider as DC anyway. The information that we have now can be added to const.py and then passed to build_transmission within the orchestration function (see feat: add top-level HIFLD grid orchestration function #236), or some alternative avenue.
  • Refactor augment_line_voltages and create_buses functions within prereise.gather.griddata.hifld.data_process.transmission to ignore DC lines, or to separate out the DC lines from AC lines initially and only pass the AC lines to these functions, since HVDC voltages should not influence neighboring AC voltages, and should not be considered when creating buses or transformers within substations.
  • Don't calculate impedance or rating for DC lines (rating information will need to come from elsewhere, possibly another dictionary passed to build_transmission)
  • Either have DC lines as an additional return from build_transmission, or separate it out in the highest-level orchestration function before writing to CSVs.

@danielolsen
Copy link
Contributor Author

I was playing around with some NetworkX tools to help identify which break-points were necessary to separate the interconnections, and getting unusual results, until I realized that currently we're only loading transmission lines with STATUS == 'IN SERVICE', which discards about 20% of the total lines which are marked as 'NOT AVAILABLE', and these 20% are not spread evenly across service territories. Of the service territories with at least 200 NOT AVAILABLE lines:

  • 89% of Pacificorp's lines are NOT AVAILABLE (Utah, Wyoming, parts of Idaho and Oregon)
  • 43% of Oncor Electric Delivery Company (which serves Dallas and the surrounding area)
  • 84% of Bonneville Power Administration (Oregon, Washington, Northern Idaho, Montana)
  • 98% of Idaho Power (Southern Idaho, eastern Oregon)
  • 99% of NorthWestern Energy LLC (MT)
  • 99% of the Public Service Company of Colorado
  • 97% of Portland General Electric
  • 66% of Puget Sound Energy
  • 99.7% of Avista (Eastern Washington)
  • 55% of Western Area Power Administration (along the East/West seam from Colorado & Nebraska northward)
  • 94% of the Tri-State G & T Assn (Colorado and some surrounding areas)

...the point of which is to say: ignoring 'NOT AVAILABLE' lines loses us a big part of the network topology near the east/west seam, and we probably need to correct this going forward by adding all these lines, and then potentially later removing some as necessary to keep interconnection separation.

@rouille
Copy link
Collaborator

rouille commented Nov 3, 2021

I was playing around with some NetworkX tools to help identify which break-points were necessary to separate the interconnections, and getting unusual results, until I realized that currently we're only loading transmission lines with STATUS == 'IN SERVICE', which discards about 20% of the total lines which are marked as 'NOT AVAILABLE', and these 20% are not spread evenly across service territories. Of the service territories with at least 200 NOT AVAILABLE lines:

  • 89% of Pacificorp's lines are NOT AVAILABLE (Utah, Wyoming, parts of Idaho and Oregon)

  • 43% of Oncor Electric Delivery Company (which serves Dallas and the surrounding area)

  • 84% of Bonneville Power Administration (Oregon, Washington, Northern Idaho, Montana)

  • 98% of Idaho Power (Southern Idaho, eastern Oregon)

  • 99% of NorthWestern Energy LLC (MT)

  • 99% of the Public Service Company of Colorado

  • 97% of Portland General Electric

  • 66% of Puget Sound Energy

  • 99.7% of Avista (Eastern Washington)

  • 55% of Western Area Power Administration (along the East/West seam from Colorado & Nebraska northward)

  • 94% of the Tri-State G & T Assn (Colorado and some surrounding areas)

...the point of which is to say: ignoring 'NOT AVAILABLE' lines loses us a big part of the network topology near the east/west seam, and we probably need to correct this going forward by adding all these lines, and then potentially later removing some as necessary to keep interconnection separation.

What values STATUS besides IN SERVICE and NOT AVAILABLE?

@danielolsen
Copy link
Contributor Author

What values STATUS besides IN SERVICE and NOT AVAILABLE?

INACTIVE (25 lines, 0.09%), UNDER_CONST (7 lines, 0.03%), PROPOSED (1 line, 0.001%).

@rouille
Copy link
Collaborator

rouille commented Nov 3, 2021

What values STATUS besides IN SERVICE and NOT AVAILABLE?

INACTIVE (25 lines, 0.09%), UNDER_CONST (7 lines, 0.03%), PROPOSED (1 line, 0.001%).

What is the breakdown of the connected components when all the lines are considered?

@danielolsen
Copy link
Contributor Author

What values STATUS besides IN SERVICE and NOT AVAILABLE?

INACTIVE (25 lines, 0.09%), UNDER_CONST (7 lines, 0.03%), PROPOSED (1 line, 0.001%).

What is the breakdown of the connected components when all the lines are considered?

If we allow both IN SERVICE and NOT AVAILABLE, there are 236 connected components. However, looking into these data pointed out another issue: when we refactored from (lon, lat) to (lat, lon) for line path coordinates, we failed to also update the logic within the map_lines_to_substations_using_coords function, so un-mapped lines were nearly-universally getting mapped to the northwestern-most substation in the contiguous U.S., at Neah Bay. When we correct that bug, it seems there are 341 connected components. Now only 164 lines mapped to the Neah Bay substation, and they seem to be mostly branches that are actually in Alaska, so we probably want to filter out a set of known Alaskan transmission owners.

@rouille
Copy link
Collaborator

rouille commented Nov 3, 2021

What values STATUS besides IN SERVICE and NOT AVAILABLE?

INACTIVE (25 lines, 0.09%), UNDER_CONST (7 lines, 0.03%), PROPOSED (1 line, 0.001%).

What is the breakdown of the connected components when all the lines are considered?

If we allow both IN SERVICE and NOT AVAILABLE, there are 236 connected components. However, looking into these data pointed out another issue: when we refactored from (lon, lat) to (lat, lon) for line path coordinates, we failed to also update the logic within the map_lines_to_substations_using_coords function, so un-mapped lines were nearly-universally getting mapped to the northwestern-most substation in the contiguous U.S., at Neah Bay. When we correct that bug, it seems there are 341 connected components. Now only 164 lines mapped to the Neah Bay substation, and they seem to be mostly branches that are actually in Alaska, so we probably want to filter out a set of known Alaskan transmission owners.

Ah. I thought I took care of reversing Lat and Lon in this function.

@danielolsen
Copy link
Contributor Author

Somehow we all missed it together, but I figured something was up when there were seven thousand transmission lines all connected to the far corner of Washington state.

@danielolsen
Copy link
Contributor Author

Closed via #240.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Request for a new feature. (Only lives in Backlog) hifld Related to ingestion of the HIFLD data
Projects
None yet
Development

No branches or pull requests

5 participants