Capacity planning ct scaling output #143

dmuldrew · 2020-04-13T19:53:02Z

Purpose

This PR adds a function to output capacity scaling factors with respect to the base grid capacities. In addition, a GridInfo class is created which contains grid-specific code now inherited by the ScenarioInfo class.

What is the code doing?

There are two new functions:
Outputs a dataframe of region capacities
output_capacities_table(self, base_grid)

Creates a change table from the capacities dataframe with optional external input
create_scale_factor_table(self, base_grid, gen_capacity=None)

Where to look

powersimdata/design/clean_capacity_scaling.py
powersimdata/design/tests/test_change_table_output.py
powersimdata/design/scenario_info.py

Time estimate

60 min

danielolsen · 2020-04-13T20:03:12Z

Is this PR ready to review, or is this branch still a work-in-progress?

dmuldrew · 2020-04-13T20:33:26Z

I expect there will be feedback from @BainanXia to address before it is ready...

BainanXia · 2020-04-13T20:41:22Z

After taking a closer look of the code, I found it is right in its current form give the fact that ScenarioInfo overwrites the grid attribute after inheriting GridInfo. Just to make it clear: ScenarioInfo retrieves the grid that specifically for the loaded scenario, which is a scaled grid. The capacity scaling framework needs that to calculate the target capacities for the upcoming scenario. However, when we fill out the change table, we are calculating the scaling factor based on a fresh copy of the original grid from PowerSimData.

dmuldrew · 2020-04-13T20:43:30Z

I didn't include the final change table step:

# Build change table
for gen_type in scale_factor_input:
    scenario.state.builder.change_table.scale_plant_capacity(gen_type, zone_name=scale_factor_input[gen_type])

because that step is straightforward to do already. Another consideration for leaving it out is that we'll likely be changing the overall change table schema.

dmuldrew · 2020-04-13T20:47:29Z

@BainanXia Feel free to suggest a better way to create the GridInfo class and handling the constructors. My main goal here was to avoid having two copies of the same functions that we'd have to maintain.

danielolsen · 2020-04-13T20:48:26Z

I don't think the final step is as straightforward as you think. Targets are defined by state (anywhere from 1 to N zones), while generators in the change table are specified by zone_id or by plant_id. A test case that tests create_scale_factor_table() for a change table compatible with the current format would be helpful.

powersimdata/design/clean_capacity_scaling.py

powersimdata/design/scenario_info.py

BainanXia · 2020-04-13T21:50:04Z

As @danielolsen mentioned, before we run an integration test on this to compare the change tables generated by the two approaches, we should add some simple unit tests on the function create_scale_factor_table().

danielolsen · 2020-04-15T18:21:29Z

I think you could get a meaningful test case of change table output by taking an Eastern grid and aggregating together all plants by type/zone into one single plant, so that you have sufficient complexity (primarily around the zone to state mapping: some states have 1 zone, some states have multiple zones with all different names, some states have multiple zones where one zone matches the state name) without an excessive amount of input data.

EDIT: or you could get even simpler by doing the above and dropping all states besides Florida, North Carolina, and Maine. Those three states cover the three cases I mentioned above, which I think is all of the edge cases we want to be sure of.

dmuldrew · 2020-04-15T23:29:39Z

We can potentially use the approach Daniel suggested above using real grid data, and also, I just worked out a simple integration test here building on the previous capacity planning tests:
https://github.com/intvenlab/PowerSimData/blob/capacity_planning_ct_output/powersimdata/design/tests/test_change_table_output.py
though I still need to tweak it so the output is a bit more interesting.

danielolsen · 2020-04-16T21:02:58Z

Here are the summed generator capacities for the six zones which are required to handle the three cases I mentioned above. Each capacity sum can be aggregated into a single generator in a MockGrid, I believe we will only need the 'Pmax' and 'zone_id' columns.

For a change table output that can integrate with the rest of our process, the calculated scaling factors for the Maine TargetManager need to be applied to that state's one zone, the scaling factors for North Carolina TargetManager need to applied to those two zones, the scaling factors for Florida TargetManager need to be applied to those three zones. See state2loadzone in https://github.com/intvenlab/PreREISE/blob/develop/prereise/gather/constants.py

Maine (zone_id=1)
    coal           0.000
    dfo          917.597
    hydro        714.800
    ng          1758.198
    other        361.000
    solar          1.000
    wind         898.800
North Carolina (zone_id=16)
    coal        9531.899
    dfo          443.427
    hydro        650.273
    ng          9719.910
    nuclear     4875.788
    other        360.565
    solar       3193.134
    wind         208.000
Western North Carolina (zone_id=17)
    coal        1962.306
    dfo           47.374
    hydro       1335.117
    ng          3434.384
    other          5.189
    solar        357.082
    wind           1.000
Florida Panhandle (zone_id=21)
    coal        1767.298
    dfo           29.532
    hydro         55.701
    ng          1794.551
    other         92.258
    solar         54.357
    wind           1.000
Florida North (zone_id=22)
    coal        6055.942
    dfo         1537.361
    ng         14474.008
    other        227.127
    solar        687.533
    wind           1.000
Florida South (zone_id=23)
    coal        3267.056
    dfo         4096.413
    ng         31718.224
    nuclear     3341.230
    other        567.613
    solar       1121.009
    wind           1.000

danielolsen · 2020-04-22T18:35:56Z

If this is now ready to go, could you please rebase onto develop, ensure all tests still work, and edit some more context into the first post? Lately I've been following the template in the PR Request Etiquette wiki page, and found it's very helpful in organizing my explanations of what the code is doing and why, and also in discovering things that may be incomplete or not explained well.

danielolsen · 2020-04-23T21:21:16Z

Trying the new function using the objects we've been using to set up Eastern scenarios seems to fail:

PS DROPBOX_PATH\Results\Scenarios\ScenarioRuns\Eastern2030Independent> python
Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pandas as pd
>>> from powersimdata.scenario.scenario import Scenario
>>> from powersimdata.design.scenario_info import GridInfo
>>> from powersimdata.design.scenario_info import ScenarioInfo
>>> from powersimdata.design.clean_capacity_scaling import IndependentStrategyManager
>>> scenario_string = '403'
>>> targets_info_location = 'Eastern 2030 Clean Energy Targets - natural log.csv'
>>> scenario = Scenario(scenario_string)
SCENARIO: test | EasternBase_2020_3

--> State
analyze
100%|#####################################| 8.12k/8.12k [00:00<00:00, 70.3kb/s]
>>> scenario_info = ScenarioInfo(scenario)
[...excessive ScenarioInfo output truncated...]
>>> eastern = pd.read_csv(targets_info_location)
>>> eastern['external_ce_historical_amount'] = eastern['external_ce_historical_amount'].fillna(0)
>>> eastern['allowed_resources'] = eastern['allowed_resources'].fillna('solar,wind')
>>> eastern['ce_category'] = eastern['ce_category'].fillna('TBD')
>>> eastern['solar_percentage'] = eastern['solar_percentage'].fillna('None')
>>> independent_strategy_manager = IndependentStrategyManager()
>>> independent_strategy_manager.targets_from_data_frame(eastern)
>>> start_time = '2016-01-01 00:00:00'
>>> end_time = '2016-12-31 23:00:00'
>>> independent_strategy_manager.populate_targets_with_resources(scenario_info, start_time, end_time)

[...Alabama and Arkansas output truncated...]


Connecticut

Invalid resource type
Invalid resource type
Added resource coal!

Invalid resource type
Invalid resource type
Added resource dfo!

Invalid resource type
Invalid resource type
Added resource other!

C:\Python37\lib\site-packages\powersimdata\design\scenario_info.py:118: UserWarning: No such type of generator in the area specified!
  warnings.warn('No such type of generator in the area specified!')
No existing resource geothermal!
Invalid resource type
No such type of generator in the area specified. Division by zero.
Added resource geothermal!

Added resource solar!

Added resource hydro!

Invalid resource type
Invalid resource type
Added resource ng!

Invalid resource type
Invalid resource type
Added resource nuclear!

Added resource wind!

[...output after Connecticut truncated]

>>> from powersimdata.input.grid import Grid
>>> base_grid = Grid(['Eastern'])
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
>>> independent_strategy_manager.create_scale_factor_table(base_grid)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python37\lib\site-packages\powersimdata\design\clean_capacity_scaling.py", line 138, in create_scale_factor_table
    gen_capacity = self.output_capacities_table(base_grid)
  File "C:\Python37\lib\site-packages\powersimdata\design\clean_capacity_scaling.py", line 106, in output_capacities_table
    .issubset(set(grid_resources)), f"{tar} region contains " \
AssertionError: Connecticut region contains generator types not contained within the grid!
>>> grid_info = GridInfo(base_grid)
>>> grid_resources = grid_info.get_available_resource('all')
>>> grid_resources
['ng', 'dfo', 'hydro', 'wind', 'coal', 'nuclear', 'solar', 'other']
>>> independent_strategy_manager.targets['Connecticut'].resources.resources.keys()
dict_keys(['coal', 'dfo', 'other', 'geothermal', 'solar', 'hydro', 'ng', 'nuclear', 'wind'])

danielolsen · 2020-04-23T22:23:30Z

The docstring is missing from AbstractStrategyManager.output_capacities_table, and appears to be incomplete for AbstractStrategyManager.create_scale_factor_table.

dmuldrew · 2020-04-24T00:32:45Z

@danielolsen This is an assertion to verify to that all resources within the capacity planning target object exist within the base grid. I suppose if we need to have resources in future scenario grids that are not present in the base grid, it will have to be removed or replaced...

Right now we have the ability to specify resources not in the grid if they are present in available_resources list. So another option might be to just remove geothermal and ng from the available_resources list of Connecticut.

danielolsen · 2020-04-24T21:06:31Z

The script from my previous comment now runs, but the outputs are still not compatible with the way our change table specifies scaling. Scaling has to be specified by zone_id or plant_id, not by a string for the name of the state (which may be made up of several zones).

dmuldrew · 2020-04-27T17:53:45Z

The last step is accomplished in the notebook using:

# Build change table
for gen_type in scale_factor_input:
    scenario.state.builder.change_table.scale_plant_capacity(gen_type, zone_name=scale_factor_input[gen_type])

I think we should keep this step, otherwise, there would be a scenario object dependency within the capacity planning framework, which I'd like to avoid.

rouille · 2020-04-27T18:02:50Z

You can instant Kate a ChangeTable objet, Gill the table with the methods defined in the object. All of this without the mention to the Scenario object.

danielolsen · 2020-04-27T18:32:12Z

I took a closer look at the ChangeTable.scale_plant_capacity method, and I realized I was mistaken. We do not specify the zone by zone_id, we specify it by zone_name, which looks up a zone_id. That said, my original point in #143 (comment) remains: we cannot use the list of state names that currently come out of this code as zone names for the change table, because naively doing this will result in trying to use the zone name of 'Florida', which does not correspond to any zone in our grid, and trying to use the zone name 'North Carolina' which does match one zone in our grid (number 16) but does not match the other zone that is also part of the state of North Carolina, named 'Western North Carolina' (number 17).

I don't see any update to any notebooks as part of this PR, so I'm not seeing how you are able to generate a change table that will work the way we need it to. I am not asking you to apply the change table as part of this feature, just to demonstrate somehow that the change table that is generated is compatible with the input required by the change table method.

danielolsen · 2020-04-27T20:11:02Z

Here's the dictionary that I get when running my code pasted above (which mimics what we are doing now for Eastern runs):

{'ng': {'Alabama': 1.0032506021583836, 'Arkansas': 1.000000414972369, 'Connecticut': 1.0204997663847009, 'Delaware': 0.9875339831576154, 'Florida': 0.9684479161689177, 'Georgia': 0.9999999463873118, 'Illinois': 1.0000569918842521, 'Indiana': 1.0044732724056193, 'Iowa': 0.8880397157452772, 'Kansas': 0.8614981906286743, 'Kentucky': 1.0000004713148365, 'Louisiana': 0.9337869132555634, 'Maine': 0.9453997786369908, 'Maryland': 1.0057792777158723, 'Massachusetts': 1.1114740765842086, 'Michigan': 1.016367574982658, 'Minnesota': 1.0750730973302938, 'Mississippi': 0.9123217303976288, 'Missouri': 0.9978717263692048, 'Montana Eastern': 1.0000046168264858, 'Nebraska': 0.975573147634249, 'New Hampshire': 0.9999999999999993, 'New Jersey': 0.9177177526117556, 'New Mexico Eastern': 0.988831201930389, 'New York': 0.9974629266151973, 'North Carolina': 1.0320508269010866, 'North Dakota': 1.5073097215042996, 'Ohio': 1.0000685939593852, 'Oklahoma': 1.0031298217720679, 'Pennsylvania': 1.0020369445578527, 'Rhode Island': 1.000001520452365, 'South Carolina': 0.9936210843287235, 'South Dakota': 0.999999062793756, 'Tennessee': 1.0008258236200247, 'Texas': 0.994642312338667, 'Virginia': 0.9633879045948965, 'West Virginia': 0.9943975081454138, 'Wisconsin': 0.9978605183961953}, 'dfo': {'Alabama': 0.9999790799355662, 'Arkansas': 1.0000000000000007, 'Connecticut': 0.9999971599058217, 'Delaware': 1.0, 'Florida': 0.6699796903080989, 'Georgia': 0.9999925415436025, 'Illinois': 1.0034557978339684, 'Indiana': 0.4042744586047457, 'Iowa': 0.9576596459454403, 'Kansas': 1.0367338330872906, 'Kentucky': 0.9998261171970092, 'Louisiana': 1.0000362331968546, 'Maine': 1.0000032694091199, 'Maryland': 0.9532071749718317, 'Massachusetts': 0.8663954778382386, 'Michigan': 0.966804994058939, 'Minnesota': 0.9808934206541243, 'Mississippi': 0.3749999999999997, 'Missouri': 1.0152037509218936, 'Nebraska': 1.000022179588126, 'New Hampshire': 0.9999909091735524, 'New Jersey': 0.3482931699709367, 'New York': 0.9642971771478442, 'North Carolina': 0.8139755216472672, 'North Dakota': 0.9999999999999998, 'Ohio': 0.9518273123310287, 'Oklahoma': 1.0, 'Pennsylvania': 1.0000009555214326, 'Rhode Island': 1.0, 'South Carolina': 0.9101068867925857, 'South Dakota': 1.0000070822031317, 'Tennessee': 0.9999789920379826, 'Vermont': 0.9999774271461137, 'Virginia': 1.000000000000004, 'West Virginia': 1.0, 'Wisconsin': 0.9781541029876054}, 'hydro': {'Alabama': 1.0000024101525256, 'Arkansas': 0.9999985173125991, 'Connecticut': 1.0000066890079526, 'Florida': 0.9999820470009514, 'Georgia': 1.0000019456372282, 'Illinois': 1.0000503803718066, 'Indiana': 1.000021715998176, 'Iowa': 0.9999922601218256, 'Kansas': 0.999714367323622, 'Kentucky': 0.9999980326617491, 'Louisiana': 1.0, 'Maine': 1.0000000000000018, 'Maryland': 0.9999963689311221, 'Massachusetts': 1.0000022112935187, 'Michigan': 1.000002990950666, 'Minnesota': 0.9999814216178056, 'Missouri': 0.99999819233877, 'Nebraska': 1.0000000000000002, 'New Hampshire': 0.9999835219287811, 'New Jersey': 1.0, 'New York': 0.9999971272705335, 'North Carolina': 1.000005036793779, 'North Dakota': 1.0, 'Ohio': 0.9999844481423308, 'Oklahoma': 1.0000009358923125, 'Pennsylvania': 0.9999971550578803, 'Rhode Island': 1.0, 'South Carolina': 0.999997286905547, 'South Dakota': 0.9999987516400328, 'Tennessee': 1.0000004747213267, 'Texas': 1.0, 'Vermont': 0.9999664030835853, 'Virginia': 1.0000022890913867, 'West Virginia': 0.9999919050844165, 'Wisconsin': 0.9999925498785628}, 'wind': {'Alabama': 1.0, 'Arkansas': 1.0, 'Connecticut': 263.66173239788685, 'Delaware': 183.62629293460802, 'Florida': 1.0, 'Georgia': 1.0, 'Illinois': 3.012378439179425, 'Indiana': 1.8277598950997465, 'Iowa': 1.0580403343030635, 'Kansas': 0.9999998373005163, 'Kentucky': 1.0, 'Louisiana': 1.0, 'Maine': 2.660423026247092, 'Maryland': 38.57980183945357, 'Massachusetts': 22.040297553986676, 'Michigan': 6.562671444569763, 'Minnesota': 1.2812401305104153, 'Mississippi': 1.0, 'Missouri': 4.484511782435197, 'Montana Eastern': 1.0, 'Nebraska': 1.0755694301422507, 'New Hampshire': 7.246082756147602, 'New Jersey': 521.0767010771756, 'New Mexico Eastern': 1.0618257613197657, 'New York': 14.676253540730988, 'North Carolina': 4.042248561205534, 'North Dakota': 1.1113760940246078, 'Ohio': 7.957554140145644, 'Oklahoma': 1.0419111544501412, 'Pennsylvania': 5.85778002506734, 'Rhode Island': 11.987150122467542, 'South Carolina': 1.0, 'South Dakota': 1.03891429127347, 'Tennessee': 0.9999999999999999, 'Texas': 1.0248192158483063, 'Vermont': 6.181377134828993, 'Virginia': 310.68348073014613, 'West Virginia': 1.0, 'Wisconsin': 3.447680654964134}, 'coal': {'Alabama': 0.8251137988497128, 'Arkansas': 1.0, 'Connecticut': 1.0, 'Delaware': 1.0, 'Florida': 0.7906010804400528, 'Georgia': 0.8881670736521574, 'Illinois': 0.8569402648406381, 'Indiana': 0.9656815390763844, 'Iowa': 1.0000004971416834, 'Kansas': 0.9834228946882186, 'Kentucky': 0.8650083003775229, 'Louisiana': 1.0000000000000002, 'Maryland': 0.9093388041902145, 'Massachusetts': 0.0, 'Michigan': 0.911514960463835, 'Minnesota': 0.9677820528268913, 'Mississippi': 0.8010239257584183, 'Missouri': 0.9191389719135256, 'Montana Eastern': 1.0, 'Nebraska': 1.0000007532205832, 'New Hampshire': 1.0, 'New Jersey': 0.392540757835182, 'New York': 1.0000005406285022, 'North Carolina': 0.9661042238240921, 'North Dakota': 0.9562768671984551, 'Ohio': 0.7548696051642251, 'Oklahoma': 1.0000003723702746, 'Pennsylvania': 0.7940446194340689, 'South Carolina': 1.000000361866639, 'South Dakota': 1.0, 'Tennessee': 0.8166491334954616, 'Texas': 1.0000004934862285, 'Virginia': 0.7707470744632976, 'West Virginia': 0.9674929131293989, 'Wisconsin': 0.7655943379099679}, 'nuclear': {'Alabama': 1.0000003846801777, 'Arkansas': 0.999999705269626, 'Connecticut': 1.0000002442179687, 'Florida': 1.0000002402563577, 'Georgia': 0.9999998598059594, 'Illinois': 1.0000003279216907, 'Iowa': 1.0000000893008618, 'Kansas': 0.9999997393232486, 'Louisiana': 0.9999999656936831, 'Maryland': 1.0000000292354225, 'Massachusetts': 0.0, 'Michigan': 1.0000002673975597, 'Minnesota': 1.0000001835494006, 'Mississippi': 1.0000003671203896, 'Missouri': 0.9999998333215375, 'Nebraska': 1.0000001751025456, 'New Hampshire': 1.0000001432365833, 'New Jersey': 1.0000001424732798, 'New York': 1.0000000310147061, 'North Carolina': 0.9999998628010267, 'Ohio': 0.9999995045938845, 'Pennsylvania': 1.0000001199178026, 'South Carolina': 1.0000001810571604, 'Tennessee': 0.999999727560539, 'Virginia': 1.0000002188271357, 'Wisconsin': 0.9999996035604168}, 'solar': {'Alabama': 1.2769130998702987, 'Arkansas': 1.1702127659574468, 'Connecticut': 227.41437264232292, 'Delaware': 47.781461610657054, 'Florida': 1.150572306925926, 'Georgia': 1.0771962218592432, 'Illinois': 51.3967979599255, 'Indiana': 4.428253397767584, 'Iowa': 4.73076923076923, 'Kansas': 10.0, 'Kentucky': 2.63, 'Louisiana': 1.0, 'Maine': 253.82044754986876, 'Maryland': 55.20536023434069, 'Massachusetts': 14.280948514822134, 'Michigan': 195.33318948709177, 'Minnesota': 3.966826083238876, 'Mississippi': 1.040553435114504, 'Missouri': 35.111812133363536, 'Montana Eastern': 1.0, 'Nebraska': 3.079365079365079, 'New Hampshire': 126.39990385712784, 'New Jersey': 43.21574086325935, 'New Mexico Eastern': 1.0303030303030303, 'New York': 95.29733439781107, 'North Carolina': 1.9987328221954552, 'North Dakota': 1.0, 'Ohio': 27.251285112601057, 'Oklahoma': 12.2, 'Pennsylvania': 32.676943323133, 'Rhode Island': 44.58809962982195, 'South Carolina': 2.092496765847348, 'South Dakota': 1.0, 'Tennessee': 1.460987261146497, 'Texas': 1.0, 'Vermont': 9.15918037240631, 'Virginia': 16.255644082702236, 'West Virginia': 1.0, 'Wisconsin': 133.36231544118925}, 'other': {'Alabama': 1.0000064565016746, 'Arkansas': 0.9999968436881873, 'Connecticut': 1.0, 'Delaware': 1.0000077284707973, 'Florida': 1.000002254796516, 'Georgia': 1.0000003780127775, 'Illinois': 0.9999974410248127, 'Indiana': 1.0, 'Iowa': 1.0000039922442505, 'Kansas': 0.9999620635472295, 'Kentucky': 1.0000019125015416, 'Louisiana': 0.9999990420598269, 'Maine': 1.0, 'Maryland': 1.0, 'Massachusetts': 1.0, 'Michigan': 1.0, 'Minnesota': 0.9999977048797855, 'Mississippi': 1.0000018293917894, 'Missouri': 1.0000007567622458, 'Montana Eastern': 0.9999932450252547, 'Nebraska': 0.9999863522922233, 'New Hampshire': 1.0000020831038694, 'New Jersey': 1.0000031975762218, 'New Mexico Eastern': 1.0, 'New York': 1.0000027322479021, 'North Carolina': 1.0000011114973641, 'North Dakota': 0.9999144460652031, 'Ohio': 1.0000048698760164, 'Oklahoma': 0.9999959958722264, 'Pennsylvania': 1.0000000700104457, 'Rhode Island': 0.9999893329253294, 'South Carolina': 1.0000003258808434, 'South Dakota': 1.0, 'Tennessee': 1.0000003200588197, 'Texas': 1.000001967476799, 'Vermont': 0.9999947033102434, 'Virginia': 0.9999980695017964, 'West Virginia': 1.0, 'Wisconsin': 0.9999977965055532}}

I see a few issues:

We are scaling state names, not zone names (see previous comment)
We are scaling 'ng', 'nuclear', etc., and I am not sure that we want to do this as part of the clean capacity scaling.
We are scaling several types/zones by almost 1, which suggests that we do not actually want to apply any scaling factors here, but there is some floating point cruft because capacity values are very similar but off by just a bit.

When I try to apply this ct to a new scenario, scale_plant_capacity() returns without adding anything to the change table dictionary (see https://github.com/intvenlab/PowerSimData/blob/develop/powersimdata/input/change_table.py#L162), because it got at least one zone name that is not present in the grid:

>>> new_scenario = Scenario('')
>>> new_scenario.state.set_builder(['Eastern'])
>>> new_scenario.state.builder.change_table.scale_plant_capacity('solar', zone_name=ct['solar'])
--------------
Possible zones
--------------
Maine
New Hampshire
Vermont
Massachusetts
Rhode Island
Connecticut
New York City
Upstate New York
New Jersey
Pennsylvania Eastern
Pennsylvania Western
Delaware
Maryland
Virginia Mountains
Virginia Tidewater
North Carolina
Western North Carolina
South Carolina
Georgia North
Georgia South
Florida Panhandle
Florida North
Florida South
Alabama
Mississippi
Tennessee
Kentucky
West Virginia
Ohio River
Ohio Lake Erie
Michigan Northern
Michigan Southern
Indiana
Chicago North Illinois
Illinois Downstate
Wisconsin
Minnesota Northern
Minnesota Southern
Iowa
Missouri East
Missouri West
Arkansas
Louisiana
East Texas
Texas Panhandle
New Mexico Eastern
Oklahoma
Kansas
Nebraska
South Dakota
North Dakota
Montana Eastern
>>> new_scenario.state.builder.change_table.ct
{}

dmuldrew · 2020-04-27T23:03:32Z

So it sounds like we only need to output scaling factors for the wind and solar resources at this point?

danielolsen · 2020-04-27T23:39:21Z

Any changes to the scaling of generators besides solar and wind must be coming from either scaling that occurred in the scenario loaded into ScenarioInfo, floating point errors, or some bug in capacity scaling. Either way, I do not think we want to retrieve these values from the clean capacity scaling process. @BainanXia, do you agree?

@dmuldrew could you please run an integration test to demonstrate that this now works as expected? It does not need to be anything fancy, just copying/pasting from a terminal to show success will be enough.

BainanXia · 2020-04-28T00:01:36Z

If the change table output of capacity scaling framework only contains wind and solar, we will have to go through the same procedure as previously we did using target capacity excel sheet to load target capacities for other generators, which make it meaningless. I was expecting the change table output here to be a complete one that covers all the scalings coming from the target capacities. After running this, we just need to add other entries to the change table when creating next scenario.

danielolsen · 2020-04-28T00:12:03Z

@BainanXia I see your point. That brings up two questions:

Where should the desired capacities for other types of generators come from? Is it the optional input gen_capacity to create_scale_factor_table()? Side note: the docstring for that parameter is unclear as to whether that input should be capacities (MW) or scaling factors (unitless).
If we do not pass anything to gen_capacity in the call to create_scale_factor_table(), should we get no scaling factors for any other generator types (besides wind and solar), or should we get scaling factors that will produce the capacities from the scenario loaded into ScenarioInfo, when they are applied to a base grid?

dmuldrew · 2020-04-28T00:18:13Z

I updated the capacity planning demo notebook:
https://github.com/intvenlab/PowerSimData/blob/capacity_planning_ct_output/powersimdata/design/demo/eastern_clean_capacity_scaling_demo.ipynb
and I'm seeing some scaling on ng in addition to wind and solar with respect to the Eastern base grid. I'm not sure if this is expected...my understanding that you're tweaking natural gas generation capacity in scenarios as well?

{'ng': {'Alabama': 1.0,
              'Arkansas': 1.0,
              'Connecticut': 0.7286196063433874,
              'Delaware': 1.0,
              'Florida Panhandle': 0.9133782316684993,
              'Florida South': 0.9133782316684993,
              'Florida North': 0.9133782316684993,
              'Georgia South': 1.0,
              'Georgia North': 1.0,
              'Iowa': 0.8317044578649616,
              'Illinois Downstate': 1.0,
              'Chicago North Illinois': 1.0,
              'Indiana': 0.8580860268108403,
              'Kansas': 1.0,
              'Kentucky': 0.8633186974366128,
              'Louisiana': 0.9542411603416739,
              'Massachusetts': 0.9347475432056931,
              'Maryland': 0.6160753429662239,
              'Maine': 1.0,
              'Michigan Northern': 1.0,
              'Michigan Southern': 1.0,
              'Minnesota Southern': 1.0,
              'Minnesota Northern': 1.0,
              'Missouri East': 1.0,
              'Missouri West': 1.0,
              'Mississippi': 1.0,
              'Montana Eastern': 1.0,
              'Western North Carolina': 0.9586522849496901,
              'North Carolina': 0.9586522849496901,
              'North Dakota': 1.0,
              'Nebraska': 1.0,
              'New Hampshire': 1.0,
              'New Jersey': 0.9570253868208743,
              'New Mexico Eastern': 1.0,
              'New York City': 0.9680855631853625,
              'Upstate New York': 0.9680855631853625,
              'Ohio River': 0.7685243368575096,
              'Ohio Lake Erie': 0.7685243368575096,
              'Oklahoma': 0.9616510111695022,
              'Pennsylvania Western': 0.6871816189767775,
              'Pennsylvania Eastern': 0.6871816189767775,
              'Rhode Island': 1.0,
              'South Carolina': 0.8917456125456951,
              'South Dakota': 1.0,
              'Tennessee': 0.8464172686329734,
              'El Paso': 1.0,
              'Far West': 1.0,
              'South': 1.0,
              'North': 1.0,
              'East': 1.0,
              'Texas Panhandle': 1.0,
              'South Central': 1.0,
              'West': 1.0,
              'East Texas': 1.0,
              'Coast': 1.0,
              'North Central': 1.0,
              'Virginia Tidewater': 0.8319038692734475,
              'Virginia Mountains': 0.8319038692734475,
              'Wisconsin': 1.0,
              'West Virginia': 1.0}

danielolsen · 2020-04-28T19:37:55Z

New natural gas generators have been added since scenario 394 (see #118), which is why you are calculating a scaling factor when you use that scenario as a base. Currently in both our 2016 runs and our 2020 runs, we turn off some subset of generators in our grid to reflect that some of them either a) have retired or b) have not yet been installed, so I don't think we can compare total capacities between two grids to get the scaling factors we want for non-renewable types. Doing this automatically would (I think) require adding extra info to the plant table (operating/retirement year), adding another parameter to the algorithm (simulation year), and passing capacity targets to this process (which we can do).

There's another problem with the change table output in the last post: it has ERCOT zones for Texas. That's why it's not adding properly in the demo notebook.

danielolsen · 2020-04-29T22:06:09Z

My notes from our sync-up today:

Project goals
- The ultimate purpose is for the output of capacity scaling to be a change table dictionary consistent with the input to ChangeTable.scale_plant_capacity (keys of zone_name and plant_id) or the data structure specified in the ChangeTable docstring (keys of zone_id and plant_id).
- Use as input: an already-run 'base' scenario, which contains both a) PG results sufficient to calculate capacity factors for renewable generators and b) non-renewable generator capacities as they should be in the new scenario. The framework should calculate scaling factors for non-renewable generators by plant_id and/or zone_id so that the result of applying the scaling factors onto an unscaled grid is the same capacities as in the 'base' scenario (within rounding errors). Ideally scaling factors that are less than some epsilon (0.01%?) should be ignored, but this is a low priority compared to the main goals.
To solve the ERCOT load zone issue: after you derive your list of relevant zones, filter out zones that are not present in the base scenario. This will eliminate the ERCOT Texas zones, and maintain Texas Panhandle and East Texas. Any time your output contains -------------- Possible zones --------------, that means you are passing at least one zone that is not present in the grid being scaled, and scaling factors for zones that are in the grid are potentially being discarded. See Capacity planning ct scaling output #143 (comment).
We do not need to pass an unscaled 'base' grid to the method, since we can use a fresh unscaled grid instantiated with the interconnect(s) of the 'base' scenario.
To check whether outputs look the way they should: using scenario 403 as the 'base', recreate the change tables from scenario 408 (Independent, see RunEastern2030IndependentAnchor_NGCoal_scaling.ipynb in Dropbox/Results/Scenarios/ScenarioRuns/Eastern2030Independent/) and scenario 411 (Collaborative, see RunEastern2030CollaborativeAnchor.ipynb in Dropbox/Results/Scenarios/ScenarioRuns/Eastern2030Collaborative/).
- NOTE: both of these scenarios also have additional zone scaling factors for coal and natural gas, compared to scenario 403! See the notebooks, and specifically where they access Eastern 2030 NG and Coal_v2020to2030.xlsx. So to verify, you can either:
  - make a copy the of the notebooks (do not modify the notebooks in /ScenarioRuns/) and relevant files and remove this extra scaling, so the change tables of your uncreated scenario (new_scenario.state.builder.change.table.ct) match between the notebook and the capacity scaling framework, or
  - apply the same additional coal and gas scaling, so that the output of the capacity scaling matches when you get when calling Scenario('408/411').state.get_ct().

If something here is unclear, wrong, or incomplete, please let me know.

danielolsen · 2020-04-30T20:44:46Z

One more thing to be aware of: the 'base' scenario may have fewer generators than exist in a new unscaled grid in develop! For example, if/when we merge in the offshore wind data branch, there will be more plants in Grid(['Eastern']) than there were in scenario 403. In that case, these new generators should be scaled by plant_id down to 0, so that the grid that is generated will be functionally identical to the grid in the 'base' scenario.

BainanXia · 2020-04-30T20:49:31Z

One more thing to be aware of: the 'base' scenario may have fewer generators than exist in a new unscaled grid in develop! For example, if/when we merge in the offshore wind data branch, there will be more plants in Grid(['Eastern']) than there were in scenario 403. In that case, these new generators should be scaled by plant_id down to 0, so that the grid that is generated will be functionally identical to the grid in the 'base' scenario.

I have similar comments in my draft box when I saw your comment popped up ;)

dmuldrew · 2020-05-04T17:37:21Z

The results of the change table validation of new method of generating the change table using the new GridInfo class are in this notebook:
https://github.com/intvenlab/PowerSimData/blob/capacity_planning_ct_output/powersimdata/design/demo/ScaleFactorComparison.ipynb

Summarizing, all scaling factors generating by the new method are in also in the scaling factors generated by the old method. However, there appears to be a minor bug in the old method that attempts to scale some states with no resource capacity and more than one loadzone down to zero. I think it has to do the initialization of sum_state_ca to 0 before summing:

sum_state_ca = 0
if area in state2loadzone:
    for loadzone in state2loadzone[area]:
        if loadzone in sum_by_type_zone and colname_map[gen_type] in sum_by_type_zone[loadzone]:
            sum_state_ca += sum_by_type_zone[loadzone][colname_map[gen_type]]
    for loadzone in state2loadzone[area]:
        if loadzone in sum_by_type_zone and colname_map[gen_type] in sum_by_type_zone[loadzone]:
            scale_factor[colname_map[gen_type]][loadzone] = (row[gen_type],sum_state_ca)
else:
    if colname_map[gen_type] in sum_by_type_zone[area]:
        sum_state_ca = sum_by_type_zone[area][colname_map[gen_type]]
    scale_factor[colname_map[gen_type]][area] = (row[gen_type],sum_state_ca)

This later is corrected by the change table object validation.

The new method has no such issue, and the change table object has no complaints about the generated scaling factors.

BainanXia · 2020-05-04T17:54:50Z

@dmuldrew Whenever there is non-zero target capacity designed for a zero existing capacity resource, we consider something is wrong and needs to be notified, which is not a bug, even in the new framework, we should be aware of that (we could add dummy 1MW generators later to fix it). Another case you mentioned, if the existing capacity is zero and the target capacity is also zero, this will be considered as a normal situation, not 'attempting to scale multiple loadzones down to zero'. It certainly can be handled in a different way, as in the new framework, they are identified to be 'no scale', which is fine.

Thanks for the validation. I've read through the notebook and find the change table outputs comparison is convincible. I think @danielolsen should also give a pass before we can close this.

dmuldrew · 2020-05-04T18:01:11Z

I think it's validated enough to merge in, since these new functions are an optional output for our workflow, we can still do some additional comparisons with new scenarios before we rely solely this new output.

danielolsen · 2020-05-04T18:29:42Z

Thank you for the notebook, the creation of a change table with zone scaling factors appears to be working very well.

Could you please rebase onto develop and run an integration test, starting from Scenario 403 and the csv with the target info, and ending with a change_table? There are a couple of potential issues that I want to check against:

We want the generators that are turned off in Scenario 403 to be turned off in our change table as well, and we want the scaling factors to result in the right total capacity even when these generators are turned off. This will require scaling some plants by plant_id. Otherwise, I don't think we will get scaling factors that result in the same capacities for the non-renewable generators.
We want the generators that have been added to the grid since Scenario 403 (i.e. offshore wind) to be turned off as well.

dmuldrew · 2020-05-04T23:26:38Z

@danielolsen I think this is part of what you're looking for?
https://github.com/intvenlab/PowerSimData/blob/capacity_planning_ct_output/powersimdata/design/demo/eastern_clean_capacity_scaling_demo.ipynb

             'wind_offshore': {'Connecticut': 0.0,
              'Delaware': 0.0,
              'Florida South': 0.0,
              'Florida North': 0.0,
              'Georgia South': 0.0,
              'Massachusetts': 0.0,
              'Maryland': 0.0,
              'Maine': 0.0,
              'North Carolina': 0.0,
              'New Hampshire': 0.0,
              'New Jersey': 0.0,
              'New York City': 0.0,
              'Rhode Island': 0.0,
              'South Carolina': 0.0,
              'Virginia Tidewater': 0.0},

danielolsen · 2020-05-04T23:49:29Z

@dmuldrew yes, the change table output is handling things properly when it needs to scale generators by type/zone, but not when it needs to scale by type & plant_id. In Scenario 403 we specifically turn off individual generators, to represent retirements between 2016 and 2020. We want the change table we generate to also do the same thing.

BainanXia · 2020-05-04T23:50:30Z

@dmuldrew Let me explain the main concern that @danielolsen brings up here. Let's call the grid we got from the loaded scenario, ref_grid, then call the grid from your powersimdata locally, base_grid. Once we calculated the target capacities for next scenario (the current excel sheet output), we need to apply those numbers to the base_grid to come up with the corresponding scale factors for the change table.

However, there is an issue here, the base_grid is NOT aligned with the original ref_grid before the ref_grid is scaled according to the change table in the loaded scenario, i.e. some generators should be turned off, which implies one should turn off the corresponding generators first before calculating the scale factors so that when the change table is applied, the scaled total capacities will match the target capacities without those 'off' generators.

Furthermore, the change table operations are independent with each other, which implies we should turn off those generators that we found in the ref_grid in the final change table for the next scenario as well.

dmuldrew · 2020-05-05T18:17:36Z

Seems like in this case you can just pass in whatever reference grid you want to scale from as one of the optional inputs of the new functions.

danielolsen · 2020-05-05T18:25:34Z

The reference grid is associated with the 'base' scenario, so we do not need another input from the user side. Passing the base scenario and the list of targets is enough information for all calculations.

dmuldrew added this to the Egg Hunt milestone Apr 13, 2020

dmuldrew requested a review from BainanXia April 13, 2020 19:53

dmuldrew assigned dmuldrew and BainanXia Apr 13, 2020

dmuldrew requested a review from danielolsen April 13, 2020 19:53