Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add function to assign demand to buses proportional to population #235

Merged
merged 3 commits into from
Nov 2, 2021

Conversation

danielolsen
Copy link
Contributor

@danielolsen danielolsen commented Oct 29, 2021

Pull Request doc

Purpose

Estimates population per substation, using data from simplemaps.com, and uses this information to add demand to buses. This is a re-implementation of the approach taken by the collaborators. Closes #229.

What the code is doing

We add a new module prereise.gather.griddata.hifld.data_process.demand with a single function assign_demand_to_buses. This function:

  1. Loads population by ZIP and by county from CSVs in the repo.
  2. Sorts substations by connected transmission capacity.
  3. Distributes each ZIP code's population to the N highest-transmission-capacity non-generation substations within that ZIP, where N is an integer value less than the total (calculated using the substation_load_share fraction within const.py) but no less than one, unless there are no substations within that ZIP. These are considered as 'load substations'.
  4. Determines which counties have no substations with ZIP-assigned demand ('load substations'), and for each of these, picks the substation in the county with greatest transmission capacity to add to the set of load substations.
  5. Determines how much of each county's population is not yet assigned, and distributes this to the load substations within the county. Note: if a county has no substations at all, its population does not go anywhere else, and is effectively ignored. Less than 5% of population is ignored this way, so it should not have a large effect on the overall demand distribution.
  6. Translates total population (from ZIP-assignment and county assignment) to demand using an assumption of demand-per-person.
  7. Selects the lowest-voltage bus within each substation to assign demand to.

Besides demand.py, all other changes are data/documentation.

Testing

Tested manually.

Usage Example/Visuals

import pandas as pd
from prereise.gather.griddata.hifld.data_process.demand import assign_demand_to_buses
from prereise.gather.griddata.hifld.data_process.generators import build_plant
from prereise.gather.griddata.hifld.data_process.transmission import (
    build_transmission,
    calculate_branch_mileage,
    create_buses,
    create_transformers,
    estimate_branch_impedance,
    estimate_branch_rating,
)

# Invoking highest-level `data_process` functions
lines, substations = build_transmission(method="line2sub")
bus = create_buses(lines)
generators = build_plant(bus, substations)

# This code has been demonstrated via other PRs, but hasn't been baked into the top-level data process functions yet
lines["type"] = "Line"
lines["length"] = lines.apply(calculate_branch_mileage, axis=1)
transformers = create_transformers(bus)
transformers["type"] = "Transformer"
branch = pd.concat([lines, transformers])
branch["x"] = branch.apply(lambda x: estimate_branch_impedance(x, bus["baseKV"]), axis=1)
branch["rateA"] = branch.apply(lambda x: estimate_branch_rating(x, bus["baseKV"]), axis=1)

# New code
assign_demand_to_buses(substations, branch, generators, bus)

A "Pd" column is added inplace to the bus dataframe:

>>> # Estimated 2.01 kW per person within code, 315.8 million people assigned to buses
>>> bus["Pd"].sum() / 2.01e-3
315800868.4704517
>>> bus["Pd"].isna().sum()
0

We need the generators to be able to preferentially assign demand to non-generator buses, and we need branch capacities to preferentially assign demand to higher-capacity substations when multiple substations are available within an area (ZIP or county). Generating the transformers and estimating branch impedances and capacities should probably be added to build_transmission as part of fulfillment of #226.

I demonstrate using the "line2sub" method since I have more faith in line coordinates than line substation names, based on some exploration with DC lines: #233 (comment). I suggest that we also switch to this as default as part of #226.

Time estimate

1 hour. The code itself isn't too long, but it's some fairly dense pandas.

@danielolsen danielolsen added the hifld Related to ingestion of the HIFLD data label Oct 29, 2021
@danielolsen danielolsen self-assigned this Oct 29, 2021
@danielolsen danielolsen force-pushed the daniel/hifld_bus_demand branch from 1ea6767 to 748fe4d Compare October 29, 2021 21:47
Comment on lines +27 to +25
filtered_branch = branch.query("SUB_1_ID != SUB_2_ID")
from_cap = filtered_branch.groupby("SUB_1_ID").sum()["rateA"]
to_cap = filtered_branch.groupby("SUB_2_ID").sum()["rateA"]
sub_cap = from_cap.combine(to_cap, lambda x, y: x + y, fill_value=0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking about refactoring calculate_substation_capacity function in powersimdata/design/transmission/substations.py, which currently takes a grid object as input, but failed to come up with a compatible idea given the column names are different as well...Let's keep these lines here then.

subs_per_zip = filtered_subs.value_counts("ZIP")
zip_load_substations = subs_per_zip * const.substation_load_share
zip_load_substations = zip_load_substations.round().clip(lower=1)
zip_assigned_population = (zip_data["population"] / zip_load_substations).dropna()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was reading through the reference code from collaborators, I was thinking whether we should distribute load (population) proportional to the substation capacities instead of uniformly. Do you think that will make a difference?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will definitely make some kind of difference, but I'm not not sure in which general direction the difference will be. Thinking about a low-medium density area, I would imagine that the population is fairly evenly spread out, and so distributing uniformly probably makes sense, since a high-capacity substation may just be a collector, not in response to a pocket of density. However, in an area that truly has different densities, higher-capacity substations may actually be in reaction to higher density. Without further information, I'm not sure what the best conclusion is.

If we run a simulation with this approach and find issues, then I think we may need to revisit this question. I think it's also potentially related to #234; we may want to patch the algorithm, or we may alternatively want to patch the data that feeds the algorithm.

Copy link
Collaborator

@BainanXia BainanXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. Thanks!

@danielolsen danielolsen force-pushed the daniel/hifld_bus_demand branch from ed5b489 to 53d6c00 Compare October 29, 2021 23:05
@danielolsen
Copy link
Contributor Author

I chatted with @rouille today, and he raised a suggestion: should these CSVs live in the blob storage, rather than the repo?

@BainanXia
Copy link
Collaborator

I chatted with @rouille today, and he raised a suggestion: should these CSVs live in the blob storage, rather than the repo?

Good call. Looking at the size of the two files, <10M in total. I would say it won't hurt to leave it in the repo for now so that it is easier to move around. But it is also clean put them in the blob storage if we have a folder structure in mind (we still have decent amount of files in the repo now). Again, your choice.

@danielolsen danielolsen force-pushed the daniel/hifld_bus_demand branch 2 times, most recently from b4eba4e to 1e06ffa Compare November 2, 2021 19:03
@danielolsen
Copy link
Contributor Author

This feature has been refactored to get the county and ZIP data files from blob storage, rather than from the repo itself. The new call signature (including changes to build_transmission via #237):

from prereise.gather.griddata.hifld.data_process.demand import assign_demand_to_buses
from prereise.gather.griddata.hifld.data_process.generators import build_plant
from prereise.gather.griddata.hifld.data_process.transmission import build_transmission
branch, bus, sub, dcline = build_transmission()
generators = build_plant(bus, sub)
assign_demand_to_buses(sub, branch, generators, bus)

@danielolsen danielolsen force-pushed the daniel/hifld_bus_demand branch from c1448bf to 27d0e67 Compare November 2, 2021 21:05
@danielolsen danielolsen merged commit 6853560 into hifld Nov 2, 2021
@danielolsen danielolsen deleted the daniel/hifld_bus_demand branch November 2, 2021 21:41
danielolsen added a commit that referenced this pull request Dec 8, 2021
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Jan 5, 2022
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Jan 8, 2022
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Jan 31, 2022
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Feb 25, 2022
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Mar 15, 2022
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Apr 1, 2022
feat: add function to assign demand to buses proportional to population
danielolsen added a commit that referenced this pull request Apr 5, 2022
feat: add function to assign demand to buses proportional to population
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hifld Related to ingestion of the HIFLD data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants