Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add Grid validation function #495

Merged
merged 5 commits into from
Jun 4, 2021
Merged

Conversation

danielolsen
Copy link
Contributor

Pull Request doc

Purpose

Since we are increasingly looking to constructing Grid objects from datasets besides the usa_tamu CSVs, we are increasingly in need of a check that the Grid objects that we are constructing are internally consistent. This PR adds that check (closes #481), and also makes a couple of changes to some of the Western offshore wind buses we added, so that their voltages are consistent with their connected onshore voltages (found by running the tests).

What the code is doing

There is one main user-facing check_grid function, which activates nine lower-level functions which check the bullet points from #481. Each of these is pretty self-explanatory, with the exception of the connected components check, for which we need to make a slight modification in the model immutables so that we know that the "USA" interconnect should really have three connected components. I don't see the interconnect_combinations parameter being used anywhere, so I think this will be okay, but maybe @rouille knows more.

Testing

Tests pass on all of our Grids.

Time estimate

15-30 minutes if you want to dig into all of the lower-level tests.

@danielolsen danielolsen self-assigned this Jun 4, 2021
@jenhagg
Copy link
Collaborator

jenhagg commented Jun 4, 2021

Should we add networkx to the setup.py as well?

@BainanXia
Copy link
Collaborator

Should we move the grid modification to a separate PR so that it is more trackable in the future, although these changes on bus voltages are very unlikely to cause any trouble given we created them.

@danielolsen
Copy link
Contributor Author

Should we add networkx to the setup.py as well?

Good catch, yes

Should we move the grid modification to a separate PR so that it is more trackable in the future, although these changes on bus voltages are very unlikely to cause any trouble given we created them.

If you think it is cleaner. It's already a distinct commit, and if we move it to a separate PR then we will need to merge that PR before this one, otherwise the tests in this one will fail.

"""
g = nx.from_pandas_edgelist(grid.branch, "from_bus_id", "to_bus_id")
num_connected_components = len([c for c in nx.connected_components(g)])
if len(grid.interconnect) == 1:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I need to refresh my brain here, the only situation we have len(grid.interconnect) > 1 is TexasWestern, right? If TexasWestern is handled in else block, what does Texas_Western entry do in interconnect_combinations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question. I don't know if/when/how interconnect_combinations was used prior to this PR. We can init a Grid(["Western", "Eastern"]) and then grid.interconnect == ["Western", "Eastern"].

@BainanXia
Copy link
Collaborator

Should we add networkx to the setup.py as well?

Good catch, yes

Should we move the grid modification to a separate PR so that it is more trackable in the future, although these changes on bus voltages are very unlikely to cause any trouble given we created them.

If you think it is cleaner. It's already a distinct commit, and if we move it to a separate PR then we will need to merge that PR before this one, otherwise the tests in this one will fail.

Either way. I don't have strong preference. What do you think @rouille

@BainanXia
Copy link
Collaborator

Given this feature could be useful in many situations especially when we are adopting a new model, do you think it is worthwhile to make all the check run regardless of whether the previous one fails or not, so that the user will be aware of all the problems in one shot?

@danielolsen
Copy link
Contributor Author

Given this feature could be useful in many situations especially when we are adopting a new model, do you think it is worthwhile to make all the check run regardless of whether the previous one fails or not, so that the user will be aware of all the problems in one shot?

In this case, would we need to refactor check_grid so that it collects the exception from each test, then raises one mega-exception at the end?

@BainanXia
Copy link
Collaborator

Given this feature could be useful in many situations especially when we are adopting a new model, do you think it is worthwhile to make all the check run regardless of whether the previous one fails or not, so that the user will be aware of all the problems in one shot?

In this case, would we need to refactor check_grid so that it collects the exception from each test, then raises one mega-exception at the end?

That's what I could think of. Maybe there are better solutions @jon-hagg @rouille ?

@danielolsen danielolsen force-pushed the daniel/grid_check branch 2 times, most recently from c2251df to cbd493a Compare June 4, 2021 20:27
@jenhagg
Copy link
Collaborator

jenhagg commented Jun 4, 2021

Given this feature could be useful in many situations especially when we are adopting a new model, do you think it is worthwhile to make all the check run regardless of whether the previous one fails or not, so that the user will be aware of all the problems in one shot?

In this case, would we need to refactor check_grid so that it collects the exception from each test, then raises one mega-exception at the end?

That's what I could think of. Maybe there are better solutions @jon-hagg @rouille ?

What I'd do is have each function return a list of error messages and collect those, then raise an exception from the top level _check_grid if there are any errors. Or maybe just print them - no need to raise an error unless it's intended to be used to interrupt control flow

@danielolsen
Copy link
Contributor Author

Given this feature could be useful in many situations especially when we are adopting a new model, do you think it is worthwhile to make all the check run regardless of whether the previous one fails or not, so that the user will be aware of all the problems in one shot?

In this case, would we need to refactor check_grid so that it collects the exception from each test, then raises one mega-exception at the end?

That's what I could think of. Maybe there are better solutions @jon-hagg @rouille ?

What I'd do is have each function return a list of error messages and collect those, then raise an exception from the top level _check_grid if there are any errors. Or maybe just print them - no need to raise an error unless it's intended to be used to interrupt control flow

Good call, we don't need to raise exceptions in the lower-level functions, they can return either a string or None and then we can combine any returned strings as necessary for generating the final exception. I think the high-level function should interrupt control flow here if there is a problem: grid integrity is important.

@danielolsen
Copy link
Contributor Author

Mega-Exception logic is added. We get:

ValueError: Problem(s) found with grid: islanded buses detected: {2090023}. indices for bus and bus2sub don't match.

or

ValueError: Problem(s) found with grid: buses present in transmission network but missing from bus table: {2090023}. indices for bus and bus2sub don't match. branch(es) connected across multiple interconnections: Int64Index([104191], dtype='int64', name='branch_id'). line(s) connected across multiple voltages: Int64Index([104191], dtype='int64', name='branch_id').

etc when we muck with the usa_tamu data, otherwise all tests pass.

@danielolsen danielolsen force-pushed the daniel/grid_check branch 4 times, most recently from 9a608a8 to 2b41df1 Compare June 4, 2021 22:39
Copy link
Collaborator

@BainanXia BainanXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice. This could be very useful. Thanks for the patience and being responsive.

@danielolsen danielolsen force-pushed the daniel/grid_check branch 2 times, most recently from 7768262 to 0b9ad99 Compare June 4, 2021 23:02
@danielolsen
Copy link
Contributor Author

One thing that was not done here, that probably should be: checking for null values in all of the data frames.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add validation function for Grid objects
5 participants