Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grid equality bugfix (storage) and print improvement #362

Merged
merged 3 commits into from
Dec 21, 2020

Conversation

danielolsen
Copy link
Contributor

Purpose

What the code is doing

  • The ignored_subkeys are defined as a part of an AbstractGrid, and given default values in a TAMU grid. The values are used when producing storage table entries, but are not used after that. When we load a MATReader grid, we load the storage table entries, but we don't load these default values; and we shouldn't load from usa_tamu_model.py, because a MATReader is not necessarily a TAMU grid. So, when we compare Grids, we want to ignore these, since the real data to compare is in the tables, which are compared.
  • We remove the print, and use except Exception: rather than except: on the advice of https://www.flake8rules.com/rules/E722.html (which is the complaint that pops up on a bare except:).

Testing

To be added.

Time estimate

2 minutes.

@@ -142,7 +152,6 @@ def _univ_eq(ref, test):
_univ_eq(self.id2zone, other.id2zone)
_univ_eq(self.bus2sub, other.bus2sub)

except Exception as e:
print(f"ERROR: could not compare grid. {str(e)}")
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know!

Copy link
Collaborator

@BainanXia BainanXia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

@rouille
Copy link
Collaborator

rouille commented Dec 19, 2020

I addition to returning False, don't we want to have a print of the error so we know where is the discrepancy between the two grids that we compare?

@danielolsen
Copy link
Contributor Author

I addition to returning False, don't we want to have a print of the error so we know where is the discrepancy between the two grids that we compare?

We could do something like that by wrapping each _univ_eq call in a try/except that flags each non-matching table, and then reports out at the end. As is, the current error printing is not informative, it's things like cannot compare Series with different index.

@BainanXia
Copy link
Collaborator

I addition to returning False, don't we want to have a print of the error so we know where is the discrepancy between the two grids that we compare?

We could do something like that by wrapping each _univ_eq call in a try/except that flags each non-matching table, and then reports out at the end. As is, the current error printing is not informative, it's things like cannot compare Series with different index.

In this way, we can get to know which entry can't be compared instead of returning a general error message saying cannot compare Series with different labels. I think we could either print out the first entry that gives this error then stop and leave the user to fix them one by one sequentially if an equality is expected (this is easier to implement) OR we parse through the entire grid and summarize all different entries and print out at the end (more informative).

@rouille
Copy link
Collaborator

rouille commented Dec 19, 2020

I addition to returning False, don't we want to have a print of the error so we know where is the discrepancy between the two grids that we compare?

We could do something like that by wrapping each _univ_eq call in a try/except that flags each non-matching table, and then reports out at the end. As is, the current error printing is not informative, it's things like cannot compare Series with different index.

In this way, we can get to know which entry can't be compared instead of returning a general error message saying cannot compare Series with different labels. I think we could either print out the first entry that gives this error then stop and leave the user to fix them one by one sequentially if an equality is expected (this is easier to implement) OR we parse through the entire grid and summarize all different entries and print out at the end (more informative).

Yeah, print the first discrepancy encountered and return false would be informative.

@danielolsen
Copy link
Contributor Author

Check the latest commit. Without the fixes to storage and dcline checks in #363, a comparison failure looks like:

>>> from powersimdata.scenario.scenario import Scenario
>>> old_grid = Scenario(1712).state.get_grid()
Transferring ScenarioList.csv from server
100%|#######################################| 236k/236k [00:00<00:00, 1.22Mb/s]
Transferring ExecuteList.csv from server
100%|######################################| 21.1k/21.1k [00:00<00:00, 270kb/s]
SCENARIO: test | new_bus

--> State
analyze
--> Loading grid
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading sub
Loading bus2sub
--> Loading ct
>>> new_scenario = Scenario('')
>>> new_scenario.state.set_builder(["Texas"])
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
Transferring ScenarioList.csv from server
100%|#######################################| 236k/236k [00:00<00:00, 1.56Mb/s]
--> Summary
# Existing study
test | base | Anchor | Julia
# Available profiles
demand: ercot
hydro: v1 | v2
solar: v2 | v4.1
wind: v1 | v2 | v5.1 | v5
>>> new_scenario.state.builder.change_table.add_bus([{"lat": 30, "lon": -95, "zone_id": 308}])
>>> new_scenario.state.builder.change_table.add_plant([{"type": "wind", "bus_id": 3008161, "Pmax": 400}])
>>> new_scenario.state.builder.change_table.add_branch([{"from_bus_id": 3008160, "to_bus_id": 3008161, "capacity": 300}])
>>> new_scenario.state.builder.change_table.add_storage_capacity({3008161: 100})
>>> old_grid == new_scenario.state.get_grid()
non-matching entries: dcline, storage
False

@danielolsen
Copy link
Contributor Author

Plus one more commit reduces the number of try/except blocks by extending _univ_eq to add to the nonmatching_entries set directly.

assert set(ref.columns) == set(test.columns)
for col in ref.columns:
assert (ref[col] == test[col]).all()
except Exception:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like Exception will always be an AssertionError, no?

Copy link
Contributor Author

@danielolsen danielolsen Dec 21, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we try to compare two dataframes with transposed columns and non-identical indices, then we will get ValueError: Can only compare identically-labeled Series objects when we try to check ref[col] == test[col], before we can call .all() and check the assert. So I think we will always get either AssertionError or ValueError, in case we want to tighten up this last except.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

self_storage_num = len(self.storage["gencost"])
other_storage_num = len(other.storage["gencost"])
if self_storage_num == 0:
_univ_eq(other_storage_num, 0, "storage")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need the if/else for storage. If there is no storage we still have a dict with keys that can be compared including the ones that will have empty data frames as value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With just the changes in this PR, this doesn't work because the storage_template() in abstract_grid.py doesn't provide the same columns, but with the changes from #363 then you are right that we don't need the if/else.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If/else has been removed, as long as we merge this and #363 back-to-back there should be no problems.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah. I reviewed #363 first and had in mind the addition you did to the storage_template function. Sorry about that.

@danielolsen danielolsen force-pushed the daniel/grid_equality_fixes branch 2 times, most recently from b0db982 to f16440a Compare December 21, 2020 20:06
assert set(ref.columns) == set(test.columns)
for col in ref.columns:
assert (ref[col] == test[col]).all()
except (AssertionError, ValueError):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

Copy link
Collaborator

@rouille rouille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I like very much this new grid comparison. Thanks.

@danielolsen danielolsen force-pushed the daniel/grid_equality_fixes branch from f16440a to 64b2ad1 Compare December 21, 2020 20:25
@danielolsen danielolsen merged commit e17454e into develop Dec 21, 2020
@danielolsen danielolsen deleted the daniel/grid_equality_fixes branch December 21, 2020 20:32
@ahurli ahurli mentioned this pull request Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants