Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove reference to deleted constant and slight pandas refactor #365

Merged
merged 7 commits into from
Jan 6, 2021

Conversation

jenhagg
Copy link
Collaborator

@jenhagg jenhagg commented Dec 23, 2020

Purpose

I was playing around with querying the execute list and noticed we still had a reference to server_setup.EXECUTE_LIST which no longer exists. Changed the exception to reflect that but also tweaked the query since it seems a little more readable.

Edit: based on feedback, I expanded the scope to set the index within CsvStore and took the opportunity to move query logic to the relevant data access classes. I think this simplifies the scenario class, making it less concerned with parsing the csv files.

What the code is doing

Nothing new

Testing

Loaded an existing scenario works, and if I delete the entry from the execute list I get the exception raised here.

Time estimate

3 min

@jenhagg jenhagg requested a review from rouille December 23, 2020 19:50
@jenhagg jenhagg self-assigned this Dec 23, 2020
@@ -98,14 +97,12 @@ def _set_status(self):
:raises Exception: if scenario not found in execute list on server.
"""
execute_table = self._execute_list_manager.get_execute_table()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that the CsvStore class is dedicated to reading the ScenarioList/ExecuteList. So I guess the _parse_csv method could return a pandas.DataFrame with id as index by doing l.45:

table = pd.read_csv(file_object, index_col=0) or table = pd.read_csv(file_object, index_col="id") 

That way we don't have to set_index later

@@ -43,6 +43,7 @@ def _parse_csv(self, file_object):
:return: (*pandas.DataFrame*) -- the specified file as a data frame.
"""
table = pd.read_csv(file_object)
table.set_index("id", inplace=True, drop=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason for not dropping the column?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was trying to get it to work without, but the only thing we need it for is returning scenario info as an ordered dict. If we drop the column we'll get an error doing self.info["id"] within a scenario instance, since DataFrame.to_dict only includes columns when the kind is "records"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... We could drop it here and use reset_index() l. 142 of scenario_list.py:

return scenario.reset_index().to_dict("records", into=OrderedDict)[0]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do like that a bit better, however it results in the id column being an int type, so at some point downstream it fails during string concatenation. I'll give this some more thought, probably next year.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about return scenario.reset_index().astype({"id": "str"}).to_dict("records", into=OrderedDict)[0]?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And/or can the string concatenation include an extra str() wrap around the id, so that it has less strict assumptions about its input?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Ben's suggestion in the latest commit. Thinking the strict assumption about types is actually a good thing - makes things simpler for any code using the scenario object. Having it guaranteed to be an int would be fine too, just minimizing changes here. This is kinda reminiscent of Postel's law, which is basically be "flexible on inputs but not outputs" (my interpretation).

Copy link
Collaborator

@rouille rouille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

:return: (*str*) -- scenario status
"""
try:
table = self.get_execute_table()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we expecting a KeyError from this call, or would this be better outside the try?

Copy link
Contributor

@danielolsen danielolsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@jenhagg jenhagg merged commit 1296401 into develop Jan 6, 2021
@jenhagg jenhagg deleted the jon/pd_index branch January 6, 2021 21:16
@ahurli ahurli mentioned this pull request Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants