Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: set up flexibility for multiple versions of MATReader #148

Merged
merged 2 commits into from
Apr 20, 2020

Conversation

danielolsen
Copy link
Contributor

Purpose

Set up flexibility to use one of many different MATReader sub-classes, depending on the engine that was used to run the scenario and produce the input.mat file.

What is the code doing

In mat_reader.py, moving the existing _build_network() code to a new subclass: REISEMATReader.

In profiles.py, changing the return type of get_data() and _read_data() when given a .mat extension. Instead of loading the input.mat file to instantiate and return a Grid, we just return the path to the .mat file (still trying to open it, and downloading it from the server if needed), because we don't know in here how we should interpret the input.mat file to produce a grid.

In analyze.py, adapting get_grid() now that using the InputData method returns a path to an input.mat file. We use our knowledge about the scenario to then instantiate a Grid, passing the engine used as an additional parameter.

In grid.py, adding a new parameter to Grid instantiation: engine, to specify which engine was used to create an input.mat file if we are not creating a Grid from usa_tamu. This engine parameter will determine which type of MATReader subclass to use to read the matfile. Currently only 'REISE' is a valid choice, which points to our new REISEMATReader class, and any other choice will throw an error.

Testing

All unit tests pass:

C:\Users\dolsen\repos\iv\powersimdata> python -m pytest
============================= test session starts =============================
platform win32 -- Python 3.7.4, pytest-5.3.1, py-1.8.0, pluggy-0.13.1
rootdir: C:\Users\dolsen\repos\iv\powersimdata
collected 100 items

powersimdata\design\tests\test_object_persistence.py ...                 [  3%]
powersimdata\design\tests\test_resource_target_manager.py .............. [ 17%]
.                                                                        [ 18%]
powersimdata\design\tests\test_scenario_info.py ........                 [ 26%]
powersimdata\design\tests\test_strategies.py ..............              [ 40%]
powersimdata\design\tests\test_target_manager_input.py ...               [ 43%]
powersimdata\design\tests\test_transmission.py ......................... [ 68%]
............                                                             [ 80%]
powersimdata\input\tests\test_grid.py ...............                    [ 95%]
powersimdata\utility\tests\test_transfer_data.py .....                   [100%]

============================== warnings summary ===============================
C:\Python37\lib\site-packages\win32\lib\pywintypes.py:2
  C:\Python37\lib\site-packages\win32\lib\pywintypes.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp, sys, os

-- Docs: https://docs.pytest.org/en/latest/warnings.html
======================= 100 passed, 1 warning in 11.62s =======================

An integration test passes, whether the file has already been downloaded or not:

>>> from powersimdata.scenario.scenario import Scenario
>>> scenario = Scenario('401')
SCENARIO: test | USABase_2016_test_4

--> State
analyze
100%|#####################################| 7.24k/7.24k [00:00<00:00, 58.5kb/s]
>>> grid = scenario.state.get_grid()
--> Loading grid
401_grid.mat not found in C:\Users\dolsen\ScenarioData\ on local machine
Transferring 401_grid.mat from server
100%|#####################################| 18.5M/18.5M [00:02<00:00, 8.75Mb/s]
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading dcline
Loading sub
Loading bus2sub
>>> grid = scenario.state.get_grid()
--> Loading grid
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading dcline
Loading sub
Loading bus2sub

Time to review

One hour. There aren't many lines of code that have changed, but the changes are in several different files/objects that have to do a graceful dance for get_grid() to always work.

@rouille
Copy link
Collaborator

rouille commented Apr 18, 2020

self._build_network() in the MATREADER object (l. 24) is undefined and I believe should not be there. All tests pass without when I comment the line.

@rouille
Copy link
Collaborator

rouille commented Apr 18, 2020

You use f-strings in grid.py and profiles.py. While I like it, it can be problematic when the package is installed on the server where Python 3.5.2 is installed. I had to generate a wind profile on the server this morning and I had to edit prereise/gather/winddata/rap/power_curves to refactor a couple of f-strings.

@danielolsen
Copy link
Contributor Author

@rouille the point of the _build_network method is that every sub-class of MATReader will define this function, and because the subclasses do not re-define the __init__ method, the _build_network call in the MATReader __init__ method will trigger each sub-class's specific network building logic.

Without that line, all tests pass, but the Grid object loaded is an empty DataFrame:

>>> from powersimdata.scenario.scenario import Scenario
>>> scenario = Scenario('401')
SCENARIO: test | USABase_2016_test_4

--> State
analyze
>>> grid = scenario.state.get_grid()
--> Loading grid
>>> grid
<powersimdata.input.grid.Grid object at 0x000001883D2D7CC8>
>>> grid.plant
Empty DataFrame
Columns: []
Index: []
>>> grid['plant']
Empty DataFrame
Columns: []
Index: []

We could make this logic more clear by defining this method for MATReader, and having that method either pass or throw an Exception, with a docstring explaining that this should be re-defined by each subclass that we want to implement.

@danielolsen
Copy link
Contributor Author

The docstring for Grid has been updated, and all f-strings removed.

@danielolsen
Copy link
Contributor Author

Types have been updated for the columns being read from the matfiles (see: #144). There were many places where we had columns full of zeros, but they should have been float zeros rather than int zeros.

@rouille
Copy link
Collaborator

rouille commented Apr 20, 2020

Good point. I would recommend to define the _build_network method in MATReader and pass. Throwing an exception is nice but I cannot imagine someone writing a file reader inheriting from MATReader and not implement the method.

Copy link
Collaborator

@rouille rouille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants