Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use relative paths and mirror server file structure locally #340

Merged
merged 19 commits into from
Dec 9, 2020

Conversation

jenhagg
Copy link
Collaborator

@jenhagg jenhagg commented Nov 20, 2020

Purpose

Remove (most) dependence on server root path by using relative paths to download/upload files, and passing in the source and destination root at instantiation time, so it's reused throughout the object's lifespan. This moves the responsibility of parsing paths, creating folders, etc away from calling code, and the assumption of having two fixed "root" directories simplifies some logic.

What the code is doing

Chronologically, this involved first changing the semantics of copy_from to implement copying a node from a remote tree to a local tree while preserving the structure. This required a lot of path joins and such to be updated. Along the way I removed the data_access constructor parameter from a few classes, introduced a small PathConfig object that simplifies joins where still needed, and created the CommandBuilder class to reuse the logic for building command line invocations. The commit history may be useful to see this trajectory.

Testing

There are some new unit tests for building commands and file transfers, the latter using a semi-mock connection which operates locally on temp directories.

I also ran a texas simulation using this branch and successfully extracted/downloaded files to verify the end to end process.

Where to look

The SSHDataAccess class and server_setup module were the starting points, most everything else flows from that, but does include some opportunistic refactoring. The InputData and OutputData classes are a good illustration of simplified logic, and as usual the unit tests are a (hopefully) good demo.

Time estimate

30 min

"""Constructor"""
self._ssh = None
self._retry_after = 5
self.root = server_setup.DATA_ROOT_DIR if root is None else root
self.dest_root = server_setup.LOCAL_DIR
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an attribute at the DataAccess level?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing it will, just waiting to generalize until the local implementation starts happening, at which point the duplicated code will be obvious and can be moved to the base class.

@danielolsen
Copy link
Contributor

Do you plan to mirror the server structure in copy_to as well? That would require sub-folders for the local folder: tmp/scenario_1, tmp/scenario_2, etc.

@jenhagg
Copy link
Collaborator Author

jenhagg commented Nov 21, 2020

Do you plan to mirror the server structure in copy_to as well? That would require sub-folders for the local folder: tmp/scenario_1, tmp/scenario_2, etc.

All the usages I see will delete the local file afterward, so was planning to leave that as is. However it does need to be updated to work with relative paths.

Comment on lines -622 to +637
if not os.path.exists(server_setup.LOCAL_DIR):
os.makedirs(server_setup.LOCAL_DIR)
os.makedirs(server_setup.LOCAL_DIR, exist_ok=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware of this call signature, it's very nice!

Comment on lines +23 to +24
def execute_dir(self):
return self._join(EXECUTE_DIR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of these being functions instead of attributes is that self.root can be changed and that change will propagate?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but to be honest this was mostly an ad hoc way to simplify the code outside SSHDataAccess that was using the root dir for path joins. I feel like overall things could be further simplified - so this may be temporary.

@jenhagg jenhagg force-pushed the jon/tree branch 4 times, most recently from beeb42f to e3d20c9 Compare December 1, 2020 06:40
@jenhagg jenhagg marked this pull request as ready for review December 1, 2020 19:24
@jenhagg jenhagg self-assigned this Dec 1, 2020
@jenhagg jenhagg added this to the Turkey Time milestone Dec 1, 2020
@@ -0,0 +1,58 @@
import pandas as pd
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is just moving tests around

@@ -0,0 +1,41 @@
import pandas as pd
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is just moving tests around

@@ -3,3 +3,4 @@ markers =
integration: marks tests that require external dependencies (deselect with '-m "not integration"')
db: marks tests that connect to a local database
ssh: marks tests that connect to the server over ssh
wip: marks tests undergoing development
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this useful while writing new tests so figured I'd leave it for future use?

@jenhagg jenhagg requested review from ahurli and ToddG December 2, 2020 20:02
Comment on lines 29 to 32
if self.data_loc == "disk":
self.data_access = SSHDataAccess(server_setup.BACKUP_DATA_ROOT_DIR)
else:
self.data_access = SSHDataAccess(server_setup.DATA_ROOT_DIR)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, we were passing any DataAccess object to the init, now we're hardcoding that we will create a new SSHDataAccess object. Maybe I'm missing something, but this seems like a regression in our abstraction process that will need to get undone as soon as we implement the local access. Could we accomplish the same functionality and keep the abstraction by still accepting a DataAccess object in the init and modifying self.data_access.root based on data_loc?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this is technically a regression, though I was thinking the fix would be using a factory instead of having the parameter - for example

class Context:
    @staticmethod
    def get_data_access(root):
        if check_environment_variable():
            return LocalDataAccess(root)
        return SSHDataAccess(root)

and in input_data.py:

self.data_access = Context.get_data_access(root)

Open to ideas here.. one might be passing the context as a parameter to allow easy substitution of test implementations. Your suggestion also makes sense.. lemme think about that a bit more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a factory work, with the potential downside that we're creating a new SSH connection for every time we want to access the server, rather than re-using the existing one.

Copy link
Collaborator

@rouille rouille Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potential issue. I was opening new SSH connection all over the place in the first version of the scenario framework and the server was not happy. That said, I probably did a bad job at closing the connection everytime a new one was created.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is a good point.. if we share the same DataAccess instance in multiple places, we should definitely not modify the root attribute. We could still achieve connection reuse though - either cache equivalent DataAccess instances within the factory, or use a static connection (self._ssh property) instead of one per instance.

Copy link
Contributor

@danielolsen danielolsen Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify: we've historically had one active SSH connection per Scenario, shared among all the Scenario methods.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense (and I agree it's slightly preferable to recreating connections). Was just realizing that with these changes, not all SSHDataAccess instances are interchangeable anymore, so sharing those won't make sense (but sharing the connection still makes sense, just need to write the code).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are we on this issue? This seems like a fairly fundamental design point for this PR. Would it be easier (possible?) to separate the path/folder part of this PR from the refactor of how we pass around DataAccess objects?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the connection reuse? I was thinking if we close connections correctly (seems like we do this), it won't have much impact, and I could handle that in a follow up PR if needed. For the part about passing in the instance, I also plan on implementing the factory in the next PR when I add the local implementation (that's really the only way I see to do it, since we can't reuse the instance anymore). Let me know if this sounds reasonable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After a conversation with Daniel I've added the initial factory layout, hopefully that gives an idea of how it will progress.

@danielolsen
Copy link
Contributor

I deleted my ScenarioData folder and created a new Scenario, prepared it, ran it, and queried the results, and everything looks according to plan. The local files exactly mirror the server structure, and once raw profiles are cached locally they are not re-downloaded.

>>> from powersimdata.scenario.scenario import Scenario
>>> scenario = Scenario('')
>>> scenario.state.set_builder(["Texas"])
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
Transferring ScenarioList.csv from server
100%|########################################| 235k/235k [00:00<00:00, 668kb/s]
--> Summary
# Existing study
base | test | Anchor | Julia
# Available profiles
demand: ercot
hydro: v1 | v2
solar: v2 | v4.1
wind: v1 | v2 | v5.1 | v5
>>> scenario.state.builder.set_base_profile("demand", "ercot")
>>> scenario.state.builder.set_base_profile("hydro", "v2")
>>> scenario.state.builder.set_base_profile("solar", "v4.1")
>>> scenario.state.builder.set_base_profile("wind", "v5.1")
>>> scenario.state.builder.set_name("test", "daniel_tests_jons_tree")
>>> scenario.state.builder.set_time("2016-01-01 00:00:00", "2016-01-03 23:00:00", "24H")
>>> scenario.state.builder.change_table.scale_plant_capacity("solar", zone_name={"West": 2})
>>> scenario.state.builder.change_table.scale_plant_capacity("wind", zone_name={"West": 2})
>>> scenario.state.builder.change_table.scale_plant_capacity("hydro", zone_name={"West": 2})
>>> scenario.state.builder.change_table.scale_demand(zone_name={"West": 2})
>>> scenario.state.create_scenario()
CREATING SCENARIO: test | daniel_tests_jons_tree

--> Generating scenario id
--> Adding entry in ScenarioList.csv on server
--> Writing change table on local machine
Writing C:\Users\DanielOlsen\ScenarioData\1722_ct.pkl
Transferring C:\Users\DanielOlsen\ScenarioData\1722_ct.pkl to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_ct.pkl on local machine
--> Adding entry in execute table on server
SCENARIO SUCCESSFULLY CREATED WITH ID #1722
State switching: create --> execute
>>> scenario.state.prepare_simulation_input()
Transferring ExecuteList.csv from server
100%|######################################| 20.8k/20.8k [00:00<00:00, 128kb/s]
---------------------------
PREPARING SIMULATION INPUTS
---------------------------
--> Creating temporary folder on server for simulation inputs
machine
Transferring texas_demand_ercot.csv from server
100%|#######################################| 959k/959k [00:00<00:00, 1.50Mb/s]
Multiply demand in West (#303) by 2.00
Writing scaled demand profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_demand.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_demand.csv on local machine
--> Loading hydro
texas_hydro_v2.csv not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring texas_hydro_v2.csv from server
100%|#####################################| 2.62M/2.62M [00:01<00:00, 2.22Mb/s]
Writing scaled hydro profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_hydro.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_hydro.csv on local machine
--> Loading solar
texas_solar_v4.1.csv not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring texas_solar_v4.1.csv from server
100%|#####################################| 3.61M/3.61M [00:01<00:00, 2.55Mb/s]
Writing scaled solar profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_solar.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_solar.csv on local machine
--> Loading wind
texas_wind_v5.1.csv not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring texas_wind_v5.1.csv from server
100%|#####################################| 19.0M/19.0M [00:07<00:00, 2.68Mb/s]
Writing scaled wind profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_wind.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_wind.csv on local machine
--> Preparing MPC file
Scaling grid
Building MPC file
Transferring C:\Users\DanielOlsen\ScenarioData\1722_case.mat to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_case.mat on local machine
--> Updating status in execute table on server
>>> scenario.state.launch_simulation(threads=8)
--> Launching simulation on server
PID: 25388
<subprocess.Popen object at 0x00000147D59C4E50>
>>> quit()
PS C:\Users\DanielOlsen\repos\bes\PowerSimData>
PS C:\Users\DanielOlsen\repos\bes\PowerSimData>
PS C:\Users\DanielOlsen\repos\bes\PowerSimData> python
Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from powersimdata.scenario.scenario import Scenario
>>> scenario = Scenario(1722)
Transferring ScenarioList.csv from server
100%|########################################| 235k/235k [00:00<00:00, 701kb/s]
Transferring ExecuteList.csv from server
100%|######################################| 20.8k/20.8k [00:00<00:00, 169kb/s]
SCENARIO: test | daniel_tests_jons_tree

--> State
analyze
--> Loading grid
1722_grid.mat not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1722_grid.mat from server
100%|########################################| 191k/191k [00:00<00:00, 894kb/s]
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading sub
Loading bus2sub
--> Loading ct
1722_ct.pkl not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1722_ct.pkl from server
100%|#########################################| 206/206 [00:00<00:00, 1.45kb/s]
>>> scenario.state.get_lmp()
--> Loading LMP
C:\Users\DanielOlsen\ScenarioData\data/output\1722_LMP.pkl not found on local machine
Transferring 1722_LMP.pkl from server
100%|#######################################| 610k/610k [00:00<00:00, 2.20Mb/s]
                       3001001    3001002  ...    3008159    3008160
UTC                                        ...
2016-01-01 00:00:00  24.806667  24.629782  ...  22.240349  22.243629
2016-01-01 01:00:00  24.739674  24.497225  ...  22.276453  22.279398
2016-01-01 02:00:00  24.489302  24.256945  ...  22.253412  22.256344
2016-01-01 03:00:00  24.783640  24.563858  ...  22.210369  22.213171
2016-01-01 04:00:00  25.037502  24.828499  ...  22.126270  22.128239
...                        ...        ...  ...        ...        ...
2016-01-03 19:00:00  23.253880   9.772205  ...  22.168762  22.172525
2016-01-03 20:00:00  23.256464   9.773940  ...  22.156517  22.160339
2016-01-03 21:00:00  23.889296  11.042955  ...  22.136557  22.140299
2016-01-03 22:00:00  22.933260  12.123874  ...  22.121342  22.124964
2016-01-03 23:00:00  22.650866  11.272489  ...  21.966211  21.968102

[72 rows x 2000 columns]
>>> scenario.state.get_pg()
--> Loading PG
C:\Users\DanielOlsen\ScenarioData\data/output\1722_PG.pkl not found on local machine
Transferring 1722_PG.pkl from server
100%|########################################| 183k/183k [00:00<00:00, 774kb/s]
                          12875      12876  ...      14048       14049
UTC                                         ...
2016-01-01 00:00:00   93.895042  17.221796  ...   1.084221   40.035355
2016-01-01 01:00:00  113.781754  18.496275  ...   2.580470   68.812187
2016-01-01 02:00:00  117.523766  19.679838  ...   5.696044   72.946083
2016-01-01 03:00:00  117.236267  16.149055  ...  14.590879   75.538712
2016-01-01 04:00:00  117.423836  14.734890  ...  73.820221   66.276161
...                         ...        ...  ...        ...         ...
2016-01-03 19:00:00    5.199273   0.194389  ...  23.013470   18.864305
2016-01-03 20:00:00    3.665761   0.460093  ...  19.755426   13.048021
2016-01-03 21:00:00   17.572971   1.049806  ...  16.927418   14.015634
2016-01-03 22:00:00   16.981102   2.830077  ...  14.350667    6.151488
2016-01-03 23:00:00    0.000000   4.079185  ...  39.642094  117.152267

[72 rows x 598 columns]
>>> scenario.state.get_pf()
--> Loading PF
C:\Users\DanielOlsen\ScenarioData\data/output\1722_PF.pkl not found on local machine
Transferring 1722_PF.pkl from server
100%|#######################################| 976k/976k [00:00<00:00, 1.39Mb/s]
                         100915      100916  ...     104119     104120
UTC                                          ...
2016-01-01 00:00:00  103.781952  103.781952  ... -49.854210 -49.854210
2016-01-01 01:00:00   99.933151   99.933151  ... -54.308800 -54.308800
2016-01-01 02:00:00   93.480057   93.480057  ... -50.943272 -50.943272
2016-01-01 03:00:00  105.065758  105.065758  ... -48.895115 -48.895115
2016-01-01 04:00:00  107.981354  107.981354  ... -49.638092 -49.638092
...                         ...         ...  ...        ...        ...
2016-01-03 19:00:00  109.480530  109.480530  ... -40.344730 -40.344730
2016-01-03 20:00:00  106.598625  106.598625  ... -39.177395 -39.177395
2016-01-03 21:00:00  104.269455  104.269455  ... -38.680443 -38.680443
2016-01-03 22:00:00   99.945633   99.945633  ... -37.980453 -37.980453
2016-01-03 23:00:00   81.679192   81.679192  ... -38.790810 -38.790810

[72 rows x 3206 columns]
>>> scenario.state.get_demand()
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
--> Loading demand
Multiply demand in West (#303) by 2.00
                             301         302  ...           307          308
UTC                                           ...
2016-01-01 00:00:00  2173.808470  868.636393  ...  10703.575700  1341.750663
2016-01-01 01:00:00  2206.959175  890.950782  ...  10743.720990  1449.531975
2016-01-01 02:00:00  2200.797530  873.920708  ...  10436.555930  1355.505824
2016-01-01 03:00:00  2179.300089  860.342391  ...  10117.139850  1314.445489
2016-01-01 04:00:00  2154.405686  840.608568  ...   9836.114934  1330.866082
...                          ...         ...  ...           ...          ...
2016-12-31 19:00:00  2047.459060  690.662724  ...   9651.340702  1142.804232
2016-12-31 20:00:00  2035.036805  675.302580  ...   9639.117255  1125.266863
2016-12-31 21:00:00  2026.887979  668.133067  ...   9637.841629  1124.054410
2016-12-31 22:00:00  2023.906935  674.449005  ...   9683.466597  1118.782875
2016-12-31 23:00:00  2032.573815  705.708943  ...  10066.856620  1176.701108

[8784 rows x 8 columns]
>>> scenario.state.get_hydro()
--> Loading hydro
                         12977      12978  ...     13463      13464
UTC                                        ...
2016-01-01 00:00:00  14.526031   6.038912  ...  3.607026   7.834264
2016-01-01 01:00:00  12.650153   5.259052  ...  3.141218   6.822554
2016-01-01 02:00:00   6.301027   2.619528  ...  1.564637   3.398307
2016-01-01 03:00:00   2.821834   1.173122  ...  0.700703   1.521888
2016-01-01 04:00:00   2.292740   0.953162  ...  0.569321   1.236534
...                        ...        ...  ...       ...        ...
2016-12-31 19:00:00  10.123868   4.208799  ...  2.513904   5.460064
2016-12-31 20:00:00  10.159403   4.223572  ...  2.522728   5.479228
2016-12-31 21:00:00  10.290649   4.278135  ...  2.555319   5.550013
2016-12-31 22:00:00  15.623852   6.495309  ...  3.879631   8.426347
2016-12-31 23:00:00  24.787511  10.304920  ...  6.155101  13.368545

[8784 rows x 22 columns]
>>> scenario.state.get_solar()
--> Loading solar
                        12878     12896  ...      14005      14006
UTC                                      ...
2016-01-01 00:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 01:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 02:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 03:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 04:00:00  0.000000  0.000000  ...   0.000000   0.000000
...                       ...       ...  ...        ...        ...
2016-12-31 19:00:00  4.038380  1.319879  ...  42.380628  15.867405
2016-12-31 20:00:00  3.682028  1.272921  ...  40.524097  13.642190
2016-12-31 21:00:00  3.247971  1.177243  ...  37.351799  10.046580
2016-12-31 22:00:00  2.301117  0.985453  ...  31.177795   4.248332
2016-12-31 23:00:00  0.932313  0.466155  ...  15.401827   0.205514

[8784 rows x 36 columns]

@rouille
Copy link
Collaborator

rouille commented Dec 9, 2020

Should we rename the BACKUP_DATA_ROOT_DIR, DATA_ROOT_DIR and LOCAL_DIR variables in powersimdata.utility.server_setup to SERVER_BACKUP_DATA_ROOT_DIR, SERVER_DATA_ROOT_DIR and ***CLIENT_DATA_ROOT_DIR, respectively?

@danielolsen
Copy link
Contributor

Tests do not pass on Windows:

PS C:\Users\DanielOlsen\repos\bes\PowerSimData> python -m pytest -m "not integration"
============================= test session starts =============================
platform win32 -- Python 3.8.3, pytest-6.1.0, py-1.9.0, pluggy-0.13.1
rootdir: C:\Users\DanielOlsen\repos\bes\PowerSimData, configfile: pytest.ini
collected 240 items / 41 deselected / 199 selected

powersimdata\design\generation\tests\test_clean_capacity_scaling.py .... [  2%]
.                                                                        [  2%]
powersimdata\design\generation\tests\test_cost_curves.py .....           [  5%]
powersimdata\design\tests\test_scenario_info.py ........                 [  9%]
powersimdata\design\transmission\tests\test_mwmiles.py ......            [ 12%]
powersimdata\design\transmission\tests\test_statelines.py .....          [ 14%]
powersimdata\design\transmission\tests\test_upgrade.py ................. [ 23%]
............................................                             [ 45%]
powersimdata\input\tests\test_change_table.py .......................... [ 58%]
........                                                                 [ 62%]
powersimdata\input\tests\test_grid.py ............................       [ 76%]
powersimdata\input\tests\test_transform_grid.py ..................       [ 85%]
powersimdata\tests\test_mocks.py ..........                              [ 90%]
powersimdata\utility\tests\test_distance.py ...                          [ 91%]
powersimdata\utility\tests\test_helpers.py .........                     [ 96%]
powersimdata\utility\tests\test_transfer_data.py .FFFFF.                 [100%]

================================== FAILURES ===================================
_______________________________ test_copy_from ________________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E5368550>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200E69490D0>

    def test_copy_from(mock_data_access, temp_fs, make_temp):
        fname = make_temp()
>       mock_data_access.copy_from(fname)

powersimdata\utility\tests\test_transfer_data.py:82:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:165: in copy_from
    self.check_file_exists(from_path, should_exist=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E5368550>
filepath = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_from0\\src/tmpr2klwc_d'
should_exist = True

    def check_file_exists(self, filepath, should_exist=True):
        """
        Check that file exists (or not) at the given path

        :param str filepath: the full path to the file
        :param bool should_exist: whether the file is expected to exist
        """
        _, _, stderr = self.execute_command(CommandBuilder.list(filepath))
        compare = operator.ne if should_exist else operator.eq
        if compare(len(stderr.readlines()), 0):
            msg = "not found" if should_exist else "already exists"
>           raise OSError(f"{filepath} {msg} on server")
E           OSError: C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from0\src/tmpr2klwc_d not found on server

powersimdata\utility\transfer_data.py:68: OSError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from0\src/tmpr2klwc_d
__________________________ test_copy_from_multi_path __________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E3A90EB0>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from_multi_path0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from_multi_path0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200EDD6D790>

    def test_copy_from_multi_path(mock_data_access, temp_fs, make_temp):
        rel_path = Path("foo", "bar")
        src_path = temp_fs[0] / rel_path
        src_path.mkdir(parents=True)
        fname = make_temp(rel_path)
>       mock_data_access.copy_from(fname, rel_path)

powersimdata\utility\tests\test_transfer_data.py:91:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:165: in copy_from
    self.check_file_exists(from_path, should_exist=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E3A90EB0>
filepath = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_from_multi_path0\\src/foo\\bar/tmpv86rczr4'
should_exist = True

    def check_file_exists(self, filepath, should_exist=True):
        """
        Check that file exists (or not) at the given path

        :param str filepath: the full path to the file
        :param bool should_exist: whether the file is expected to exist
        """
        _, _, stderr = self.execute_command(CommandBuilder.list(filepath))
        compare = operator.ne if should_exist else operator.eq
        if compare(len(stderr.readlines()), 0):
            msg = "not found" if should_exist else "already exists"
>           raise OSError(f"{filepath} {msg} on server")
E           OSError: C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from_multi_path0\src/foo\bar/tmpv86rczr4 not found on server

powersimdata\utility\transfer_data.py:68: OSError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from_multi_path0\src/foo\bar/tmpv86rczr4
________________________________ test_copy_to _________________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E5368100>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200C4E07CA0>

    def test_copy_to(mock_data_access, temp_fs, make_temp):
        fname = make_temp(remote=False)
>       mock_data_access.copy_to(fname, temp_fs[0])

powersimdata\utility\tests\test_transfer_data.py:97:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:195: in copy_to
    sftp.put(from_path, to_path)
powersimdata\tests\mock_ssh.py:15: in put
    shutil.copy(from_path, to_path)
C:\Python38\lib\shutil.py:415: in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\dest\\tmp0dqwctmv'
dst = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\src/C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\src/tmp0dqwctmv'

    def copyfile(src, dst, *, follow_symlinks=True):
        """Copy data from src to dst in the most efficient way possible.

        If follow_symlinks is not set and src is a symbolic link, a new
        symlink will be created instead of copying the file it points to.

        """
        sys.audit("shutil.copyfile", src, dst)

        if _samefile(src, dst):
            raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

        file_size = 0
        for i, fn in enumerate([src, dst]):
            try:
                st = _stat(fn)
            except OSError:
                # File most likely does not exist
                pass
            else:
                # XXX What about other special files? (sockets, devices...)
                if stat.S_ISFIFO(st.st_mode):
                    fn = fn.path if isinstance(fn, os.DirEntry) else fn
                    raise SpecialFileError("`%s` is a named pipe" % fn)
                if _WINDOWS and i == 0:
                    file_size = st.st_size

        if not follow_symlinks and _islink(src):
            os.symlink(os.readlink(src), dst)
        else:
>           with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
E           PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\dest\\tmp0dqwctmv'

C:\Python38\lib\shutil.py:261: PermissionError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to0\src/C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to0\src/tmp0dqwctmv
Transferring C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to0\dest\tmp0dqwctmv to server
___________________________ test_copy_to_multi_path ___________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200EDDAE0A0>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to_multi_path0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to_multi_path0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200E52BACA0>

    def test_copy_to_multi_path(mock_data_access, temp_fs, make_temp):
        rel_path = Path("foo", "bar")
        remote_path = mock_data_access.root / rel_path
        remote_path.mkdir(parents=True)
        fname = make_temp(remote=False)
>       mock_data_access.copy_to(fname, rel_path)

powersimdata\utility\tests\test_transfer_data.py:106:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:195: in copy_to
    sftp.put(from_path, to_path)
powersimdata\tests\mock_ssh.py:15: in put
    shutil.copy(from_path, to_path)
C:\Python38\lib\shutil.py:415: in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_multi_path0\\dest\\tmpiagozo3q'
dst = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_multi_path0\\src/foo\\bar/tmpiagozo3q'

    def copyfile(src, dst, *, follow_symlinks=True):
        """Copy data from src to dst in the most efficient way possible.

        If follow_symlinks is not set and src is a symbolic link, a new
        symlink will be created instead of copying the file it points to.

        """
        sys.audit("shutil.copyfile", src, dst)

        if _samefile(src, dst):
            raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

        file_size = 0
        for i, fn in enumerate([src, dst]):
            try:
                st = _stat(fn)
            except OSError:
                # File most likely does not exist
                pass
            else:
                # XXX What about other special files? (sockets, devices...)
                if stat.S_ISFIFO(st.st_mode):
                    fn = fn.path if isinstance(fn, os.DirEntry) else fn
                    raise SpecialFileError("`%s` is a named pipe" % fn)
                if _WINDOWS and i == 0:
                    file_size = st.st_size

        if not follow_symlinks and _islink(src):
            os.symlink(os.readlink(src), dst)
        else:
>           with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
E           PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_multi_path0\\dest\\tmpiagozo3q'

C:\Python38\lib\shutil.py:261: PermissionError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_multi_path0\src/foo\bar/tmpiagozo3q
Transferring C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_multi_path0\dest\tmpiagozo3q to server
_____________________________ test_copy_to_rename _____________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200F025A0D0>
make_temp = <function make_temp.<locals>._make_temp at 0x00000200EDD5E4C0>

    def test_copy_to_rename(mock_data_access, make_temp):
        root = mock_data_access.root
        fname = make_temp(remote=False)
        new_fname = "new_fname"
>       mock_data_access.copy_to(fname, root, change_name_to=new_fname)

powersimdata\utility\tests\test_transfer_data.py:114:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:195: in copy_to
    sftp.put(from_path, to_path)
powersimdata\tests\mock_ssh.py:15: in put
    shutil.copy(from_path, to_path)
C:\Python38\lib\shutil.py:415: in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\dest\\tmpv8_l6qqq'
dst = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\src/C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\src/new_fname'

    def copyfile(src, dst, *, follow_symlinks=True):
        """Copy data from src to dst in the most efficient way possible.

        If follow_symlinks is not set and src is a symbolic link, a new
        symlink will be created instead of copying the file it points to.

        """
        sys.audit("shutil.copyfile", src, dst)

        if _samefile(src, dst):
            raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

        file_size = 0
        for i, fn in enumerate([src, dst]):
            try:
                st = _stat(fn)
            except OSError:
                # File most likely does not exist
                pass
            else:
                # XXX What about other special files? (sockets, devices...)
                if stat.S_ISFIFO(st.st_mode):
                    fn = fn.path if isinstance(fn, os.DirEntry) else fn
                    raise SpecialFileError("`%s` is a named pipe" % fn)
                if _WINDOWS and i == 0:
                    file_size = st.st_size

        if not follow_symlinks and _islink(src):
            os.symlink(os.readlink(src), dst)
        else:
>           with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
E           PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\dest\\tmpv8_l6qqq'

C:\Python38\lib\shutil.py:261: PermissionError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_rename0\src/C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_rename0\src/new_fname
Transferring C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_rename0\dest\tmpv8_l6qqq to server
============================== warnings summary ===============================
..\..\..\..\..\Python38\lib\site-packages\win32\lib\pywintypes.py:2
  C:\Python38\lib\site-packages\win32\lib\pywintypes.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp, sys, os

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ===========================
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_from - OSE...
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_from_multi_path
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_to - Permi...
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_to_multi_path
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_to_rename
========== 5 failed, 194 passed, 41 deselected, 1 warning in 12.69s ===========

@jenhagg
Copy link
Collaborator Author

jenhagg commented Dec 9, 2020

Should we rename the BACKUP_DATA_ROOT_DIR, DATA_ROOT_DIR and LOCAL_DIR variables in powersimdata.utility.server_setup to SERVER_BACKUP_DATA_ROOT_DIR, SERVER_DATA_ROOT_DIR and ***CLIENT_DATA_ROOT_DIR, respectively?

I don't feel a strong need to do so currently, but not opposed to it.

@jenhagg
Copy link
Collaborator Author

jenhagg commented Dec 9, 2020

Tests do not pass on Windows:

Thanks, Windows. Looks like it's due to calling posixpath.join within copy_to which previously would only pertain to paths on the server. I'll see if I can mock that out, or if there is an alternative approach.

Copy link
Contributor

@danielolsen danielolsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main code looks good and works to create, prepare, and run a scenario, and to load its inputs & outputs from within the scenario framework. Once the testing on windows is worked out I think it is ready to merge.

@jenhagg jenhagg merged commit 86827c1 into develop Dec 9, 2020
@jenhagg jenhagg deleted the jon/tree branch December 9, 2020 21:03
@ahurli ahurli mentioned this pull request Mar 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants