Use relative paths and mirror server file structure locally #340

jenhagg · 2020-11-20T02:28:43Z

Purpose

Remove (most) dependence on server root path by using relative paths to download/upload files, and passing in the source and destination root at instantiation time, so it's reused throughout the object's lifespan. This moves the responsibility of parsing paths, creating folders, etc away from calling code, and the assumption of having two fixed "root" directories simplifies some logic.

What the code is doing

Chronologically, this involved first changing the semantics of copy_from to implement copying a node from a remote tree to a local tree while preserving the structure. This required a lot of path joins and such to be updated. Along the way I removed the data_access constructor parameter from a few classes, introduced a small PathConfig object that simplifies joins where still needed, and created the CommandBuilder class to reuse the logic for building command line invocations. The commit history may be useful to see this trajectory.

Testing

There are some new unit tests for building commands and file transfers, the latter using a semi-mock connection which operates locally on temp directories.

I also ran a texas simulation using this branch and successfully extracted/downloaded files to verify the end to end process.

Where to look

The SSHDataAccess class and server_setup module were the starting points, most everything else flows from that, but does include some opportunistic refactoring. The InputData and OutputData classes are a good illustration of simplified logic, and as usual the unit tests are a (hopefully) good demo.

Time estimate

30 min

powersimdata/utility/transfer_data.py

danielolsen · 2020-11-20T02:40:29Z

powersimdata/utility/transfer_data.py

        """Constructor"""
        self._ssh = None
        self._retry_after = 5
+        self.root = server_setup.DATA_ROOT_DIR if root is None else root
+        self.dest_root = server_setup.LOCAL_DIR


Should this be an attribute at the DataAccess level?

I'm guessing it will, just waiting to generalize until the local implementation starts happening, at which point the duplicated code will be obvious and can be moved to the base class.

powersimdata/data_access/csv_store.py

danielolsen · 2020-11-21T00:54:19Z

Do you plan to mirror the server structure in copy_to as well? That would require sub-folders for the local folder: tmp/scenario_1, tmp/scenario_2, etc.

jenhagg · 2020-11-21T01:01:51Z

Do you plan to mirror the server structure in copy_to as well? That would require sub-folders for the local folder: tmp/scenario_1, tmp/scenario_2, etc.

All the usages I see will delete the local file afterward, so was planning to leave that as is. However it does need to be updated to work with relative paths.

powersimdata/input/input_data.py

danielolsen · 2020-11-23T15:53:20Z

powersimdata/input/change_table.py

-        if not os.path.exists(server_setup.LOCAL_DIR):
-            os.makedirs(server_setup.LOCAL_DIR)
+        os.makedirs(server_setup.LOCAL_DIR, exist_ok=True)


I wasn't aware of this call signature, it's very nice!

danielolsen · 2020-11-23T15:59:01Z

powersimdata/utility/server_setup.py

+    def execute_dir(self):
+        return self._join(EXECUTE_DIR)


The idea of these being functions instead of attributes is that self.root can be changed and that change will propagate?

Right, but to be honest this was mostly an ad hoc way to simplify the code outside SSHDataAccess that was using the root dir for path joins. I feel like overall things could be further simplified - so this may be temporary.

jenhagg · 2020-12-01T22:37:03Z

powersimdata/data_access/tests/test_scenario_csv.py

@@ -0,0 +1,58 @@
+import pandas as pd


This file is just moving tests around

jenhagg · 2020-12-01T22:37:10Z

powersimdata/data_access/tests/test_execute_csv.py

@@ -0,0 +1,41 @@
+import pandas as pd


This file is just moving tests around

jenhagg · 2020-12-01T22:39:38Z

pytest.ini

@@ -3,3 +3,4 @@ markers =
    integration: marks tests that require external dependencies (deselect with '-m "not integration"')
    db: marks tests that connect to a local database
    ssh: marks tests that connect to the server over ssh
+    wip: marks tests undergoing development


I found this useful while writing new tests so figured I'd leave it for future use?

danielolsen · 2020-12-02T20:47:21Z

powersimdata/input/input_data.py

+        if self.data_loc == "disk":
+            self.data_access = SSHDataAccess(server_setup.BACKUP_DATA_ROOT_DIR)
+        else:
+            self.data_access = SSHDataAccess(server_setup.DATA_ROOT_DIR)


Previously, we were passing any DataAccess object to the init, now we're hardcoding that we will create a new SSHDataAccess object. Maybe I'm missing something, but this seems like a regression in our abstraction process that will need to get undone as soon as we implement the local access. Could we accomplish the same functionality and keep the abstraction by still accepting a DataAccess object in the init and modifying self.data_access.root based on data_loc?

Yeah this is technically a regression, though I was thinking the fix would be using a factory instead of having the parameter - for example

class Context: @staticmethod def get_data_access(root): if check_environment_variable(): return LocalDataAccess(root) return SSHDataAccess(root)

and in input_data.py:

self.data_access = Context.get_data_access(root)

Open to ideas here.. one might be passing the context as a parameter to allow easy substitution of test implementations. Your suggestion also makes sense.. lemme think about that a bit more.

I think a factory work, with the potential downside that we're creating a new SSH connection for every time we want to access the server, rather than re-using the existing one.

This is a potential issue. I was opening new SSH connection all over the place in the first version of the scenario framework and the server was not happy. That said, I probably did a bad job at closing the connection everytime a new one was created.

Actually this is a good point.. if we share the same DataAccess instance in multiple places, we should definitely not modify the root attribute. We could still achieve connection reuse though - either cache equivalent DataAccess instances within the factory, or use a static connection (self._ssh property) instead of one per instance.

To clarify: we've historically had one active SSH connection per Scenario, shared among all the Scenario methods.

Yeah that makes sense (and I agree it's slightly preferable to recreating connections). Was just realizing that with these changes, not all SSHDataAccess instances are interchangeable anymore, so sharing those won't make sense (but sharing the connection still makes sense, just need to write the code).

Where are we on this issue? This seems like a fairly fundamental design point for this PR. Would it be easier (possible?) to separate the path/folder part of this PR from the refactor of how we pass around DataAccess objects?

Are you referring to the connection reuse? I was thinking if we close connections correctly (seems like we do this), it won't have much impact, and I could handle that in a follow up PR if needed. For the part about passing in the instance, I also plan on implementing the factory in the next PR when I add the local implementation (that's really the only way I see to do it, since we can't reuse the instance anymore). Let me know if this sounds reasonable.

After a conversation with Daniel I've added the initial factory layout, hopefully that gives an idea of how it will progress.

powersimdata/scenario/create.py

danielolsen · 2020-12-09T00:25:10Z

I deleted my ScenarioData folder and created a new Scenario, prepared it, ran it, and queried the results, and everything looks according to plan. The local files exactly mirror the server structure, and once raw profiles are cached locally they are not re-downloaded.

>>> from powersimdata.scenario.scenario import Scenario
>>> scenario = Scenario('')
>>> scenario.state.set_builder(["Texas"])
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
Transferring ScenarioList.csv from server
100%|########################################| 235k/235k [00:00<00:00, 668kb/s]
--> Summary
# Existing study
base | test | Anchor | Julia
# Available profiles
demand: ercot
hydro: v1 | v2
solar: v2 | v4.1
wind: v1 | v2 | v5.1 | v5
>>> scenario.state.builder.set_base_profile("demand", "ercot")
>>> scenario.state.builder.set_base_profile("hydro", "v2")
>>> scenario.state.builder.set_base_profile("solar", "v4.1")
>>> scenario.state.builder.set_base_profile("wind", "v5.1")
>>> scenario.state.builder.set_name("test", "daniel_tests_jons_tree")
>>> scenario.state.builder.set_time("2016-01-01 00:00:00", "2016-01-03 23:00:00", "24H")
>>> scenario.state.builder.change_table.scale_plant_capacity("solar", zone_name={"West": 2})
>>> scenario.state.builder.change_table.scale_plant_capacity("wind", zone_name={"West": 2})
>>> scenario.state.builder.change_table.scale_plant_capacity("hydro", zone_name={"West": 2})
>>> scenario.state.builder.change_table.scale_demand(zone_name={"West": 2})
>>> scenario.state.create_scenario()
CREATING SCENARIO: test | daniel_tests_jons_tree

--> Generating scenario id
--> Adding entry in ScenarioList.csv on server
--> Writing change table on local machine
Writing C:\Users\DanielOlsen\ScenarioData\1722_ct.pkl
Transferring C:\Users\DanielOlsen\ScenarioData\1722_ct.pkl to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_ct.pkl on local machine
--> Adding entry in execute table on server
SCENARIO SUCCESSFULLY CREATED WITH ID #1722
State switching: create --> execute
>>> scenario.state.prepare_simulation_input()
Transferring ExecuteList.csv from server
100%|######################################| 20.8k/20.8k [00:00<00:00, 128kb/s]
---------------------------
PREPARING SIMULATION INPUTS
---------------------------
--> Creating temporary folder on server for simulation inputs
machine
Transferring texas_demand_ercot.csv from server
100%|#######################################| 959k/959k [00:00<00:00, 1.50Mb/s]
Multiply demand in West (#303) by 2.00
Writing scaled demand profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_demand.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_demand.csv on local machine
--> Loading hydro
texas_hydro_v2.csv not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring texas_hydro_v2.csv from server
100%|#####################################| 2.62M/2.62M [00:01<00:00, 2.22Mb/s]
Writing scaled hydro profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_hydro.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_hydro.csv on local machine
--> Loading solar
texas_solar_v4.1.csv not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring texas_solar_v4.1.csv from server
100%|#####################################| 3.61M/3.61M [00:01<00:00, 2.55Mb/s]
Writing scaled solar profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_solar.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_solar.csv on local machine
--> Loading wind
texas_wind_v5.1.csv not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring texas_wind_v5.1.csv from server
100%|#####################################| 19.0M/19.0M [00:07<00:00, 2.68Mb/s]
Writing scaled wind profile in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring C:\Users\DanielOlsen\ScenarioData\1722_wind.csv to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_wind.csv on local machine
--> Preparing MPC file
Scaling grid
Building MPC file
Transferring C:\Users\DanielOlsen\ScenarioData\1722_case.mat to server
--> Deleting C:\Users\DanielOlsen\ScenarioData\1722_case.mat on local machine
--> Updating status in execute table on server
>>> scenario.state.launch_simulation(threads=8)
--> Launching simulation on server
PID: 25388
<subprocess.Popen object at 0x00000147D59C4E50>
>>> quit()
PS C:\Users\DanielOlsen\repos\bes\PowerSimData>
PS C:\Users\DanielOlsen\repos\bes\PowerSimData>
PS C:\Users\DanielOlsen\repos\bes\PowerSimData> python
Python 3.8.3 (tags/v3.8.3:6f8c832, May 13 2020, 22:37:02) [MSC v.1924 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from powersimdata.scenario.scenario import Scenario
>>> scenario = Scenario(1722)
Transferring ScenarioList.csv from server
100%|########################################| 235k/235k [00:00<00:00, 701kb/s]
Transferring ExecuteList.csv from server
100%|######################################| 20.8k/20.8k [00:00<00:00, 169kb/s]
SCENARIO: test | daniel_tests_jons_tree

--> State
analyze
--> Loading grid
1722_grid.mat not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1722_grid.mat from server
100%|########################################| 191k/191k [00:00<00:00, 894kb/s]
Loading bus
Loading plant
Loading heat_rate_curve
Loading gencost_before
Loading gencost_after
Loading branch
Loading sub
Loading bus2sub
--> Loading ct
1722_ct.pkl not found in C:\Users\DanielOlsen\ScenarioData\ on local machine
Transferring 1722_ct.pkl from server
100%|#########################################| 206/206 [00:00<00:00, 1.45kb/s]
>>> scenario.state.get_lmp()
--> Loading LMP
C:\Users\DanielOlsen\ScenarioData\data/output\1722_LMP.pkl not found on local machine
Transferring 1722_LMP.pkl from server
100%|#######################################| 610k/610k [00:00<00:00, 2.20Mb/s]
                       3001001    3001002  ...    3008159    3008160
UTC                                        ...
2016-01-01 00:00:00  24.806667  24.629782  ...  22.240349  22.243629
2016-01-01 01:00:00  24.739674  24.497225  ...  22.276453  22.279398
2016-01-01 02:00:00  24.489302  24.256945  ...  22.253412  22.256344
2016-01-01 03:00:00  24.783640  24.563858  ...  22.210369  22.213171
2016-01-01 04:00:00  25.037502  24.828499  ...  22.126270  22.128239
...                        ...        ...  ...        ...        ...
2016-01-03 19:00:00  23.253880   9.772205  ...  22.168762  22.172525
2016-01-03 20:00:00  23.256464   9.773940  ...  22.156517  22.160339
2016-01-03 21:00:00  23.889296  11.042955  ...  22.136557  22.140299
2016-01-03 22:00:00  22.933260  12.123874  ...  22.121342  22.124964
2016-01-03 23:00:00  22.650866  11.272489  ...  21.966211  21.968102

[72 rows x 2000 columns]
>>> scenario.state.get_pg()
--> Loading PG
C:\Users\DanielOlsen\ScenarioData\data/output\1722_PG.pkl not found on local machine
Transferring 1722_PG.pkl from server
100%|########################################| 183k/183k [00:00<00:00, 774kb/s]
                          12875      12876  ...      14048       14049
UTC                                         ...
2016-01-01 00:00:00   93.895042  17.221796  ...   1.084221   40.035355
2016-01-01 01:00:00  113.781754  18.496275  ...   2.580470   68.812187
2016-01-01 02:00:00  117.523766  19.679838  ...   5.696044   72.946083
2016-01-01 03:00:00  117.236267  16.149055  ...  14.590879   75.538712
2016-01-01 04:00:00  117.423836  14.734890  ...  73.820221   66.276161
...                         ...        ...  ...        ...         ...
2016-01-03 19:00:00    5.199273   0.194389  ...  23.013470   18.864305
2016-01-03 20:00:00    3.665761   0.460093  ...  19.755426   13.048021
2016-01-03 21:00:00   17.572971   1.049806  ...  16.927418   14.015634
2016-01-03 22:00:00   16.981102   2.830077  ...  14.350667    6.151488
2016-01-03 23:00:00    0.000000   4.079185  ...  39.642094  117.152267

[72 rows x 598 columns]
>>> scenario.state.get_pf()
--> Loading PF
C:\Users\DanielOlsen\ScenarioData\data/output\1722_PF.pkl not found on local machine
Transferring 1722_PF.pkl from server
100%|#######################################| 976k/976k [00:00<00:00, 1.39Mb/s]
                         100915      100916  ...     104119     104120
UTC                                          ...
2016-01-01 00:00:00  103.781952  103.781952  ... -49.854210 -49.854210
2016-01-01 01:00:00   99.933151   99.933151  ... -54.308800 -54.308800
2016-01-01 02:00:00   93.480057   93.480057  ... -50.943272 -50.943272
2016-01-01 03:00:00  105.065758  105.065758  ... -48.895115 -48.895115
2016-01-01 04:00:00  107.981354  107.981354  ... -49.638092 -49.638092
...                         ...         ...  ...        ...        ...
2016-01-03 19:00:00  109.480530  109.480530  ... -40.344730 -40.344730
2016-01-03 20:00:00  106.598625  106.598625  ... -39.177395 -39.177395
2016-01-03 21:00:00  104.269455  104.269455  ... -38.680443 -38.680443
2016-01-03 22:00:00   99.945633   99.945633  ... -37.980453 -37.980453
2016-01-03 23:00:00   81.679192   81.679192  ... -38.790810 -38.790810

[72 rows x 3206 columns]
>>> scenario.state.get_demand()
Reading bus.csv
Reading plant.csv
Reading gencost.csv
Reading branch.csv
Reading dcline.csv
Reading sub.csv
Reading bus2sub.csv
Reading zone.csv
--> Loading demand
Multiply demand in West (#303) by 2.00
                             301         302  ...           307          308
UTC                                           ...
2016-01-01 00:00:00  2173.808470  868.636393  ...  10703.575700  1341.750663
2016-01-01 01:00:00  2206.959175  890.950782  ...  10743.720990  1449.531975
2016-01-01 02:00:00  2200.797530  873.920708  ...  10436.555930  1355.505824
2016-01-01 03:00:00  2179.300089  860.342391  ...  10117.139850  1314.445489
2016-01-01 04:00:00  2154.405686  840.608568  ...   9836.114934  1330.866082
...                          ...         ...  ...           ...          ...
2016-12-31 19:00:00  2047.459060  690.662724  ...   9651.340702  1142.804232
2016-12-31 20:00:00  2035.036805  675.302580  ...   9639.117255  1125.266863
2016-12-31 21:00:00  2026.887979  668.133067  ...   9637.841629  1124.054410
2016-12-31 22:00:00  2023.906935  674.449005  ...   9683.466597  1118.782875
2016-12-31 23:00:00  2032.573815  705.708943  ...  10066.856620  1176.701108

[8784 rows x 8 columns]
>>> scenario.state.get_hydro()
--> Loading hydro
                         12977      12978  ...     13463      13464
UTC                                        ...
2016-01-01 00:00:00  14.526031   6.038912  ...  3.607026   7.834264
2016-01-01 01:00:00  12.650153   5.259052  ...  3.141218   6.822554
2016-01-01 02:00:00   6.301027   2.619528  ...  1.564637   3.398307
2016-01-01 03:00:00   2.821834   1.173122  ...  0.700703   1.521888
2016-01-01 04:00:00   2.292740   0.953162  ...  0.569321   1.236534
...                        ...        ...  ...       ...        ...
2016-12-31 19:00:00  10.123868   4.208799  ...  2.513904   5.460064
2016-12-31 20:00:00  10.159403   4.223572  ...  2.522728   5.479228
2016-12-31 21:00:00  10.290649   4.278135  ...  2.555319   5.550013
2016-12-31 22:00:00  15.623852   6.495309  ...  3.879631   8.426347
2016-12-31 23:00:00  24.787511  10.304920  ...  6.155101  13.368545

[8784 rows x 22 columns]
>>> scenario.state.get_solar()
--> Loading solar
                        12878     12896  ...      14005      14006
UTC                                      ...
2016-01-01 00:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 01:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 02:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 03:00:00  0.000000  0.000000  ...   0.000000   0.000000
2016-01-01 04:00:00  0.000000  0.000000  ...   0.000000   0.000000
...                       ...       ...  ...        ...        ...
2016-12-31 19:00:00  4.038380  1.319879  ...  42.380628  15.867405
2016-12-31 20:00:00  3.682028  1.272921  ...  40.524097  13.642190
2016-12-31 21:00:00  3.247971  1.177243  ...  37.351799  10.046580
2016-12-31 22:00:00  2.301117  0.985453  ...  31.177795   4.248332
2016-12-31 23:00:00  0.932313  0.466155  ...  15.401827   0.205514

[8784 rows x 36 columns]

rouille · 2020-12-09T00:26:36Z

Should we rename the BACKUP_DATA_ROOT_DIR, DATA_ROOT_DIR and LOCAL_DIR variables in powersimdata.utility.server_setup to SERVER_BACKUP_DATA_ROOT_DIR, SERVER_DATA_ROOT_DIR and ***CLIENT_DATA_ROOT_DIR, respectively?

danielolsen · 2020-12-09T00:28:01Z

Tests do not pass on Windows:

PS C:\Users\DanielOlsen\repos\bes\PowerSimData> python -m pytest -m "not integration"
============================= test session starts =============================
platform win32 -- Python 3.8.3, pytest-6.1.0, py-1.9.0, pluggy-0.13.1
rootdir: C:\Users\DanielOlsen\repos\bes\PowerSimData, configfile: pytest.ini
collected 240 items / 41 deselected / 199 selected

powersimdata\design\generation\tests\test_clean_capacity_scaling.py .... [  2%]
.                                                                        [  2%]
powersimdata\design\generation\tests\test_cost_curves.py .....           [  5%]
powersimdata\design\tests\test_scenario_info.py ........                 [  9%]
powersimdata\design\transmission\tests\test_mwmiles.py ......            [ 12%]
powersimdata\design\transmission\tests\test_statelines.py .....          [ 14%]
powersimdata\design\transmission\tests\test_upgrade.py ................. [ 23%]
............................................                             [ 45%]
powersimdata\input\tests\test_change_table.py .......................... [ 58%]
........                                                                 [ 62%]
powersimdata\input\tests\test_grid.py ............................       [ 76%]
powersimdata\input\tests\test_transform_grid.py ..................       [ 85%]
powersimdata\tests\test_mocks.py ..........                              [ 90%]
powersimdata\utility\tests\test_distance.py ...                          [ 91%]
powersimdata\utility\tests\test_helpers.py .........                     [ 96%]
powersimdata\utility\tests\test_transfer_data.py .FFFFF.                 [100%]

================================== FAILURES ===================================
_______________________________ test_copy_from ________________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E5368550>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200E69490D0>

    def test_copy_from(mock_data_access, temp_fs, make_temp):
        fname = make_temp()
>       mock_data_access.copy_from(fname)

powersimdata\utility\tests\test_transfer_data.py:82:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:165: in copy_from
    self.check_file_exists(from_path, should_exist=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E5368550>
filepath = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_from0\\src/tmpr2klwc_d'
should_exist = True

    def check_file_exists(self, filepath, should_exist=True):
        """
        Check that file exists (or not) at the given path

        :param str filepath: the full path to the file
        :param bool should_exist: whether the file is expected to exist
        """
        _, _, stderr = self.execute_command(CommandBuilder.list(filepath))
        compare = operator.ne if should_exist else operator.eq
        if compare(len(stderr.readlines()), 0):
            msg = "not found" if should_exist else "already exists"
>           raise OSError(f"{filepath} {msg} on server")
E           OSError: C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from0\src/tmpr2klwc_d not found on server

powersimdata\utility\transfer_data.py:68: OSError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from0\src/tmpr2klwc_d
__________________________ test_copy_from_multi_path __________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E3A90EB0>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from_multi_path0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_from_multi_path0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200EDD6D790>

    def test_copy_from_multi_path(mock_data_access, temp_fs, make_temp):
        rel_path = Path("foo", "bar")
        src_path = temp_fs[0] / rel_path
        src_path.mkdir(parents=True)
        fname = make_temp(rel_path)
>       mock_data_access.copy_from(fname, rel_path)

powersimdata\utility\tests\test_transfer_data.py:91:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:165: in copy_from
    self.check_file_exists(from_path, should_exist=True)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E3A90EB0>
filepath = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_from_multi_path0\\src/foo\\bar/tmpv86rczr4'
should_exist = True

    def check_file_exists(self, filepath, should_exist=True):
        """
        Check that file exists (or not) at the given path

        :param str filepath: the full path to the file
        :param bool should_exist: whether the file is expected to exist
        """
        _, _, stderr = self.execute_command(CommandBuilder.list(filepath))
        compare = operator.ne if should_exist else operator.eq
        if compare(len(stderr.readlines()), 0):
            msg = "not found" if should_exist else "already exists"
>           raise OSError(f"{filepath} {msg} on server")
E           OSError: C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from_multi_path0\src/foo\bar/tmpv86rczr4 not found on server

powersimdata\utility\transfer_data.py:68: OSError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_from_multi_path0\src/foo\bar/tmpv86rczr4
________________________________ test_copy_to _________________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200E5368100>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200C4E07CA0>

    def test_copy_to(mock_data_access, temp_fs, make_temp):
        fname = make_temp(remote=False)
>       mock_data_access.copy_to(fname, temp_fs[0])

powersimdata\utility\tests\test_transfer_data.py:97:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:195: in copy_to
    sftp.put(from_path, to_path)
powersimdata\tests\mock_ssh.py:15: in put
    shutil.copy(from_path, to_path)
C:\Python38\lib\shutil.py:415: in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\dest\\tmp0dqwctmv'
dst = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\src/C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\src/tmp0dqwctmv'

    def copyfile(src, dst, *, follow_symlinks=True):
        """Copy data from src to dst in the most efficient way possible.

        If follow_symlinks is not set and src is a symbolic link, a new
        symlink will be created instead of copying the file it points to.

        """
        sys.audit("shutil.copyfile", src, dst)

        if _samefile(src, dst):
            raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

        file_size = 0
        for i, fn in enumerate([src, dst]):
            try:
                st = _stat(fn)
            except OSError:
                # File most likely does not exist
                pass
            else:
                # XXX What about other special files? (sockets, devices...)
                if stat.S_ISFIFO(st.st_mode):
                    fn = fn.path if isinstance(fn, os.DirEntry) else fn
                    raise SpecialFileError("`%s` is a named pipe" % fn)
                if _WINDOWS and i == 0:
                    file_size = st.st_size

        if not follow_symlinks and _islink(src):
            os.symlink(os.readlink(src), dst)
        else:
>           with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
E           PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to0\\dest\\tmp0dqwctmv'

C:\Python38\lib\shutil.py:261: PermissionError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to0\src/C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to0\src/tmp0dqwctmv
Transferring C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to0\dest\tmp0dqwctmv to server
___________________________ test_copy_to_multi_path ___________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200EDDAE0A0>
temp_fs = (WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to_multi_path0/src'), WindowsPath('C:/Users/DanielOlsen/AppData/Local/Temp/pytest-of-DanielOlsen/pytest-0/test_copy_to_multi_path0/dest'))
make_temp = <function make_temp.<locals>._make_temp at 0x00000200E52BACA0>

    def test_copy_to_multi_path(mock_data_access, temp_fs, make_temp):
        rel_path = Path("foo", "bar")
        remote_path = mock_data_access.root / rel_path
        remote_path.mkdir(parents=True)
        fname = make_temp(remote=False)
>       mock_data_access.copy_to(fname, rel_path)

powersimdata\utility\tests\test_transfer_data.py:106:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:195: in copy_to
    sftp.put(from_path, to_path)
powersimdata\tests\mock_ssh.py:15: in put
    shutil.copy(from_path, to_path)
C:\Python38\lib\shutil.py:415: in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_multi_path0\\dest\\tmpiagozo3q'
dst = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_multi_path0\\src/foo\\bar/tmpiagozo3q'

    def copyfile(src, dst, *, follow_symlinks=True):
        """Copy data from src to dst in the most efficient way possible.

        If follow_symlinks is not set and src is a symbolic link, a new
        symlink will be created instead of copying the file it points to.

        """
        sys.audit("shutil.copyfile", src, dst)

        if _samefile(src, dst):
            raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

        file_size = 0
        for i, fn in enumerate([src, dst]):
            try:
                st = _stat(fn)
            except OSError:
                # File most likely does not exist
                pass
            else:
                # XXX What about other special files? (sockets, devices...)
                if stat.S_ISFIFO(st.st_mode):
                    fn = fn.path if isinstance(fn, os.DirEntry) else fn
                    raise SpecialFileError("`%s` is a named pipe" % fn)
                if _WINDOWS and i == 0:
                    file_size = st.st_size

        if not follow_symlinks and _islink(src):
            os.symlink(os.readlink(src), dst)
        else:
>           with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
E           PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_multi_path0\\dest\\tmpiagozo3q'

C:\Python38\lib\shutil.py:261: PermissionError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_multi_path0\src/foo\bar/tmpiagozo3q
Transferring C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_multi_path0\dest\tmpiagozo3q to server
_____________________________ test_copy_to_rename _____________________________

mock_data_access = <powersimdata.utility.transfer_data.SSHDataAccess object at 0x00000200F025A0D0>
make_temp = <function make_temp.<locals>._make_temp at 0x00000200EDD5E4C0>

    def test_copy_to_rename(mock_data_access, make_temp):
        root = mock_data_access.root
        fname = make_temp(remote=False)
        new_fname = "new_fname"
>       mock_data_access.copy_to(fname, root, change_name_to=new_fname)

powersimdata\utility\tests\test_transfer_data.py:114:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
powersimdata\utility\transfer_data.py:195: in copy_to
    sftp.put(from_path, to_path)
powersimdata\tests\mock_ssh.py:15: in put
    shutil.copy(from_path, to_path)
C:\Python38\lib\shutil.py:415: in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

src = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\dest\\tmpv8_l6qqq'
dst = 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\src/C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\src/new_fname'

    def copyfile(src, dst, *, follow_symlinks=True):
        """Copy data from src to dst in the most efficient way possible.

        If follow_symlinks is not set and src is a symbolic link, a new
        symlink will be created instead of copying the file it points to.

        """
        sys.audit("shutil.copyfile", src, dst)

        if _samefile(src, dst):
            raise SameFileError("{!r} and {!r} are the same file".format(src, dst))

        file_size = 0
        for i, fn in enumerate([src, dst]):
            try:
                st = _stat(fn)
            except OSError:
                # File most likely does not exist
                pass
            else:
                # XXX What about other special files? (sockets, devices...)
                if stat.S_ISFIFO(st.st_mode):
                    fn = fn.path if isinstance(fn, os.DirEntry) else fn
                    raise SpecialFileError("`%s` is a named pipe" % fn)
                if _WINDOWS and i == 0:
                    file_size = st.st_size

        if not follow_symlinks and _islink(src):
            os.symlink(os.readlink(src), dst)
        else:
>           with open(src, 'rb') as fsrc, open(dst, 'wb') as fdst:
E           PermissionError: [Errno 13] Permission denied: 'C:\\Users\\DanielOlsen\\AppData\\Local\\Temp\\pytest-of-DanielOlsen\\pytest-0\\test_copy_to_rename0\\dest\\tmpv8_l6qqq'

C:\Python38\lib\shutil.py:261: PermissionError
---------------------------- Captured stdout call -----------------------------
ls C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_rename0\src/C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_rename0\src/new_fname
Transferring C:\Users\DanielOlsen\AppData\Local\Temp\pytest-of-DanielOlsen\pytest-0\test_copy_to_rename0\dest\tmpv8_l6qqq to server
============================== warnings summary ===============================
..\..\..\..\..\Python38\lib\site-packages\win32\lib\pywintypes.py:2
  C:\Python38\lib\site-packages\win32\lib\pywintypes.py:2: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp, sys, os

-- Docs: https://docs.pytest.org/en/stable/warnings.html
=========================== short test summary info ===========================
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_from - OSE...
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_from_multi_path
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_to - Permi...
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_to_multi_path
FAILED powersimdata/utility/tests/test_transfer_data.py::test_copy_to_rename
========== 5 failed, 194 passed, 41 deselected, 1 warning in 12.69s ===========

jenhagg · 2020-12-09T00:37:43Z

Should we rename the BACKUP_DATA_ROOT_DIR, DATA_ROOT_DIR and LOCAL_DIR variables in powersimdata.utility.server_setup to SERVER_BACKUP_DATA_ROOT_DIR, SERVER_DATA_ROOT_DIR and ***CLIENT_DATA_ROOT_DIR, respectively?

I don't feel a strong need to do so currently, but not opposed to it.

jenhagg · 2020-12-09T00:45:28Z

Tests do not pass on Windows:

Thanks, Windows. Looks like it's due to calling posixpath.join within copy_to which previously would only pertain to paths on the server. I'll see if I can mock that out, or if there is an alternative approach.

danielolsen

The main code looks good and works to create, prepare, and run a scenario, and to load its inputs & outputs from within the scenario framework. Once the testing on windows is worked out I think it is ready to merge.

danielolsen reviewed Nov 20, 2020

View reviewed changes

powersimdata/utility/transfer_data.py Outdated Show resolved Hide resolved

danielolsen reviewed Nov 20, 2020

View reviewed changes

dmuldrew reviewed Nov 20, 2020

View reviewed changes

powersimdata/data_access/csv_store.py Outdated Show resolved Hide resolved

jenhagg force-pushed the jon/tree branch from 54fead8 to 251f030 Compare November 21, 2020 00:14

rouille reviewed Nov 21, 2020

View reviewed changes

powersimdata/input/input_data.py Outdated Show resolved Hide resolved

danielolsen reviewed Nov 23, 2020

View reviewed changes

jenhagg force-pushed the jon/tree branch from 6f03ef9 to ea91048 Compare November 24, 2020 00:57

jenhagg force-pushed the jon/tree branch 4 times, most recently from beeb42f to e3d20c9 Compare December 1, 2020 06:40

jenhagg marked this pull request as ready for review December 1, 2020 19:24

jenhagg self-assigned this Dec 1, 2020

jenhagg added this to the Turkey Time milestone Dec 1, 2020

jenhagg commented Dec 1, 2020

View reviewed changes

jenhagg requested review from ahurli and ToddG December 2, 2020 20:02

danielolsen reviewed Dec 2, 2020

View reviewed changes

jenhagg force-pushed the jon/tree branch from ab38924 to f309fdb Compare December 4, 2020 00:46

Jon Hagg added 6 commits December 8, 2020 14:58

refactor: mirror server structure locally

30896f0

refactor: remove data_access param from OutputData

8a3c035

refactor: makedirs in one line

d167a93

refactor: move scenario/execute list constants around

d3293bc

refactor: add wrappers for cp and rm commands

bf725db

refactor: minor simplification of path joins

08f9e0c

Jon Hagg added 11 commits December 8, 2020 14:58

refactor: use relative path in copy_to and handle cleanup

4540a59

refactor: use proper path joins and simplify _read_data

8e758f5

test: factor out commands and add tests

aacddbb

refactor: move existence check to base class

c28b5b1

test: move tests around

2c657ab

test: add unit tests for copy_from

b7c8417

test: add unit tests for copy_to

ba5fac7

test: ensure filename is not a full path

053e999

chore: add docstrings and remove wip test markers

5ccc396

chore: remove unused data_access parameter

6131b79

refactor: create stub factory to maintain abstraction

d3bb11d

jenhagg force-pushed the jon/tree branch from d30b0e2 to d3bb11d Compare December 8, 2020 22:59

danielolsen reviewed Dec 8, 2020

View reviewed changes

powersimdata/scenario/create.py Show resolved Hide resolved

danielolsen approved these changes Dec 9, 2020

View reviewed changes

Jon Hagg added 2 commits December 9, 2020 11:32

test: skip tests using posixpath on windows

3b4d620

chore: default to_dir and cleanup tests

b693de0

jenhagg merged commit 86827c1 into develop Dec 9, 2020

jenhagg deleted the jon/tree branch December 9, 2020 21:03

ahurli mentioned this pull request Mar 11, 2021

Develop into Master #410

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use relative paths and mirror server file structure locally #340

Use relative paths and mirror server file structure locally #340

jenhagg commented Nov 20, 2020 •

edited

Loading

danielolsen Nov 20, 2020

jenhagg Nov 22, 2020

danielolsen commented Nov 21, 2020

jenhagg commented Nov 21, 2020

danielolsen Nov 23, 2020

danielolsen Nov 23, 2020

jenhagg Nov 23, 2020

jenhagg Dec 1, 2020

jenhagg Dec 1, 2020

jenhagg Dec 1, 2020

danielolsen Dec 2, 2020

jenhagg Dec 2, 2020

danielolsen Dec 4, 2020

rouille Dec 4, 2020 •

edited

Loading

jenhagg Dec 4, 2020

danielolsen Dec 4, 2020 •

edited

Loading

jenhagg Dec 4, 2020

danielolsen Dec 8, 2020

jenhagg Dec 8, 2020

jenhagg Dec 8, 2020

danielolsen commented Dec 9, 2020

rouille commented Dec 9, 2020

danielolsen commented Dec 9, 2020

jenhagg commented Dec 9, 2020

jenhagg commented Dec 9, 2020

danielolsen left a comment

Use relative paths and mirror server file structure locally #340

Use relative paths and mirror server file structure locally #340

Conversation

jenhagg commented Nov 20, 2020 • edited Loading

Purpose

What the code is doing

Testing

Where to look

Time estimate

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielolsen commented Nov 21, 2020

jenhagg commented Nov 21, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rouille Dec 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielolsen Dec 4, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danielolsen commented Dec 9, 2020

rouille commented Dec 9, 2020

danielolsen commented Dec 9, 2020

jenhagg commented Dec 9, 2020

jenhagg commented Dec 9, 2020

danielolsen left a comment

Choose a reason for hiding this comment

jenhagg commented Nov 20, 2020 •

edited

Loading

rouille Dec 4, 2020 •

edited

Loading

danielolsen Dec 4, 2020 •

edited

Loading