-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat, refactor: added new entry point to run call.py without PowerSimData #74
Conversation
) | ||
|
||
|
||
def _insert_in_file(filename, scenario_id, column_number, column_value): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could update the docstring here, since it now also works on the scenario list. I suspect we'd move this to a helper module when refactoring extract_data
since there is a duplicate of this in there, but it can wait, just an fyi if you didn't see that yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! I didn't really make any changes here, so I kept the docstring as is, but I see no reason why not to update it.
As far as the refactoring of insert_in_file out, I still think ultimately we should remove the PowerSim dependency from here. Since a multi-repo PR seemed a little daunting while I'm still learning the workflow of the team, I wrote this so decoupling it would ultimately just mean deleting _get_scenario, _record_scenario, _insert_in_file, the arg definitions, and the if args.scenario_id
statements (i.e. just deleting code but not necessarily writing anything new). My hope is to refactor extract_data
, update PowerSimData
to use this new call, then just delete these functions embedded here and in extract_data
.
) | ||
|
||
parser.add_argument( | ||
"powersim_threads", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to combine the thread arguments and still have compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but it causes some weird interactions which I'm not sure are desirable. Specifically, you can pass in 2 arguments defining the number of threads, and the second one will be applied.
E.g.
python -u ./pyreisejl/utility/call.py 88 12 -t 123
will set threads to 123 while python -u ./pyreisejl/utility/call.py -t 123 88 12
sets the threads arg to 12.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's too ugly to support both, adding a complementary PR in PowerSimData to specify threads with a -t
flag should be very easy, and as long as we merge both PRs at the same time it won't disrupt our workflow at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I was thinking it would be nice to only have one flag, but wasn't sure if we'd need to change powersimdata to achieve that. I'm fine with merging this now and doing the complementary PRs as a follow up, or adding onto this, whatever is easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ha, yeah I was trying to avoid a multi-repo PR first thing. (See above comment re: 'multi-repo PR'). Basically, I wrote this such that my plan was to maintain backwards compatibility with code that was easily deletable (i.e. anything referencing PowerSimData is in it's own function that isn't intertwined with the actual parameter parsing for the simulation) so that I could get this through, work on PowerSimData
to use the new flags, then just delete any relevant code in REISE.jl
without having to worry about coordination.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to sequence the changes however you're most comfortable.
As of now, the only people who regularly run scenarios are @BainanXia and myself, and we're pretty familiar with needing to merge/pull multiple repos at the same time to keep them playing nice together. It happens somewhat frequently with PowerSimData/PostREISE, and occasionally with PowerSimData/REISE.jl.
pyreisejl/utility/call.py
Outdated
parser.add_argument( | ||
"-s", | ||
"--start-date", | ||
help="The start date for the simulation in format 'YYYY-MM-DD HH:MM:SS'", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now we are limited to only running simulations at hourly resolution, and only for the year 2016 (see min_ts
and max_ts
in launch_scenario()
). Until we fix that issue, we should document it here.
pyreisejl/utility/call.py
Outdated
:param str end_date: end date of simulation as 'YYYY-MM-DD HH:MM:SS' | ||
:param int interval: length of each interval in hours | ||
:param str input_dir: directory with input data | ||
:param None/str output_dir: directory for output data. None defaults to input directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this accurate? I think None
defaults to an output
subfolder within the input directory (via the Julia part of REISE.jl).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! I think you're right, and I missed it when trying to make sure my lines weren't too long. This should match the argument help message in line 164. Thank you for catching that!
I believe we are not far from getting rid of the
All three are defined in |
pd.Timestamp("2016-01-01 00:00:00"), | ||
pd.Timestamp("2016-01-01 02:00:00"), | ||
"H", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we delete the dummy csv file at the end of the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an in memory file, no? If so, nothing to clean up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was looking for a way to do it in memory to avoid having to clean it up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All good then
pyreisejl/utility/call.py
Outdated
""" | ||
# extract time limits from 'demand.csv' | ||
profile = open(os.path.join(input_dir, "demand.csv")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's generally good to call open
with a context manager:
with open(os.path.join(input_dir, "demand.csv")) as profile:
min_ts, max_ts, freq = extract_date_limits(profile)
dates = pd.date_range(start=min_ts, end=max_ts, freq=freq)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
:return: (*pandas.Timestamp*) -- the valid date as a pandas timestamp | ||
""" | ||
regex = r"^\d{4}-\d{1,2}-\d{1,2}( (?P<hour>\d{1,2})(:\d{1,2})?(:\d{1,2})?)?$" | ||
match = re.match(regex, date) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always impressed by a good regex. That said, did we consider using either datetime.strptime or pandas.to_datetime (which supports the same format as strptime)? Nothing wrong with using regex, just throwing that out there in case it's useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex is to parse out whether or not an hour was specified, hence the use of the named capture group for that field specifically. This gives us the ability to specify an end date with no hour (e.g. 2016-01-01 to 2016-12-31), and then round it up to the end of the last day (2016-01-01 00:00:00 to 2016-12-31 23:00:00) which hopefully makes it a little more user friendly.
pyreisejl/utility/call.py
Outdated
print(start_ts) | ||
print(end_ts) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these prints here on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! Those are leftover from my personal testing.
pyreisejl/utility/call.py
Outdated
print(args.start_date) | ||
print(args.end_date) | ||
print(args.interval) | ||
print(args.input_dir) | ||
print(args.output_dir) | ||
print(args.threads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these prints still here on purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope! Those are leftover from my personal testing.
valid_date += pd.Timedelta(hours=23) | ||
|
||
else: | ||
err_str = f"'{date}' is an invalid date. It needs to be in the form YYYY-MM-DD." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also say that HH:MM:SS
can be passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not sure what's the best way to say that those are valid but optional. I didn't want an overly-wordy error string. Any suggestions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
f"'{date}' is an invalid date. YYYY-MM-DD is required, HH:MM:SS are optional."
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for refactoring this! I will follow your lead when I add a command line argument for the number of segments to linearize to.
b76b10f
to
ce55ef5
Compare
Purpose
Remove dependency of call.py on PowerSimData, adding another entry point to call.py that is not a scenario ID.
What is the code doing?
This code now runs independently of PowerSim data by passing in flags to define the required and options parameters to run a simulation. The flags are documented below, which is the result of running
python call.py -h
orpython call.py --help
.To maintain backwards compatibility (for now), I have been able to successfully run the following two commands in my local environment and get the Julia engine to run locally (though not complete because I'm impatient) after copying scenario 1220 data to my local and updating
const.py
accordingly:python -u ./pyreisejl/utility/call.py 1220
python -u ./pyreisejl/utility/call.py 1220 5
As an example of how the code can now be run:
python ./pyreisejl/utility/call.py -s '2016-01-01 00:00:00' -e '2016-01-07 23:00:00' -int 24 -i /Users/annahurlimann/ScenarioData/test
I've also included some additional checks/input validation:
python -u ./pyreisejl/utility/call.py -s '2016-01-01 00:00:00'
returns an WrongNumberOfArguments error).Where to Look
The code changed is all in
pyreise.utility.call
.Time to Review
15 min.