-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update config file path resolution #716
Update config file path resolution #716
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #716 +/- ##
===========================================
- Coverage 94.72% 94.69% -0.04%
===========================================
Files 73 73
Lines 4932 4937 +5
===========================================
+ Hits 4672 4675 +3
- Misses 260 262 +2 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Just had a minor documentation suggestion
One last detail of model configurations is that you may want to allow users to specify a | ||
file path to configure your model. This should not be used for loading the data used to | ||
run the model but might be used to load further details. If you do want to include a | ||
file path in your congfiguration, you need to pick a configuration key name that ends in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth giving an example here, as it's pretty technical language which I think becomes much more obvious when you give an example of what a non-data
data file might contain
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I have to admit that without the explanation in the PR description, I would not know what this is suppose to be doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dalonsoa Yup - I've updated both this section and the _resolve_config_paths
docstring. Is that now clearer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the part I find difficult is "specify a file path to configure your model", in that I think you can only understand this if you already have a pretty good idea of how the model configuration system works. I guess something expanded on what this means/why you would want to configure the models with additional non-data files would be useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh FFS. I hadn't pushed my update to that section. I've pushed it and added a concrete example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yeah the new text is much clearer!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks OK, but I think the explanation on what this is for and how to use it needs some improvement. As it is now, it is very much unclear.
I've also flagged one potential issue, although that might be better addressed in a separate PR.
One last detail of model configurations is that you may want to allow users to specify a | ||
file path to configure your model. This should not be used for loading the data used to | ||
run the model but might be used to load further details. If you do want to include a | ||
file path in your congfiguration, you need to pick a configuration key name that ends in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. I have to admit that without the explanation in the PR description, I would not know what this is suppose to be doing.
@@ -97,36 +97,34 @@ def config_merge( | |||
return dest, conflicts | |||
|
|||
|
|||
def _resolve_config_paths(config_dir: Path, params: dict[str, Any]) -> None: | |||
def _resolve_config_paths(config_dir: Path, config_dict: dict[str, Any]) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't checked this back in the day, so I mention it here now. This will work totally find in Mac and Linux (bugs aside) but will fail in Windows if any of the paths is in a different unit than the configuration file. Let's say your config file is in C:
but you have all you data files in a separate drive, eg. D:
. In this case, there will be an error as it is not possible to stablish relative paths between locations in different units. It's a very specific case, but I mention it because I learn it the hard way!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dalonsoa Just to check I understand this. If we have a config file on Windows, then the file could be C:/config.toml
but contain references to D:/data/data1.nc
. Within the single config file, does that data path have to be absolute, because there is no way in Windows to get a relative path onto the D:/
drive from C:/
?
If that is the case, then we just need to make sure those D:/
references don't get tampered with. I think this is another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. In Windows, you cannot stablish relative paths between locations in different units. It just does not let you. At least, pathlib
cannot resolve that situation. This is what I get when I tested it a few minutes ago:
>>> from pathlib import Path
>>> a = Path.cwd()
>>> a
WindowsPath('C:/Users/dalonsoa')
>>> b = Path(r"D:/agdp.exe")
>>> b
WindowsPath('D:/agdp.exe')
>>> b.exists()
True
>>> b.relative_to(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\dalonsoa\.pyenv\pyenv-win\versions\3.12.4\Lib\pathlib.py", line 682, in relative_to
raise ValueError(f"{str(self)!r} is not in the subpath of {str(other)!r}")
ValueError: 'D:\\agdp.exe' is not in the subpath of 'C:\\Users\\dalonsoa'
>>> a.relative_to(b)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\dalonsoa\.pyenv\pyenv-win\versions\3.12.4\Lib\pathlib.py", line 682, in relative_to
raise ValueError(f"{str(self)!r} is not in the subpath of {str(other)!r}")
ValueError: 'C:\\Users\\dalonsoa' is not in the subpath of 'D:\\agdp.exe'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much clearer now!
Co-authored-by: Diego Alonso Álvarez <[email protected]>
Description
This PR:
Updates
core.config._resolve_config_paths
to recursively search a config file payload for values stored under keys ending in_path
and then resolves those values relative to the config path. The function previously only did this tofile
entries in[core.data.variable]
tables and recent changes in model configurations now need options to resolve other file paths within the config tree.Extends the test suite to check path manipulation and error handling work as expected
Updates the core model schema to change
file
tofile_path
and then rolls out the change in the[core.data.variable]
schema specs across example files and tests to get this to apply. This is a bit of a sweeping change but I think it is long term easier to maintain the pattern "xyz_path
needs resolving" than to have to maintain a list that "these specific key paths need resolving".Fixes # 715
Type of change
Key checklist
pre-commit
checks:$ pre-commit run -a
$ poetry run pytest
Further checks