Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix/normalize storage paths #2384

Merged
merged 11 commits into from
Oct 16, 2024
Merged

fix/normalize storage paths #2384

merged 11 commits into from
Oct 16, 2024

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Oct 15, 2024

This PR adds some of the path normalization logic from zarr v2 into v3. I largely copied this function as-is with a few tweaks. That function is actually not terribly strict as far as path parsing is concerned, so I'm open to requests that we make it stricter. The main thing to note is that normalize_path (the new name for that function) strips leading / characters from strings, which ensures that paths are relative.

I wired this function up to make_store_path, which previously didn't take path as a parameter (now it does). This led to some nice code deletion opportunities in our various creation routines.

Something that always bugged me in zarr v2 was use of None as a default value for types where there's already a default value. In the case of strings, the empty string '' is a perfectly good default value, and so allowing None is just noise IMO. I was tempted to try and remove None as a valid path argument in this PR, but I held back. If people agree that None is a silly default for a stringy value like path when we can just use '' instead, then I can make that change as well.

Fixes #2357

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/tutorial.rst
  • Changes documented in docs/release.rst
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

Copy link
Member

@jhamman jhamman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good stuff @d-v-b - I'm glad we're moving path into make_store_path.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, a couple questions about the implementation.

One meta-comment I've been thinking about, probably for a future PR: we could define PathLike as a NewType. The user-facing API could accept str | PathLike, and internally we could use just PathLike. Then we can verify with mypy that we've normalized all paths provided by the user.

@d-v-b d-v-b merged commit 29246d6 into main Oct 16, 2024
25 checks passed
@d-v-b d-v-b deleted the fix/normalize-storage-paths branch October 16, 2024 17:04
d-v-b added a commit to d-v-b/zarr-python that referenced this pull request Oct 18, 2024
* bring in path normalization function from v2, and add a failing test

* rephrase comment

* simplify storepath creation

* Update tests/v3/test_api.py

Co-authored-by: Joe Hamman <[email protected]>

* refactor: remove redundant zarr format fixture

* replace assertion with an informative error message

* fix incorrect path concatenation in make_store_path, and refactor store_path tests

* remove upath import because we don't need it

* apply suggestions from code review

---------

Co-authored-by: Joe Hamman <[email protected]>
d-v-b added a commit that referenced this pull request Oct 18, 2024
* move v3/tests to tests and fix various mypy issues

* test(ci): change branch name in v3 workflows (#2368)

* Use lazy % formatting in logging functions (#2366)

* Use lazy % formatting in logging functions

* f-string should be more efficient

* Space before unit symbol

From "SI Unit rules and style conventions":
https://physics.nist.gov/cuu/Units/checklist.html

	There is a space between the numerical value and unit symbol,
	even when the value is used in an adjectival sense, except in
	the case of superscript units for plane angle.

* Enforce ruff/flake8-logging-format rules (G)

---------

Co-authored-by: Joe Hamman <[email protected]>

* Move roadmap and v3-design documument to docs (#2354)

* move roadmap to docs

* formatting and minor copy editing

* Multiple imports for an import name (#2367)

Co-authored-by: Joe Hamman <[email protected]>

* Enforce ruff/pycodestyle warnings (W) (#2369)

* Apply ruff/pycodestyle rule W291

W291 Trailing whitespace

* Enforce ruff/pycodestyle warnings (W)

It looks like `ruff format` does not catch all trailing spaces.

---------

Co-authored-by: Joe Hamman <[email protected]>

* Apply ruff/pycodestyle preview rule E262 (#2370)

E262 Inline comment should start with `# `

Co-authored-by: Joe Hamman <[email protected]>

* Fix typo (#2382)

Co-authored-by: Joe Hamman <[email protected]>

* Imported name is not used anywhere in the module (#2379)

* Missing mandatory keyword argument `shape` (#2376)

* Update ruff rules to ignore (#2374)

Co-authored-by: Joe Hamman <[email protected]>

* Docstrings for arraymodule (#2276)

* start to docstrings for arraymodule

* incorporating toms edits, overriding mypy error...

* fix attrs

* Update src/zarr/core/array.py

Co-authored-by: Sanket Verma <[email protected]>

* fix store -> storage

* remove properties from asyncarray docstring

---------

Co-authored-by: Sanket Verma <[email protected]>
Co-authored-by: Joe Hamman <[email protected]>

* fix/normalize storage paths (#2384)

* bring in path normalization function from v2, and add a failing test

* rephrase comment

* simplify storepath creation

* Update tests/v3/test_api.py

Co-authored-by: Joe Hamman <[email protected]>

* refactor: remove redundant zarr format fixture

* replace assertion with an informative error message

* fix incorrect path concatenation in make_store_path, and refactor store_path tests

* remove upath import because we don't need it

* apply suggestions from code review

---------

Co-authored-by: Joe Hamman <[email protected]>

* Enforce ruff/flake8-pyi rule PYI013 (#2389)

PYI013 Non-empty class body must not contain `...`

Note that documentation is enough to fill the class body.

* deps: remove fasteners from list of dependencies (#2386)

* Enforce ruff/flake8-annotations rule ANN003 (#2388)

ANN003 Missing type annotation

Co-authored-by: Joe Hamman <[email protected]>

* Enforce ruff/Perflint rules (PERF) (#2372)

* Apply ruff/Perflint rule PERF401

PERF401 Use a list comprehension to create a transformed list

* Enforce ruff/Perflint rules (PERF)

* chore: update package maintainers (#2387)

* chore: update package maintainers

* Update pyproject.toml

Co-authored-by: David Stansby <[email protected]>

---------

Co-authored-by: David Stansby <[email protected]>

* Fixed consolidated Group getitem with multi-part key (#2363)

* Fixed consolidated Group getitem with multi-part key

This fixes `Group.__getitem__` when indexing with a key
like 'subgroup/array'. The basic idea is to rewrite the indexing
operation as `group['subgroup']['array']` by splitting the key
and doing each operation independently.

Closes #2358

---------

Co-authored-by: Joe Hamman <[email protected]>

* chore: add python 3.13 to ci / pyproject.toml (#2385)

* chore: add python 3.13 to ci / pyproject.toml

* update hatch matrix

* remove references to dead test dir in pyproject.toml

* remove v3 reference in test

---------

Co-authored-by: Joe Hamman <[email protected]>
Co-authored-by: Dimitri Papadopoulos Orfanos <[email protected]>
Co-authored-by: Emma Marshall <[email protected]>
Co-authored-by: Sanket Verma <[email protected]>
Co-authored-by: David Stansby <[email protected]>
Co-authored-by: Tom Augspurger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Paths with leading slashes do bad things
3 participants