Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data_kind: Refactor the if-else statements into match-case statements #3481

Merged
merged 8 commits into from
Oct 7, 2024

Conversation

seisman
Copy link
Member

@seisman seisman commented Oct 3, 2024

Description of proposed changes

Follow-up PR after PR #3480.

This PR refactors the data_kind function, to use match-case statements instead of if-else statements.

This PR also updates the docstrings.

@seisman seisman added maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog needs review This PR has higher priority and needs review. labels Oct 3, 2024
@seisman seisman added this to the 0.14.0 milestone Oct 3, 2024
@@ -187,23 +187,26 @@ def _check_encoding(
return "ISOLatin1+"


def data_kind(
def data_kind( # noqa: PLR0911
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PLR0911 means "too-many-return-statements".

https://docs.astral.sh/ruff/rules/too-many-return-statements/

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR refactors the data_kind function, to use if-return statements instead of if-else statements. The new if-return statements are easier to read and maintain.

I don't see much of an improvement in readability/maintainability between the previous if-elif-else statements and the if-return style in this PR. Looking at https://stackoverflow.com/questions/72024229/how-to-solve-pylint-too-many-return-statements-elegantly, I'm thinking if we could either keep the if-elif-else syntax, or use match-case instead.

If using match-case, would it be more readable to do the isinstance() checks first, and then have a nested if-elif for case _ for the more complex cases?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel match-case would make it more complicated:

match data:
    case x if isinstance(x, io.StringIO):
        return "stringio"
    case x if isinstance(x, xr.DataArray):
        return "image" if len(x.dims) == 3 else "grid"
    case x if isinstance(x, bool | int | float) or (x is None and not required):
        return "arg"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using match-case also has the too-many-return-statements warning, and using if-elif-else means we need to type hints the kind variable like:

kind: Literal["arg", "file", "geojson", "grid", "image", "matrix", "stringio", "vectors"]

Copy link
Member

@weiji14 weiji14 Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't run tests on this, but we can remove some of the isinstance() calls like so using class patterns (xref https://stackoverflow.com/questions/67524641/convert-multiple-isinstance-checks-to-structural-pattern-matching/67524642#67524642)

    match data:
        case str() | pathlib.PurePath():
            kind = "file"
        case list() | tuple() if all(
            isinstance(_file, str | pathlib.PurePath) for _file in data
        ):
            kind = "file"
        case io.StringIO():
            kind = "stringio"
        case (bool() | int() | float()) | None if not required:
            kind = "arg"
        case xr.DataArray():
            kind = "image" if len(data.dims) == 3 else "grid"
        case x if hasattr(x, "__geo_interface__"):
            kind = "geojson"
        case x if x is not None:
            kind = "matrix"
        case _:
            kind = "vectors"
    return kind

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The match-case statements look more compact. Done in 33051ac.

The mypy warning is also suppressed.

@@ -287,30 +290,36 @@ def data_kind(
>>> data_kind(data=None)
'vectors'
"""
kind: Literal[
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type hint is no longer needed.

def data_kind(
data: Any = None, required: bool = True
def data_kind( # noqa: PLR0911
data: Any, required: bool = True
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making data a required parameter here.

@seisman seisman requested a review from a team October 5, 2024 12:34
@seisman seisman added final review call This PR requires final review and approval from a second reviewer and removed needs review This PR has higher priority and needs review. labels Oct 6, 2024
@seisman
Copy link
Member Author

seisman commented Oct 6, 2024

Will merge in 24 hours.

- ``"file"``: a string or a :class:`pathlib.PurePath` object or a sequence of them,
representing one or more file names
- ``"geojson"``: a geo-like Python object that implements ``__geo_interface__``
(e.g., geopandas.GeoDataFrame or shapely.geometry)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this highlighting exists also for shapely.geometry?

Suggested change
(e.g., geopandas.GeoDataFrame or shapely.geometry)
(e.g., :class:`geopandas.GeoDataFrame` or shapely.geometry)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is not documented in the API documentation page, so linking shapely.geometry or not makes little difference.

@seisman seisman changed the title data_kind: Refactor the if-else statements into if-return statements data_kind: Refactor the if-else statements into match-case statements Oct 7, 2024
@seisman seisman merged commit dd26601 into main Oct 7, 2024
21 checks passed
@seisman seisman deleted the data_kind/return branch October 7, 2024 05:05
@seisman seisman removed the final review call This PR requires final review and approval from a second reviewer label Oct 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Boring but important stuff for the core devs skip-changelog Skip adding Pull Request to changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants