Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve type signatures of pipe methods to enable type checkers to flag erroneous usages #9997

Open
chuckwondo opened this issue Jan 29, 2025 · 6 comments

Comments

@chuckwondo
Copy link

chuckwondo commented Jan 29, 2025

What is your issue?

The type annotations of xarray.Dataset.pipe, xarray.DataArray.pipe, and xarray.DataTree.pipe should be enhanced to address the following shortcomings in terms of what type checkers cannot flag due to the imprecise typing currently in place:

  1. Incorrect number of arguments
  2. Incorrect argument types
  3. Code suggestions for chaining after the pipe (only an issue for Datatree.pipe)

Note that in the case of specifying a tuple (function + keyword) as the first argument to pipe, only item 3 can be addressed because there's no way to fully specify the function's type signature when the function does not take the dataset/array/tree as its first argument.

In this case, this can be addressed by using a lambda instead. In essence, my recommendation would be for users of pipe not to use the tuple form so that more robust type checking can be performed.

NOTE: I plan to submit a PR for this issue.

@chuckwondo chuckwondo added the needs triage Issue that has not been reviewed by xarray team member label Jan 29, 2025
Copy link

welcome bot commented Jan 29, 2025

Thanks for opening your first issue here at xarray! Be sure to follow the issue template!
If you have an idea for a solution, we would really welcome a Pull Request with proposed changes.
See the Contributing Guide for more.
It may take us a while to respond here, but we really value your contribution. Contributors like you help make xarray better.
Thank you!

@TomNicholas TomNicholas added topic-typing and removed needs triage Issue that has not been reviewed by xarray team member labels Jan 29, 2025
@TomNicholas
Copy link
Member

That sounds good, but for someone unfamiliar with ParamSpec can you elaborate on how that would improve things? Or just show us in a PR 😁

@max-sixty
Copy link
Collaborator

FWIW unless someone is actively working on it or seeking input, I would vote to not leave issues open for things like this — there is some coordination cost of having hundreds of open issues — and most typing work (though not all tbc) isn't that ambiguous such that we can have useful discussions before seeing code contributions.

So I would propose leaving this open for a week or two and then closing with "contributions welcome"

(nothing directed at you @chuckwondo , thank you for opening it, and if you'd be up for submitting a PR with some implementation that would be very welcome!)

@chuckwondo
Copy link
Author

That sounds good, but for someone unfamiliar with ParamSpec can you elaborate on how that would improve things? Or just show us in a PR 😁

Absolutely. I'm already working on a PR. I should have noted in the initial comment that I plan to submit a PR to address it.

I suspect I won't put up a PR until next week due to my current commitments. I am also adding a pytest plugin that supports testing type annotations via mypy, so the type tests that I'll include will show clarifying examples. (Note: this is different from the existing task that runs code through mypy. Rather, the pytest plugin supports testing that your own type annotations behave as intended.)

In short, here are the shortcomings of the current pipe type annotations in terms of what type checkers cannot flag due to the imprecise typing currently in place:

  1. Incorrect number of arguments
  2. Incorrect argument types
  3. Code suggestions for chaining after the pipe (only an issue for Datatree.pipe)

Note that in the case of specifying a tuple (function + keyword) as the first argument to pipe, only item 3 can be addressed because there's no way to fully specify the function's type signature when the function does not take the dataset/array/tree as its first argument. In this case, this can be addressed by using a lambda instead. In essence, my recommendation would be for users of pipe not to use the tuple form so that more robust type checking can be performed.

(nothing directed at you @chuckwondo , thank you for opening it, and if you'd be up for submitting a PR with some implementation that would be very welcome!)

No worries. I realize there are much more pressing issues to address.

(Note: I may have an unhealthy obsession with type annotations, so whenever I open an issue regarding the need for correcting or improving any type annotations, it means I also plan to open a PR to address the issue, but I will try to remember to explicitly indicate this whenever I open such an issue.)

@max-sixty
Copy link
Collaborator

(Note: I may have an unhealthy obsession with type annotations, so whenever I open an issue regarding the need for correcting or improving any type annotations, it means I also plan to open a PR to address the issue, but I will try to remember to explicitly indicate this whenever I open such an issue.)

Amazing, that's great, thank you!

@chuckwondo chuckwondo changed the title Improve type signatures of pipe methods with ParamSpec Improve type signatures of pipe methods to enable type checkers to flag erroneous usages Jan 30, 2025
@chuckwondo
Copy link
Author

I enhanced the initial comment with some details from my previous comment, and I also updated the issue title to be more descriptive rather than prescriptive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants