TYP: get_indexer #40612

jbrockmendel · 2021-03-24T15:42:31Z

@simonjayhawkins im at a loss on why im getting a mypy complaint:

error: Overloaded function signatures 1 and 2 overlap with incompatible return types  [misc]

AFAICT they should be compatible. Am I missing something obvious?

jbrockmendel · 2021-04-01T16:30:59Z

@simonjayhawkins gentle ping

jreback · 2021-04-02T02:23:06Z

can you rebase

jbrockmendel · 2021-04-02T03:02:13Z

cc @WillAyd any idea whats going on here? my confusion only deepens when i try to do the same thing for Index.join

jbrockmendel · 2021-04-02T22:02:41Z

rebased, still no idea whats going on with mypy

jbrockmendel · 2021-04-05T18:49:41Z

@MarcoGorelli #40197 makes me think you have a better grasp of @overload than i do. can you tell what im doing wrong here?

MarcoGorelli

Hey @jbrockmendel ,

Yeah, I'm embarassed by how long I spent making sense of this in order to do #40200 and #40197 (maybe I should write a blog post* about it)

I explain it a bit more clearly in ##40200 (comment) , but in essence, you need an overload for each combination of keyword argument which precedes the one you're overloading

Here, there are no keyword arguments preceding unique, so you'll just need overloads for:

Literal[True] = ...
Literal[False]
bool = ...

What's confusing here is that the bool overload is needed - IMO, it shouldn't be, as the union of Literal[True] and Literal[False] should be regarded as the same as bool, but I opened an issue about that in mypy and have had no response: python/mypy#10194

*just put something down while it's still fresh in my mind - and in case it's useful to you: https://medium.com/@m.e.gorelli/making-sense-of-typing-overload-437e6deecade

MarcoGorelli · 2021-04-05T20:52:42Z

Indeed, this seems to work:

    @overload
    def _get_indexer_non_comparable(
        self, target: Index, method, unique: Literal[True] = ...
    ) -> np.ndarray:
        # returned ndarray is np.intp
        ...

    @overload
    def _get_indexer_non_comparable(
        self, target: Index, method, unique: Literal[False]
    ) -> tuple[np.ndarray, np.ndarray]:
        # both returned ndarrays are np.intp
        ...

    @overload
    def _get_indexer_non_comparable(
        self, target: Index, method, unique: bool = ...
    ) -> Union[tuple[np.ndarray, np.ndarray], np.ndarray]:
        ...

jbrockmendel · 2021-04-05T21:36:13Z

Indeed, this seems to work:

It does, but AFAICT it isnt the 3rd @overload that is making a difference, its the fact that in the second one you wrote unique: Literal[False] instead of unique: Literal[False] = ...

jbrockmendel · 2021-04-06T01:43:00Z

It does, but AFAICT it isnt the 3rd @overload that is making a difference, its the fact that in the second one you wrote unique: Literal[False] instead of unique: Literal[False] = ...

related: what if the argument for which we need to omit the = ... comes after another keyword-only argument?

update: it looks like just pretending in the overloading line that that previous argument is not keyword-only works. but that feels very sketchy

MarcoGorelli · 2021-04-06T07:41:11Z

It does, but AFAICT it isnt the 3rd @overload that is making a difference, its the fact that in the second one you wrote unique: Literal[False] instead of unique: Literal[False] = ...

You need to omit =... for Literal[False] because the default is Literal[True].

Otherwise, consider what happens if you don't do _get_indexer_non_comparable(target, method). Which overload should mypy use? Literal[False]=... and Literal[True]=... both match, and their return types are incompatible, so mypy throws an error. On the other hand, if you type the overloads as Literal[True]=... and Literal[False], then if you don't specify the value of inplace, then mypy will know to use the one which accepts defaults, i.e. the Literal[True] overload.

Check the mypy docs on this:

mypy will detect and prohibit inherently unsafely overlapping overloads on a best-effort basis.
Two variants are considered unsafely overlapping when both of the following are true:
    - All of the arguments of the first variant are compatible with the second.
    - The return type of the first variant is not compatible with (e.g. is not a subtype of) the second.

related: what if the argument for which we need to omit the = ... comes after another keyword-only argument?

See what I did for set_axis - you need to provide overloads (without =...) for each combination of keyword arguments which comes before the one you're overloading

In case you haven't already, see this open issue in mypy about this topic: python/mypy#6580

pandas/core/indexes/base.py

pandas/core/indexes/category.py

jbrockmendel · 2021-04-07T20:27:36Z

Are any of the remaining comments actionable?

jbrockmendel · 2021-04-19T15:34:33Z

gentle ping @MarcoGorelli

MarcoGorelli

As far as I can tell, this is correct (though probably I don't know this part of the codebase well enough to merge)

The only part here I'm not totally comfortable commenting on is the change from missing to ensure_platform_int(missing), as that involves making sense of pxi.in files, which I've not (yet) done. If that change is correct, then I'm pretty sure the rest is fine

MarcoGorelli · 2021-04-19T17:49:15Z

pandas/core/indexes/base.py

@@ -5217,7 +5221,7 @@ def get_indexer_non_unique(self, target):
        tgt_values = target._get_engine_target()

        indexer, missing = self._engine.get_indexer_non_unique(tgt_values)
-        return ensure_platform_int(indexer), missing
+        return ensure_platform_int(indexer), ensure_platform_int(missing)


perhaps you could just comment on why this change is necessary, and what it does?

i think this was from before a separate PR ensured that self._engine.get_indexer_non_unique already returned ndarray[intp], but this makes it a little bit more explicit (and is cheap)

MarcoGorelli · 2021-04-19T18:54:33Z

Thanks for explaining - thanks @jbrockmendel

* TYP: get_indexer * update per discussion in pandas-dev#40612 * one more overload * pre-commit fixup

TYP: get_indexer

f47c4b2

jbrockmendel added the Typing type annotations, mypy/pyright type checking label Mar 24, 2021

jreback added this to the 1.3 milestone Apr 2, 2021

Merge branch 'master' into typ-get_indexer

9734962

Merge branch 'master' into typ-get_indexer

5b815ac

MarcoGorelli self-requested a review April 5, 2021 19:37

MarcoGorelli requested changes Apr 5, 2021

View reviewed changes

Merge branch 'master' into typ-get_indexer

abb1187

update per discussion in pandas-dev#40612

a115d0d

MarcoGorelli reviewed Apr 6, 2021

View reviewed changes

pandas/core/indexes/base.py Show resolved Hide resolved

jbrockmendel added 2 commits April 6, 2021 09:44

Merge branch 'master' into typ-get_indexer

4dc8836

Merge branch 'master' into typ-get_indexer

1c21432

MarcoGorelli requested changes Apr 6, 2021

View reviewed changes

pandas/core/indexes/base.py Show resolved Hide resolved

jbrockmendel added 3 commits April 6, 2021 15:26

Merge branch 'master' into typ-get_indexer

6182535

one more overload

0e7a168

pre-commit fixup

407aeb3

MarcoGorelli reviewed Apr 7, 2021

View reviewed changes

pandas/core/indexes/category.py Show resolved Hide resolved

jbrockmendel added 3 commits April 13, 2021 10:27

Merge branch 'master' into typ-get_indexer

9dc2907

Merge branch 'master' into typ-get_indexer

d6491c9

Merge branch 'master' into typ-get_indexer

262b28d

MarcoGorelli approved these changes Apr 19, 2021

View reviewed changes

MarcoGorelli reviewed Apr 19, 2021

View reviewed changes

MarcoGorelli merged commit e78fd92 into pandas-dev:master Apr 19, 2021

jbrockmendel deleted the typ-get_indexer branch April 19, 2021 19:00

yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request Apr 21, 2021

TYP: get_indexer (pandas-dev#40612)

8fa4f11

* TYP: get_indexer * update per discussion in pandas-dev#40612 * one more overload * pre-commit fixup

yeshsurya pushed a commit to yeshsurya/pandas that referenced this pull request May 6, 2021

TYP: get_indexer (pandas-dev#40612)

985c5fa

* TYP: get_indexer * update per discussion in pandas-dev#40612 * one more overload * pre-commit fixup

JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021

TYP: get_indexer (pandas-dev#40612)

9e3cfdf

* TYP: get_indexer * update per discussion in pandas-dev#40612 * one more overload * pre-commit fixup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TYP: get_indexer #40612

TYP: get_indexer #40612

jbrockmendel commented Mar 24, 2021

jbrockmendel commented Apr 1, 2021

jreback commented Apr 2, 2021

jbrockmendel commented Apr 2, 2021

jbrockmendel commented Apr 2, 2021

jbrockmendel commented Apr 5, 2021

MarcoGorelli left a comment •

edited

Loading

MarcoGorelli commented Apr 5, 2021

jbrockmendel commented Apr 5, 2021

jbrockmendel commented Apr 6, 2021 •

edited

Loading

MarcoGorelli commented Apr 6, 2021 •

edited

Loading

jbrockmendel commented Apr 7, 2021

jbrockmendel commented Apr 19, 2021

MarcoGorelli left a comment •

edited

Loading

MarcoGorelli Apr 19, 2021

jbrockmendel Apr 19, 2021

MarcoGorelli commented Apr 19, 2021

TYP: get_indexer #40612

TYP: get_indexer #40612

Conversation

jbrockmendel commented Mar 24, 2021

jbrockmendel commented Apr 1, 2021

jreback commented Apr 2, 2021

jbrockmendel commented Apr 2, 2021

jbrockmendel commented Apr 2, 2021

jbrockmendel commented Apr 5, 2021

MarcoGorelli left a comment • edited Loading

Choose a reason for hiding this comment

MarcoGorelli commented Apr 5, 2021

jbrockmendel commented Apr 5, 2021

jbrockmendel commented Apr 6, 2021 • edited Loading

MarcoGorelli commented Apr 6, 2021 • edited Loading

jbrockmendel commented Apr 7, 2021

jbrockmendel commented Apr 19, 2021

MarcoGorelli left a comment • edited Loading

Choose a reason for hiding this comment

MarcoGorelli Apr 19, 2021

Choose a reason for hiding this comment

jbrockmendel Apr 19, 2021

Choose a reason for hiding this comment

MarcoGorelli commented Apr 19, 2021

MarcoGorelli left a comment •

edited

Loading

jbrockmendel commented Apr 6, 2021 •

edited

Loading

MarcoGorelli commented Apr 6, 2021 •

edited

Loading

MarcoGorelli left a comment •

edited

Loading