-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: missing __getitem__
type fixes
#1963
Conversation
) -> Series[Any]: ... | ||
|
||
@overload | ||
def __getitem__( | ||
self: Self, | ||
key: ( | ||
item: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
silly mistake (I blame copilot) π
I wonder why it was not picked up though
This comment was marked as resolved.
This comment was marked as resolved.
you do now π feel free to make improvements, thanks for all your help + evangelism of Narwhals |
Oh hello π |
Maybe this is too much for the PR. # Annotations for `__getitem__` methods
SingleIndexSelector: TypeAlias = int
MultiIndexSelector: TypeAlias = Union[
slice,
range,
Sequence[int],
"Series",
"np.ndarray[Any, Any]",
]
SingleNameSelector: TypeAlias = str
MultiNameSelector: TypeAlias = Union[
slice,
Sequence[str],
"Series",
"np.ndarray[Any, Any]",
]
BooleanMask: TypeAlias = Union[
Sequence[bool],
"Series",
"np.ndarray[Any, Any]",
]
SingleColSelector: TypeAlias = Union[SingleIndexSelector, SingleNameSelector]
MultiColSelector: TypeAlias = Union[MultiIndexSelector, MultiNameSelector, BooleanMask] Still ends up pretty complex in the # `str` overlaps with `Sequence[str]`
# We can ignore this but we must keep this overload ordering
@overload
def __getitem__(
self, key: tuple[SingleIndexSelector, SingleColSelector]
) -> Any: ...
@overload
def __getitem__( # type: ignore[overload-overlap]
self, key: str | tuple[MultiIndexSelector, SingleColSelector]
) -> Series: ...
@overload
def __getitem__(
self,
key: (
SingleIndexSelector
| MultiIndexSelector
| MultiColSelector
| tuple[SingleIndexSelector, MultiColSelector]
| tuple[MultiIndexSelector, MultiColSelector]
),
) -> DataFrame: ...
def __getitem__(
self,
key: (
SingleIndexSelector
| SingleColSelector
| MultiColSelector
| MultiIndexSelector
| tuple[SingleIndexSelector, SingleColSelector]
| tuple[SingleIndexSelector, MultiColSelector]
| tuple[MultiIndexSelector, SingleColSelector]
| tuple[MultiIndexSelector, MultiColSelector]
),
) -> DataFrame | Series | Any: @MarcoGorelli do you see these aliases as a desirable thing to steal (re-implement) from |
key: ( | ||
slice | ||
item: ( | ||
int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @dangotbanned thanks for fixing! super quick I didn't have the chance to see the comments :D
Anyway I think that int
is a bit of a special case.
The narwhals rule at the moment is that int
in __getitem__
should not be fully supported (https://narwhals-dev.github.io/narwhals/pandas_like_concepts/column_names/) or encouraged .
I think we shouldn't add it in typing here, but ignore the int
slicing in the tests if a typing error occurs.
Having said that, it is weird that after my changes mypy
wants me to remove the # ignore
in the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EdAbati interesting π€
So should this line in the doc be rewritten?
narwhals/narwhals/dataframe.py
Line 928 in 733ab52
- Integers are always interpreted as positions |
I do appreciate that what you're saying is documented, but writing always reads to me as something that belongs in the annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm oh yeah actually good catch, you are right! then it makes a lot of sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EdAbati marking as unresolved just to show another option if a reviewer wanted it:
diff --git a/narwhals/dataframe.py b/narwhals/dataframe.py
index 6b673a80..207ef7b2 100644
--- a/narwhals/dataframe.py
+++ b/narwhals/dataframe.py
@@ -859,6 +859,8 @@ class DataFrame(BaseFrame[DataFrameT]):
"""
return self._compliant_frame.estimated_size(unit=unit) # type: ignore[no-any-return]
+ @overload
+ def __getitem__(self: DataFrame[pd.DataFrame], item: int) -> Any: ...
@overload
def __getitem__( # type: ignore[overload-overlap]
self: Self,
diff --git a/narwhals/stable/v1/__init__.py b/narwhals/stable/v1/__init__.py
index 5aefefe2..af8aed47 100644
--- a/narwhals/stable/v1/__init__.py
+++ b/narwhals/stable/v1/__init__.py
@@ -84,6 +84,7 @@ if TYPE_CHECKING:
from types import ModuleType
import numpy as np
+ import pandas as pd
from typing_extensions import Self
from narwhals.dtypes import DType
@@ -131,6 +132,8 @@ class DataFrame(NwDataFrame[IntoDataFrameT]):
def _lazyframe(self: Self) -> type[LazyFrame[Any]]:
return LazyFrame
+ @overload
+ def __getitem__(self: DataFrame[pd.DataFrame], item: int) -> Any: ...
@overload
def __getitem__( # type: ignore[overload-overlap]
self: Self,
I think this describes the pandas
edge case?
I agree, my plan was actually to add these in this or follow-up PR. I find them easier to read and follow |
assert_equal_data(result, {"a": [2], "b": [12]}) | ||
|
||
|
||
def test_slice_fails(constructor_eager: ConstructorEager) -> None: | ||
class Foo: ... | ||
|
||
with pytest.raises(TypeError, match="Expected str or slice, got:"): | ||
nw.from_native(constructor_eager(data), eager_only=True)[Foo()] # type: ignore[call-overload] | ||
nw.from_native(constructor_eager(data), eager_only=True)[Foo()] # type: ignore[call-overload, unused-ignore] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
π€ ignoring unused-ignore seems a bit strange?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoGorelli see #1963 (comment)
It is to ignore pre-commit
not understanding the call is invalid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What type of PR is this? (check all applicable)
Checklist
If you have comments or can explain your changes, please do so below
I am so sorry @MarcoGorelli , I was a bit to quick to press "Ready for Review" π
This is a follow up of #1958