-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: Add a hypothesis test for __getitem__ #1098
test: Add a hypothesis test for __getitem__ #1098
Conversation
2520ec0
to
e31110c
Compare
Compares polars with pandas & pyarrow.
e31110c
to
ddf1cbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is awesome, thanks @sjdenny !
looks like there's a failure in the "random versions" CI job (specifically, pandas==2.0.3)
if we can address that, then I think we can merge this can then gradually address the rest
Thanks @MarcoGorelli !
Looks like this was a flake instead (at least, I was able to reproduce on pandas==2.2.3 too, by running through more examples). I've bumped the number of samples up to 10k in db69469: this does (did) reliably find the failing case, but the downside is it's a bit of a slow test now. Which end of the tradeoff scale do you want to go for? Related to this, I saw Hypothesis has settings profiles, which look neat for supporting faster settings locally & more thorough sampling in CI. |
Ah, coverage now. Looking π |
awesome stuff, thanks! there's a couple of really minor things i wanted to address, but i'll try to get this in shortly, really appreciate your contribution here π |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @sjdenny !
again, so sorry I was really slow with this one - this is really valuable work you've done here, well done π
Compares polars
__getitem__
calls with pandas & pyarrow.What type of PR is this? (check all applicable)
Related issues
DataFrame.__getitem__
Β #1008Checklist
Tests addedDocumented the changesIf you have comments or can explain your changes, please do so below.
Putting this up for initial feedback - there's a bunch of cases which pyarrow doesn't support (documented in the test). Do we want to support these cases, or should be instead tighten the test (on pandas as well) and declare these unsupported for now?
Cases:
df[0:1, 'a':'b']
(trivial pairs such asdf[:, 'a':'b']
where one slice is:
are an exception).df[[], "a":]
fails.df[..., ::step]
is unsupported, ieslice(None, None, <something>)