-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: Use object dtype for empty Series #29405
API: Use object dtype for empty Series #29405
Conversation
7bbd1a7
to
1eb2aa6
Compare
Meant to open a draft PR - please don't review this for now. I assume quite a few more warnings and broken tests will need to be fixed |
Mmm sorry. We need to have a discussion about whether this is worth the change. At this point, I'm not sure... This can be made backwards compatible, right? When the data is empty and no dtype is specified, we can emit a warning saying "pass dtype=float to keep the old behavior, or dtype=object for the new behavior." |
Do you mean if we want to change it at all? No strong opinion about it, but it seems reasonable to want consistent behaviour across Series, DataFrame and Index for this. Any particular reason why we wouldn't want that?
Why would we want to stay backwards compatible here? A big release like 1.0.0 seems like a pretty good opportunity to break this behaviour. We could obviously add this warning, but I wonder if it's really meaningful. Or do you mean we should add this warning before implementing the actual change, i.e. going through a full deprecation cycle? |
@SaturnFromTitan we generally don’t like to non backward compatible changes if we can help it; or if not provide a deprecation period this change would likely break lots of code that is out there so better to provide some notice before we actually change it |
@jreback |
yes only a warning for now |
Agreed that a warning is the best path forward. |
c761bab
to
af0712d
Compare
Phew... I changed the dtype change to a warning, but now realise this triggers an immense amount of warnings that clutter all test output. Silencing all of them feels like a huge undertaking... Is there an easier way around this @jreback @TomAugspurger? |
They'll all need to be updated. We don't allow unhandled warnings in our
test suite.
We'll also need a new test ensuring that not specifying the dtype or data
results in a warning.
…On Fri, Nov 8, 2019 at 4:59 PM Martin Winkel ***@***.***> wrote:
Phew... I changed the dtype change to a warning, but now realise this
triggers an immense amount of warnings that clutter all test output.
Silencing all of them feels like a huge undertaking... Is there an easier
way around this @jreback <https://github.com/jreback> @TomAugspurger
<https://github.com/TomAugspurger>?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29405?email_source=notifications&email_token=AAKAOITNOSGAE5UEJR6BAVDQSXVLRA5CNFSM4JI3MAE2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDTTKRY#issuecomment-552023367>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIVNPLWMGY64RGYRR63QSXVLRANCNFSM4JI3MAEQ>
.
|
as tom indicates the best way here is to correct the expected in the tests |
@jreback @TomAugspurger I understand how I can resolve the warnings, the problem I see is the amount of work. This code change now raises >4000 warnings. Changing all those tests would not only require a lot of work but would also result in a gigantic PR :D Now the question: Do you think there is a good way to work around this or should we rather close this PR as "Won't fix" because of the value it brings doesn't justify the effort it demands? |
there shouldn’t be nearly this many warnings. the change is only for a very small subset of things, eg empty Series with no dtype your patch is catching other cases; haven’t looked in depth but likely |
3971d02
to
3c26e55
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see my comments, you should have very very few catch warnings at all. if you have a bare Series()
then just Series(dtype=object)
and will silence and be correct.
@jreback Thanks for taking a look at the PR already! I think I still need to rework some of the changes and silence more warnings, but it's already very helpful. |
788fc5b
to
69a71ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
merge master as well. i have to fully review.
95b4029
to
7e2f6ad
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates!
b2b0c9a
to
07c9ca0
Compare
@jreback @TomAugspurger @jorisvandenbossche Thanks for your comments! I think I addressed all of them. Please let me know if you think there's more to do. Note: CI is failing, but that seems unrelated. |
Looks all good to me now! @SaturnFromTitan for future reference: if you could push only new commits instead of squashing with previous commits (and thus force pushing), that would be preferred. It makes it easier to see what has changes since a previous review with the github interface. |
@jreback it's what is written in https://dev.pandas.io/docs/development/policies.html (#28415). If you want to reopen that discussion (or seek for further clarifications), maybe start a mailing list discussion. |
07c9ca0
to
015ded8
Compare
…sts and docs so that no unnecessary warnings are thrown
015ded8
to
4ba36b4
Compare
This is green as well @jreback |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. some minor comments. ping on green.
I'm not too sure about the types I assigned for |
1da1f08
to
67e00ff
Compare
@jreback merge conflicts resolved and CI is green ✅ |
thanks @SaturnFromTitan great patch! this was a biggie. do we have an issue to turn this into a FutureWarning? happy to accept patches for things like typing (or anything else)! |
Yes, here's the issue for turning it into a FutureWarning: #30017 |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff