-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pd.Timestamp constructor ignores missing arguments #31930
Comments
Definite +1. Moreover I think the
I intentionally left out the following args that I think can be deprecated/removed:
|
The constructor argument-handling behavior is a problem. I, however, disagree that constructing timezone-aware Timestamps should happen in two steps unless building from components. It would no doubt be much easier for us to support, but it looks like we are cutting functionality because refactoring is painful. Also, we should consider that this is how value = tz_localize_to_utc(np.array([self.value], dtype='i8'), tz,
ambiguous=ambiguous,
nonexistent=nonexistent)[0]
return Timestamp(value, tz=tz, freq=self.freq) It just converts to time since epoch and calls the constructor. |
I tend to be more in the camp of Refactoring/transition is painful, but I think it's worth getting to a more straightforward state. |
It's a matter of opinion, I suppose. I believe that if something is a valid object belonging to a class, we should be able to make this object with the constructor as easily as possible unless there is a very good reason not to have this functionality. We could argue that the user can just parse whatever input they have into components themselves and then build with our constructor, but I don't think that's an excuse. Especially because it's very common (at least in my experience) to read dates as strings from text files and then build datetime-like from that, whether the result needs to be tz-aware or not. A bit off-topic, but since we are quoting that principle from Zen of Python. Currently, If we'd like to refactor the missing arguments behavior, we could benefit from discussing PR scope in advance, otherwise we risk pulling the entire class into scope. |
pls don’t massively refactor in this PR if we want to merge this with a little more review and then potentially do a refactor (likely in multiple steps) then ok with that we have quite a large suite for Timestamp so not refacators should be easily possible |
@mroeschke @pganssle Now that I reread the discussion, looks like that's what you intend too. |
@jreback I think you are responding on the wrong issue, this is not the "add |
Hello all. Just a small notification, that
doesn't work as workaround and produce an error
:) |
Hello all. The reason of the issue is that tz was third positional argument, so it takes value instead of the day, to prevent this behavior tz was made key only attribute. Please review the PR I've submitted. Regards, |
FWIW - with version 1.4.2 of pandas the following two options work nicely ts1 = pd.Timestamp("2009-01-10", tz="UTC")
ts2 = pd.Timestamp(year=2009, month=1, day=10, tz='UTC') and pd.Timestamp(2009,1, 10, tz='UTC') raises an error: TypeError Traceback (most recent call last)
/var/folders/hc/m_f707sd7t3ghfvxn5013sfm0000gp/T/ipykernel_39707/770468094.py in <module>
----> 1 pd.Timestamp(2009,1, 10, tz='UTC')
pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()
TypeError: __new__() got multiple values for keyword argument 'tz' I'm totally not sure whether this issue is the relevant one, but I got here thanks to #31930 (comment) by @ArtyomKaltovich. |
As part of the discussions in #31563, I came across these strange semantics in
pd.Timestamp
, where it is apparently legal to over-specify apd.Timestamp
by specifying both adatetime
(or anotherTimestamp
) and pass the by-component construction values, and any irrelevant arguments are ignored:The signature for the function is:
There's actually a decent amount of redundant information in there, because
pd.Timestamp
is attempting to have its own constructor *in addition to being constructable like a datetime. Properly, there are two overloaded constructors here (note that I'm not sure ifnanosecond
belongs to both or just one):I think that ideally the correct behavior would be to throw an error if you mix and match between the two, which is at least done in the case of specifying both
tz
andtzinfo
:Though confusingly this also fails if you specify
tzinfo
at all in the "by-component" constructor. I have filed a separate bug for that at #31929.Recommendation
I think that the behavior of
pandas.Timestamp
should probably be brought at least mostly in-line with the concept of two overloaded constructors (possibly withtz
andtzinfo
being mutually-exclusive aliases for one another). Any other combination, particularly combinations where the values passed are ignored, should raise an exception.This may be a breaking change, since it will start raising exceptions in code that didn't raise exceptions before (though I am not sure I can think of any situation where silently ignoring the values is a desirable condition), so it may be a good idea to have a deprecation period where a warning rather than an exception is raised.
The text was updated successfully, but these errors were encountered: