Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_KnowsLocal methods and properties exclusiveness #190

Closed
sasanjac opened this issue Dec 16, 2024 · 29 comments
Closed

_KnowsLocal methods and properties exclusiveness #190

sasanjac opened this issue Dec 16, 2024 · 29 comments

Comments

@sasanjac
Copy link

Why are some methods and properties exclusive to _KnowsLocal, e.g. minute, second, replace()etc?

It's possible to create an Instant at an arbitrary point in time, so why is it not possible to set its components to an arbitrary value afterwards?

Similarly, why are its components not accessible?

@ariebovenberg
Copy link
Owner

Hi @sasanjac, thanks for opening this issue! I agree that this can be a surprise at first.

The confusion—I suspect—lies in this contradiction:

  • Instant is an "abstract" way of indicating time. Similar to a UNIX timestamp, it doesn't have a year, month, etc. It just indicates a moment in time—unambiguously. There is no "local" time. This may seem limiting, but is actually very liberating. No need to take into account the particularities of calendars and clock time. All popular modern datetime libraries have a similar type (often also named Instant)
  • For clarity, the repr of Instant displays the instant in UTC (Instant(2024-12-16 09:43:28.806464Z)). However, this is purely for human understandability during debugging, since it's much more descriptive than something like Instant(epoch=1734342279.237856). However, the downside is that it gives the impression that Instant does have year, month, day, hour, etc. components—which it doesn't (for reasons mentioned above)

This issue comes up often, so I'll probably add a FAQ item.

Let me know if this does (not) answer your question. I'm always happy to look at ways to make things more intuitive 😄

@sasanjac
Copy link
Author

Yeah ok, that's actually what I guessed. I just got confused, because sometimes it seems to behave as a UTC Datetime and sometimes it just doesn't.

@sasanjac
Copy link
Author

But I think maybe there should be another Layer (e.g. UTCDateTime) between Instant and ZonedDateTime for when you only care about the moment but still need to do time manipulation.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Dec 16, 2024

Considering the recurrence of this topic, I'd be interested to hear more about your use case for manipulating individual components of UTC. Persuaded by other libraries and my experience, I came to the conclusion that you rarely ever need to think of UTC in these terms. But it could just be my lack of imagination 😄

In my experience, when I deal in UTC, it's only to perform operations like arithmetic (add, subtract, difference, and comparison operators). The actual year/month/day/hour don't matter and are essentially arbitrary. A user of Instant needs to be confronted with the fact that to get the "date" or "clock time", you need to pick a timezone:

>>> i = Instant.now()
i.date()  # there is no such thing as the "current date", it's different throughout the world!
>>> i.to_tz("Asia/Tokyo").date()
Date(2024-12-17)
>>> i.to_tz("US/Eastern").date()
Date(2024-12-16)

But I think maybe there should be another Layer (e.g. UTCDateTime) between Instant and ZonedDateTime for when you only care about the moment but still need to do time manipulation.

Have you tried OffsetDateTime with offset=0? Although it'd still have ergonomics issues because you'd still have to specify ignore_dst=True for replace().

edit: clarifications

@sasanjac
Copy link
Author

For the last part: Yes that's how I implemented it currently.

I think the main goal is to manage real time data from various sources spanning different time zones. The data affects each other so time zones are irrelevant thus I like to use something like Instant that indicates exactly that.

But then, maybe you need to do periodic operations on that data e.g. calculating some values aggregated over set timeframes for which you need to calculate the start and end points. For example calculating a mean value since the start of the last quarter hour.

I think most real time systems, where timezone does not matter in regards to algorithms is set up in a similar way and therefore it might be nice to have like a specialized OffsetDateTime which assumes UTC and thus removes DST logic etc.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Dec 16, 2024

Thanks for sharing. Your use case makes sense.

But then, maybe you need to do periodic operations on that data e.g. calculating some values aggregated over set timeframes for which you need to calculate the start and end points. For example calculating a mean value since the start of the last quarter hour.

Aggregation makes sense, although note that any reference point is essentially arbitrary. The "last quarter hour" only works because every timezone on earth currently defines an offset in 15-minute increments.

My initial thoughts are to:

  1. Just call timestamp() and do your aggregation calculations in int or float
  2. Choose a reference point and use TimeDelta arithmetic:
>>> from whenever import Instant, minutes
>>> ref_point = Instant.from_utc(2024, 12, 16, hour=12)  # arbitrary!
>>> data = Instant.now()
>>> # put the data point into 15-minute aggregated buckets
>>> data_bucket = int(
...     (data - ref_point)  # returns a TimeDelta
...     / minutes(15)   # dividing by another TimeDelta returns a float (i.e. how many times one fits in the other)
... )

@sasanjac
Copy link
Author

sasanjac commented Dec 17, 2024

I can't create the arbitrary reference point because it is dependent on the current point in time. I agree with your argument that it is up to the current time definition.

I basically implemented it like this:

now = whenever.ZonedDateTime.now("UTC")
minutes = now.minute % 15
start = now.replace(second=0, nanosecond=0, disambiguate="compatible") - whenever.TimeDelta(minutes=minutes)

With a designated UTCDateTime it could become something like this:

now = whenever.UTCDateTime.now()
minutes = now.minute % 15
start = now.replace(second=0, nanosecond=0) - whenever.TimeDelta(minutes=minutes)

Semantically, this would be in between ZonedDateTime and Instant because it completely strips away the concept of time zones but keeps the definition of time units (which probably won't ever change anyway).

@ariebovenberg
Copy link
Owner

ariebovenberg commented Dec 17, 2024

Looking at your case, it looks like you could use something similar to the rounding API in Temporal

In whenever, it'd look something like this:

# this API doesn't exist (yet)
now = OffsetDateTime.now(0, ignore_dst=True)
now.round(unit="minute", increment=15, mode="floor")

@ariebovenberg
Copy link
Owner

Semantically, this would be in between ZonedDateTime and Instant because it completely strips away the concept of time zones but keeps the definition of time units (which probably won't ever change anyway).

In previous releases, Instant used to be just this: UTCDateTime. Experience from other libraries showed me the value of Instant though, so it was natural to replace UTCDateTime.

Aside from adding the rounding API to all ...DateTime classes, I can imagine the following options going forward:

  1. Add UTCDateTime next to Instant. Downside: there are now 6 classes...
  2. Move Instant back to UTCDateTime again. Downside: easier for devs to make incorrect assumptions, for example: naively doing calendar arithmetic or assuming time-of-day.
  3. Design OffsetDateTime to be easier to use as UTC. Perhaps removing ignore_dst=True requirement or allowing special case offset=None. Downside: I'm not sure this would result in a good API.
  4. Add round_utc() method to Instant to allow rounding based on the UTC date and time-of-day. Downside: an awkward compromise, more of such methods might be needed.

The core dilemma comes to this:

  • on the one hand: UTC date and time-of-day components don't have any meaning. Exposing them risks users making incorrect assumptions. If you do need UTC explicitly, OffsetDateTime will work
  • on the other hand: you could argue that by virtue of being a standard, UTC date and time-of-day do have meaning. People may want to round or aggregate by UTC date or time-of-day. Using OffsetDateTime isn't ideal, because it forces you to account for the offset being variable—which it isn't in the case of just using UTC.

Right now I'm leaning towards option (4). What are your thoughts?

@BurntSushi
Copy link

@ariebovenberg Sorry to butt in here (I'm the author of Jiff), but following along here, it kind of looks like the work around to just used a ZonedDateTime in UTC works pretty well? Is there a specific problem that comes up with that approach?

@ariebovenberg
Copy link
Owner

ariebovenberg commented Dec 17, 2024

@BurntSushi It'd work for sure—maybe I'm overthinking it. What irks me though is the required disambiguation in replace(). I'm still believe (somewhat against the current) that this explicitness is a good idea (#173) and helps in the general case, but in the UTC case it sticks out like a sore thumb.

edit: clarification

@ariebovenberg
Copy link
Owner

@BurntSushi butting in is welcome 😄 , would be curious for your thoughts on #173

@BurntSushi
Copy link

Ah yeah, in Jiff I just make compatible the default everywhere, with no explicit opt-in. It's what Temporal does as well.

@sasanjac
Copy link
Author

sasanjac commented Dec 17, 2024

I've thought about this a while now and I think we basically have what we need already: OffsetDateTime, which UTCDateTime would be a very special instance of.
What I don't really understand is the need for handling DST. Isn't that what ZonedDateTime is for? I mean OffsetDateTime has a fixed offset which would change when considering DST.
So basically, if OffsetDateTime removed DST handling altogether and maybe even had an offset=0 by default, the problem would be solved.

To address your comments:

  1. Add UTCDateTime next to Instant. Downside: there are now 6 classes...

I think UTCDateTime is a special (the most simple) instance of OffsetDateTime.

  1. Move Instant back to UTCDateTime again. Downside: easier for devs to make incorrect assumptions, for example: naively doing calendar arithmetic or assuming time-of-day.

I think the Instant concept is pretty good. I wouldn't want to change that.

  1. Design OffsetDateTime to be easier to use as UTC. Perhaps removing ignore_dst=True requirement or allowing special case offset=None. Downside: I'm not sure this would result in a good API.

As above, why is there even a need for that in the first place? But maybe I really don't understand it.

  1. Add round_utc() method to Instant to allow rounding based on the UTC date and time-of-day. Downside: an awkward compromise, more of such methods might be needed.

I think that is an already too specific implementation for a specific problem.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Dec 18, 2024

So basically, if OffsetDateTime removed DST handling altogether and maybe even had an offset=0 by default, the problem would be solved.

As above, why is there even a need for that in the first place? But maybe I really don't understand it.

I'm happy to explain. ignore_dst a similarly controversial as disambiguate and I'm looking for opinions 🙂

The dilemma is that we either:

  1. Allow convenient arithmetic on OffsetDateTime, which may breed DST bugs. For example, the result of adding 24 hours to 2024-03-09 13:00:00-07:00 is different whether the offset corresponds to Denver or Phoenix. In returning 2024-03-10 13:00:00-07:00 you're effectively assuming there is no DST transition. This is a very easy and common mistake!
  2. Or, make it explicit where DST bugs may be introduced. This is the current approach. ignore_dst=True is a signal "yes, I accept potential DST bugs here because I ignore timezone changes". There is no ignore_dst=False since OffsetDateTime cannot account for it.
  3. Disable DST-sensitive arithmetic on OffsetDateTime entirely

Solutions (2) and (3) make OffsetDateTime an awkward choice modeling UTC (only), since the whole idea is to avoid DST in the first place. As suggested, ZonedDateTime with tz="UTC" may be better, especially if disambiguate becomes optional (very likely now).

here are the docs on the topic. I'm always eager for feedback on its clarity 👂

edit: add UTC remark

@sasanjac
Copy link
Author

I really don't see the dilemma. As I understand it, 2024-03-09 13:00:00-07:00 is neither Denver or Phoenix time zone but UTC-7. The time zone information is missing so there should be no way to do DST-sensitive arithmetic on OffsetDateTime. So for my understanding, adding 24 hours should result in 2024-03-10 13:00:00-07:00.

However, if I had the information whether UTC-7 is Denver or Phoenix, it should be required to use ZonedDateTime in order to do DST-sensitive arithmetic.

@ariebovenberg
Copy link
Owner

ariebovenberg commented Dec 18, 2024

As I understand it, 2024-03-09 13:00:00-07:00 is neither Denver or Phoenix time zone but UTC-7. The time zone information is missing so there should be no way to do DST-sensitive arithmetic on OffsetDateTime. So for my understanding, adding 24 hours should result in 2024-03-10 13:00:00-07:00.

Yes you are right if you are disciplined enough to only use OffsetDateTime for truly fixed offsets (e.g. commercial oceanic shipping). However, it's super common for systems to use fixed offset timestamps for stuff that is actually timezoned. Imagine you're taking delivery of chemicals that need to 'rest' for 24 hours before opening:

# Taking delivery of a package in Denver. The system records a ISO8601 timestamp
delivery = OffsetDateTime.parse_common_iso("2024-03-09T12:00:00-07:00")
# this calculation *seems* safe, but it isn't!
can_open = delivery.add(hours=24)  # "2024-03-10 12:00:00-07:00"
print('IMPORTANT: don't open package until {can_open.local()}')  # prints 12:00, but should be 13:00

Having to type delivery.add(hours=24, ignore_dst=True) may be a hassle for the few users that really use only fixed offsets, but it's a life-saver for most.

edit: typos

@BurntSushi
Copy link

To add, this is also why Temporal.ZonedDateTime will refuse to parse something like 2024-03-09T12:00:00-07:00:

> zdt = Temporal.ZonedDateTime.from("2024-03-09T12:00:00-07:00")
Uncaught RangeError: Temporal.ZonedDateTime requires a time zone ID in brackets
    ParseTemporalZonedDateTimeString ecmascript.mjs:496
    ToTemporalZonedDateTime ecmascript.mjs:1511
    from zoneddatetime.mjs:478
    <anonymous> debugger eval code:1
[ecmascript.mjs:496:34](https://tc39.es/polyfill/lib/ecmascript.mjs)

Temporal (and Jiff) doesn't have a dedicated OffsetDateTime. But you can get one by providing a ZonedDateTime with a "time zone" that is a fixed offset:

> instant = Temporal.Instant.from("2024-03-09T12:00:00-07:00")
Object { … }
> instant.toString()
"2024-03-09T19:00:00Z"
> zdt = instant.toZonedDateTimeISO("-07:00")
Object { … }
> zdt.toString()
"2024-03-09T12:00:00-07:00[-07:00]"

@sasanjac
Copy link
Author

As I understand it, 2024-03-09 13:00:00-07:00 is neither Denver or Phoenix time zone but UTC-7. The time zone information is missing so there should be no way to do DST-sensitive arithmetic on OffsetDateTime. So for my understanding, adding 24 hours should result in 2024-03-10 13:00:00-07:00.

Yes you are right if you are disciplined enough to only use OffsetDateTime for truly fixed offsets (e.g. commercial oceanic shipping). However, it's super common for systems to use fixed offset timestamps for stuff that is actually timezoned. Imagine you're taking delivery of chemicals that need to 'rest' for 24 hours before opening:

# Taking delivery of a package in Denver. The system records a ISO8601 timestamp
delivery = OffsetDateTime.parse_common_iso("2024-03-09T12:00:00-07:00")
# this calculation *seems* safe, but it isn't!
can_open = delivery.add(hours=24)  # "2024-03-10 12:00:00-07:00"
print('IMPORTANT: don't open package until {can_open.local()}')  # prints 12:00, but should be 13:00

Having to type delivery.add(hours=24, ignore_dst=True) may be a hassle for the few users that really use only fixed offsets, but it's a life-saver for most.

edit: typos

I understand your argument and I can totally understand if you keep the logic as it is currently, but in my opinion, that's a user mistake.
I believe the package would benefit from a change in logic, as it would create a clear distinction between the five classes, leaving the user with only one option per use case and the difference between ZonedDateTime and OffsetDateTime would not be so blurred. But thats entirely your decision of course.

@sasanjac
Copy link
Author

Also, if you keep the logic as is, maybe use ignore_dst as an instance variable of OffsetDateTime which defaults to False and if True lets the use omit setting ignore_dst=True on every operation.

@ariebovenberg
Copy link
Owner

I understand your argument and I can totally understand if you keep the logic as it is currently, but in my opinion, that's a user mistake.

Absolutely a user mistake, agree. As @BurntSushi mentions it's indeed a hard balancing act to create an API that's "easy to use, hard to misuse".

I believe the package would benefit from a change in logic, as it would create a clear distinction between the five classes, leaving the user with only one option per use case and the difference between ZonedDateTime and OffsetDateTime would not be so blurred. But thats entirely your decision of course.

Could you elaborate on what you mean with "change in logic" and how this would create a clear distinction?

@sasanjac
Copy link
Author

Could you elaborate on what you mean with "change in logic" and how this would create a clear distinction?

Allowing arithmetic operations without setting ignore_dst=True but on second thought my last comment might be enough and doesn't shift the balance of the API.

@ariebovenberg
Copy link
Owner

@sasanjac in the latest release, 0.6.16, the explicit disambiguation parameter is now optional.

Pending implementation of rounding (#152), your use case becomes a bit cleaner:

now = whenever.ZonedDateTime.now("UTC")
minutes = now.minute % 15
start = now.replace(second=0, nanosecond=0) - whenever.TimeDelta(minutes=minutes)

@sasanjac
Copy link
Author

perfect, thanks :)

@sasanjac
Copy link
Author

And how would you feel about a dedicated .utcnow() method? 😅
If you are open to something like that, I can create a separate issue for that.

@ariebovenberg
Copy link
Owner

Quite the bold idea, considering the utcnow() in the standard library is already so controversial 😁.

What might be an option is having "UTC" as a default value for tz—but even that I think isn't a good idea.

Just for completeness, functools.partial is great for this:

utcnow = partial(ZonedDateTime.now, "UTC")  # creates a new callable
utcnow()  # same as calling ZonedDateTime.now("UTC")

What I would be open to is having tz="UTC" be the default for Instant.to_tz()—but even then it'd have to be carefully considered.

@sasanjac
Copy link
Author

My motivation is not really avoiding being verbose but ensuring being type safe and string literals are somewhat tricky. I would think that using enums for time zones is unfeasible due to the maintenance needed.
functools.partial is unusable unfortunately when needing proper typing atm.

@ariebovenberg
Copy link
Owner

@sasanjac FYI, the latest release as added rounding—also to Instant.

so you can now do:

>>> now = Instant.now()
Instant(2025-02-21 08:54:21Z)
>>> now.round("minute", increment=15, mode="floor")
Instant(2025-02-21 08:45:00Z)

See the docs for more info

@sasanjac
Copy link
Author

Uhh that's cool! I'll check it out, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants