Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ESQL] Support autocasting to resolve types between nanosecond dates and millisecond dates #110009

Open
Tracked by #109352
not-napoleon opened this issue Jun 20, 2024 · 2 comments · May be fixed by #123678
Open
Tracked by #109352

[ESQL] Support autocasting to resolve types between nanosecond dates and millisecond dates #110009

not-napoleon opened this issue Jun 20, 2024 · 2 comments · May be fixed by #123678
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) team-discuss

Comments

@not-napoleon
Copy link
Member

Description

Discussion ticket for how mixed operations between millisecond dates and nanosecond dates should work. This behavior would cover binary comparisons and arithmetic operations between mixed millisecond and nanosecond dates, and possibly some functions (TBD). This is complicated, and getting it wrong potentially obligates us to support a bad choice for a long time in BWC.

Option 1 - cast to millisecond date automatically

  • This has the benefit that it will never overflow, but it loses precision. Effectively, equality becomes a range operation
  • Changing this down the road would be a breaking change

Option 2 - Cast to nanosecond date automatically

  • This can overflow, so an otherwise sensible comparison might just start returning nulls
  • Equality gets weird, since we'll just be multiplying the milliseconds by 1,000,000 so for the vast majority of nanosecond values they will never be equal.
  • Changing this down the road would be a breaking change

Option 3 - Don't auto cast anything, make the user pick the behavior they want

  • Explicit is better than implicit (i.e. Zen of Python compliant)
  • We could later add a default cast without it being breaking, as the explicit casts would still work
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@felixbarny
Copy link
Member

To add my 2c, I think that offering a path that enables changing the mapping of @timestamp in a data stream from date to date_nanos in a backwards compatible way is important. We want to change the default for OTel data streams and possibly all LogsDB data streams. IIUC, missing support for autocasting is a blocker for that as queries that span both old and new backing indices would fail.

Option 1 seems like the safest to me as there's no risk of overflows. The precision loss will also just be a temporary thing as old backing indices age out.

Having said that, the fact that some backing indices use date_nanos as a field type is probably a good indication that the data stream contains dates that are in range of what's supported by date_nanos. From that standpoint, option 2 may also be viable. But maybe that's not something that generalizes well across all use cases, for example, when querying with a wide index pattern or for non-observability type use cases.

@fang-xing-esql fang-xing-esql linked a pull request Feb 28, 2025 that will close this issue
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) team-discuss
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants