Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(expr): refactor timezone-related expr impl #14313

Closed
wants to merge 11 commits into from

Conversation

KeXiangWang
Copy link
Contributor

@KeXiangWang KeXiangWang commented Jan 3, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR is tring to take place of the current rewrite mechanism, using captured context (introduced in #13702) to impl expr related to timezone.

The detailed backgroud and motivation can be found here.

This PR introduce a new session variable: enable_timezone_rewriting. If enabled, frontend will rewriting function all to provide timezone. Otherwise, captured context will be used to offer timezone.
This PR may block the const folding optimization of some functions in some secnario.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added test labels as necessary. See details.
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Release note

If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.

@KeXiangWang KeXiangWang marked this pull request as draft January 3, 2024 01:55
@KeXiangWang KeXiangWang force-pushed the wkx/refactor-time-zone branch from ecd2fe7 to 02c4915 Compare January 11, 2024 21:47
@KeXiangWang KeXiangWang force-pushed the wkx/refactor-time-zone branch from 542246f to 704b20b Compare January 16, 2024 04:33
@KeXiangWang KeXiangWang marked this pull request as ready for review January 16, 2024 05:28
@xiangjinwu
Copy link
Contributor

Looks like there were some misunderstandings of the previous rewriting based approach. Let me share some backgrounds of these expressions.

Some expressions (in AST) do not depend on the implicit session timezone. They take an explicit parameter of timezone (usually last one):

  • timestamp AT TIME ZONE varchar -> timestamptz
  • timestamptz AT TIME ZONE varchar -> timestamp
  • date_trunc(varchar, timestamptz, varchar) -> timestamptz
  • date_add(timestamptz, interval, varchar) -> timestamptz (new in PG 16, not exposed in RW yet)
  • date_subtract(timestamptz, interval, varchar) -> timestamptz (new in PG 16, not exposed in RW yet)
  • make_timestamptz(int, int, int, int, int, double, varchar) -> timestamptz

They served as the basis of rewriting. The impure variant (e.g. when last timezone parameter is optional and absent) is rewritten into the corresponding pure variant. However, they would still exist even if we used context based approach since day 0. This makes them different from the following group of imaginary implementations, which were introduced solely for the purpose of being rewriting targets:

  • cast_with_time_zone(timestamptz, varchar) -> varchar
  • cast_with_time_zone(varchar, varchar) -> timestamptz
  • extract(varchar, timestamptz, varchar) -> decimal
  • date_part(varchar, timestamptz, varchar) -> double
  • to_timestamptz(varchar, varchar, varchar) -> timestamptz
  • to_char(timestamptz, varchar, varchar) -> varchar

As for the expressions (in AST) that depend on implicit session timezone:

  • There should be no at_time_zone(timestamptz) -> timestamp or at_time_zone(timestamp) -> timestamptz. The timezone parameter is required for AT TIME ZONE.
  • There should be no cast_with_time_zone(timestamptz) -> varchar or cast_with_time_zone(varchar) -> timestamptz. The expression without rewriting is cast(timestamptz) -> varchar or cast(varchar) -> timestamptz.
  • Similarly, without rewriting, the following signatures shall have a context-based implementation on generic ExprType:
    • cast(date/timestamp) -> timestamptz or cast(timestamptz) -> date/time/timestamp
    • equal/not_equal/less_than/less_than_or_equal/greater_than/greater_than_or_equal/is_distinct_from/is_not_distinct_from between timestamptz and date/timestamp
    • interval + timestamptz -> timestamptz / timestamptz + interval -> timestamptz / timestamptz - interval -> timestamptz
  • add_with_time_zone and subtract_with_time_zone can be kept for PG 16 date_add and date_subtract with 3 parameters. But their 2-parameter variant can just be add / subtract. That is, avoid add_with_time_zone(timestamptz, interval) -> timestamptz in favor of add(timestamptz, interval) -> timestamptz.
  • To confirm, extract / date_part / to_timestamp / to_char shall have 2-parameter context-based implementation and 3-parameter backward-compatible implementation.
  • As a result, rw_enable_timezone_rewriting = false is supposed to make SessionTimezone as ExprRewriter completely no-op, rather than rewriting to different output (last parameter absent vs present) as in the PR right now. This is what it really means to disable rewriting completely.

However, I just come up with a risk of providing context-based implementation on existing common ExprType cast/equal/add/ etc. Our ImpureAnalyzer today only look at ExprType, while in PostgreSQL pg_proc.provolatile is defined per signature. date < date is immutable but date < timestamptz is stable (1 2). It is easy to fix (or no fix needed if stable is still considered "pure") ImpureAnalyzer itself, but having separate ExprTypes does it more foolproof. The separation is a side effect of SessionTimezone as ExprRewriter right now, and would go away if the suggestions above are followed. Comments welcomed.

Copy link
Contributor

This PR has been open for 60 days with no activity. Could you please update the status? Feel free to ping a reviewer if you are waiting for review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants