-
Notifications
You must be signed in to change notification settings - Fork 613
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(frontend): cast/cmp with session timezone #7243
Conversation
…into jon/implicit_cast_with_session_timezone
Not sure if this approach is the best. Preferably, the timezone can become a context that the expression framework can reference rather than embedding in an For instance, it's hard to support string to timestamptz casting fully, and also implicit casting does not seem to be caught by the binder. |
All casts (explicit, assign, implicit) can be handled at a single place: As mentioned above, including timezone as context in the expression framework has the advantage of supporting more expressions. It also takes a large effort to refactor, and cannot provide the notice at binding phase. To get a notice when time zone is used implicitly, we still need to (1) update all indirect callers of Just sharing some thoughts. I do not have a preference over any one. |
Seems the session information including session timezone will be lost during recovery test, as it seems the session is restarted from scratch. Any ideas @yezizp2012? Perhaps we should remove the |
Yes, the frontend will be killed randomly during the recovery test, which will lead the session restart and all config will be reset.
Yes,
|
…into jon/implicit_cast_with_session_timezone
Codecov Report
@@ Coverage Diff @@
## main #7243 +/- ##
==========================================
- Coverage 73.05% 73.01% -0.04%
==========================================
Files 1068 1069 +1
Lines 170827 171099 +272
==========================================
+ Hits 124799 124935 +136
- Misses 46028 46164 +136
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Clever solution!
@@ -62,6 +65,39 @@ pub fn timestamp_at_time_zone(input: NaiveDateTimeWrapper, time_zone: &str) -> R | |||
Ok(usec) | |||
} | |||
|
|||
pub fn timestamptz_to_string(elem: i64, time_zone: &str, writer: &mut dyn Write) -> Result<()> { | |||
let time_zone = lookup_time_zone(time_zone)?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the time_zone
string will be looked up for many times. Is there any way to optimize it? Because in most cases this will just a literal. (We can even reject the non-literal cases, I think)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The lookup is a hash table lookup, with hashing cost linear to string length. The bigger cost is actually our way of evaluating literal expr, which would create a column full of duplicated values.
The ideal case, of course, is to pass a single session timezone as task context from frontend to compute node, and do a single lookup. But I guess we can defer that refactor until there is a more useful case for expr eval context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does seem reasonable if the timezone string is replaced by an i32 fixed offset that is looked up by front-end... cloning i32 array is probably much cheaper than string array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But it is not a fixed offset. The offset can vary at different dates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, indeed. We could represent it by the chrono_tz enum value, and repr in our type system as i32
?
https://docs.rs/chrono-tz/latest/chrono_tz/static.TZ_VARIANTS.html#
It may be a bit brittle. But an i32 at the RHS of AT TIME ZONE
and CAST_WITH_TIMEZONE
will always be interpreted as a chrono_tz timezone.
I think the binder should not look it up. We can peform the lookup at the same place as looking up session_timezone. That way, it will still be human readable until it is turned into a proto plan.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw, we don't support strings such as UTC+8
via chrono_tz. Do you think we should support these fixed offsets as well? We can reserve some space in our i32
encoding for such strings, and look them up separately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
created issue: #7335
src/frontend/src/expr/mod.rs
Outdated
let input_type = input.return_type(); | ||
match (input_type.clone(), return_type.clone()) { | ||
(DataType::Timestamptz, DataType::Varchar) | ||
| (DataType::Varchar, DataType::Timestamptz) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make it clear (rather than to suggest any actions), this may be overly noisy even when all input strings already contains timezone (2022-01-01 12:00:00-08:00
) and the session timezone is actually unused. It is impossible to know this in frontend but only during execution.
As a special case, when the input is a literal string, we can do the parsing in frontend and avoid casting expr at all. This would improve the user experience. Opened #7320 to track it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Great!
statement ok | ||
set timezone = "us/pacific"; | ||
|
||
# Cast date to timestamptz | ||
query T | ||
select '2022-01-01'::date::timestamp with time zone; | ||
---- | ||
2022-01-01 08:00:00+00:00 | ||
|
||
# Cast timestamp to timestamptz | ||
query T | ||
select '2022-01-01 00:00:00'::timestamp::timestamp with time zone; | ||
---- | ||
2022-01-01 08:00:00+00:00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should display 2022-01-01 00:00:00-08:00
or 2022-01-01 08:00:00+00:00
? Although they are equal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this will be in an upcoming PR (depending on this PR). It will format Pg rows with the session timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has totally checked 2644 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
1258 | 1 | 1385 | 0 |
Click to see the invalid file list
- src/frontend/src/expr/session_timezone.rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest lgtm
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
Support:
We have implemented this to always use UTC because it is the basis of many things.'2022-10-01 12:00:00'::timestamptz
)Missing:
interval '1' day + '2022-03-13 09:00:00Z'::timestamptz
)When doing so, we will emit the PgResponse NOTICE:
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Refer to a related PR or issue link (optional)
#5826