Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make from_unixtime aware of execution timezone #12892

Open
niebayes opened this issue Oct 12, 2024 · 4 comments
Open

Make from_unixtime aware of execution timezone #12892

niebayes opened this issue Oct 12, 2024 · 4 comments
Labels
enhancement New feature or request

Comments

@niebayes
Copy link
Contributor

Is your feature request related to a problem or challenge?

No response

Describe the solution you'd like

Applies an offset to the unixtime after the cast.

Describe alternatives you've considered

No response

Additional context

Datafusion's from_unixtime is not aware of timezone. The following code block demonstrates the interaction between me and datafusion-cli.

> set datafusion.execution.time_zone = '+08:00';
0 row(s) fetched. 
Elapsed 0.000 seconds.

> select to_unixtime('2024-09-01 10:00:00+08:00');
+------------------------------------------------+
| to_unixtime(Utf8("2024-09-01 10:00:00+08:00")) |
+------------------------------------------------+
| 1725156000                                     |
+------------------------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

> select from_unixtime(1725156000);
+----------------------------------+
| from_unixtime(Int64(1725156000)) |
+----------------------------------+
| 2024-09-01T02:00:00              |
+----------------------------------+
1 row(s) fetched. 
Elapsed 0.001 seconds.

Specifically, when converting a date to unixtime with the to_unixtime function, we can provide a timezone and the cast result could reveal the timezone. However, when converting the unixtime back to a date, Datafusion is not aware of timezone. Yes, the cast result is correct, but the timezone info is discarded and only a date with timezone UTC+0 is returned.

I wonder if we can make the from_unixtime function aware of timezone. For example, by setting the execution timezone through the datafusion.execution.time_zone configuration, we can apply an offset to the cast result so the timezone info could be revealed.

@niebayes niebayes added the enhancement New feature or request label Oct 12, 2024
@Omega359
Copy link
Contributor

Omega359 commented Oct 12, 2024

UDF's currently do not have access to the DF context which would make implementing this using configuration a bit difficult. What could be easily supported is allowing an additional argument 'tz' which would be used to generate the timestamp in tz aware fashion.

@buraksenn
Copy link
Contributor

@Omega359 would that be desirable? I can start working on to overload from_unixtime function with additional time_zone parameter as you've mentioned.

@Omega359
Copy link
Contributor

Sounds good. I would like to still think about what it would take to get udf's access to the config though.

@alamb
Copy link
Contributor

alamb commented Nov 21, 2024

This may now be feasible after

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants