[ESQL] Date Trunc function should support date nanos #110008

not-napoleon · 2024-06-20T18:57:39Z

Description

I've labeled this team discuss because it's unclear to me what this support should look like. Do we need to support buckets smaller than one millisecond? Does that have any resource constraints on the number of buckets, or do we need to build some kind of spill-to-disk thing before we do this?

elasticsearchmachine · 2024-06-20T18:58:02Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

While working on #110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests. Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this. While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.

While working on elastic#110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests. Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this. While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.

While working on #110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests. Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this. While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.

Resolves #110008 As discussed elsewhere, this does NOT allow for truncating to a value smaller than a millisecond. Our timespan literal syntax doesn't allow specifying less than a millisecond, and the rounding infrastructure also does not support it. We also had a discussion regarding the return type, and decided that it made sense to keep the type as date_nanos, even though the truncation will always produce a millisecond-rounded (or higher) value. --------- Co-authored-by: Elastic Machine <[email protected]>

Resolves elastic#110008 As discussed elsewhere, this does NOT allow for truncating to a value smaller than a millisecond. Our timespan literal syntax doesn't allow specifying less than a millisecond, and the rounding infrastructure also does not support it. We also had a discussion regarding the return type, and decided that it made sense to keep the type as date_nanos, even though the truncation will always produce a millisecond-rounded (or higher) value. --------- Co-authored-by: Elastic Machine <[email protected]>

Resolves #110008 As discussed elsewhere, this does NOT allow for truncating to a value smaller than a millisecond. Our timespan literal syntax doesn't allow specifying less than a millisecond, and the rounding infrastructure also does not support it. We also had a discussion regarding the return type, and decided that it made sense to keep the type as date_nanos, even though the truncation will always produce a millisecond-rounded (or higher) value. --------- Co-authored-by: Elastic Machine <[email protected]>

Resolves elastic#110008 As discussed elsewhere, this does NOT allow for truncating to a value smaller than a millisecond. Our timespan literal syntax doesn't allow specifying less than a millisecond, and the rounding infrastructure also does not support it. We also had a discussion regarding the return type, and decided that it made sense to keep the type as date_nanos, even though the truncation will always produce a millisecond-rounded (or higher) value. --------- Co-authored-by: Elastic Machine <[email protected]>

While working on #110008 I discovered that the Date Trunc tests were only running in folding mode, because the interval types are marked as not representable. The correct way to test this is to set the forceLiteral flag for those fields, which will (as the name suggests) force them to be literals even in non-folding tests. Doing that turned up errors in the evaluatorToString tests, which I fixed. There are two big changes here. First, the second parameter to the evaluator is a Rounding instance, not the actual interval. Since Rounding includes some information about the specific rounding in the toString results, I am just using a starts with matcher to validate the majority of the string, rather than trying to reconstruct the expected rounding string. Second, passing in a literal null for the interval parameter folds the whole expression to null, and thus a completely different toString. I added a clause in AnyNullIsNull to account for this. While I was in there, I moved some specific test cases to a different file. I know moving code is something we're trying to minimize right now, but this seemed worth it. The tests in question do not depend on the parameters of the test case, but all methods in the class get run for every set of parameters. This was causing these tests to be run many times with the same values, which bloats our test run time and test count. Moving them to a distinct class means they'll only be executed once per test run. I feel like this benefit outweighs the cost of git history complexity.

Resolves #110008 As discussed elsewhere, this does NOT allow for truncating to a value smaller than a millisecond. Our timespan literal syntax doesn't allow specifying less than a millisecond, and the rounding infrastructure also does not support it. We also had a discussion regarding the return type, and decided that it made sense to keep the type as date_nanos, even though the truncation will always produce a millisecond-rounded (or higher) value. --------- Co-authored-by: Elastic Machine <[email protected]>

Resolves elastic#110008 As discussed elsewhere, this does NOT allow for truncating to a value smaller than a millisecond. Our timespan literal syntax doesn't allow specifying less than a millisecond, and the rounding infrastructure also does not support it. We also had a discussion regarding the return type, and decided that it made sense to keep the type as date_nanos, even though the truncation will always produce a millisecond-rounded (or higher) value. --------- Co-authored-by: Elastic Machine <[email protected]>

not-napoleon added >enhancement team-discuss :Analytics/ES|QL AKA ESQL labels Jun 20, 2024

not-napoleon mentioned this issue Jun 20, 2024

[ES|QL] Support date_nanos field type #109352

Open

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 20, 2024

not-napoleon self-assigned this Oct 29, 2024

not-napoleon mentioned this issue Nov 1, 2024

[ESQL] clean up date trunc tests #116111

Merged

not-napoleon mentioned this issue Nov 6, 2024

[ESQL] Add support for date trunc on date nanos type #116354

Merged

not-napoleon closed this as completed in #116354 Nov 7, 2024

not-napoleon mentioned this issue Nov 7, 2024

[8.x] [ESQL] Add support for date trunc on date nanos type #116354 #116422

Merged

not-napoleon mentioned this issue Dec 4, 2024

[ESQL] Support date nanos on Bucket function #118031

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ESQL] Date Trunc function should support date nanos #110008

[ESQL] Date Trunc function should support date nanos #110008

not-napoleon commented Jun 20, 2024

elasticsearchmachine commented Jun 20, 2024

[ESQL] Date Trunc function should support date nanos #110008

[ESQL] Date Trunc function should support date nanos #110008

Comments

not-napoleon commented Jun 20, 2024

Description

elasticsearchmachine commented Jun 20, 2024