Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(type): fix parsing array literal and printing struct value #13229

Merged
merged 22 commits into from
Nov 15, 2023

Conversation

wangrunji0408
Copy link
Contributor

@wangrunji0408 wangrunji0408 commented Nov 3, 2023

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

Fix parsing array literal

fix #4769

  • We don't handle double-quotes and null values correctly
  select '{null, "null"}' :: varchar[];
- {"null","\"null\""}
+ {NULL,"null"}

This PR rewrites the parser using recursive descent approach.

Fix row to text

  • Null values should appear empty
  • Double quotes are needed if
    • it is an empty string
    • it has leading or tailing whitespaces
    • it contains any of the characters(),"\
      • " is escaped to ""
      • \ is escaped to \\
  select Row(''), Row('"'), Row(' a '), Row('()'), Row(','), Row(NULL), Row(NULL, NULL);
-        ()       (")       ( a )       (())       (,)       (NULL)     (NULL,NULL)
+        ("")     ("""")    (" a ")     ("()")     (",")     ()         (,)

Move parsing logic to common

This PR also moves the parsing logic to a method ListValue::from_str in the common crate. The cast function no longer needs to build sub-expression on every call. As a result, the runtime of cast(varchar) -> int8[] function decreased from 952.10 µs to 185.21 µs (-80%).

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • I have added fuzzing tests or opened an issue to track them. (Optional, recommended for new SQL features Sqlsmith: Sql feature generation #7934).
  • My PR contains breaking changes. (If it deprecates some features, please create a tracking issue to remove them in the future).
  • All checks passed in ./risedev check (or alias, ./risedev c)
  • My PR changes performance-critical code. (Please run macro/micro-benchmarks and show the results.)
  • My PR contains critical fixes that are necessary to be merged into the latest release. (Please check out the details)

Documentation

  • My PR needs documentation updates. (Please use the Release note section below to summarize the impact on users)

Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://www.postgresql.org/docs/16/arrays.html#ARRAYS-IO
https://www.postgresql.org/docs/16/rowtypes.html#ROWTYPES-IO-SYNTAX
#4769

There are many edge cases. It is okay if we cannot resolve them all at once.

The wrong output may have been used in a lot of existing e2e slt tests 😢

elems.push(match trimmed[start..i].trim() {
"" => return Err("Unexpected \"}\" character.".into()),
s if s.eq_ignore_ascii_case("null") => None,
s => Some(ScalarImpl::from_literal(&unquote(s), elem_type)?),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ScalarImpl::from_literal and ScalarImpl::from_text shall be consolidate to a FromText (opposite of ToText). For now I am okay with using either one.

Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
Signed-off-by: Runji Wang <[email protected]>
@wangrunji0408 wangrunji0408 force-pushed the wrj/fix-parsing-array-literal branch 4 times, most recently from 96ac71e to 48d7649 Compare November 7, 2023 09:36
Signed-off-by: Runji Wang <[email protected]>
@wangrunji0408 wangrunji0408 force-pushed the wrj/fix-parsing-array-literal branch from 48d7649 to 47c8a4b Compare November 7, 2023 09:44
Signed-off-by: Runji Wang <[email protected]>
@wangrunji0408 wangrunji0408 force-pushed the wrj/fix-parsing-array-literal branch from 1e8e994 to 87e68e2 Compare November 7, 2023 14:52
Copy link

codecov bot commented Nov 7, 2023

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (094a071) 67.62% compared to head (e78129b) 67.65%.
Report is 10 commits behind head on main.

Files Patch % Lines
src/common/src/array/struct_array.rs 84.09% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #13229      +/-   ##
==========================================
+ Coverage   67.62%   67.65%   +0.03%     
==========================================
  Files        1527     1527              
  Lines      260464   260714     +250     
==========================================
+ Hits       176132   176381     +249     
- Misses      84332    84333       +1     
Flag Coverage Δ
rust 67.65% <97.81%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very close ... Thank you for the effort!

Copy link
Contributor

@xiangjinwu xiangjinwu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not construct any further problematic cases, but left two suggestions that hopefully improves the readability.

Btw there may be new tests added in main that uses the old output format, and we would need to fix them after merge-main.

@wangrunji0408 wangrunji0408 force-pushed the wrj/fix-parsing-array-literal branch from 0431ff2 to e78129b Compare November 15, 2023 08:07
@wangrunji0408 wangrunji0408 added this pull request to the merge queue Nov 15, 2023
@wangrunji0408
Copy link
Contributor Author

@xiangjinwu Thanks for your patient review. 🥹

Merged via the queue into main with commit 2628460 Nov 15, 2023
@wangrunji0408 wangrunji0408 deleted the wrj/fix-parsing-array-literal branch November 15, 2023 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Different display format for array/struct with pg
2 participants