-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix regex parser for parsing functions having SQL body with language sql (PG 15 feature) #2201
Changes from 4 commits
860e99a
f4e04a9
80de2d6
ff7fb07
6df033f
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -134,6 +134,7 @@ var ( | |
parserIssueDetector = queryissue.NewParserIssueDetector() | ||
multiRegex = regexp.MustCompile(`([a-zA-Z0-9_\.]+[,|;])`) | ||
dollarQuoteRegex = regexp.MustCompile(`(\$.*\$)`) | ||
sqlBodyBeginRegex = re("BEGIN", "ATOMIC") | ||
//TODO: optional but replace every possible space or new line char with [\s\n]+ in all regexs | ||
viewWithCheckRegex = re("VIEW", capture(ident), anything, "WITH", opt(commonClause), "CHECK", "OPTION") | ||
rangeRegex = re("PRECEDING", "and", anything, ":float") | ||
|
@@ -912,7 +913,6 @@ sqlParsingLoop: | |
|
||
stmt += currLine + " " | ||
formattedStmt += currLine + "\n" | ||
|
||
// Assuming that both the dollar quote strings will not be in same line | ||
switch dollarQuoteFlag { | ||
case CODE_BLOCK_NOT_STARTED: | ||
|
@@ -921,9 +921,14 @@ sqlParsingLoop: | |
} else if matches := dollarQuoteRegex.FindStringSubmatch(currLine); matches != nil { | ||
dollarQuoteFlag = 1 //denotes start of the code/body part | ||
codeBlockDelimiter = matches[0] | ||
} else if matches := sqlBodyBeginRegex.FindStringSubmatch(currLine); matches != nil { | ||
dollarQuoteFlag = 1 //denotes start of the sql body part https://www.postgresql.org/docs/15/sql-createfunction.html#:~:text=a%20new%20session.-,sql_body,-The%20body%20of | ||
codeBlockDelimiter = "END" //SQL body to determine the end of BEGIN ATOMIC ... END; sql body | ||
} | ||
case CODE_BLOCK_STARTED: | ||
if strings.Contains(currLine, codeBlockDelimiter) { | ||
if strings.Contains(currLine, codeBlockDelimiter) || | ||
strings.Contains(currLine, strings.ToLower(codeBlockDelimiter)) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Concern here is that we don't know the variety of cases possible with plpgsql, since it can almost have anything inside the body. Here the condition is getting changed to match with either same or lowercase variant of it. Should we just handle the sqlbody func case separately(by that i mean having different if conditions here for that) and not update the existing ones. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
the same or lowercase variant is only of the But okay, I get your point about any unknowns with lowercase variant of the dollarQuoteRegex case, I can do that in separate if condition for just END case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done |
||
//TODO: anyways we should be using pg-parser: but for now for the END sql body delimiter checking the UPPER and LOWER both | ||
dollarQuoteFlag = 2 //denotes end of code/body part | ||
if isEndOfSqlStmt(currLine) { | ||
break sqlParsingLoop | ||
|
@@ -972,6 +977,9 @@ func isEndOfSqlStmt(line string) bool { | |
line = line[0:cmtStartIdx] // ignore comment | ||
line = strings.TrimRight(line, " ") | ||
} | ||
if len(line) == 0 { | ||
return false | ||
} | ||
return line[len(line)-1] == ';' | ||
} | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put a
^
in the regex to ensure thatBEGIN ATOMIC
is at the start of line?Do we need to ensure that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to ensure that as it could be anywhere not necessarily in starting of the line e.g.