-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for string values #58
Changes from 9 commits
d0d3412
2a54d28
460b748
e53022a
984fe66
7b85e7d
e241322
64990f8
43f7b28
4a1635c
39f92b5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -542,11 +542,11 @@ func (t *tokenizer) readString() (string, error) { | |
if err != nil { | ||
return "", err | ||
} | ||
|
||
switch c { | ||
case -1, '\n': | ||
if isProhibitedControlChar(c) || c == '\n' { | ||
return "", t.invalidChar(c) | ||
} | ||
|
||
switch c { | ||
case '"': | ||
return ret.String(), nil | ||
|
||
|
@@ -582,20 +582,24 @@ func (t *tokenizer) readLongString() (string, error) { | |
if err != nil { | ||
return "", err | ||
} | ||
|
||
switch c { | ||
case -1: | ||
if isProhibitedControlChar(c) { | ||
return "", t.invalidChar(c) | ||
} | ||
|
||
switch c { | ||
case '\'': | ||
startPosition := t.pos | ||
ok, err := t.skipEndOfLongString(t.skipCommentsHandler) | ||
if err != nil { | ||
return "", err | ||
} | ||
if ok { | ||
return ret.String(), nil | ||
} | ||
|
||
if startPosition == t.pos { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like wasConsumed, err := t.skipEndOfLongString(t.skipCommentsHandler)
if err != nil {
return "", err
}
if wasConsumed {
return ret.String(), nil
} else {
// The ' was not part of a long string ending.
ret.writeByte(byte(c))
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It'll also return false if it skipped over a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I almost refactored #61 to change |
||
// No character has been consumed. It is single '. | ||
ret.WriteByte(byte(c)) | ||
} | ||
case '\\': | ||
c, err = t.peek() | ||
if err != nil { | ||
|
@@ -1263,3 +1267,25 @@ func (t *tokenizer) unread(c int) { | |
t.pos-- | ||
t.buffer = append(t.buffer, c) | ||
} | ||
|
||
func isProhibitedControlChar(c int) bool { | ||
// Values lower than this are non-displayable ASCII characters; except for new line and white space characters. | ||
if c > 0x1F { | ||
return false | ||
} | ||
if isStringWhitespace(c) || isNewLineChar(c) { | ||
return false | ||
} | ||
return true | ||
} | ||
|
||
func isStringWhitespace(c int) bool { | ||
return c == 0x09 || //horizontal tab | ||
c == 0x0B || //vertical tab | ||
c == 0x0C // form feed | ||
} | ||
|
||
func isNewLineChar(c int) bool { | ||
return c == 0x0A || //new line | ||
c == 0x0D //carriage return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should explicitly handle the
EOF
/-1
case here and below.isProhibitedControlChar
will catch it, but the method name makes me think it's only looking for ASCII control characters, which doesn't include -1.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your point, but that means going back to this comment.
We can either change the function name to something more generic like
invalidStringCharacter
or put the logic before this commit back.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a subtle difference here. We've gone from:
to
I think
isProhibitedControlChar
is a more precise/communicative name, so I'd like to keep it. But that means we need the explicit EOF check:I think this is especially helpful since someone new to the codebase could reasonably expect EOF to be handled by the
above, since
err
is how the standard library reportsEOF
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. Updated the pull request. Thanks.