-
-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sgf-parsing: Add individual tests of escaping/whitespace behaviour #1889
Conversation
diff --git a/exercises/sgf-parsing/verify.rb b/exercises/sgf-parsing/verify.rb
index 45d41aca..f85c5044 100644
--- a/exercises/sgf-parsing/verify.rb
+++ b/exercises/sgf-parsing/verify.rb
@@ -97,12 +97,8 @@ class SGF
if c == ?\\ && !escape
escape = true
else
- val << ?\\ if escape && c == ?n
- if escape && c == ?t
- val << ' '
- else
- val << c
- end
+ to_insert = c == "\n" && escape ? '' : c == "\t" ? ' ' : c
+ val << to_insert
escape = false
end
pos += 1
@@ -114,6 +110,8 @@ end
json = JSON.parse(File.read(File.join(__dir__, 'canonical-data.json')))
+json['cases'].reject! { |v| v['uuid'] == '11c36323-93fc-495d-bb23-c88ee5844b8c' }
+
verify(json['cases'], property: 'parse') { |i, c|
pos, node = SGF.new(i['encoded']).node(0, paren_required: true)
raise 'Incomplete parse' unless pos == i['encoded'].size
|
"expected": { | ||
"properties": { | ||
"A": ["x[y]z"], | ||
"B": ["foo"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may ask. why the foo
and bar
? Just to detect the case where a parser silently breaks in a way that finishes the current property value but halts all further parsing. Having additional material to parse after that, and verifying that that was parsed, avoids that case.
"encoded": "(;A[\\]b\nc\\\nd\t\te\\\\ \\\n\\]])" | ||
}, | ||
"expected": { | ||
"properties": { | ||
"A": ["]b\ncd e\\ ]"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be very clear: In this case, and in all other cases in this PR as well, it's important to keep in mind the difference between the string as it appears in JSON and the string as it is presented to the SGF parser.
The string "(;A[\\]b\nc\\\nd\t\te\\\\ \\\n\\]])"
in JSON means that this string is presented to the SGF parser:
(;A[\]b
c\
d e\\ \
\]])
Everything between the [
and the ]
is the property value, and it contains these characters:
\]
, escaped closing bracket: insert closing bracket in property valueb
: insert as-is.- unescaped newline: insert as-is.
c
: insert as-is.- escaped newline: insert nothing.
d
: insert as-is- two tabs: insert two spaces
e
: insert as-is\\
, escaped backslash: insert backslash- space: insert as-is.
- escaped newline: insert nothing
\]
, escaped closing bracket: insert closing bracket
The property value is thus:
]b
cd e\ ]
which is written in JSON as it is on line 347.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it perhaps be worth it to add this comment to the test case? It looks to be very useful in getting people to understand what they're working with.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a bad idea. Added.
224e01c
to
1a8d89a
Compare
previous case "escaped property value" crams too much into one test case. Split it up into multiple ones. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in the new case. exercism/problem-specifications#1889
previous case "escaped property value" crams too much into one test case. Split it up into multiple ones. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in the new case. exercism/problem-specifications#1889
"encoded": "(;A[\\]b\nc\\\nd\t\te\\\\ \\\n\\]])" | ||
}, | ||
"expected": { | ||
"properties": { | ||
"A": ["]b\ncd e\\ ]"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it perhaps be worth it to add this comment to the test case? It looks to be very useful in getting people to understand what they're working with.
ff82413
to
03d447f
Compare
"1. the string as it is represented in a string literal", | ||
"2. the string as it will be presented to the SGF parser", | ||
"In particular, the SGF parser will see a property (between the square brackets) with these characters:", | ||
"Escaped closing bracket (a single backslash followed by a closing bracket): Insert closing bracket in property value", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is verbose, but I am thinking it may be a good idea to avoid backslashes in the comments field if avoiding backslashes will help prevent further confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely.
CC @exercism/reviewers |
3c6c50b
to
427d136
Compare
Please understand that it was necessary for me to rebase this because #1917 reformatted this exercise. My intent is that the only thing changed is whitespace. Believing this to be true (observe how |
395ad7a
to
a8f7c0e
Compare
previous case "escaped property value" crams too much into one test case. Split it up into multiple ones. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in the new case. exercism/problem-specifications#1889
previous case "escaped property value" crams too much into one test case. Split it up into multiple ones. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in the new case. exercism/problem-specifications#1889
previous case "escaped property value" crams too much into one test case. Split it up into multiple ones. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in the new case. exercism/problem-specifications#1889 exercism#1025
previous case "escaped property value" crams too much into one test case. Split it up into multiple ones. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in the new case. exercism/problem-specifications#1889 #1025
11c36323-93fc-495d-bb23-c88ee5844b8c crams too much into one test case. Split it up into multiple ones. In addition, 11c36323-93fc-495d-bb23-c88ee5844b8c has behaviour that violates the specification. The violation is that `\t` and `\n` (written as `"\\t"` and `"\\n"` in the JSON string, respectively), do not hold any sort of special significance in SGF, according to the specification: https://www.red-bean.com/sgf/sgf4.html Reimplement in 08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab. The reimplemented case is mostly as it was when the exercise was originally implemented in exercism/exercism@7a5075b and the reimplemented case is in accordance with the specification. Note that the original case also got it wrong in that newlines should remain newlines; this is corrected in 08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.
11c36323-93fc-495d-bb23-c88ee5844b8c crams too much into one test case.
Split it up into multiple ones.
In addition, 11c36323-93fc-495d-bb23-c88ee5844b8c has behaviour that
violates the specification. The violation is that
\t
and\n
(writtenas
"\\t"
and"\\n"
in the JSON string, respectively), do not holdany sort of special significance in SGF, according to the specification:
https://www.red-bean.com/sgf/sgf4.html
Reimplement in 08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.
The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b
and the reimplemented case is in accordance with the specification.
Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in
08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.