Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sgf-parsing: Add individual tests of escaping/whitespace behaviour #1889

Merged
merged 1 commit into from
Oct 6, 2022

Conversation

petertseng
Copy link
Member

11c36323-93fc-495d-bb23-c88ee5844b8c crams too much into one test case.
Split it up into multiple ones.

In addition, 11c36323-93fc-495d-bb23-c88ee5844b8c has behaviour that
violates the specification. The violation is that \t and \n (written
as "\\t" and "\\n" in the JSON string, respectively), do not hold
any sort of special significance in SGF, according to the specification:
https://www.red-bean.com/sgf/sgf4.html

Reimplement in 08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b
and the reimplemented case is in accordance with the specification.

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in
08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.

@petertseng
Copy link
Member Author

Diff to https://github.com/petertseng/exercism-problem-specifications/blob/verify/exercises/sgf-parsing/verify.rb:

diff --git a/exercises/sgf-parsing/verify.rb b/exercises/sgf-parsing/verify.rb
index 45d41aca..f85c5044 100644
--- a/exercises/sgf-parsing/verify.rb
+++ b/exercises/sgf-parsing/verify.rb
@@ -97,12 +97,8 @@ class SGF
       if c == ?\\ && !escape
         escape = true
       else
-        val << ?\\ if escape && c == ?n
-        if escape && c == ?t
-          val << ' '
-        else
-          val << c
-        end
+        to_insert = c == "\n" && escape ? '' : c == "\t" ? ' ' : c
+        val << to_insert
         escape = false
       end
       pos += 1
@@ -114,6 +110,8 @@ end
 
 json = JSON.parse(File.read(File.join(__dir__, 'canonical-data.json')))
 
+json['cases'].reject! { |v| v['uuid'] == '11c36323-93fc-495d-bb23-c88ee5844b8c' }
+
 verify(json['cases'], property: 'parse') { |i, c|
   pos, node = SGF.new(i['encoded']).node(0, paren_required: true)
   raise 'Incomplete parse' unless pos == i['encoded'].size

"expected": {
"properties": {
"A": ["x[y]z"],
"B": ["foo"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you may ask. why the foo and bar? Just to detect the case where a parser silently breaks in a way that finishes the current property value but halts all further parsing. Having additional material to parse after that, and verifying that that was parsed, avoids that case.

Comment on lines 329 to 365
"encoded": "(;A[\\]b\nc\\\nd\t\te\\\\ \\\n\\]])"
},
"expected": {
"properties": {
"A": ["]b\ncd e\\ ]"]
Copy link
Member Author

@petertseng petertseng Nov 28, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be very clear: In this case, and in all other cases in this PR as well, it's important to keep in mind the difference between the string as it appears in JSON and the string as it is presented to the SGF parser.

The string "(;A[\\]b\nc\\\nd\t\te\\\\ \\\n\\]])" in JSON means that this string is presented to the SGF parser:

(;A[\]b
c\
d		e\\ \
\]])

Everything between the [ and the ] is the property value, and it contains these characters:

  • \], escaped closing bracket: insert closing bracket in property value
  • b: insert as-is.
  • unescaped newline: insert as-is.
  • c: insert as-is.
  • escaped newline: insert nothing.
  • d: insert as-is
  • two tabs: insert two spaces
  • e: insert as-is
  • \\, escaped backslash: insert backslash
  • space: insert as-is.
  • escaped newline: insert nothing
  • \], escaped closing bracket: insert closing bracket

The property value is thus:

]b
cd  e\ ]

which is written in JSON as it is on line 347.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps be worth it to add this comment to the test case? It looks to be very useful in getting people to understand what they're working with.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a bad idea. Added.

@petertseng petertseng force-pushed the sgf branch 2 times, most recently from 224e01c to 1a8d89a Compare November 28, 2021 16:04
petertseng added a commit to petertseng/exercism-haskell that referenced this pull request Nov 28, 2021
previous case "escaped property value" crams too much into one test
case.  Split it up into multiple ones.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in the new case.

exercism/problem-specifications#1889
petertseng added a commit to petertseng/exercism-haskell that referenced this pull request Nov 28, 2021
previous case "escaped property value" crams too much into one test
case.  Split it up into multiple ones.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in the new case.

exercism/problem-specifications#1889
Comment on lines 329 to 365
"encoded": "(;A[\\]b\nc\\\nd\t\te\\\\ \\\n\\]])"
},
"expected": {
"properties": {
"A": ["]b\ncd e\\ ]"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it perhaps be worth it to add this comment to the test case? It looks to be very useful in getting people to understand what they're working with.

@petertseng petertseng force-pushed the sgf branch 2 times, most recently from ff82413 to 03d447f Compare December 9, 2021 09:29
"1. the string as it is represented in a string literal",
"2. the string as it will be presented to the SGF parser",
"In particular, the SGF parser will see a property (between the square brackets) with these characters:",
"Escaped closing bracket (a single backslash followed by a closing bracket): Insert closing bracket in property value",
Copy link
Member Author

@petertseng petertseng Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is verbose, but I am thinking it may be a good idea to avoid backslashes in the comments field if avoiding backslashes will help prevent further confusion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely.

@ErikSchierboom
Copy link
Member

CC @exercism/reviewers

@petertseng petertseng force-pushed the sgf branch 2 times, most recently from 3c6c50b to 427d136 Compare February 1, 2022 18:05
@petertseng
Copy link
Member Author

petertseng commented Feb 1, 2022

Please understand that it was necessary for me to rebase this because #1917 reformatted this exercise. My intent is that the only thing changed is whitespace. Believing this to be true (observe how git diff -w 03d447f 427d136 exercises/sgf-parsing is empty), I believe this PR remains reviewable.

@petertseng petertseng force-pushed the sgf branch 2 times, most recently from 395ad7a to a8f7c0e Compare April 19, 2022 22:25
petertseng added a commit to petertseng/exercism-haskell that referenced this pull request Jul 21, 2022
previous case "escaped property value" crams too much into one test
case.  Split it up into multiple ones.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in the new case.

exercism/problem-specifications#1889
petertseng added a commit to petertseng/exercism-haskell that referenced this pull request Jul 21, 2022
previous case "escaped property value" crams too much into one test
case.  Split it up into multiple ones.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in the new case.

exercism/problem-specifications#1889
petertseng added a commit to petertseng/exercism-haskell that referenced this pull request Jul 22, 2022
previous case "escaped property value" crams too much into one test
case.  Split it up into multiple ones.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in the new case.

exercism/problem-specifications#1889

exercism#1025
petertseng added a commit to exercism/haskell that referenced this pull request Jul 22, 2022
previous case "escaped property value" crams too much into one test
case.  Split it up into multiple ones.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in the new case.

exercism/problem-specifications#1889

#1025
11c36323-93fc-495d-bb23-c88ee5844b8c crams too much into one test case.
Split it up into multiple ones.

In addition, 11c36323-93fc-495d-bb23-c88ee5844b8c has behaviour that
violates the specification. The violation is that `\t` and `\n` (written
as `"\\t"` and `"\\n"` in the JSON string, respectively), do not hold
any sort of special significance in SGF, according to the specification:
https://www.red-bean.com/sgf/sgf4.html

Reimplement in 08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.

The reimplemented case is mostly as it was when the exercise was
originally implemented in
exercism/exercism@7a5075b
and the reimplemented case is in accordance with the specification.

Note that the original case also got it wrong in that newlines should
remain newlines; this is corrected in
08e4b8ba-bb07-4431-a3d9-b1f4cdea6dab.
@kytrinyx kytrinyx merged commit 3c33c24 into exercism:main Oct 6, 2022
@petertseng petertseng deleted the sgf branch October 6, 2022 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants