-
-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
series: Improve documentation #1020
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think moving to use a string representation is actually a very sensible thing to do. Overall a great effort, with some small nits. Thanks for working on this!
exercises/series/canonical-data.json
Outdated
"description": "slices of one", | ||
"property": "slices", | ||
"series": "01234", | ||
"slice_length": 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are in the process of enforcing that all keys should be lowerCamelCase
(see #987). Could you rename this property to sliceLength
(and also in the other cases)?
exercises/series/canonical-data.json
Outdated
{ | ||
"description": "slices of one", | ||
"property": "slices", | ||
"series": "01234", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are in the processing of ensuring that all input properties are gathered in a single input
key (see #996). Could you put the series
and slice_length
properties in an input
field (and also in the other cases)?
exercises/series/canonical-data.json
Outdated
"expected": ["123", "231", "312", "123"] | ||
}, | ||
{ | ||
"description": "overly long slice", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to "too long slice"?
exercises/series/canonical-data.json
Outdated
} | ||
}, | ||
{ | ||
"description": "overly short slice", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe rename to "slice length cannot be zero" or something like that?
8510ed3
to
97f42dc
Compare
Thanks for the feedback @ErikSchierboom! I've made the suggested changes. I've also changed the indentation level to 2-spaces instead of 4 for consistency with the rest of the repo. |
Looking much better already! I think the next step is to think about the various test cases. Ideally, each subsequent test case should introduce one new concept, that the previous tests didn't cover. And also in ascending difficulty. As an example, consider the following two test cases: {
"description": "slices of three",
"property": "slices",
"input": {
"series": "01234",
"sliceLength": 3
},
"expected": ["012", "123", "234"]
},
...
{
"description": "order preserved with descending numbers",
"property": "slices",
"input": {
"series": "43210",
"sliceLength": 3
},
"expected": ["432", "321", "210"]
}, Does the second test case actually introduce a new concept that the first test doesn't cover? I don't think it does in this case (the preservation of the ordering is also tested in the first test case). So perhaps we can try to come up with a set of test cases that gradually introduce more complex concepts not covered previously? |
Looks good. Should there be a test for a series made up of a single repeating digit? |
One way to come up with test cases is to look at the description. E.g. the description notes: "the digits need not be numerically consecutive." So maybe one test could be with consecutive numbers, whereas a later test could test with non-consecutive numbers. |
Hello. It was me who kicked off this review of this exercise. Why ? Because I looked at the description and saw that it said "substrings" and was miffed because it did not occur to me that whoever wrote the test cases for Python must have read a version that read "lists of integers". I read the description and it said nothing about handling duplicates (not tested), nothing about the order in which results were to be returned and nothing about returning a list. Ergo: the test cases were expecting a set containing results in no particular order with no duplicates. Perhaps the choice of return type is, or should be, track specific. I would suggest that all 'generated' READMEs add a track specific note of "what to return unless the generic read me specifies otherwise". |
It is but on its own does not prove the solution preserves ordering in the general case ... a pass for one test may simply be a coincidence. I suspect somewhere in the TDD bible it is written that each test case should test only one requirement. Amen. That is not the same as each requirement should have only one test case. What one test case does is test your assumptions about the solution's implementation. I would suggest that a good test suite makes no such assumptions. Even two test cases don't prove ordering, they simply reduce the chance of a buggy solution passing all tests. Consider again the 'set'. Contents may be ordered (I don't assume so). An ordered set converted back to a list might pass one of the two tests case but not both. It is possible that converting a list (without duplicates) to an unordered set and back again preserves order (because nothing happened to alter the initial order) . This would pass all the original tests. For an unordered set. In some languages, a duplicate is discarded, in others it replaces the original. That, I expect, would make a difference to the order of results. The general principle for all systems (road networks for example) including software systems and also test suites for software is that optimisation tends to eliminate redundancy. The corollary of this is that is makes the system 'brittle'. Road networks experience gridlock, test suites pass buggy code. Redundancy is necessary for robust and, ultimately, secure systems. |
I suspect this exercise was imported from somewhere and something in the description was lost in the process. After all, there must be a reason the input consists of only digits. |
Sorry for the delay in getting back to this - I should have some time to look at this again at the weekend. |
This PR is changing two files. I think the |
@N-Parsons Small bump. Are you still working on this? |
Sorry for the delay, @ErikSchierboom - I've been quite ill recently. I'll take another look at this in the next few days. |
@N-Parsons Sorry to hear that! I hope you have recovered fully. Good luck continuing the work on this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall I think this is starting to look very, very nice! The one remaining thing I feel is selecting the correct set of test cases. Ideally, the test cases have increasing difficulty, which I think you handle quite well. Another thing we aim for in the canonical data is to have each test case introduce a new concept/difficulty. I think that the current canonical data doesn't completely adhere to this. I'm hoping for some feedback of @petertseng or @rpottsoh, which often have great insight in these matters.
exercises/series/canonical-data.json
Outdated
} | ||
}, | ||
{ | ||
"description": "empty string is invalid even if slice 0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically a combination of the "empty string is invalid" and "slice length cannot be zero" test cases, so I think this can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically a combination of the "empty string is invalid" and "slice length cannot be zero" test cases, so I think this can be removed.
Done.
exercises/series/canonical-data.json
Outdated
"sliceLength": 1 | ||
}, | ||
"expected": { | ||
"error": "string cannot be empty" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of "string cannot be empty" we could use "series cannot be empty'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe instead of "string cannot be empty" we could use "series cannot be empty'?
Done.
exercises/series/canonical-data.json
Outdated
"expected": ["123", "231", "312", "123"] | ||
}, | ||
{ | ||
"description": "strings can have repeating digits", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this test case add anything to the "slices can include duplicates" test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two cases are essentially the same in my opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this test case add anything to the "slices can include duplicates" test case?
It seemed to be duplicating efforts. I combined "slices can include duplicates"
and "strings can have repeating digits"
.
In the original README the input is described as a string of digits and the output appears to be in integer form. 0540d50 saw fit to redefine the output as strings. What was the reasoning behind this? I see from the initial comment that some tracks seem to have implemented the exercise as described and other tracks didn't. I understand the comment regarding the word I don't like the word I am not saying that the change to the README should be reversed. I have not completed the exercise myself nor have I implemented it. I think requiring integer output over strings is more difficult (interesting maybe), but only in a trivial way. If we are talking about only strings, why limit the input to only digits? Since the exercise is only interested in digits, why does the input have to be a string? Enough rambling and babbling from me for now. I'll try to take some time over the weekend and read through the JSON. Thanks for the vote of confidence @ErikSchierboom 👍 |
"sliceLength": 2 | ||
}, | ||
"expected": ["35"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are dealing with characters and not numbers / digits how is "slices of two" any different than "slices of two with nonsequential numbers"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test case adds anything over the previous one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ErikSchierboom I agree. Would a slice of 3 from 3 be worth adding to replace one of these two, or am I just beating a dead horse? I am fine with just removing one of the two instances and not adding in a replacement case.
This captures my thoughts nicely. The current exercise seems to be leaning on two ideas:
Although I think the second case is slightly harder (and thus more interesting), I would actually be fine with just going with the first option: capturing a slice of characters from a string. In that case, the test cases can probably be significantly reduced and should also include not just digits but also alphabetic characters (and even other characters like spaces, punctuation, etc.). |
I wouldn't object to straying from https://github.com/exercism/problem-specifications/blob/master/exercises/series/metadata.yml because we shouldn't force people on how to solve largest-series-product anyway, so that's fine. On the other hand, that also means I'm not at all interested in this problem standalone.
Agreed. If the description.md changes are NOT moved to a separate PR, I only am comfortable with the following arrangement of commits:
The reason for this:
I would object to anything else, including without limitation:
Feel free to delay rearranging commits until the very end, of course. |
The only choices to really make are:
substitute I don't particularly care which choice is made in any of the above cases |
@N-Parsons What do you think about the above comments made by @petertseng? |
I have decided to justify each alternative I present. The only choices to really make are:
substitute I don't particularly care which justifiable alternative is chosen in any of the above decisions. |
To me, all three scenarios presented above should result in an error. |
I disagree: exceptions should be exceptional, whether they're implemented
as throws or Result monads or whatever. If there exists a justifiable
non-error alternative, it is to be preferred over an error. GIGO applies as
always, but it's the caller's responsibility to watch for that.
In the Rust track, this exercise is implemented such that the function
cannot return an error: unsigned inputs mean that there can be no negative
slice length, and in the other cases the slice-based return value is used.
I believe that this is fundamentally more elegant than designing it to
return an error. If we specify here that the function should return errors
in these scenarios, that means that the Rust track's implementation of this
exercise is not in compliance, and should be changed to a fundamentally
worse design in order to conform with the canonical data.
I believe that the case testing negative slice length should return an
error, as there is no other justifiable return for that case, and languages
which support unsigned ints can simply discard the test case. The other two
cases should return the slices as described by @petertseng.
…On Mon, Mar 5, 2018 at 8:10 AM, Erik Schierboom ***@***.***> wrote:
To me, all three scenarios presented above should result in an error.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#1020 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHdeTgDdqDDDziq34f00KLoYnySuceVoks5tbOTOgaJpZM4Ql6hv>
.
|
Can we perhaps come to a decision here? I'm fine with any option. |
As am I. @N-Parsons what do you think? Have you been able to follow all of the discussion that has been happening since the beginning of February? |
@rpottsoh @petertseng @coriolinus There has not been a response for 24 days. I prefer not to have long standing PR's open. Any ideas on how to proceed? |
My stance remains:
|
@coriolinus Okay, clear. After considering those changes, I'm fine with them. However, there being no response for a while, begs the question if we want to wait for someone to work on this PR, or should someone create a new PR? |
I'm fine leaving this open and waiting for someone with interest to come in and implement the requested changes. |
I am not able to offer my time to make any changes personally. In general, from my point of view, I have no qualms about leaving any PR open indefinitely as long as:
If some means of clearly indicating that an interested party may work on this PR would make this more attractive to those interested parties, I support applying those means. For example, that might be to either: Label, comment, or close this PR. I don't feel a need to express a preference between these options at this time. |
c4b4d02
to
41c9c6e
Compare
I had an hour this evening so I have donated it here. I am not sure that how I have committed my changes is kosher. If there is another / better way I should have gone about contributing to this PR / issue please let me know. |
- 4914 | ||
- 9142 | ||
- "4914" | ||
- "9142" | ||
|
||
And if you ask for a 6-digit series from a 5-digit string, you deserve | ||
whatever you get. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I interpret this last statement to mean that each track can decide how to deal with potentially erroneous input. I know that has been debated to an extent in this PR. If the consent were to treat erroneous input a certain way then this statement should probably be reworded to reflect that decision.
At present there are three different error messages related to |
Personally, I prefer specific feedback as it makes it easier to immediately see what the current issue is. If I am playing a game, and I violate a rule, it is not useful for another player who has observed the violation to simply hand me the rule book and tell me "you can't do that" (ignoring the fact this is in itself is a rule in some games). It is much better to cite the violated rule and reference the rules (or, to bring the example back home, the README) for further clarification. |
Thanks for the feedback @cmccandless. |
"sliceLength": 2 | ||
}, | ||
"expected": ["35"] | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this test case adds anything over the previous one.
I think this is looking great, with just one single nit. Thanks for doing this @rpottsoh! |
5cbb311
to
048c7a7
Compare
048c7a7
to
01a7bb3
Compare
@petertseng Could you perhaps also review this PR, as you have also given it some thought previously. |
Please do not wait for my review, other than I can vouch for the fact that the following code verifies the proposed JSON file: require 'json'
require_relative '../../verify'
json = JSON.parse(File.read(File.join(__dir__, 'canonical-data.json')))
verify(json['cases'], property: 'slices') { |i, _|
i['series'].each_char.each_cons(i['sliceLength']).map(&:join).tap { |x|
raise 'Nope, empty is bad' if x.empty?
}
} I have no additional commentary to add at this time. |
Then I find this ready to be merged. Thanks a lot @N-Parsons for starting with this and @rpottsoh for finishing it! |
Resolves #584 by implementing canonical data for
series
.I've noticed that lots of implementations expect the output to be a list of lists of integers rather than a list of strings:
However, I would argue that a list of strings is more reasonable, particularly given the current description, which mentions substrings. Some tracks have implemented it this way:
As a result, this implementation of the canonical data expects the output to be a list of strings.