-
-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: Testing 'invalid' input. #902
Comments
Thanks for opening a dedicated discussion. :-) I understand your point:
I really understand this. Sometimes I want to do a puzzle for the puzzle's sake and don't care about edge cases that might never occur. Now, let me set some ground work first. Here are some quotes from the exercism homepage:
First: "Level up your programming skills" Second: "solve practice problems" Third: "[get] feedback" Fourth: "For code newbies and experienced programmers" And one last thing:
OK, now that we are on the same page, let me provide you my point of view: Exercism uses TDD in a broad sense. It is not TDD from the book, because we provide the tests, one at a time, and the user should complete the cycle by making it green and refactoring. That's fine. TDD is generally considered a good practice nowadays. It gets more and more traction. However, I'd wager most of the seasoned programmers and nearly all of new programmers are still inexperienced when it comes to writing particularly good tests, especially so when it comes to writing tests first. For example, it is an ongoing and everlasting topic in almost every developer conference. This has reasons: On the one hand, the benefits to a project, architecture, craft and business are fact. This means:
Now, we provide programming exercises for new and experienced programmers. This includes at least to provide interesting challenges to let the users increase their problem solving skills. As I already said, TDD or writing tests in general is a notoriously underutilized method for new and experienced programmers as well. So providing good examples and well defined test specifications is the least we can do teach programmers of all levels about it. So now the core question: Well, just look at some exercise submissions randomly and you will see something interesting:
But I said "usually". From experience in production, I know that implementation of such checks greatly depend on the dev's mood and the depth of the current implementation level. The deeper a method is buried, the higher the confidence grows that a null check is probably not necessary. On the contrary: The more complex a project is, the fear to hit a null pointer exception grows, because "who knows if this methods already checks for it, so better we check it here as well...". This does not occur with well documented pieces of software that tell exactly what they expect, what they can handle and what they return. Everyone who ever read such a JDoc for example, is grateful to have this information. So, I urge you to make these constraints explicit: So I have got two suggestions:
To be clear:
And if so, just let them throw an error for simplicity sake. e.g. For the nucleotide-count exercise: DNA strands and RNA strands consist of one different nucleotide. We want to check DNA strands only, so an RNA strand should throw an error, as well as all other inputs that are not exactly A, a, C, c, G, g or T, t. You don't need tests or checks for special unicode characters in particular, because these are usually invalid anyways, but it does not hurt to have them neither. Such test cases are usually so immensely fast that it does not matter if you have them executed or not. However, string handling, proper escaping of forms, date and time etc. should be part of a separate exercise. |
Unfortunately the only guidance I found on this was in https://github.com/exercism/docs/blob/master/you-can-help/improve-exercise-metadata.md#extracting-canonical-test-data
So, if we need some better documentation on what is or is not essential, it would be appropriate. I was once told not to add a case in #287 (comment). I know that if I am told a function has a documented precondition as its contract, I feel no obligation to test what happens if the precondition is violated. If our problem descriptions need more documented preconditions, that would be helpful in any case since #869 alleges (and I agree) that the number of descriptions that are lacking is non-zero. |
Interesting.
and the read from @derifatives is also a nice one. To summarize we have two... maybe three problems:
|
First things first: I'm tired as hell and will only read the full thread after I got some hours of sleep, but I want to give a quickshot right now: In general I do consider error handling a good thing which should be tested, but in the given example the purpose of the exercise is counting occurences of a letter in a string, where the string is only allowed to four different characters. So either we restrict ourselfs only to valid input or check only for strings which do not contain those simple characters.
In some languages not possible at all, and in my opinion, we should assume some safetynet in the early languages. Also those languages that constantly deal with null-errors should are free to add them as necessary.
We agreed to restrict ourself to 7bit US-ASCII a while ago because all those exercises that dealt with characters outside of this range were subject of questions and problems due to difficulties getting the encoding in the editor/language/library right. We considered 7bit ASCII as a safe subset of the most commonly used encodings. We also decided that we could create exercises that deal with Unicode and/or I18n and/or L21n later on as the need arises.
In my opinion a single test for an invalid DNA and for an invalid nucleotide were sufficient. |
That's the question. :-)
I am not aware how many languages do not have a null-concept or something alike. But I think only a few. Following this logic we also have to remove all the "expected: error" specs, because not all languages have a concept to throw exceptions. Generalization is good for canonical data, but hard to fully comply with. If a canonical test is not applicable to very few tracks, then it just can be omitted by them, can't they?
Correct, that's out of question. |
Unless otherwise specified, I do read ASCII as 7bit, since this is what it originally was. The 8 bit version is called extended for a reason.
Haskell, Rust, OCaml, Erlang, Elixir, maybe other. Even those that do have a null value do treat it different in their idioms… In go it seems as if In other languages I've seen it as "please insert your default value here" when calling a function or the presence of a computation error/argument error when returned from a function. So in other languages we do not even have strict typing to enforce an input string, shall we therefore do create a canonical test that throws when given an integer when strings are expected? In some languages we could decide to even use the typesystem of the language to keep out invalid input and create a datatype Therefore, as I said, the canonical data should only contain a small limited set of test data which deals with correct input and expectations, while handling errornous input should be in the responsibility of the track. Another reason why it should be in the tracks responsibility are different idioms and possibilities of error signaling. In go we have a multi-return and return errors as a value when they occur or |
So you say we should indeed test for invalid inputs, correct? |
@NobbZ wrote:
Well said, I agree with this. @Vankog wrote:
(incorrect) I did not interpret what Nobbz said this way.
It's not about delegating, the tracks choose to use the canonical data to aid the writing of their own test suites. They are also free to not use the canonical data at all, and to add and/or change tests which are present in the canonical data. |
I think you misunderstood me ^^
means in my eyes exactly what I inferred here (see annotations):
The word "delegate" was not particularly well chosen.
That's what I thought, too. However, I was asked to change the canonical tests first, when I proposed to change the track tests. Only language specific tests should be the exemption. |
I have no expertise as to discuss the programming part of the problem, but I do want to point out something not quite as obvious and almost as unimportant to the problem as it can be... DNA is double stranded. If you are trying to count the amount of nucleotides in DNA, you most consider that each nucleotide exists only in pair. Each time you have T, you also have A on the opposing strand. |
@Dysp thanks for pointing this out.
I think the problem as is OK as is. But, that doesn't mean anything, I might just be feeling lazy. 😃 If you have not already, take a look at its description.md and see what you think. If you think the wording could be improved we would all be very grateful for any changes you might suggest. A PR would be the best way to go about this. Let us know if you think there could be improvements made to the wording and if you need any help with the PR process. |
I created a PR: #913 |
* Add error object * remove error tests per #902
-- nucleotide-count description
This issue is a discussion about what kind of invalid test input it is appropriate to have as part of the canonical-data for an exercise.
Part of the PR "nucleotide-count: refactoring tests (discussion)" (#895) proposed adding tests for various "error" input cases,
null
input, strings that contain non ascii characters and other tests that involve strings containing various other non-DNA characters.In the discussion that arose around that, @Vankog wrote: #895 (comment)
I agree that writing good tests is hard. One of the traps that is often fallen in to, is testing too much, and the skill is in knowing where to draw the line between what is important to test and what is not important to test.
Part of the challenge of programming is designing clean and self contained abstractions so that you have to worry about as few things as possible at any one time.
The exercises on Exercism help by taking away a lot of the ambiguity and uncertainty by providing small, self contained and well defined exercises, for which the 'interesting' part of the problem is implementing the algorithm to solve a specific problem, rather than working out what the problem is and splitting it up.
In this case, where the problem is "counting nucleotides in a DNA string" it is OK to assume that all the input to the function will be valid DNA strings, and testing against things that are NOT valid DNA strings is inappropriate.
Think of it at it as there was another step prior to every problem that insures that the input is valid. In this case:
We should be able to trust that the DNA Parser is doing its job and turning whatever it gets as input into a valid DNA string before passing it to nucleotide-count.
Input validation and String parsing are legitimate problems on their own, but they should be split out rather than mixed in to every problem.
I encourage the creation of a 'string-cleaning' exercise that takes all kinds of wacky input values and ensures that the result is a clean string.
But these tests do not belong in
nucleotide-count
, or any of the other exercises that happen to take strings as input.See also: #428 where we discussed whether it was appropriate for every test that handled strings to also have to deal with strings that contained non-ascii characters.
The text was updated successfully, but these errors were encountered: