JSON Schema proposal: All inputs should appear under the `input` key #996

petertseng · 2017-11-08T04:26:06Z

In our schema, we know that description, expected, and property are fixed, but the inputs to each test case depend on the problem!
That means we can't know ahead of time what key(s) constitute the inputs; we have to examine the JSON in order to do so.

According to those test generators listed at exercism/discussions#155 the current solutions to this problem take on one of two forms:

Look for any key that isn't description, expected, property, or comments. That key (or those keys) must be the input, by process of elimination.
Per-exercise configuration or code describing what keys hold the input for each exercise.

Well, what if we could do better?

Let's look at https://github.com/exercism/problem-specifications/blob/master/exercises/change/canonical-data.json where each case has two inputs: coins and target. Let's say that instead a case looks like:

{
  "description": "single coin change",
  "property": "findFewestCoins",
  "input": {
    "coins": [1, 5, 10, 25, 100],
    "target": 25
  },
  "expected": [25]
}

As you can see, the key input contains an object whose keys and values represent the inputs.

And what about https://github.com/exercism/problem-specifications/blob/master/exercises/leap/canonical-data.json where each case has only one input? One might consider two possible ways to do this:

The first way is actually how the JSON is already: The key input contains a scalar, the one and only input.

{
  "description": "year not divisible by 4: common year",
  "property": "leapYear",
  "input": 2015,
  "expected": false
}

The second way we might consider is that it will also be an object, for consistency. So then:

{
  "description": "year not divisible by 4: common year",
  "property": "leapYear",
  "input": {
    "year": 2015
  },
  "expected": false
}

One could consider making that a one-liner: "input": {"year": 2015}) if vertical space savings are desired.
This has the advantage of consistency, of course: input always contains an object, but the file also becomes more verbose.

Were one of the two forms of this proposal to be implemented, would this benefit you and/or your track? Would it instead cause harm?

Alternative proposals to solve the problem of "we don't know the names of the keys that hold the input(s) ahead of time"?

The text was updated successfully, but these errors were encountered:

petertseng · 2017-11-08T04:29:41Z

I thought about trying to do this, but my current reaction to "would this benefit you and/or your track" is "meh" since I don't maintain any track that uses generators, and the humans reading the JSON files seem to have no trouble understanding which keys represent the inputs. But hey, it's here if people want it, and maybe it'll be good to have.

Insti · 2017-11-08T08:05:04Z

Having input as a key with an object containing the well-named arguments is my preference.

ErikSchierboom · 2017-11-08T08:55:00Z

Were one of the two forms of this proposal to be implemented, would this benefit you and/or your track?

Absolutely! It would make processing the JSON simpler and less error-prone.

Having input as a key with an object containing the well-named arguments is my preference.

Same here.

NobbZ · 2017-11-08T09:29:45Z

I would be happy about consilidating those keys under input as well!

petertseng · 2017-11-08T22:52:32Z

I see a comment https://github.com/exercism/problem-specifications/blob/master/canonical-schema.json#L6-L10 in the schema file that explains why input is not a key: To allow for tests that are not example-based (for example, property-based).

I allege that since the vast majority of our specifications are example-based that we can proceed with input. In general, it is hard to see how to encode a property-based test in JSON, and to date we have not made an attempt at doing so.

So I think all we will ask is that this schema does not prevent a future expansion where we might add the ability to encode a property-based test.

I think such an expansion can be added in another key that is not input. So that is achieved.

petertseng · 2017-11-08T23:09:41Z

If we take action on this (I would wait until a week has passed since the filing of the proposal), plan what to do with the version.

Assume that an exercise that is converted to the new schema is currently versioned x.y.z.

Leave current version scheme in place, which would require a major version bump to (x+1).0.0.
Adopt the two-component version proposed in Versioning guidelines #938, which would require a minor version bump to x.(y+1).
Adopt the one-component version proposed in Versioning guidelines #938 (comment), which would require simply dropping y and z, resulting in version x.
Remove the version number completely.
Your suggestion here.

Adopting a different version schema simultaneously allows us to take the interesting shortcut of "an exercise is converted to the new input schema if and only if it has an N-component version", so consider it an incentive to adopt a different version schema simultaneously.

My preferences are as I noted earlier. Single-component version > No version at all > Two-component version with CI enforcement that some version component changes ≈ Two-component version without CI enforcement > current version scheme.

coriolinus · 2017-11-09T01:35:13Z

I think that for exercises who have their canonical-data.json restructured by this, it would be appropriate to leave the current version scheme in place, but increment the version: (x+1).0.0. It's possible that a flexibly-written test generator will generate identical output after the fact, but many (including mine for Rust) will generate something new after the fact. That's a breaking change, appropriate for a major version upgrade.

The proposals to reduce the version number to a two- or one-component number discard clarity for simplicity. I can't deny that they're simpler, but I believe that knowledge of standard three-component semantic versioning is fairly widespread in the programming community at this point.

The proposal to abandon versioning in favor of git SHAs or other schemes have the drawback that it becomes harder for humans to tell what's going on at a glance, and impossible for them to estimate the difference between two files at a glance.

Insti · 2017-11-09T05:53:38Z

Can we try and keep the versioning discussions in #938 so they will be more easily discoverable to future generations.

petertseng · 2017-11-15T04:25:05Z

Well, to my surprise, in 24 * 7 hours there were no objection and there was also nonzero support. I suppose we can call it open season to convert JSON files to this, then. #998 will be able to tell us which exercises were converted or not. I would make a big checklist like #625 did but since my allotted time is up now, I can't.

snahor · 2017-11-30T11:23:09Z

exercism#996 Updated the schema for canonical data. So I am updating the book-store exercise's json to reflect the change.

petertseng · 2018-02-05T02:10:45Z

This makes some progress on the concerns listed in #336 which were:

We don't know which keys are inputs (now we will! They're all the key/value pairs under input!)
Are we able to have meaningful names for the inputs, so that they are easy for humans to read? (We are able!)

In fact, having an input object was proposed there. So this was merely a revival of it, 15 months later. Thanks to those who originally proposed it (#336 (comment), #336 (comment), #336 (comment))

The remaining questions to resolve would probably be:

We don't know what order the inputs come in. (canonical-data.json standardisation discussion (was: Malformed data?) #336 (comment))

This proposal deliberately chose not to solve that problem because it chooses to take a small step at a time. If you were to predict that I will put forward a proposal to solve that problem, I advise you that that is an extremely unwise prediction to make.

petertseng · 2018-02-10T12:12:37Z

Everyone is reminded that, when submitting a PR that purports to move inputs to the input object, you will need to remove the USE_OLD_SCHEMA file in that exercise's directory because I just merged #1074. The CI has got your back in case you forget to do that (I mean the CI will fail if you forget).

ErikSchierboom · 2018-02-10T12:13:10Z

Great work @petertseng!

petertseng · 2018-02-12T08:55:24Z

Since this proposal was accepted and is now being checked by CI, I believe it's safe to close the issue. Tracking of progress is done by counting the number of USE_OLD_SCHEMA files. even though it's a policy there's no need to keep it open since policies can be found with https://github.com/exercism/problem-specifications/issues?utf8=%E2%9C%93&q=label%3Apolicy

petertseng · 2018-02-12T22:42:29Z

Note that in react (#1130) and circular-buffer (#1186) to accommodate the schema (which requires expected we had to add expected: {} to each case, because the expectations are already encoded in each operation under operations.

If there is desire and demand for it, consider allowing operations to be an alternative to expected. Remember that bank-account will want the same treatment (#554).

grains 1.1.0 As proposed and accepted in #996 ```ruby have_input = false ARGF.each_line { |l| if l.include?('version') ver = l.split(?")[3] ver_components = ver.split(?.).map(&:to_i) ver_components[1] += 1 ver_components[2] = 0 l[ver] = ver_components.join(?.) end have_input &&= !l.include?('"description"') first_non_space = l.index(/\S/) if l.include?('"input"') have_input = true input = l.split(?:).last.to_i puts ' ' * first_non_space + '"input": {' puts ' ' * first_non_space + ' "square": ' + input.to_s puts ' ' * first_non_space + '},' next end puts ' ' * first_non_space + '"input": {},' if l.include?('"expected"') && !have_input puts l } ```

petertseng · 2018-02-16T02:18:46Z

The remaining questions to resolve would probably be:

We don't know what order the inputs come in. (canonical-data.json standardisation discussion (was: Malformed data?) #336 (comment))

My advice to the person who will resolve this question: You will need to be able to account for language differences. Take https://github.com/exercism/problem-specifications/blob/master/exercises/accumulate/description.md for example. This is the higher-order function map. Consider how a few existing languages would order the inputs:

Haskell: Object being acted on goes last, so that function composition works well.
- https://wiki.haskell.org/Parameter_order
- Thus, map has the function before the list.
Elixir: Object being acted on goes first, so that pipelining works well.
- https://hexdocs.pm/elixir/Kernel.html#%7C%3E/2
- Thus, Enum.map/2 has the list before the function.

The solution to this problem will need to take into account the possibility that different languages have different conventions.

Whoops, I just typed this up and realized it's already been said before. #336 (comment). I'm sorry that I failed to add anything new to the discussion.

This reverts commit a866c43. Observe that there are no more USE_OLD_SCHEMA files in this repository. This indicates that henceforth all canonical-data.json files should use the schema proposed in #996 and defined in #1074.

Insti mentioned this issue Nov 14, 2017

meetup: correct example in description.md #1006

Closed

Insti added the policy label Nov 15, 2017

Insti mentioned this issue Nov 15, 2017

documentation: Update README.md to reflect the "input key as an object" policy. #1007

Closed

ErikSchierboom mentioned this issue Nov 21, 2017

series: Improve documentation #1020

Merged

cmccandless mentioned this issue Nov 21, 2017

Add test generator exercism/python#1093

Closed

devkabiir added a commit to devkabiir/problem-specifications that referenced this issue Nov 25, 2017

update input schema as per exercism#996

7c13fb8

devkabiir mentioned this issue Dec 6, 2017

binary-search-tree: implement canonical-data.json #940

Merged

Stargator mentioned this issue Dec 10, 2017

Revise tool create-exercise to handle new policy exercism/dart#88

Closed

rpottsoh mentioned this issue Dec 10, 2017

readme.md: Update example to reflect "input key as an object" policy #1030

Merged

rpottsoh added a commit to rpottsoh/exercism-problem-specifications that referenced this issue Dec 15, 2017

book-store: Update json for new "input" policy

5c326cb

exercism#996 Updated the schema for canonical data. So I am updating the book-store exercise's json to reflect the change.

rpottsoh mentioned this issue Dec 15, 2017

book-store: Update json for new "input" policy #1037

Merged

This was referenced Dec 16, 2017

beer-song: Update json for new input policy #1038

Merged

binary: Update json for new input policy and fix typo #1039

Merged

binary-search: Update json for new input policy #1040

Merged

rpottsoh mentioned this issue Dec 21, 2017

transpose: add test case for mixed line length #1047

Merged

This was referenced Jan 29, 2018

robot-simulator: apply "input" policy #1163

Merged

run-length-encoding: apply "input" policy #1164

Merged

tournament: apply "input" policy #1165

Merged

bdw429s mentioned this issue Feb 5, 2018

bob: Update to clarify ambiguity regarding shouted questions exercism/cfml#74

Closed

petertseng closed this as completed Feb 12, 2018

petertseng mentioned this issue Feb 15, 2018

grains: Move input (square) to input object #1191

Merged

petertseng mentioned this issue Feb 16, 2018

Revert "travis: Choose schema based on USE_OLD_SCHEMA file" #1193

Merged

This was referenced Feb 24, 2018

All tracks that have test generators exercism/discussions#155

Closed

no-op declare compliance with problem-specifications for any input object change exercism/rust#435

Merged

petertseng mentioned this issue Mar 7, 2018

README: example has non-object inputs; change them to objects #1203

Closed

rpottsoh added a commit to rpottsoh/exercism-problem-specifications that referenced this issue Mar 7, 2018

Apply exercism#996 to README

383f29a

rpottsoh added a commit to rpottsoh/exercism-problem-specifications that referenced this issue Mar 8, 2018

Apply exercism#996 to README

633bc18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON Schema proposal: All inputs should appear under the `input` key #996

JSON Schema proposal: All inputs should appear under the `input` key #996

petertseng commented Nov 8, 2017 •

edited

Loading

petertseng commented Nov 8, 2017

Insti commented Nov 8, 2017

ErikSchierboom commented Nov 8, 2017

NobbZ commented Nov 8, 2017

petertseng commented Nov 8, 2017

petertseng commented Nov 8, 2017 •

edited

Loading

coriolinus commented Nov 9, 2017

Insti commented Nov 9, 2017

petertseng commented Nov 15, 2017

snahor commented Nov 30, 2017 •

edited by ErikSchierboom

Loading

petertseng commented Feb 5, 2018

petertseng commented Feb 10, 2018 •

edited

Loading

ErikSchierboom commented Feb 10, 2018

petertseng commented Feb 12, 2018

petertseng commented Feb 12, 2018

petertseng commented Feb 16, 2018

JSON Schema proposal: All inputs should appear under the input key #996

JSON Schema proposal: All inputs should appear under the input key #996

Comments

petertseng commented Nov 8, 2017 • edited Loading

petertseng commented Nov 8, 2017

Insti commented Nov 8, 2017

ErikSchierboom commented Nov 8, 2017

NobbZ commented Nov 8, 2017

petertseng commented Nov 8, 2017

petertseng commented Nov 8, 2017 • edited Loading

coriolinus commented Nov 9, 2017

Insti commented Nov 9, 2017

petertseng commented Nov 15, 2017

snahor commented Nov 30, 2017 • edited by ErikSchierboom Loading

petertseng commented Feb 5, 2018

petertseng commented Feb 10, 2018 • edited Loading

ErikSchierboom commented Feb 10, 2018

petertseng commented Feb 12, 2018

petertseng commented Feb 12, 2018

petertseng commented Feb 16, 2018

JSON Schema proposal: All inputs should appear under the `input` key #996

JSON Schema proposal: All inputs should appear under the `input` key #996

petertseng commented Nov 8, 2017 •

edited

Loading

petertseng commented Nov 8, 2017 •

edited

Loading

snahor commented Nov 30, 2017 •

edited by ErikSchierboom

Loading

petertseng commented Feb 10, 2018 •

edited

Loading