All tracks that have test generators #155

petertseng · 2017-06-05T00:46:09Z

Welcome to another issue of "All tracks that have X".
Today's issue is about test generators.

These are anything that use the canonical-data.json file from x-common and generate a test suite to be delivered to students of a given track.

If your track has these, I would be interested to hear about it.

I hope this can help tracks that don't have generators evaluate whether to have them, and allow tracks that already have generators to learn from each other.

Questions I would like to ask:

How much additional code must you write to generate tests for each new exercise?
- On one extreme, zero additional code is needed: A single generator can generate code for every single exercise.
- On the other extreme, maximal additional code is needed: No code at all is shared between generators of any two exercises.
- Where on this spectrum is your track currently?
- Where on this spectrum would you like your track to be, ideally?
How do you deal with the fact that the keys/values of a test case are dependent on the exercise? expected as the output is known, but the input values take on many different names.
- Vimscript: Take the first key that isn't any of comments, description, expected, property (TBD: tests that take multiple inputs). See Add lib/generate.vim vimscript#32.
- Various other tracks: Additional code required per exercise that specifies what key(s) contain(s) the input.
Statically typed languages: How do you deal with the fact that you cannot determine what types the keys/values of a test case will have until you read the value at the property key?
- C# and Scala: Parse into a map/dictionary of string -> any type.
- Go: Union all possible keys/values.
- Go proposed (clock: restore type-safety for different types of cases go#677): Delay full parsing until property is read, then parse into different structs depending on what property is.
Are there any possible changes to the canonical JSON schema that would make generation easier?

This issue may be closed when, in the issue-closer's opinion, there has been enough discussion to get an idea of how some tracks are answering these questions. Of course, even after it is closed, please feel free to comment with any additional answers.

If as a result there are any proposed changes to the schema, an appropriate issue can be created for that.

To give us a head start, here is what I know of some languages' generators.
Please forgive me for being greedy and filling in information for tracks that I am unfamiliar with.
Please correct these or add any additional tracks I missed.
In alphabetical order:

C#

https://github.com/exercism/xcsharp/tree/master/generators
per-exercise data at https://github.com/exercism/xcsharp/tree/master/generators/Exercises
- simple example: leap https://github.com/exercism/xcsharp/blob/master/generators/Exercises/LeapExercise.cs
- beer-song deals with multiple property: https://github.com/exercism/xcsharp/blob/master/generators/Exercises/BeerSongExercise.cs
https://github.com/exercism/xcsharp/blob/master/generators/Data/CanonicalDataCase.cs#L22 is IDictionary<string, object>, supporting values of any type.

ColdFusion

exercism/cfml@ef2544b
Probably zero additional config per exercise, given single-input exercises, just didn't read very carefully to verify this statement is true.

Factor

https://github.com/catb0t/exercism.factor/blob/master/exercism/autogen-exercises/autogen-exercises.factor

Go

An .meta/gen.go in each exercise directory defines the structure that the file is expected to have.
- https://github.com/exercism/xgo/blob/master/exercises/leap/.meta/gen.go is a simple example for tests that have one property.
- https://github.com/exercism/xgo/blob/master/exercises/clock/.meta/gen.go and https://github.com/exercism/xgo/blob/master/exercises/custom-set/.meta/gen.go examples for tests that have more than one property.
- Currently unioning all key/value pairs, considering parsing property first before the other key/value pairs.
Common code at https://github.com/exercism/xgo/blob/master/gen/gen.go

JavaScript

As I understand it, a single generator: https://github.com/exercism/xjavascript/blob/master/exercises/custom-set/example-gen.js

OCaml

Common code in https://github.com/exercism/xocaml/tree/master/tools/test-generator/src
a little additional for each exercise in https://github.com/exercism/xocaml/blob/master/tools/test-generator/src/ocaml_special_cases.ml
templates for each exercise in https://github.com/exercism/xocaml/tree/master/tools/test-generator/templates/ocaml
- leap is simple, only one property: https://github.com/exercism/xocaml/blob/master/tools/test-generator/templates/ocaml/leap/template.ml
- run-length-encoding with multiple properties: https://github.com/exercism/xocaml/blob/master/tools/test-generator/templates/ocaml/run-length-encoding/template.ml

Perl 6

Template in https://github.com/exercism/xperl6/blob/master/templates/test.mustache
Explanations in https://github.com/exercism/xperl6/tree/master/bin
YAML file for each exercise, then JSON file from x-common simply gets embedded into the test file
- leap: https://github.com/exercism/xperl6/blob/master/exercises/leap/example.yaml and https://github.com/exercism/xperl6/blob/master/exercises/leap/leap.t
- clock (multiple property, hard code indices of cases): https://github.com/exercism/xperl6/blob/master/exercises/clock/example.yaml and https://github.com/exercism/xperl6/blob/master/exercises/clock/clock.t

Ruby

A small script in the bin directory: https://github.com/exercism/xruby/blob/master/bin/generate
A library in lib/generator providing common-case parsing: https://github.com/exercism/xruby/tree/master/lib/generator
A .meta/generator in each exercise directory containing a *_case.rb file. If necessary a custom test_template.erb can be provided. Most tests use a common default template
- Simple example with a custom template: https://github.com/exercism/xruby/blob/master/exercises/leap/.meta/generator/leap_case.rb and https://github.com/exercism/xruby/blob/master/exercises/leap/.meta/generator/test_template.erb
- Example with more in generator: https://github.com/exercism/xruby/blob/master/exercises/bowling/.meta/generator/bowling_case.rb
- Notice how e.g. bracket-push needs no custom template: https://github.com/exercism/xruby/tree/master/exercises/bracket-push/.meta/generator

Edit: Most tests use a common default template.

Rust

https://github.com/exercism/rust/blob/master/bin/init_exercise.py

Scala

Common code in https://github.com/exercism/xscala/tree/master/testgen/src/main/scala/testgen
https://github.com/exercism/xscala/blob/master/testgen/src/main/scala/testgen/CanonicalDataParser.scala#L12 allows any type in the parse result.
Per-exercise portion in https://github.com/exercism/xscala/tree/master/testgen/src/main/scala
- https://github.com/exercism/xscala/blob/master/testgen/src/main/scala/BookStoreTestGenerator.scala looks quite simple, nice.
- https://github.com/exercism/xscala/blob/master/testgen/src/main/scala/BeerSongTestGenerator.scala deals with multiple properties. The list of inputs is defined per-property.

Vimscript

https://github.com/exercism/xvimscript/blob/master/lib/generate.vim
As far as I can tell, zero additional config per exercise, given single-input exercises.

The text was updated successfully, but these errors were encountered:

mhinz · 2017-06-05T11:52:24Z

How do you deal with the fact that the keys/values of a test case are dependent on the exercise?

This was my biggest pain point when I wrote the generator for Vimscript. As you said, I'm essentially guessing. I remove the keys comments, description, expected, property and hope that only one key will be left. Then I use its value.

It's easy to see that this approach is very fragile, nonetheless it works in a lot of cases.

I intended to open an issue for this for x-common, because we need a proper "standard" that describes how canonical data should look like. One should never have to guess.

But you also raised other points I didn't encounter, e.g. type issues, so I hope we can compile a list of typical issues here and use those to create a standard for canonical data. This would make generators less complex and more correct.

kotp · 2017-06-05T20:48:27Z

The VimScript approach in that regard so far is brilliant... remove the known keys, use the unknown.

petertseng · 2017-06-06T01:08:55Z

Statically typed languages: How do you deal with the fact that you cannot determine what types the keys/values of a test case will have until you read the value at the property key?

I need to clarify this and why I am interested in the answer to this question.

In some JSON parsers in statically-typed languages, you must declare the types of all key/value pairs in a JSON object before you can parse it. This poses a challenge for, say, the clock data in https://github.com/exercism/x-common/blob/master/exercises/clock/canonical-data.json that has multiple property but differing types for keys/values depending on property.

An excerpt:

  "cases": [
    {
      "description": "Create a new clock with an initial time",
      "cases": [
        {
          "description": "on the hour",
          "property": "create",
          "hour": 8,
          "minute": 0,
          "expected": "08:00"
        }
      ]
    },
    {
      "description": "Add minutes",
      "cases": [
        {
          "description": "add minutes",
          "property": "add",
          "hour": 10,
          "minute": 0,
          "add": 3,
          "expected": "10:03"
        }
      ]
    },
    {
      "description": "Compare two clocks for equality",
      "cases": [
        {
          "description": "clocks with same time",
          "property": "equal",
          "clock1": {
            "hour": 15,
            "minute": 37
          },
          "clock2": {
            "hour": 15,
            "minute": 37
          },
          "expected": true
        }
      ]
    }
  ]

Well, since the property is the only way we can tell these apart (without any other prior knowledge! more on prior knowledge later!), we are challenged to find a type that describes the possible objects that may lie in cases.

There will now be some examples in Go, but you might imagine how you would do it in your statically-typed language of choice.

The current solution at https://github.com/exercism/xgo/blob/master/exercises/clock/.meta/gen.go is to union all the keys/values.

type js struct {
	Groups TestGroups `json:"Cases"`
}

type TestGroups []struct {
	Description string
	Cases       []OneCase
}

type OneCase struct {
	Description string
	Property    string
	Hour        int // "create"/"add" cases
	Minute      int // "create"/"add" cases
	Add         int // "add" cases only

	Clock1   struct{ Hour, Minute int } // "equal" cases only
	Clock2   struct{ Hour, Minute int } // "equal" cases only
	Expected interface{}                // string or bool
}

How does this compare to the state of the world before property? Before then, the clock tests looked something like https://github.com/exercism/x-common/blob/cda8f9800a33d997f8c6146a10b8caf66e25ec4b/exercises/clock/canonical-data.json:

   "create": {
      "description": [
         "Test creating a new clock with an initial time."
      ],
      "cases": [
         {
            "description": "on the hour",
            "hour": 8,
            "minute": 0,
            "expected": "08:00"
         }
      ]
   },
   "add": {
      "description": [
         "Test adding and subtracting minutes."
      ],
      "cases": [
         {
            "description": "add minutes",
            "hour": 10,
            "minute": 0,
            "add": 3,
            "expected": "10:03"
         }
      ]
   },
   "equal": {
      "description": [
         "Construct two separate clocks, set times, test if they are equal."
      ],
      "cases": [
         {
            "description": "clocks with same time",
            "clock1": {
               "hour": 15,
               "minute": 37
            },
            "clock2": {
               "hour": 15,
               "minute": 37
            },
            "expected": true
         }
      ]
   }

To this, it is possible to use the structure at https://github.com/exercism/xgo/blob/d8dbcece4b6bbdd8f82099645c4defa02daca2c0/exercises/clock/.meta/gen.go

type js struct {
  Create struct {
    Description []string
    Cases       []struct {
      Description  string
      Hour, Minute int
      Expected     string
    }
  }
  Add struct {
    Description []string
    Cases       []struct {
      Description       string
      Hour, Minute, Add int
      Expected          string
    }
  }
  Equal struct {
    Description []string
    Cases       []struct {
      Description    string
      Clock1, Clock2 struct{ Hour, Minute int }
      Expected       bool
    }
  }
}

Let's remind everyone of why we moved away from this approach: exercism/problem-specifications#336 (comment) :

In most of the test suites with more than one type of test, the test's type is encoded
in a property-key, in an object describing a test group.

There are two problem with that approach:

It mixes two different concepts regarding the tests: grouping and identification

It doesn't allow nesting of test groups. That would be nice to have, but is not really needed.

It doesn't allow grouping of test of different types, which would be really great.

Moving the test type near the test data, we solved all the above problem easily. It is theoretically sound and adds functionality.

So what do we do? If you think this problem is insurmountable, then you might think to propose a schema change.

Should it go back to the way it once was, with the schema supporting keys on the top level instead of using property? Then the create, add, and equal keys are also exercise-dependent. We would need a strong reason to move back to this way.
Should the schema be changed in some other way? Your suggestion is welcome of course since I have not thought of one yet.

There are of course various choices NOT involving schema changes! If every language wanting to parse the JSON finds at least one of these choices satisfactory, the schema doesn't need to change for this reason (it might change for other reasons).

parse the JSON object into a generic map of string -> value-of-unknown-type (however this is done in your language). This seems to make it difficult to tell what keys each individual property looks like, and maybe you don't like to use your language's unknown-type type. The C# and Scala tracks are known to use this solution, and I imagine they are satisfied with it. The resulting generators look clean which is why I make that guess.
Union the keys/values. A bit ugly, hard to tell what keys each individual property has, and is tricky if a same key can have different types (this problem is not unique to multiple-property exercises though, so I guess that does not detract from this solution).
Delay parsing of the JSON object until you have read the property, then parse it into the correct type based on the property. Seems reasonable, and the generators look understandable enough.
Use prior knowledge not encoded in the schema. For clock, we know that currently cases[0] has all create cases, currently cases[1] and cases[2] have all add cases, and currently cases[3] has all equal cases. We can use this prior knowledge to parse while being able to declare types. But tracks have to be careful when adopting that approach, because x-common could change in such a way that invalidates this priori knowledge.

m-dango · 2017-06-06T12:45:01Z

The use of prior knowledge has already required me to make changes to the test the last time clock was updated. I would not recommend it unless it's a necessity.

The approach I intend to take instead is a for loop with a switch case for each property. For example with clock:

for @($c-data<cases>) {
  for @(.<cases>) -> $case {
    given $case<property> {
      when 'create' {
        ...
      }
      when 'add' {
        ...
      }
      when 'equal' {
       ...
      }
    }
  }
}

Stargator · 2017-09-13T15:39:52Z

@petertseng is it a goal for all tracks to implement test generators?

petertseng · 2017-09-13T19:38:27Z

is it a goal for all tracks to implement test generators?

I won't presume to speak for Exercism (and I shouldn't answer at all because you aren't asking me, you are asking peter seng, but I will answer anyway), but I can say for sure it's not one of my goals.

Each track can do as its maintainers please. Whatever makes their lives easier.

petertseng · 2018-02-24T18:38:00Z

This issue may be closed when, in the issue-closer's opinion, there has been enough discussion to get an idea of how some tracks are answering these questions

Way overdue.

Reminder that exercism/problem-specifications#996 should make it easy to determine what the inputs are.

petertseng added concerning/open-source kind/request-for-comments labels Jun 5, 2017

kytrinyx added the todo/data-collection label Jun 5, 2017

snahor mentioned this issue Jul 10, 2017

Write a generator exercism/sml#46

Closed

This was referenced Nov 8, 2017

Init exercise script exercism/rust#389

Merged

JSON Schema proposal: All inputs should appear under the input key exercism/problem-specifications#996

Closed

petertseng closed this as completed Feb 24, 2018

petertseng mentioned this issue Dec 1, 2018

list of tracks with test generators exercism/problem-specifications#1411

Closed

sshine mentioned this issue Mar 29, 2019

High-Scores: Add immutability test exercism/problem-specifications#1486

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All tracks that have test generators #155

All tracks that have test generators #155

petertseng commented Jun 5, 2017 •

edited

Loading

mhinz commented Jun 5, 2017 •

edited

Loading

kotp commented Jun 5, 2017

petertseng commented Jun 6, 2017 •

edited

Loading

m-dango commented Jun 6, 2017 •

edited

Loading

Stargator commented Sep 13, 2017 •

edited

Loading

petertseng commented Sep 13, 2017 •

edited

Loading

petertseng commented Feb 24, 2018

All tracks that have test generators #155

All tracks that have test generators #155

Comments

petertseng commented Jun 5, 2017 • edited Loading

C#

ColdFusion

Factor

Go

JavaScript

OCaml

Perl 6

Ruby

Rust

Scala

Vimscript

mhinz commented Jun 5, 2017 • edited Loading

kotp commented Jun 5, 2017

petertseng commented Jun 6, 2017 • edited Loading

m-dango commented Jun 6, 2017 • edited Loading

Stargator commented Sep 13, 2017 • edited Loading

petertseng commented Sep 13, 2017 • edited Loading

petertseng commented Feb 24, 2018

petertseng commented Jun 5, 2017 •

edited

Loading

mhinz commented Jun 5, 2017 •

edited

Loading

petertseng commented Jun 6, 2017 •

edited

Loading

m-dango commented Jun 6, 2017 •

edited

Loading

Stargator commented Sep 13, 2017 •

edited

Loading

petertseng commented Sep 13, 2017 •

edited

Loading