Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support lower / higher bounds in range loop #56

Closed
gmile opened this issue Aug 15, 2017 · 15 comments
Closed

Support lower / higher bounds in range loop #56

gmile opened this issue Aug 15, 2017 · 15 comments
Labels

Comments

@gmile
Copy link
Contributor

gmile commented Aug 15, 2017

Right now the following template will always produce a sequence of 5 numbers:

echo '{{ range Loop 5 }}{{ Int 0 9 }}{{ end }}' | fakedata -l 5
36513
62260
38785
18363
82136

It'd be great to have an ability to specify a lower and higher bounds, so that range Loop x y would generate a sequence of random length:

echo '{{ range Loop 2 7 }}{{ Int 0 9 }}{{ end }}' | fakedata -l 5
365
6226013
3878
183635
82

On each iteration specified by -l 5, {{ range Loop x y }} would produce a random number between x and y, inclusive, and loop that many times.

@KevinGimbel
Copy link
Collaborator

I think a Substring-like function may be a better fit here. Something around the lines of this demo: https://play.golang.org/p/vl69PmPTXx

The usage would be something like:

{{ range Loop 7 }}
{{ Substring (Int 9999999) 2 7 }}
{{ end }}

2 would be the minimum, 7 would be the max number of characters, Int 9999999 would generate a random big number on each run which Substring would cut to a number which is 2 to 7 characters in length.

Not sure yet how to implement this, just a first thought.

@gmile
Copy link
Contributor Author

gmile commented Aug 19, 2017

@lucapette I think {{ range Loop 2 7 }} may have more generic usage and cover more cases, please correct me if I'm wrong.

In my case I'm working with generating a JSON object a key of which points to an array value. That array may contain 1+ items. I need to generate an array of random size, of 1 or more items.

Here's how I think this could look like:

{
  "people": [
    {{ range Loop 1 5 }}
      {
        "name": "{{ NameFirst }} {{ NameLast }}",
        "phone": "{{ Occupation }}"
      },
    {{ end }}
  ]
}

Also, it follows the same patters as Int does in {{ Int 15 20 }}.

@KevinGimbel
Copy link
Collaborator

@gmile Indeed, it does follow the pattern of {{ Int }}. I think this is the right direction to go if we want to support "randomized" looping.

@gmile
Copy link
Contributor Author

gmile commented Aug 19, 2017

@KevinGimbel oops, I just realized I should have referred to you in my reply, but instead I mentioned Luca's github name. Sorry about that!

@KevinGimbel
Copy link
Collaborator

@gmile No worries, I subscribe to all issues and read almost everything 😀

I'll see about implementing a demo/prototype if I get the time. :)

@gmile
Copy link
Contributor Author

gmile commented Aug 21, 2017

@KevinGimbel I've opened #61 with the minimal patch to get this working. I don't know go very well, so my code is likely far from being idiomatic. I'd love to have some guidance on how to make it better.

My main question though is how do I add tests for this. I figured current implementation of range Loop is not covered with tests? I was unable to find any.

@KevinGimbel
Copy link
Collaborator

@gmile thank you! The PR looks good on first sight.

Tests in Go are written in a file ending in _test.go. For template.go you'd write your tests into tempalte_test.go but we don't have template tests at the moment (not sure why, @lucapette may know more).


For fakedata we use Table Driven Tests, which looks like this:

func TestSeparatorFormatter(t *testing.T) {
	tests := []struct {
		name string
		sep  string
		want string
	}{
		{"default", " ", "Grace Hopper example.com"},
		{"csv", ",", "Grace Hopper,example.com"},
		{"tab", "\t", "Grace Hopper\texample.com"},
	}
	for _, tt := range tests {
		t.Run(tt.name, func(t *testing.T) {
			f := &fakedata.SeparatorFormatter{Separator: tt.sep}
			if got := f.Format(columns, values); !reflect.DeepEqual(got, tt.want) {
				t.Errorf("SeparatorFormatter.Format() = %v, want %v", got, tt.want)
			}
		})
	}
}

For each test we create a slice of structs (tests := []struct). These structs have fields, in this case it's 3: name, sep, and want. Directly following the struct we declare the tests, each representing a struct (sorry if this sounds weird :D). Since we have a slice of tests we can then iterate over it using

for _, tt := range tests {}

range is basically like foreach($array as $value) {} in PHP or array.ForEach(function(value) {}) in JavaScript. With t.Run(tt.name, func(t *testing.T) {}) we declare a function that's run for every tests and that determines if the test is a success or failure.

In the above test this function has a body of

  f := &fakedata.SeparatorFormatter{Separator: tt.sep}
  if got := f.Format(columns, values); !reflect.DeepEqual(got, tt.want) {
      t.Errorf("SeparatorFormatter.Format() = %v, want %v", got, tt.want)
  }

Which means:

f := &fakedata.SeparatorFormatter{Separator: tt.sep}
Create an instance of &fakedata.SeparatorFormatter with the Separator from the current test.

if got := f.Format(columns, values); and !reflect.DeepEqual(got, tt.want)
Get the result from running it against the input and check if it's what we wanted

If it's not what we wanted we use t.Errorf("SeparatorFormatter.Format() = %v, want %v", got, tt.want) to report an error. If everything's good the tests simply passes without any further notice.


I hope this makes any sense. For now I'd say you can ignore the tests but make sure the code is formatted correctly with gofmt. Tip: If you use Atom there's a great Go plugin https://atom.io/packages/go-plus

@gmile
Copy link
Contributor Author

gmile commented Aug 21, 2017

@KevinGimbel wow thanks for a such a thorough response! I'll have to carefully read it later today, but after quick glance it all makes sense.

From what I understand, you describe a case where expected_result and actual_result are given upfront, as an array of respective structs. Then by applying a function under test, in your case it's fakedata.SeparatorFormatter (?), we check that equality between a particular pair of expected_result and actual_result stands.

So that's cool. I've used to writing tests in Ruby (where mocking would be heavily involved) and these days Elixir (almost no mocking; things are somewhat similar to what you have described).

Thanks for all the go implementation details!


I've fixed the PR by pushing the code through go fmt, that build is green now.

@KevinGimbel
Copy link
Collaborator

@gmile You're more than welcome! I'm myself fairly new to Go but I like sharing the knowledge I gained whenever I can. Luca helped me a lot with tests and Table-Driven Tests :)

From what I understand, you describe a case where expected_result and actual_result are given upfront, as an array of respective structs. Then by applying a function under test, in your case it's fakedata.SeparatorFormatter (?), we check that equality between a particular pair of expected_result and actual_result stands.

That's correct. We have a slice which is a set of tests we define. They have a name, input, and want result (e.g. a bool or specific output). The name is only used to know which tests failed or passed, the input (like sep) is passed to the function we want to test and the want is the thing we want as an output.

So that's cool. I've used to writing tests in Ruby (where mocking would be heavily involved)

These slices of tests are like mocking test data. We supply the function we want to test with a set of inputs - I try to have some that fail and some that pass so that I can see my function handles correct input as well as wrong input.

@lucapette
Copy link
Owner

@gmile I wrote about the kind of technique we use for integrations tests here on fakedata, you can read it here: http://lucapette.me/writing-integration-tests-for-a-go-cli-application

The problem with testing fakedata is that it's not so easy to do integrations testing because of the nature of the tool itself. It's hard to come up with a good assertion given the actual output is supposed to different at every run. It's a nice problem though.

@KevinGimbel thank you so much for your work. You're doing a *stellar job and fakedata couldn't be luckier to have you!

@gmile
Copy link
Contributor Author

gmile commented Aug 21, 2017

The problem with testing fakedata is that it's not so easy to do integrations testing because of the nature of the tool itself. It's hard to come up with a good assertion given the actual output is supposed to different at every run. It's a nice problem though.

As an idea, would it be possible to make a pluggable random generator, and use it as a dependency injection for all functions that rely on randomness?

When building fakedata, a normal golang random generator function would be plugged in. For tests though, a simplified random generator would be used.

By simplified random generator I mean a function that just iterates values within a given type, starting from some "zero" value. So randomness becomes "controlled randomness" during tests.

A few examples:

  • emoji:

    test (every time the same sequence):

    $ fakedata emoji -l 5
    🀄
    🃏
    🅰
    🅱
    🅾
    

    normal build (values are taken one by one in order as defined in emoji.go):

    $ fakedata emoji -l 5
    🚃
    🅰
    😀
    🍑
    💓
    
  • int:

    test:

    $ fakedata int -l 5
    444
    839
    711
    136
    617
    

    normal build:

    $ fakedata int -l 5
    0
    1
    2
    3
    4
    
  • int:

    test:

    $ fakedata enum:2,4,8,16 -l 5
    2
    4
    8
    16
    2
    

    normal build:

    $ fakedata enum:2,4,8,16 -l 5
    2
    8
    4
    8
    4
    

And so on.

@lucapette
Copy link
Owner

@gmile that wouldn't really test the only thing we don't test already. So if you look at how the test suite is organized, we do test that things are glued together correctly. I think it gets a bit philosophical at this point but I don't mind :) The thing is that I'm no big fan of unit level testing (where unit means testing "one thing with on dependencies") because in reality a green test of this kind can still mean the system doesn't work. My opinion is somewhat unpopular though as there's a lot of people advocating for injecting dependencies for the sake of testability. I find the technique helpful at times, especially when dealing with units that make heavy use of time, timezones, and dates related operations. So I'm not saying it's a no-go in general. My point is that I prefer (and therefore I try always that way first) tests that are green only when the systems exhibit the correct behaviour and red when not. I do understand no test can give that guarantee but I've definitely seen that feature more often in what we generally call "integrations tests".

So to go back to fakedata, I appreciated the suggestion. I personally thought about what you're suggesting multiple times but I always ended up not doing it because I couldn't justify the effort given the little benefits I've seen in the approach. Of course I'm open to give it a try if there's any benefit I'm not seeing :)

Sorry for the rather philosopical and pretty long answer!

@gmile
Copy link
Contributor Author

gmile commented Aug 23, 2017

@lucapette thank you for laying out your vision in a long form, I don't mind that at all! This is indeed a philosophical topic.

we do test that things are glued together correctly

My vision regarding tests aligns with yours in that integration tests are vital and absolutely essential. I see no replacement to them. I prefer having only a handful of critical integration tests that cover particular big happy & negative cases. That is to ensure the main code paths are just working, and functions those paths are comprised of are glued correctly.

in reality a green test of this kind can still mean the system doesn't work

I absolutely agree. Unit tests alone won't tell if the system is working or not. Though I definitely see a place for unit tests, specifically to verify that a function returns expectedly against particular edge casey inputs. In the description to #61 I've listed some of them.

Given the above, for example, I'd add 2 integration tests to #61:

  • check that the following renders expected output:

    cat "{{ range Loop 5 10 }}{{ Int 5 }}{{ end }}" | fakedata -l 1
    
  • check that the following renders expected error:

    cat "{{ range Loop "a" 3 }}{{ Int 5 }}{{ end }}" | fakedata -l 1
    

I'd probably even go all-in with integration testing, by running a compiled binary and checking stdout using regexps, how crazy is that? ;)

For all other cases I'd have a bunch of unit tests written, to see that function behaves as expected (either it returns an error or it returns some sort of expected good value). Since asserting on expected value is hard due to randomness, this is where I believe extracting randomness into some form of interface / adapter would make sense.

Unit tests would benefit from pluggable randomness interface / adapters by providing a "controlled randomness", while integration tests could continue to run on "native randomness".


Even though I do have my own preferences, I don't feel strong about testing in fakedata particularly due to my poor knowledge of go language and its testing idioms, and subsequent inability to propose a concrete coded solution to the tests problem.

@gmile
Copy link
Contributor Author

gmile commented Aug 23, 2017

@lucapette what do you think can be done about #61? What tests should I cover it with, if any?

@lucapette
Copy link
Owner

@gmile sorry for getting back to you so late! It was a busy week.

I'm enjoying this conversation a lot, and I feel we have a similar way of approaching tests. Your suggestion is indeed exactly what I would cover in the integrations tests for #61. The short term solution would be to use range loop 1 1 so that the output is stable. We're always using the same trick for the rest of the integration tests. While on the surface they don't seem so helpful, they indeed test the full produced binary which guarantees "things are glued together" correctly. I like that aspect a lot.

While reviewing this issue, and your arguments, I started considering we should indeed make the "unit test level" at least possible (using the pluggable randomness you're talking about) so that, in the long term, we can cover all the aspects of a specific features. The result would be:

  • small unit tests for "how generator X works"
  • integrations tests for "does this feature really have this API?"

I like that, but I think we shouldn't tie it to #61 so I suggest I take care of creating issues/milestones to take care of the testing infrastructure of fakedata.

About your question about what to specifically do with #61, I would add the two tests you suggested at integration level. I think it should be enough to extend this table. If you don't feel comfortable doing, don't worry about it. You can just say so and I merge #61 and take care of tests on my own!

Thank you very much for everything (the conversation was very stimulating, and the contribution in terms of code and food for thoughts really awesome!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants