-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal/testrunner/runners/pipeline: unmarshal test results using UseNumber #717
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand your concerns, but I'm not sure if we need such precision here. I'd rather depend on the original JSON package implementation than introduce workarounds or enable non-standard marshaling mode.
Maybe add some test cases to this PR, so we discuss it.
This uses the standard encoding/json package, using the method that was introduced to that package exactly for this purpose in go1.1. The change here is included to fix a real problem that can be seen in the related issue, longs are incorrectly low bit truncated. In cases where flags are stored in long, this could result in regressions due to a need to gloss over this kind of behaviour (not to mention the confusion that seeing this kind of apparent data corruption causes during development). |
@efd6 I agree with the need of this change if there are cases when required precision is lost, but I am also a bit concerned of the use of a non-default mode. I see this is required in several places, and we can easily forget about this if we refactor this code, or if we add something else there that needs unmarshaling. |
Yes, I'm happy to add a test case. I'm wondering how to hook that in. I'll think about how to make sure regressions are caught and add one after the weekend. |
@jsoriano Apologies for the complexity of the testing code, the level coupling here made this necessary and I figured adding test infra derived from the compiler suite would be better than a significant refactor. On the plus side, if additional tests are required, they are easy to add. |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to avoid the involvement of ES in order to isolate the responsibilities; allowing ES to participate in the discussion moves the complexity that exists here elsewhere and makes it more opaque. Note that the complexity is really only a gutting of |
I understand that you preferred to remove the Elaticsearch from the equation, but it's natural that Elasticsearch takes place in pipeline tests. If we don't have to overengineer the test framework and achieve similar results with what we have now (for example, an extra package added to What's your preference on this, @jsoriano? |
I'm not entirely sure that it's possible to express two of the tests here in the packages tests; from what I can see, it would be able to express the first test here The test code here could be made simpler at the cost of having to spin up the stack. If this were done, the actual |
Based on what you shared with us, I understood that it's easy to reproduce this case with pipeline tests and that's the kind of sample test package I suggest preparing. Could you please clarify where is the tricky part? Is it flaky or does it depend on the runtime Go environment? |
OK, if you are OK with modifying this so that you can conditionally express "should fail" (what this does) and "does not modify *-expected.json when run with -g" (what this does) then it is possible, though IMHO icky. The second of those tests is fine with doing a diff, though that then needs to be wrapped up in an xUnit document to capture failures, but the former needs special consideration since the failure is wanted so the xUnit doc needs to be constructed to explain that a thing was wanted not to be but is (take a look at the failure stacktraces here how the testing proposed here testing could present that — not with go2xunit since it does not correctly handle test output). |
Yes, but that's not how the JUnit is supposed to work and patching like that seems to be a hack for me. Let me wear my code owner hat and propose a different approach to making this PR concise and not another test
The idea is to focus on delivering the improvement instead of exploring ad-hoc testing extensions to what we have here. Keeping it consistent help future maintainers with understanding how the codebase and not spending time on analyzing the |
I'm sorry, you've lost me here.
|
…eNumber By default, numbers are unmarshaled as float64 resulting in low bits truncation in longs that are above 53 bits wide, so use a decoder and UseNumber to ensure results are not corrupted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The positive test doesn't test anything that's added here. I can guarantee this because the old code did not fail the positive test I have here.
Correct me if I'm wrong, but I understood that without this change, there might be problem with a particular pull request. I thought to copy those potentially affected/flaky/risky tests here to make sure that this PR solves the problem.
// jsonUnmarshalUsingNumber is a drop-in replacement for json.Unmarshal that | ||
// does not default to unmarshaling numeric values to float64. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please adjust this comment to be more verbose: what kind of numbers can lose precision, when we can expect it (samples). We talked about these cases in the PR, I see a value in adding them here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
if err != nil { | ||
return | ||
} | ||
if !reflect.DeepEqual(got, test.want) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. Isn't there any replacement for assert.DeepEqual
in testify/assert
, so we need to use reflection here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert.Equal
calls reflect.DeepEqual
in a round about way that. I try to avoid testing frameworks as much as possible unless they add value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, in this case, it's about consistency with the rest of the codebase, so kindly please adjust to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have changed the tests so that they are testing the concern of the change here, which is that the values are properly sent through a unmarshal/marshal round-trip. This also has the advantage that the need for any kind of deep equality check becomes unnecessary. I hope this is OK. I have also added test cases that show exactly where the cutover happens in behaviour, with notes explaining how to check this, for the curious.
Those were tested in my previous unacceptable approach. They can not be tested in an automated way without making the changes that I made or suggested. Honestly, I would not have written 200 lines of testing infrastructure if I didn't think that it was necessary to address the concerns that were expressed. |
I'm afraid that in case writing tons of code without explaining context doesn't make it easier to follow or catch mistakes.
Well, so far you didn't provide any sample test packages as we asked at the beginning of this conversation, and also here.
The solution you provided before was partially mocking the elastic-package behavior to set specific conditions. You mentioned that it was necessary to provide infra code, but didn't show any practical, real use case where this bugfix PR solves the problem. I'd like to see in the Github description, fragments of a pipeline, package, event, etc. when the conversion can possibly go wrong and model it here. |
@efd6 sorry for jumping here quite late but is there any particular bug or behaviour that should be fixed thanks to this PR? I do not find anything in the original description. |
@jlind23 Yes, the link to elastic/integrations#2758 provides context. The relevant part is this:
An example of the issue can be see here:
Note here that What this PR does is replace the calls to The case that provoked the change here is something that cannot be tested with the current testing infrastructure; I spent quite some time attempting to fit in a test that would have made use of the code that exists, but what we want to see is that a long > 1<<53 comes through the pipeline runner unscathed, and can then be written to disk as the original long value. Because all of the code for testing here depends on dynamically typed value comparison there is no way to ensure that the looser equality check of I hope that clarifies the issue. |
@efd6 thanks for providing more context. I see that some TODOs have been added for tests, and that there have been some discussions about additional test code, but I am not sure about the code being discussed, it seems that there were some force pushes. Please don't force push to ongoing reviews 🙏 it is difficult to keep track of the conversation. I still think that we need some test here to avoid regressions on the fixed use case. I would suggest to black-box test the whole
|
remains, err := io.ReadAll(dec.Buffered()) | ||
if err != nil { | ||
return err | ||
} | ||
for _, b := range remains { | ||
if b > ' ' || (b != ' ' && b != '\t' && b != '\r' && b != '\n') { | ||
// Mimic encoding/json error for this case, but without rigmarole. | ||
return fmt.Errorf("invalid character %q after top-level value", b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't be enough to check if there is something else there with dec.More()
?
remains, err := io.ReadAll(dec.Buffered()) | |
if err != nil { | |
return err | |
} | |
for _, b := range remains { | |
if b > ' ' || (b != ' ' && b != '\t' && b != '\r' && b != '\n') { | |
// Mimic encoding/json error for this case, but without rigmarole. | |
return fmt.Errorf("invalid character %q after top-level value", b) | |
if dec.More() { | |
return fmt.Errorf("unexpected characters found after unmarshaling value") | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, this won't work. Some trailing bytes are acceptable, others are not; a JSON stream can have arbitrarily large quantities of whitespace outside literals and this extends to trailing and leading data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dec.More()
ignores whitespaces, are there other acceptable bytes? The code I propose passes the test cases added in this PR (after adapting the error strings).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or something like this could be done to keep this error message:
_, err = dec.Token()
if err != nil && !errors.Is(err, io.EOF) {
r, ok := dec.Buffered().(io.ByteReader)
if !ok {
return err
}
b, err := r.ReadByte()
if err != nil {
return err
}
// Mimic encoding/json error for this case, but without rigmarole.
return fmt.Errorf("invalid character %q after top-level value", b)
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dec.More()
ignores whitespaces
Nice. I didn't notice that. To retain the error parity we'll need something like the second option or what I have. I'm not convinced that the assertion to io.ByteReader
is simpler of lighter though. We expect that in the happy case that remaining will be ~0, and in the unhappy case remaining will still be ~small (it must have fit in memory), but we cannot guarantee that dec.Buffered()
will return an io.ByteReader
, so we may miss cases.
If we are happy to lose error parity, then dec.More()
would be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I would like to avoid is to do parsing here (the if b > ' ' || (b != ' ' && b != '\t' && b != '\r' && b != '\n')
line), and rely on the decoder for this.
Regarding the error message, I don't have a strong opinion, but at this level I think that it is ok if we don't show the found character.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. This will complicate the test code. I will do it next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to complicate test code, actually I think that we could simplify it, and only check that it returns an error when expected, without checking the error message.
No worries. You can see the changes that were being discussed (with the same shas) at #721 where I was using them to test an alternative to go2xunit (you can ignore the changes to the root Makefile and the added j2xunit.go file — they are not relevant here). The second and third options that you put forward, I think, are essentially what I implemented. Please correct me if I'm wrong. I don't think the approach of manually watching for contamination of the repo (what I understand you are suggesting in the first option) is a good idea. Without complaint from CI I can imagine this just being accidentally waved through with non-zero probability. It could be watched for by a test as I suggested above, but that seems ugly. |
The tests I see in this PR focus on testing I hadn't paid much of attention to the related tests in #721, sorry. This is more aligned to what I am proposing in the second option, yes. But it looks to me that
I don't think we would need a negative example here. If we add a new package with a pipeline that generates big integers we should be able to reproduce the situation. If some code change breaks the test for this package, it would call attention in the PR introducing the change. I think this would be enough to avoid regressions in this use case, without adding complexity to tests. |
Yeah, this is not quite true. It tests the loading and verification steps performed by (misclick)
Yes, I was unhappy with the need to mock it substantially, but it was necessary due to coupling as I said. This could be reduced by taking closure arguments to construct the parts that are not being tested here, but that is moot given the later parts of your comment.
I need to think about this. |
scripts/test-check-packages.sh
Outdated
if [ "$(basename $d)" == "long_integers" ]; then | ||
# Ensure that any change in unmarshaling behaviour is noticed; this will result in a dirty | ||
# git state on exit if an inappropriate use of encoding/json.Unmarshal has been made. | ||
elastic-package test -v -g --report-format xUnit --report-output file --defer-cleanup 1s --test-coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the -g
necessary here? Shouldn't it be caught due to differences compared to the expected.json
file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I think this if is not needed, elastic-package test
should fail if there are regressions in unmarshalling logic. If it doesn't fail, this is a different bug to be addressed in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it won't be caught. This was the reason for the additional code previously. The issue is that it is not possible for a machine to measure the accuracy of some thing that is more accurate than it, and this is what that would be asking it to do; if the the code is regressed, then it becomes a less accurate machine and so become inadequate to test that the test cases (which are more accurate) are satisfied. This is all theoretically sound, but can be practically confirmed by running the tests at 9b93a10 with this change
diff --git a/internal/testrunner/runners/pipeline/test_result.go b/internal/testrunner/runners/pipeline/test_result.go
index d67853c..38916f8 100644
--- a/internal/testrunner/runners/pipeline/test_result.go
+++ b/internal/testrunner/runners/pipeline/test_result.go
@@ -228,6 +228,7 @@ func unmarshalTestResult(body []byte) (*testResult, error) {
// prevent low bit truncation of values greater than 1<<53.
// See https://golang.org/cl/6202068 for details.
func jsonUnmarshalUsingNumber(data []byte, v interface{}) error {
+ return json.Unmarshal(data, v)
dec := json.NewDecoder(bytes.NewReader(data))
dec.UseNumber()
err := dec.Decode(v)
They will pass.
Aleternatively, this also demonstrates the situation https://play.golang.com/p/CMsFSSjcd0f.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm.. to be honest, I'd accept that risk than introduce a sneaky workaround. It sounds to me like an edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I remove the tests? They essentially don't do anything without this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me challenge this. In this case, we're talking about the Go stdlib's nature, which unfortunately has this behavior. It means that we need to a test to detect a possible regression. I'm wondering if the unit test which calls jsonUnmarshalUsingNumber
, then marshals back to []byte
using the right json.Marshaller
can detect this issue. WDYT?
EDIT:
https://play.golang.com/p/8xH6zmearh4 (it's similar/same? to what you have covered with your tests).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question that has to be asked is what does that test? Depending where it's put that could adequately test for all regressions, but as the code stands you will end up mocking essentially all of (*runner).run
in order to achieve that.
There are many moving parts here, breaking some of them would be detected, but breaking others will not. The fact that breaking some of them will be is a good reason to keep the package here. I'm at the end of an extremely long day, so I don't have the cognitive capacity to sort out which are visible and which are not right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see, you make a good point with the comparison of floats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think this is going in the good direction, but I think we don't need to modify the test script.
remains, err := io.ReadAll(dec.Buffered()) | ||
if err != nil { | ||
return err | ||
} | ||
for _, b := range remains { | ||
if b > ' ' || (b != ' ' && b != '\t' && b != '\r' && b != '\n') { | ||
// Mimic encoding/json error for this case, but without rigmarole. | ||
return fmt.Errorf("invalid character %q after top-level value", b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I would like to avoid is to do parsing here (the if b > ' ' || (b != ' ' && b != '\t' && b != '\r' && b != '\n')
line), and rely on the decoder for this.
Regarding the error message, I don't have a strong opinion, but at this level I think that it is ok if we don't show the found character.
scripts/test-check-packages.sh
Outdated
if [ "$(basename $d)" == "long_integers" ]; then | ||
# Ensure that any change in unmarshaling behaviour is noticed; this will result in a dirty | ||
# git state on exit if an inappropriate use of encoding/json.Unmarshal has been made. | ||
elastic-package test -v -g --report-format xUnit --report-output file --defer-cleanup 1s --test-coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I think this if is not needed, elastic-package test
should fail if there are regressions in unmarshalling logic. If it doesn't fail, this is a different bug to be addressed in a different PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@efd6 I have been discussed with Marcin offline about this PR, we are fine with going on with it so we unblock this issue. We will follow up with unit tests for the runner.
The only changes we would like to see before merging the PR are:
- Please revert the changes in
test-check-packages.sh
, but leave here the sample package. As you mentioned, this won't be enough to detect regressions, but it illustrates the issue. We will try to improve detection of this kind of issues in follow ups. - Remove the "parsing" done when checking for invalid syntax after the message, rely on decoder for that, using
More()
and/orToken()
.
remains, err := io.ReadAll(dec.Buffered()) | ||
if err != nil { | ||
return err | ||
} | ||
for _, b := range remains { | ||
if b > ' ' || (b != ' ' && b != '\t' && b != '\r' && b != '\n') { | ||
// Mimic encoding/json error for this case, but without rigmarole. | ||
return fmt.Errorf("invalid character %q after top-level value", b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to complicate test code, actually I think that we could simplify it, and only check that it returns an error when expected, without checking the error message.
scripts/test-check-packages.sh
Outdated
if [ "$(basename $d)" == "long_integers" ]; then | ||
# Ensure that any change in unmarshaling behaviour is noticed; this will result in a dirty | ||
# git state on exit if an inappropriate use of encoding/json.Unmarshal has been made. | ||
elastic-package test -v -g --report-format xUnit --report-output file --defer-cleanup 1s --test-coverage |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I see, you make a good point with the comparison of floats.
This reverts commit 86af80f.
💚 Build Succeeded
Expand to view the summary
Build stats
🤖 GitHub commentsTo re-run your PR in the CI, just comment with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, numbers are unmarshaled as
float64
resulting in low bits truncationin longs that are above 53 bits wide, so use a decoder and
UseNumber
to ensureresults are not corrupted.
See behaviour before and after.
Relates elastic/integrations#2758.