Regression versus failure #834

lionel- · 2018-12-28T15:42:40Z

This is about better support for tests that should not cause R CMD check failures because they are testing output that is not under the developer's control, at least not entirely. For instance, rlang has a lot of expect_known_output() tests for the backtraces. These outputs sometimes change across R versions, e.g. because eval() produces a different backtrace. Another case is vdiffr tests, because the appearance of plots is sensitive to upstream code such as computation of margins or spacing between elements.

Such tests should fail on platforms where the CI envvar is set (Travis, Appveyor), or where the NOT_CRAN envvar is set (tests run locally). This allows the developer to monitor and assess regressions during development.

@hadley suggested calling these tests regression tests. They could be implemented with a new expectation class expectation_regression.

The text was updated successfully, but these errors were encountered:

jimhester · 2018-12-28T15:50:10Z

I don't really see the advantage to what you describe over using skip_on_cran() for these blocks.

lionel- · 2018-12-28T15:56:43Z

This is what I planned to do initially but Hadley suggested using a special expectation. Perhaps it is about giving more structure to this class of expectation?

Is NOT_CRAN set on Appveyor as well (I'm assuming it is set on Travis since you suggest it)?

hadley · 2018-12-28T16:09:56Z

I think there's a class of expectations that should be ignored on CRAN by default (i.e. regression tests). I don't have any strong feelings about how it should be implemented.

lionel- · 2018-12-28T16:13:46Z

If these tests are mostly about outputs, it would be useful to send a structured expectation with before/after fields. We could then provide tools to review and validate regressions similar to what we have in vdiffr.

jimhester · 2018-12-28T16:54:36Z

Would it make sense to use the standard testing machinery, but name the test files different? e.g. reg-xyz.R and these tests only get run when NOT_CRAN is set, or some other regression test specific switch?

lionel- · 2018-12-28T17:10:44Z

Interesting idea. Advantages:

It works with all expectations.
It would be easy to move a test from one file to the other to change its semantics.

Disadvantages:

Most users would probably just keep using the test- files, even for things that should clearly be in reg- files.
This would multiply the number of files in the test folders. This could be a problem in ggplot2 which has already many files, though I guess that's an extreme case. Could also be mitigated by having support for setup, helper, teardown, and reg folders.

About NOT_CRAN, I think using a different skip that checks for both NOT_CRAN || CI would be better, because r-appveyor is documented not to set NOT_CRAN by default.

lionel- · 2018-12-30T14:26:14Z

How about using a special block instead of a different file?

see_that("this thing looks like this", {
  expect_true(...)

  expect_known_output(...)

  vdiffr::expect_doppelganger(...)
})

hadley · 2018-12-31T14:40:49Z

Oh yeah, I like that idea too. Maybe check_that()?

lionel- · 2019-01-06T11:47:30Z

skip_on_cran() (or similar) might not be right because we still need the code to be evaluated. If the code throws an unexpected error, this should be a hard failure.

The motivation for this is that while we cannot always control how an exact output evolves over time, the code generating the output should be robust and always work.

Fixes #782. Fixes #834.

hadley added the feature a feature request or enhancement label Mar 28, 2019

hadley added a commit that referenced this issue Jul 18, 2019

Implement verify_output()

3892ccd

Fixes #782. Fixes #834.

hadley mentioned this issue Jul 18, 2019

Implement verify_output() #906

Merged

hadley added the wip work in progress label Jul 18, 2019

hadley closed this as completed in #906 Jul 19, 2019

hadley added a commit that referenced this issue Jul 19, 2019

Implement verify_output() (#906)

38f08b4

Fixes #782. Fixes #834.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Regression versus failure #834

Regression versus failure #834

lionel- commented Dec 28, 2018

jimhester commented Dec 28, 2018 •

edited

Loading

lionel- commented Dec 28, 2018 •

edited

Loading

hadley commented Dec 28, 2018

lionel- commented Dec 28, 2018

jimhester commented Dec 28, 2018

lionel- commented Dec 28, 2018 •

edited

Loading

lionel- commented Dec 30, 2018

hadley commented Dec 31, 2018

lionel- commented Jan 6, 2019

Regression versus failure #834

Regression versus failure #834

Comments

lionel- commented Dec 28, 2018

jimhester commented Dec 28, 2018 • edited Loading

lionel- commented Dec 28, 2018 • edited Loading

hadley commented Dec 28, 2018

lionel- commented Dec 28, 2018

jimhester commented Dec 28, 2018

lionel- commented Dec 28, 2018 • edited Loading

lionel- commented Dec 30, 2018

hadley commented Dec 31, 2018

lionel- commented Jan 6, 2019

jimhester commented Dec 28, 2018 •

edited

Loading

lionel- commented Dec 28, 2018 •

edited

Loading

lionel- commented Dec 28, 2018 •

edited

Loading