Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip BWC tests in checkPart1 and checkPart2 #38730

Merged
merged 4 commits into from
Feb 14, 2019

Conversation

alpar-t
Copy link
Contributor

@alpar-t alpar-t commented Feb 11, 2019

This implies that we will add bwcTestsSamples as a PR check.

For reference elasticsearch-ci/1 takes ~92 minutes and elasticsearch-ci 120 minutes right now. These checks run in parallel, so it still takes ~2 hours for PR checks to run.

By splitting out bwc Tests in their own tests, the expected times will be:
~70 min for /1, 87 min for 2 and 55m for bwc.
So the overall check time would come down by ~ 3/4 of an hour.
I also believe that splitting along these lines adds additional value.

`one can use -Dtests.bwc.enabled=false` to disable BWC tests.

This is usefull for being able to break out BWC tests into their own PR
check.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@alpar-t alpar-t requested a review from mark-vieira February 11, 2019 14:59
Copy link
Contributor

@mark-vieira mark-vieira left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference elasticsearch-ci/1 takes ~92 minutes and elasticsearch-ci 120 minutes right now. These checks run in parallel, so it still takes ~2 hours for PR checks to run.

Yikes.

This is useful for being able to break out BWC tests into their own PR
check.

How so? This PR seems to add the ability to disable BWC tests via property. How does this help facilitate being able to run only BWC tests?

I also believe that splitting along these lines adds additional value.

I suspect there are a lot of dimensions still available to chop up this build.

FWIW, it's generally best practice to limit things like magic build properties to influencing how the build does its work and now what work it does. There already exists a first-class construct for the latter, they're called tasks.

NOTE: When I say "best practice" that may very well be in conflict to agreed upon patterns and practices that exist here. At this point I'm just trying to provide some third-party insight.

I'm sure there's reasoning behind this, and perhaps we can discuss offline, but it looks like we have a load of projects with a bwcTest task, which acts as a lifecycle task. When bwc_tests_enabled is true we attach some dependencies to that lifecycle task. Why the additional complexity? In what scenario would a build run bwcTest but not want to run the BWC tests? I find this a bit confusing, and I suspect others do too.

So in summary, we already seem to have a set of tasks whose purpose is to run these tests bwcTest. We then have some magic flags that make those tasks effectively do nothing. Why both?

@alpar-t
Copy link
Contributor Author

alpar-t commented Feb 12, 2019

Thanks for the review @mark-vieira ! Those are valid points.
Let me provide some context.
We want to have check run all our tests as a rule, so it's the single thing one has to call.
We don't want to have lots of properties to change the build behavior, I agree we should reduce these and they need to be reserved for advanced uses such as debugging etc,
There are some exceptions to our check rule.
Backwards comparability tests are one of them ( bwc for short ).
These tests assert that backwards comparability is preserved, they do things like upgrade from each compatible version to the current one, this can take an awful lot of time when we are at a point in the development cycle that there are many of those, so we don't make all bwc tests part of check and have a dedicated CI job for them instead.
We do make a subset of bwc tests run as check, the task for it is called bwcTestSnapsots and it was added recently.
We call unreleased versions snapshots. The build checks out the tips of the unreleased versions to also run bwc checks against those and this is what gets ran as part of check.

We used to run check on PRs also, so we don't diverge from the rule, and it used to take 2 and a half hours. In interest of speeding it up we added the checkPart1 and checkPart2 tasks that just depend on the check from specific projects.
This is done so because we felt strongly about implementing these in a way which makes it really hard to have a test that runs as part of check but not as one of the Part versions.

The newly added parameter is meant to be used in CI to turn off bwc testing for checkPart1 and checkPart2 so these will run in a new check we add that runs bwcTestSample.

We could make a change that would automatically turn off bwc tests if one of those tasks are specified on the command line to avoid the use of the property to toggle this.
I initially taught that a property would be useful because there are some changes that go into multiple versions and would only pass in the bwc tests if we could push them at the same time. Since that's not possible we turn off the bwc tests, do the change in the older versions, wait for a good build and then re-enable them ( and we have a CI job that makes sure we don't forget to do so ). The risk there is that one could branch when the tests are disabled and the checks on the branch will skip bwc tests, so one could realize those are broken only after merge. I taught the property would help address this better I realize that's not really the case. I'll make a change to get rid of it.

@alpar-t alpar-t changed the title Allow use of property to disable BWC tests. Skip BWC tests in checkPart1 and checkPart2 Feb 12, 2019
@alpar-t
Copy link
Contributor Author

alpar-t commented Feb 12, 2019

@elasticmachine run elasticsearch-ci/default-distro

@mark-vieira
Copy link
Contributor

We want to have check run all our tests as a rule, so it's the single thing one has to call.

I assume realistically folks only run this for specific subprojects as running ./gradlew check locally would take a huge amount of time. Building on that, I assume it's not expected that folks run BWC tests locally pre-commit?

We do make a subset of bwc tests run as check, the task for it is called bwcTestSnapsots and it was added recently.

So if folks are running :some-project:check these tests will run. Again, as a matter of process, are folks regularly doing this locally?

We used to run check on PRs also, so we don't diverge from the rule, and it used to take 2 and a half hours. In interest of speeding it up we added the checkPart1 and checkPart2 tasks that just depend on the check from specific projects.

How are the BWC tests run now if check doesn't execute them? As far as I can tell, one must explicitly run ./gradlew bwcTest to run the full BWC test suite. Is there a CI job for this?

The newly added parameter is meant to be used in CI to turn off bwc testing for checkPart1 and checkPart2 so these will run in a new check we add that runs bwcTestSample.

We could make a change that would automatically turn off bwc tests if one of those tasks are specified on the command line to avoid the use of the property to toggle this.

I'm not sure why this is necessary. It very much seems like BWC tests are a special beast that we only want to run in explicit scenarios. Adding more magic to disable them in certain circumstances doesn't seem right. I think all the magic flag stuff should probably go and be replaced instead with a) tasks that invoke the specific set of tests we want for whatever the given scenario is (pre-commit check, CI, etc) and b) specific CI jobs configured to run those tasks.

The goal of "check runs everything" is of little value if we never actually run ./gradlew check with everything enabled. In fact, I see the build evolving in a much different way, where we have very specific test tasks, such that we can more effectively split up the build in CI. Implementing this splitting through special flags and logic that looks at requested task names is going to get out of hand really quick when we have more than a handful of these groups.

I think this is worth a larger discussion perhaps out of band. Let's make time to discuss this in the core-infra weekly tomorrow.

@@ -170,6 +170,17 @@ if (bwc_tests_enabled == false) {
println "See ${bwc_tests_disabled_issue}"
println "==========================================================="
}
if (project.gradle.startParameter.taskNames.find { it.startsWith("checkPart") } != null) {
Copy link
Contributor

@mark-vieira mark-vieira Feb 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of thing tends to be quite brittle. One major issue is that "tasks that will execute" != "tasks you asked to execute". For example, if I introduce a task checkAllParts which dependsOn checkPart1, checkPart1 then this breaks down.

If we don't want tasks to run as part of a build we should define new tasks with the correct dependencies. Introducing logic like this into the build makes it very difficult to reason about why a given task was executed (or not) during a particular build invocation. In the end we should strive for more specific lifecycle tasks, not more dynamic logic. If I don't want task foo to run when I run checkPart, then checkPart shouldn't depend on foo. If making this so means adding more tasks and tweaking the existing dependency graph then we should do that instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Splitting the build up is not (yet) something we want to do, so we wouldn't want to embed this into every project. In the end, setting the bwc tests to false does prevent the dependencies to be added to check,
I'm not overly concerned about tasks that would depend on these partial checks as these are done specifically for PR checks in CI, maybe we should even reflect that in the name of the task to make it clear. I would be much more concerned if it would be possible to accidentally run fewer tests, running more than one intended isn't as bad.

@alpar-t
Copy link
Contributor Author

alpar-t commented Feb 12, 2019

People do run check locally in their personal CI setups before opening PRs, and we do run it in most CI jobs. We have specific jobs for bwcTests that run trough all of them, but it's important to note again that some are part of check also. We resisted breaking up the build and tests in multiple life-cycles because more state means more complexity, and it's much harder to explain to contributors what sequence of tests they have to run in what order. That's also why checkPart1 and checkPart2 are dynamic, and more like an exception to the rule. We do have plans to better group tests of various types together to make implementing multiple PR checks easier, take a look at the testing conventions task. The PRs that added it also have some context. That hover is a disruptive effort that involves renaming classes and moving tests around that we must do slowly. I agree with the shortcomings of the approach that you highlight, we resisted initially breaking up the PR checks for the same reasons, but it found it to be working well without the shortcomings getting in the way. I'm not too worried about adding too many of these. Coming up with meaningful splits is not trivial.

@mark-vieira
Copy link
Contributor

mark-vieira commented Feb 12, 2019

We resisted breaking up the build and tests in multiple life-cycles because more state means more complexity, and it's much harder to explain to contributors what sequence of tests they have to run in what order.

These goals aren't mutually exclusive. We can have a partitioned build and simple lifecycle tasks that contributors run to do the full test suite. I certainly am not arguing against some kind of "test everything" task.

We do have plans to better group tests of various types together to make implementing multiple PR checks easier, take a look at the testing conventions task.

I have strong opinions on relying on naming conventions to convey test groups you may not want to hear ;)

I'm not too worried about adding too many of these. Coming up with meaningful splits is not trivial.

I'm not sure what's meant by "meaningful" splits. The goal should be to split the build in a way that makes sense execution wise. If we need to introduce logical groups on top of that to convey status to contributors that's something we can do. We shouldn't limit how we execute the build for best performance. There are abstractions we can introduce to aggregate build results.

To make things more concrete, let me propose a solution here:

  1. Running check runs all tests, and bwcTestSnapsots, essentially as it does today
  2. We introduce a task like quickCheck for all projects which is the same as check but does not depend on any BWC tests
  3. The full BWC test suite is run via bwcTest
  4. The existing checkPart1 and checkPart2 tasks then depend on quickCheck and a new CI job is added for running bwcTest
  5. Existing toggles and flags are removed

I am most definitely missing something here. If there are other scenarios I'm missing, we should aim to model them with new explicit tasks, rather than modifying the behavior of existing ones with toggles.

@alpar-t
Copy link
Contributor Author

alpar-t commented Feb 13, 2019

I'm not sure what's meant by "meaningful" splits. The goal should be to split the build in a way that makes sense execution wise. If we need to introduce logical groups on top of that to convey status to contributors that's something we can do. We shouldn't limit how we execute the build for best performance. There are abstractions we can introduce to aggregate build results.

I do mean logical. I'm not arguing against having splits that make sense execution wise underneath that, but we are not there yet. We wouldn't want to expose execution splits to developers, that's why doing just the logical ones first is easier because we get the if that failed this doesn't need to be ran logic for free.

To make things more concrete, let me propose a solution here:

  1. Running check runs all tests, and bwcTestSnapsots, essentially as it does today
  2. We introduce a task like quickCheck for all projects which is the same as check but does not depend on any BWC tests
  3. The full BWC test suite is run via bwcTest
  4. The existing checkPart1 and checkPart2 tasks then depend on quickCheck and a new CI job is added for running bwcTest
    We do have a CI job that runs becTests currently.
  5. Existing toggles and flags are removed

The quickCheck task would have to be added to all projects, that's what I am trying to avoid. My long term vision is that we would have tasks such as unitTest, integTest, internalClusterTest, bwcTest etc. And have testing conventions make sure all tests are within these. We can then have checkPart1 and checkPart2 depend on the specific
ones they need to, and these tasks will server as our execution level splits as some of them have test level parallelism and others will benefit from project level more. It will make it easy to implement things like don't run integ test if unit fails.
quickCheck is deceiving, as it doesn't tell you what it's doing, but sounds like a really appealing one to run.

I wouldn't want to wait to get that before we split out bwc tests, but I can assure you I view this as a temporary state. I share all your concerns and we will make sure to address them eventually.

I am most definitely missing something here. If there are other scenarios I'm missing, we should aim to model them with new explicit tasks, rather than modifying the behavior of existing ones with toggles.

@mark-vieira
Copy link
Contributor

As discussed today, 👍 acknowledging that this is a temporary measure until we agree on a better structure for splitting these PR checks.

@alpar-t alpar-t merged commit 8e85d13 into elastic:master Feb 14, 2019
@alpar-t alpar-t deleted the configurable-bwc-test-enabled branch February 14, 2019 15:19
alpar-t added a commit that referenced this pull request Feb 14, 2019
Don't run bwc tests for check part 1 and 2
alpar-t added a commit that referenced this pull request Feb 14, 2019
Don't run bwc tests for check part 1 and 2
alpar-t added a commit that referenced this pull request Feb 14, 2019
Don't run bwc tests for check part 1 and 2
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Feb 14, 2019
* elastic/master:
  Remove immediate operation retry after mapping update (elastic#38873)
  Remove mentioning of types from bulk API docs (elastic#38896)
  SQL: change JDBC setup URL in the documentation (elastic#38564)
  Skip BWC tests in checkPart1 and checkPart2 (elastic#38730)
  Enable silent FollowersCheckerTest (elastic#38851)
  Update TESTING.asciidoc with platform specific instructions (elastic#38802)
  Use consistent view of realms for authentication (elastic#38815)
  Stabilize RareClusterState (elastic#38671)
  Increase Timeout in UnicastZenPingTests (elastic#38893)
  Do not recommend installing vagrant-winrm elastic#38887
  _cat/indices with Security, hide names when wildcard (elastic#38824)
  SQL: fall back to using the field name for column label (elastic#38842)
  Fix LocalIndexFollowingIT#testRemoveRemoteConnection() test (elastic#38709)
  Remove joda time mentions in documentation (elastic#38720)
  Add enabled status for token and api key service (elastic#38687)
alpar-t added a commit that referenced this pull request Feb 15, 2019
Don't run bwc tests for check part 1 and 2
@mark-vieira mark-vieira added the Team:Delivery Meta label for Delivery team label Nov 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants