Improve End-to-End Test Management and Upgrade/Downgrade Testing #16365

systay · 2024-07-11T06:29:24Z

Introduction

Currently, our process for handling bug fixes and end-to-end tests faces several challenges, particularly with version-specific test skipping and backport management. Although GitHub contains all necessary information about issues, fixes, and backports, our existing system does not efficiently leverage this data, leading to persistent test skipping issues and inadequate upgrade/downgrade testing coverage. This proposal outlines the current issues and suggests improvements to enhance our testing process.

Current System and Identified Issues

In the current system, bug fixes are created against the main branch. Each bug fix should include at least one end-to-end test that should fail without the fix and pass with it. For upgrade/downgrade testing, tests must stop running if the vtgate component in the system is older and doesn't have the fix. This is managed using the utils.SkipIfBinaryIsBelowVersion(t, 21, "vtgate") call in Go. When a bug fix PR is deemed important, a backport label is added, and the bot backports it to all relevant branches.

However, there are three main issues with this approach:

First, the backported PR still contains the SkipIfBinaryIsBelowVersion call, which now references the incorrect version, causing inaccurate skipping of tests. Second, both the main and other branches will continue to skip the test because the SkipIfBinaryIsBelowVersion remains, and no mechanism updates the branches once fixes are merged. Third, upgrade/downgrade testing only tests against the latest release of the branch rather than the current release branch, potentially leading to unexpected issues post-release. In other words - we only test against the release tags, not against the release branches.

To improve the system, consider the following preferable solutions:

End-to-end tests should track which issues they are fixing, potentially through code annotations. By leveraging this tracking information together with the existing data in GitHub about issues, fixes, and backports, we can accurately calculate which tests need to be run. This approach would eliminate the need for any branch to keep track other branches a fix has been ported to, thereby resolving the issues of incorrect test skipping, persistent skips after merges, and inadequate upgrade/downgrade testing coverage.

By implementing these changes, the issues of incorrect test skipping, persistent skips after merges, and inadequate upgrade/downgrade testing coverage can be resolved, leading to a more reliable and maintainable testing process.

The text was updated successfully, but these errors were encountered:

shlomi-noach · 2024-07-11T10:09:34Z

This sounds good. The one thing I'd say is that we need to validate that both the linter (Extracting the annotation) as well as GitHub API - are producing the right result (or some result). So there needs to eb a test that validates that we got something.

frouioui · 2024-07-11T20:06:57Z

I like the idea of code annotations, it can even make a test more maintainable as we will, in some ways, link it to a GH Issue/PR.

One thing to take into consideration is that the test will now need to issue, at least, a GET command to GH's API or something else, which can slow down or impact the experience for some users.

Something worth noting, in GitHub Actions, the GitHub API rate limit is set to 1000 requests per hour across all actions. It will be hard to reach that number if only a small amount of tests execute API calls, even in an hour, but something to keep in mind. Link to GH docs here.

deepthi · 2024-07-17T21:26:44Z

There seem to be 2 separate issues

We are skipping some tests in upgrade/downgrade testing depending on the version
Upgrade/downgrade testing is based on the tag of older releases versus the latest commit on an older release branch.

For the first issue

is it mostly vtgate tests where we have the problems with skipping tests?
why isn't it enough to make sure to run a version of the test that matches the vtgate version? For example, if we are running a test on release-20.0 that is testing downgrading to v19, why should we run the release-20.0 version of a test, which then requires us to conditionalize new tests? why not run the release-19.0 version of the test?
so far this example only looks at vtgate. we probably have similar problems with other components that also need to be solved. I'm offering up this as an idea to be explored because it might enable us to remove a large number of if conditions from test code. we'll need to see how applicable it is to other components like vtctld.

frouioui · 2024-07-22T23:16:21Z

First Issue

I think it would be fine to run a version of the test that matches the vtgate version. For instance, when downgrading to vtgate N-1 we would checkout to the SHA we are downgrading to and run the tests located on that tag. As stated in #7344 the purpose of the query-serving upgrade/downgrade tests is primarily to make sure the cluster can operate correctly as quoted below, not necessarily that every query have to be compatible between N-1 and N. Which means that by running the same test version as the vtgate version we run would still respect this criteria.

Ability for a cluster to operate correctly in the midst of an upgrade / downgrade. E.g. vtctld and vttablets are still running but vtgate is restarted with the new version.

Applying this new strategy where we checkout to the proper SHA would save us from having to use the Skip methods. And from what I am seeing, we mostly skip tests in vtgate's end-to-end tests (both queries and schema tracking). With the exception of the PRS and Backup tests where we have different assertions based on the version of vttablet and vtctl, and these different assestion can even be removed today as they are based on v16 vs v17.

I am okay with this approach.

Second Issue

For the second issue, from what I am seeing we originally always used the previous release tag, even in the very first PR that added non-automated upgrade/downgrade testing: https://github.com/vitessio/vitess/pull/7294/files#diff-48dcda0e0b100dde692989e67a1ad23e97c3da464035698f53bd60528cf1420c. Which was then carried on by #8471, followed by #9300 and then #9745. We talked about this issue in August 2022 in an offline discussion, snippet below.

- Pros of running with patch release as previous release is that this is the most common use case and we want to exercise that use case

- Cons are that if we break something on release-14.0 branch the downgrade test on main won’t catch it

- Manan is proposing that we also add an upgrade test on 13.0 which tests upgrading to 14.0. That way if we add something on 13.0 that breaks upgrade we’ll find it. Similarly for 14.0 to main

We have implemented what manan proposed, but it still does not solve the issue where we don't know if the next version on release-20.0 will be compatible with the next version on main. I agree it is debatable as we should not introduce breaking changes on release branches.

systay added Type: RFC Request For Comment Type: Testing and removed Type: Testing labels Jul 11, 2024

frouioui mentioned this issue Jul 16, 2024

Use release branches instead of git tags for upgrade/downgrade #16414

Closed

frouioui mentioned this issue Jul 30, 2024

Improve the queries upgrade/downgrade CI workflow by using same test code version as binary #16494

Merged

frouioui closed this as completed in #16494 Jul 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve End-to-End Test Management and Upgrade/Downgrade Testing #16365

Improve End-to-End Test Management and Upgrade/Downgrade Testing #16365

systay commented Jul 11, 2024 •

edited

Loading

shlomi-noach commented Jul 11, 2024

frouioui commented Jul 11, 2024

deepthi commented Jul 17, 2024 •

edited

Loading

frouioui commented Jul 22, 2024

Improve End-to-End Test Management and Upgrade/Downgrade Testing #16365

Improve End-to-End Test Management and Upgrade/Downgrade Testing #16365

Comments

systay commented Jul 11, 2024 • edited Loading

Introduction

Current System and Identified Issues

shlomi-noach commented Jul 11, 2024

frouioui commented Jul 11, 2024

deepthi commented Jul 17, 2024 • edited Loading

frouioui commented Jul 22, 2024

First Issue

Second Issue

systay commented Jul 11, 2024 •

edited

Loading

deepthi commented Jul 17, 2024 •

edited

Loading