Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backupccl: write new backup roachtest driver #99787

Closed
msbutler opened this issue Mar 28, 2023 · 1 comment · Fixed by #102821
Closed

backupccl: write new backup roachtest driver #99787

msbutler opened this issue Mar 28, 2023 · 1 comment · Fixed by #102821
Assignees
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery

Comments

@msbutler
Copy link
Collaborator

msbutler commented Mar 28, 2023

#94143 introduced a new roachtest driver for restore that makes it easy to add new restore roachtests for a given fixture and cluster topology, increasing our overall coverage. We ought to create a similar framework for our backup roachtests to increase test coverage as well. In addition, a new backup roachtest driver could be used to significantly reduce the time it takes to create a new restore roachtest fixture-- i.e. Instead of using this clunky set of bash scripts, we could merely run a backup roachtest that creates the fixture for us.

Jira issue: CRDB-26080

@msbutler msbutler added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery labels Mar 28, 2023
@msbutler msbutler self-assigned this Mar 28, 2023
@blathers-crl
Copy link

blathers-crl bot commented Mar 28, 2023

cc @cockroachdb/disaster-recovery

msbutler added a commit to msbutler/cockroach that referenced this issue Mar 28, 2023
Currently, the tpce workload is used in 2 roachtests in tpce.go. This patch
makes it easier for other roachtests to init and run a tpce workload by
creating a tpce driver that abstracts away the actual cmds required to init and
run a tpce workload. This will make it easier to address cockroachdb#99787, for example.

Informs cockroachdb#99787

Release note: none
msbutler added a commit to msbutler/cockroach that referenced this issue Apr 6, 2023
Currently, the tpce workload is used in 2 roachtests in tpce.go. This patch
makes it easier for other roachtests to init and run a tpce workload by
creating a tpce driver that abstracts away the actual cmds required to init and
run a tpce workload. This will make it easier to address cockroachdb#99787, for example.

Informs cockroachdb#99787

Release note: none
craig bot pushed a commit that referenced this issue Apr 6, 2023
99810: roachtest: create simple tpce workload driver r=renatolabs a=msbutler

Currently, the tpce workload is used in 2 roachtests in tpce.go. This patch makes it easier for other roachtests to init and run a tpce workload by creating a tpce driver that abstracts away the actual cmds required to init and run a tpce workload. This will make it easier to address #99787, for example.

Informs #99787

Release note: none

100784: release: use in-source version for Red Hat publishing r=jlinder a=rail

Previously, we passed the version we want to publish via TeamCIty. With version available in `pkg/build/version.txt`, we can start using it to simplify our automation.

This PR copies the existing publishing script and adjusts the way we read the version.

Epic: none
Release note: None

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Rail Aliiev <[email protected]>
msbutler added a commit to msbutler/cockroach that referenced this issue May 5, 2023
Previously, we either created backup fixtures by manually creating and running
workloads on roachprod or using unweildy bash scripts. This patch introduces a
framework that makes it easy to generate a backup fixture via the roachtest
api. Once the fixture writer specifies a foreground workload (e.g. tpce) and a
scheduled backup specification, a single `roachprod run` invocation will create
the fixture in a cloud bucket that can be easily fetched by restore roachtests.

The fixture creator can initialize a foreground workload using a `workload
init` cmd or by restoring from an old fixture.

Note that the vast majority of the test specifications are "skipped" so they
are not run in the nightly roachtest suite. Creating large fixtures is
expensive and only need to be recreated once a major release. This patch
creates 5 new "roachtests":
- backupFixture/tpce/15GB/aws [disaster-recovery]
- backupFixture/tpce/32TB/aws [disaster-recovery] (skipped)
- backupFixture/tpce/400GB/aws [disaster-recovery] (skipped)
- backupFixture/tpce/400GB/gce [disaster-recovery] (skipped)
- backupFixture/tpce/8TB/aws [disaster-recovery] (skipped)

In the future, this framework should be extended to make it easier to write
backup-restore roundtrip tests as well.

Fixes cockroachdb#99787

Release note: None
msbutler added a commit to msbutler/cockroach that referenced this issue May 19, 2023
Previously, we either created backup fixtures by manually creating and running
workloads on roachprod or using unwieldy bash scripts. This patch introduces a
framework that makes it easy to generate a backup fixture via the roachtest
api. Once the fixture writer specifies a foreground workload (e.g. tpce) and a
scheduled backup specification, a single `roachtest run` invocation will create
the fixture in a cloud bucket that can be easily fetched by restore roachtests.

The fixture creator can initialize a foreground workload using a `workload
init` cmd or by restoring from an old fixture.

Note that the vast majority of the test specifications are "skipped" so they
are not run in the nightly roachtest suite. Creating large fixtures is
expensive and only need to be recreated once a major release. This patch
creates 5 new "roachtests":
- backupFixture/tpce/15GB/aws [disaster-recovery]
- backupFixture/tpce/32TB/aws [disaster-recovery] (skipped)
- backupFixture/tpce/400GB/aws [disaster-recovery] (skipped)
- backupFixture/tpce/400GB/gce [disaster-recovery] (skipped)
- backupFixture/tpce/8TB/aws [disaster-recovery] (skipped)

In the future, this framework should be extended to make it easier to write
backup-restore roundtrip tests as well.

Fixes cockroachdb#99787

Release note: None
craig bot pushed a commit that referenced this issue Jun 14, 2023
102821: backupccl: introduce backup fixture generator framework r=rhu713,renatolabs a=msbutler

Previously, we either created backup fixtures by manually creating and running workloads on roachprod or using unweildy bash scripts. This patch introduces a framework that makes it easy to generate a backup fixture via the roachtest api. Once the fixture writer specifies a foreground workload (e.g. tpce) and a scheduled backup specification, a single `roachprod run` invocation will create the fixture in a cloud bucket that can be easily fetched by restore roachtests.

The fixture creator can initialize a foreground workload using a `workload init` cmd or by restoring from an old fixture.

Note that the vast majority of the test specifications are "skipped" so they are not run in the nightly roachtest suite. Creating large fixtures is expensive and only need to be recreated once a major release. This patch creates 5 new "roachtests":
- backupFixture/tpce/15GB/aws [disaster-recovery]
- backupFixture/tpce/32TB/aws [disaster-recovery] (skipped)
- backupFixture/tpce/400GB/aws [disaster-recovery] (skipped)
- backupFixture/tpce/400GB/gce [disaster-recovery] (skipped)
- backupFixture/tpce/8TB/aws [disaster-recovery] (skipped)

In the future, this framework should be extended to make it easier to write backup-restore roundtrip tests as well.

Fixes #99787

Release note: None

104318: cmdccl: test utility for running a /dev/null c2c stream r=lidorcarmel a=stevendanna

A small utility for reading a tenant replication stream and doing nothing with it. Useful for manual testing.

Epic: none

Release note: None

104877: multiregionccl: reenable regional_by_table test r=chengxiong-ruan a=chengxiong-ruan

Informs: #98020

I wasn't able to repro the error mentioned in #98020 for all the 3 tests by stressing 2k runs. Assuming that something has been changed and we're fine now. But just enabling one test at a time just in case it would fail in CI for some reason.

Release note: None

Co-authored-by: Michael Butler <[email protected]>
Co-authored-by: Lidor Carmel <[email protected]>
Co-authored-by: Chengxiong Ruan <[email protected]>
@craig craig bot closed this as completed in b5bcdaa Jun 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-disaster-recovery
Projects
No open projects
Archived in project
Development

Successfully merging a pull request may close this issue.

1 participant