Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: validate-system-schema works in shared-process deployments #129485

Merged

Conversation

renatolabs
Copy link
Contributor

@renatolabs renatolabs commented Aug 22, 2024

This commit updates the
acceptance/validate-system-schema-after-version-upgrade roachtest so
that it can now work on multi-tenant (shared-process) deployments.

In multi-tenant deployments, we also capture the system schema on the
tenant and make the same assertion that the schema should match
whether we bootstrapped on a version or upgraded to it. In addition,
we also verify that the system schema is the same on the system and
non-system tenants.

Informs: #127378

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@renatolabs
Copy link
Contributor Author

This is failing on some shared-process runs. Some system tables are missing the exclude_data_from_backup bit on secondary tenants. This seems related to #120144.

The failure can be reproduced with seed -4961013970065068305 on this branch. Output:

After upgrading, `USE system; SHOW CREATE ALL TABLES;` does not match expected output after version upgrade for secondary tenant.
Diff:
@@ -89,11 +89,11 @@
        version INT8 NOT NULL,
        sql_instance_id INT8 NOT NULL,
        session_id BYTES NOT NULL,
        crdb_region BYTES NOT NULL,
        CONSTRAINT "primary" PRIMARY KEY (crdb_region ASC, desc_id ASC, version ASC, session_id ASC)
-) WITH (exclude_data_from_backup = true);
+);
 CREATE TABLE public.locations (
        "localityKey" STRING NOT NULL,
        "localityValue" STRING NOT NULL,
        latitude DECIMAL(18,15) NOT NULL,
        longitude DECIMAL(18,15) NOT NULL,
@@ -184,11 +184,11 @@
        config STRING NOT NULL,
        report_id INT8 NOT NULL,
        violation_start TIMESTAMPTZ NULL,
        violating_ranges INT8 NOT NULL,
        CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC, type ASC, config ASC)
-) WITH (exclude_data_from_backup = true);
+);
 CREATE TABLE public.replication_critical_localities (
        zone_id INT8 NOT NULL,
        subzone_id INT8 NOT NULL,
        locality STRING NOT NULL,
        report_id INT8 NOT NULL,
@@ -202,11 +202,11 @@
        total_ranges INT8 NOT NULL,
        unavailable_ranges INT8 NOT NULL,
        under_replicated_ranges INT8 NOT NULL,
        over_replicated_ranges INT8 NOT NULL,
        CONSTRAINT "primary" PRIMARY KEY (zone_id ASC, subzone_id ASC)
-) WITH (exclude_data_from_backup = true);
+);
 CREATE TABLE public.reports_meta (
        id INT8 NOT NULL,
        "generated" TIMESTAMPTZ NOT NULL,
        CONSTRAINT "primary" PRIMARY KEY (id ASC)
 );

@cockroachdb/sql-foundations Is this test expected to pass on secondary tenants?

@rafiss
Copy link
Collaborator

rafiss commented Aug 22, 2024

Yes, it is expected to pass, so I think your change surfaced a bug. Luckily, it's low impact, since it just means that certain system tables for shared-process tenants will not be excluded from backup. They were excluded in #120144 just as an optimization to avoid setting protected timestamps on them.

The mistake in #120144 was that the following block only appeared in the pkg/upgrade/upgrades/v24_1_system_database.go upgrade gated behind a if deps.Codec.ForSystemTenant() condition. It should have been done unconditionally.

	// Mark the tables with low gc.ttl as excluded from backup.
		if err := deps.DB.DescsTxn(ctx, func(ctx context.Context, txn descs.Txn) error {
			for _, tabName := range []catconstants.SystemTableName{
				catconstants.ReplicationConstraintStatsTableName,
				catconstants.ReplicationStatsTableName,
				catconstants.TenantUsageTableName,
				catconstants.LeaseTableName,
				catconstants.SpanConfigurationsTableName,
			} {
				if _, err := txn.ExecEx(ctx,
					"mark-table-excluded-from-backup",
					txn.KV(),
					sessiondata.NodeUserSessionDataOverride,
					"ALTER TABLE system.public."+string(tabName)+" SET (exclude_data_from_backup = true)",
				); err != nil {
					return err
				}
			}
			return nil
		}); err != nil {
			return err
		}

To fix it, we can add another upgrade that does the above for non-system tenants.

@renatolabs renatolabs force-pushed the rc/validate-system-schema-shared-process branch from 0ffbe7f to 4683452 Compare August 26, 2024 19:02
This commit updates the
`acceptance/validate-system-schema-after-version-upgrade` roachtest so
that it can now work on multi-tenant (shared-process) deployments.

In multi-tenant deployments, we also capture the system schema on the
tenant and make the same assertion that the schema should match
whether we bootstrapped on a version or upgraded to it. In addition,
we also verify that the system schema is the same on the system and
non-system tenants.

Informs: cockroachdb#127378

Release note: None
@renatolabs renatolabs force-pushed the rc/validate-system-schema-shared-process branch from 4683452 to 4c0d94c Compare August 27, 2024 19:46
@renatolabs renatolabs marked this pull request as ready for review August 27, 2024 19:47
@renatolabs renatolabs requested a review from a team as a code owner August 27, 2024 19:47
@renatolabs renatolabs requested review from nameisbhaskar, vidit-bhat and a team and removed request for a team August 27, 2024 19:47
}

if err := diff(systemComparison.upgraded, tenantComparison.upgraded); err != nil {
t.Fatal(fmt.Errorf("comparing system schema of system and tenant: %w", err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@renatolabs renatolabs added backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.2.x Flags PRs that need to be backported to 24.2 labels Aug 28, 2024
@renatolabs
Copy link
Contributor Author

TFTR!

bors r=srosenberg

Copy link

blathers-crl bot commented Aug 28, 2024

Encountered an error creating backports. Some common things that can go wrong:

  1. The backport branch might have already existed.
  2. There was a merge conflict.
  3. The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.


error setting reviewers, but backport branch blathers/backport-release-23.2-129485 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/129799/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. []

Backport to branch 23.2.x failed. See errors above.


error setting reviewers, but backport branch blathers/backport-release-24.1-129485 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/129800/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. []

Backport to branch 24.1.x failed. See errors above.


error setting reviewers, but backport branch blathers/backport-release-24.2-129485 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/129801/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. []

Backport to branch 24.2.x failed. See errors above.


🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@renatolabs renatolabs deleted the rc/validate-system-schema-shared-process branch August 28, 2024 17:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-23.2.x Flags PRs that need to be backported to 23.2. backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.2.x Flags PRs that need to be backported to 24.2
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants