Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DBZ-PGYB][yugabyte/yugabyte-db#24200] Execute snapshot in chunks #161

Merged
merged 15 commits into from
Oct 17, 2024

Conversation

vaibhav-yb
Copy link
Collaborator

@vaibhav-yb vaibhav-yb commented Sep 27, 2024

Problem

For very large tables, the default SELECT * query can take a really long time to complete leading to longer time for snapshots.

Solution

This PR aims to implement snapshotting the table in parallel using an inbuilt method yb_hash_code to only run the query for a given hash range. The following 2 configuration properties are introduced with this PR:

  1. A new snapshot.mode called parallel - this will behave exactly like initial_only but we will have the ability to launch multiple tasks.
  2. primary.key.hash.columns - this config takes in a comma separated values of the primary key hash component of the table.

Note: When snapshot.mode is set to parallel, we will not support providing regex in the property table.include.list and the user will need to specify the full name of the table in the property. Additionally, we will only allow one table in the table.include.list if snapshot.mode is parallel.

@vaibhav-yb vaibhav-yb self-assigned this Sep 27, 2024
@Sumukh-Phalgaonkar Sumukh-Phalgaonkar self-requested a review October 16, 2024 11:28
@Sumukh-Phalgaonkar
Copy link

Also could you add some unit tests for this:

  1. Snapshot of a sufficiently large table (data present in every hash partition) with parallel mode.
  2. Negative tests where the deployment would fail due to validations, for ex: multiple tables in include list

@Sumukh-Phalgaonkar
Copy link

In the run with custom image it was observed that the task-id was not getting printed in the snapshot logs. Can it be added to those logs or will it require larger effort?

}

protected String getQueryForParallelSnapshotSelect(long lowerBound, long upperBound) {
return String.format("SELECT * FROM %s WHERE yb_hash_code(%s) >= %d AND yb_hash_code(%s) <= %s",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What will happen if a user only has range partitioned primary key and the parallel snapshot mode is deployed. What if the primary key is a composite like (id HASH, v1 ASC)? Will this still work or should we add another validation that only hash partitioned columns should be provided in primary.keys field?

@vaibhav-yb vaibhav-yb changed the title [DBZ-PGYB][WIP] Execute snapshot in chunks [DBZ-PGYB][yugabyte/yugabyte-db#24200] Execute snapshot in chunks Oct 17, 2024
.withDisplayName("Comma separated primary key fields")
.withType(Type.STRING)
.withImportance(Importance.LOW)
.withDescription("A comma separated value having all the primary key components")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update this description to say HASH components of the primary key?

// Perform basic validations.
validateSingleTableProvidedForParallelSnapshot(tableIncludeList);

// Publication auto create mode should not be for all tables.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also move this into a separate function - validatePublicationForParallelSnapshot

taskProps.put(PostgresConnectorConfig.TASK_ID.name(), String.valueOf(i));

long lowerBound = i * rangeSize;
long upperBound = (i == maxTasks - 1) ? upperBoundExclusive - 1 : (lowerBound + rangeSize - 1);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can add a comment explaining this special handling for the last task.

@@ -1116,6 +1116,46 @@ public void shouldHaveBeforeImageOfUpdatedRow() throws InterruptedException {
assertThat(updateRecordValue.getStruct(Envelope.FieldName.AFTER).getStruct("aa").getInt32("value")).isEqualTo(404);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a test for primary.keys config?

@vaibhav-yb vaibhav-yb merged commit 2cda9b7 into ybdb-debezium-2.5.2 Oct 17, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants