-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DBZ-PGYB] Support for consistent snapshot for an existing slot #113
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm my understanding after this PR - if a user creates a replication slot and starts the connector with snapshot.mode=never
and after streaming if the user restarts the connector in initial
mode, he will only get the records snapshotted which would be there if the snapshot was taken in the first place itself.
And in this case, what would happen once the snapshot is completed? Will the records be streamed from the point they were streamed before connector restart?
debezium-connector-postgres/src/main/java/io/debezium/connector/postgresql/spi/SlotState.java
Show resolved
Hide resolved
...m-connector-postgres/src/test/java/io/debezium/connector/postgresql/PostgresConnectorIT.java
Show resolved
Hide resolved
...m-connector-postgres/src/test/java/io/debezium/connector/postgresql/PostgresConnectorIT.java
Outdated
Show resolved
Hide resolved
...m-connector-postgres/src/test/java/io/debezium/connector/postgresql/PostgresConnectorIT.java
Show resolved
Hide resolved
The OffsetState in Kafka would indicate that Streaming has started. So, even when user restarts the connector in INITIAL mode, the streaming will continue from the LSN corresponding to the offsetState stored in Kafka In the case of the newly added test - initialSnapshotWithExistingSlot - I was relying on the fact that the OffsetState would not be saved in Kafka as I am stopping the connector almost immediately after starting it. In this case, as there is no previous offset in Kafka, a snapshot would be taken and the streaming will start from the consistent_point |
Summary
This PR is to support consistent snapshot in the case of an existing slot.
In this case, the consistent_point hybrid time is determined from the pg_replication_slots view, specifically from the yb_restart_commit_ht column.
There is an assumption here that this slot has not been used for streaming till this point. If this holds, then the history retention barrier will be in place as of the consistent snapshot time (consistent_point). The snapshot query will be run as of the consistent_point and subsequent streaming will start from the consistent_point of the slot.
Test Plan
Added new test
mvn -Dtest=PostgresConnectorIT#initialSnapshotWithExistingSlot test