[YSQL] Fail ADD / DROP PK if table is part of XCluster replication or CDC #16625

tverona1 · 2023-03-30T19:58:18Z

Jira Link: DB-6014

Description

ADD / DROP PK is currently implemented as a table rewrite (rename original table + create new table + copy from original table + drop original table). This is not compatible with XCluster or CDC, as original relationship will be lost when recreating the table.

We should raise a clear error message when ADD / DROP PK is attempted on a table which has xcluster or CDC stream associated with it.

Warning: Please confirm that this issue does not contain any sensitive information

I confirm this issue does not contain any sensitive information.

…ble is a part of xCluster / CDC Summary: `ALTER TABLE ADD PRIMARY KEY`, `ALTER TABLE ... DROP <pkey>`, `ALTER TABLE ... ALTER COLUMN TYPE` currently perform a table rewrite under the hood (a new table is created and renamed to the original table, and the original table is dropped). This is not compatible with xCluster/CDC. Therefore, we should disallow these operations and a clear error message should be thrown. Jira: DB-6014 Test Plan: `./yb_build.sh --cxx-test xcluster_ysql-test --gtest_filter XClusterYsqlTest.TestAlterOperationTableRewrite` `./yb_build.sh --cxx-test cdcsdk_ysql-test --gtest_filter CDCSDKYsqlTest.TestAlterOperationTableRewrite` Reviewers: xCluster, mlillibridge, jason, skumar, abharadwaj, hsunder, #xcluster-project Reviewed By: abharadwaj, hsunder Subscribers: dsrinivasan, hsunder, abharadwaj, yql, mlillibridge, bogdan Differential Revision: https://phorge.dev.yugabyte.com/D25712

Summary: Background: In YB, we currently support some table rewrite operations like ALTER TYPE and ADD/DROP primary key. These rewrite operations drop and re-create the relation and all its dependent objects. Therefore, the OIDs of the relation (and all its dependent objects) change. PG implements table rewrite in a more efficient way by leveraging the pg_class.relfilenode field. In PG, relfilenode refers to the name of the on-disk file of the relation. When a relation is initially created, its relfilenode is the same as its OID. Changes introduced by this diff: This diff serves to add support for an alternative rewrite approach: the new approach only drops and re-creates the DocDB tables associated with the table and its indexes, and leverages the relfilenode field to map a PG table OID to the relevant DocDB table: a PG "relfilenode" maps to a YB DocDB table. More specifically, a table rewrite with the new approach would entail the following: 1. Let’s say we have a relation test: oid 12, relfilenode 12 2. Create a new transient relation: oid 15, relfilenode 15. - Note: the naming convention for the new transient relation is `pg_temp_<table_oid>`, i.e., `pg_temp_12` in this example. However, the DocDB table is given the same name as the original relation (`test`). So while the table rewrite operation is executing, there will be 2 DocDB tables with the name `test`. When the transaction commits/aborts, the stale one will be dropped. 3. Load the data into the new relation. 4. Swap the pg_class.relfilenode of the altered relation and the new relation: - test: oid 12, relfilenode 15 - new transient relation: oid 15, relfilenode 12 5. Reindex the altered relation (points to the new DocDB table that corresponds to relfilenode 15) - Note: a REINDEX is a table rewrite on the index. So the DocDB tables for the indexes will also be dropped and re-created. 6. Drop the new relation (points to the old DocDB table that corresponds to relfilenode 12). - Note: this step ONLY drops the old DocDB table. None of the dependent objects of the altered relation are dropped. Note: - The new rewrite mechanism breaks the assumption that the table OID directly corresponds to the DocDB table ID. Instead, it is the relfilenode OID that maps to the DocDB table. - Therefore, any communication from PG code to pggate (and beyond) should utilize the relfilenode OID of the table (utility function `YbGetRelfileNodeId()` retrieves a relation’s relfilenode id). - Conversely, if the table’s PG oid needs to be inferred inside DocDB/pggate, the table’s info should be looked up. (`pg_table_id` is stored in `TableInfo`, `PgTableDesc`) Specific code changes: - When we rewrite a table, the PG table ID explicitly needs to be sent to the DocDB layer (so that we can store it in metadata). - The relfilenode-swap rewrite mechanism is already used for `REFRESH MATERIALIZED VIEW`. Therefore, the proto field `matview_pg_table_id` which was used for this purpose has simply been renamed to `pg_table_id`. - The pg_table_id is stored in `SysTablesEntryPB` (metadata on master), `TableInfoPB` (metadata on tserver), `PgTableDesc` (table cache in pg_session). - In addition to PG table ID , CreateTable requests sent as part of table rewrite will also relay the old relfilenode id to the DocDB layer. This will allow us to detect if the old DocDB table was a part of xCluster/CDC, and error out on the rewrite (until we support table rewrite + xCluster/CDC – see GH issue #16625). - `YBCCreateTable()` (function for creating a DocDB table) explicitly takes the table’s name as an argument – this prevents us from doing an additional DocDB level table rename during a table rewrite. - The primary key for a rewritten table is copied into the `CreateStmt` that is sent to `YBCCreateTable()`. This is not needed for the existing use case of refreshes on materialized views, but is required for generalized table rewrite. - This diff does not enable the new rewrite approach for any command. It simply adds support for the new approach. Upgrade/Rollback: - This diff is upgrade/rollback **safe**. - The `matview_pg_table_id` field in `SysTablesEntryPB` is simply renamed to `pg_table_id`. This is not a new proto field. - There is a new persisted proto field (`pg_table_id`) in `TableInfoPB`. If a `REFRESH MATERIALIZED VIEW` is executed during upgrade, we may run into a scenario where some tablets have the `pg_table_id`, while others don't. However, this doesn't make a functional difference because the only use case for `TableInfoPB.pg_table_id` is for the `pg_locks` view, and row locking isn't permitted for materialized views anyway. - Subsequent diffs that enable new table rewrite commands on regular tables will have to be guarded with a LocalPersisted autoflag, as we want the tablet metadata for rewritten tables to have `pg_table_id`. Jira: DB-2394 Test Plan: `Jenkins Reviewers: dsrinivasan, myang, hsunder, jhe Reviewed By: myang, jhe Subscribers: jason, esheng, yql, ybase Differential Revision: https://phorge.dev.yugabyte.com/D29785

…s for datatypes (#144) * This PR adds a test class `YBRecordsStreamProducerIT` which is essentially a copy of `RecordsStreamProducerIT`. * The tests are modified in order to support the structure generated by the plugin `yboutput`. * Some tests are disabled owing to the fact that the underlying feature is not yet supported. * Altering not allowed for column under replication --> yugabyte/yugabyte-db#16625 * Altering the replica identity is not allowed

tverona1 added area/ysql Yugabyte SQL (YSQL) status/awaiting-triage Issue awaiting triage labels Mar 30, 2023

yugabyte-ci added kind/bug This issue is a bug priority/medium Medium priority issue and removed status/awaiting-triage Issue awaiting triage labels Mar 30, 2023

yugabyte-ci assigned mislam-yb Apr 11, 2023

yugabyte-ci assigned fizaaluthra and unassigned mislam-yb Apr 27, 2023

yugabyte-ci closed this as completed Jul 10, 2023

This was referenced Jul 12, 2024

[DBZ-PGYB] Fixed flakiness for the tests in PostgresConnectorIT yugabyte/debezium#143

Merged

[DBZ-PGYB] Add new test class YBRecordsStreamProducerIT to add tests for datatypes yugabyte/debezium#144

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[YSQL] Fail ADD / DROP PK if table is part of XCluster replication or CDC #16625

[YSQL] Fail ADD / DROP PK if table is part of XCluster replication or CDC #16625

tverona1 commented Mar 30, 2023 •

edited by jira bot

Loading

[YSQL] Fail ADD / DROP PK if table is part of XCluster replication or CDC #16625

[YSQL] Fail ADD / DROP PK if table is part of XCluster replication or CDC #16625

Comments

tverona1 commented Mar 30, 2023 • edited by jira bot Loading

Description

Warning: Please confirm that this issue does not contain any sensitive information

tverona1 commented Mar 30, 2023 •

edited by jira bot

Loading