Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch indexes for BackfillIndexRequest #4785

Closed
jaki opened this issue Jun 15, 2020 · 1 comment
Closed

Batch indexes for BackfillIndexRequest #4785

jaki opened this issue Jun 15, 2020 · 1 comment
Assignees
Labels
area/docdb YugabyteDB core features duplicate

Comments

@jaki
Copy link
Contributor

jaki commented Jun 15, 2020

The lower layers of the backfill index request have support for backfilling multiple indexes on the same table in one request. This is an optimization because the backfill only needs to do a single read (scan indexed table) for multiple writes (backfill index tables).

Everything on the tserver side (YCQL, not YSQL) seems to support multiple indexes. BackfillTable, BackfillTablet, and BackfillChunk also support it. However, MultiStageAlterTable::StartBackfillingData always sends only one index for backfill. Detect which indexes on the same table are ready to move on to the backfilling stage, and batch them together for the BackfillIndexRequest.

Also, have YSQL handle multiple index backfill.

@jaki jaki added the area/docdb YugabyteDB core features label Jun 15, 2020
jaki added a commit that referenced this issue Jun 29, 2020
Summary:

Implement core functionality for the backfill part of YSQL multi-stage
create index.  Do the following checked items:

- [x] Add `BACKFILL INDEX` grammar for postgres
- [x] Establish basic communication from tserver to postgres
- [x] Use ancient write time for inserting rows for backfill
- [x] Use supplied read time for selecting rows to backfill
- [ ] Establish connection when `yugabyte` role is password protected
- [ ] Handle errors anywhere in the schema migration process
- [ ] Handle multiple indexes backfilling at same time (issue #4785)
- [ ] Have postgres respect master to tserver RPC deadline
- [ ] Support create unique index (issue #4899)
- [ ] Support nested DDL create index (issue #4786)
- [ ] Work on multi-stage drop index

Implement it as follows:

1. Pass database name from master to tserver on `BackfillIndex` request
1. Link libpq to tablet in order to send libpq request from tserver
1. Add `BACKFILL INDEX <index_oids> READ TIME <read_time> PARTITION
   <partition_key> [ FROM <row_key_start> [ TO <row_key_end> ] ]`
   grammar
1. Wire it down a similar path as `index_build`, but pass down read time
   and partition key (don't handle row keys yet) through exec params
1. Pass down hard-coded ancient write time
1. Read from indexed table tablet with specified partition key with
   specified read time
1. Non-transactionally write to index table with specified write time

For now, explicitly error on unique index creation and nested DDL index
creation because they are unstable.  They can later be enabled and wired
to use the fast path (no multi-stage).  Eventually, after some work, we
want to enable them with backfill (multi-stage).

Also, remove support for collecting `reltuples` stats on indexes when
using backfill.  We don't really use this stat, and we don't even
collect it for non-index tables, so it shouldn't be a big deal for now.

This is part 4 of the effort of bringing index backfill to YSQL.

Keep #2301 open.

Depends on D8368

Depends on D8578

Test Plan:

`./yb_build.sh --cxx-test pgwrapper_pg_libpq-test --gtest_filter
'PgLibPqTest.Backfill*'`

Reviewers: amitanand, neil, mihnea

Reviewed By: mihnea

Subscribers: yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D8487
deeps1991 pushed a commit to deeps1991/yugabyte-db that referenced this issue Jul 22, 2020
Summary:

Implement core functionality for the backfill part of YSQL multi-stage
create index.  Do the following checked items:

- [x] Add `BACKFILL INDEX` grammar for postgres
- [x] Establish basic communication from tserver to postgres
- [x] Use ancient write time for inserting rows for backfill
- [x] Use supplied read time for selecting rows to backfill
- [ ] Establish connection when `yugabyte` role is password protected
- [ ] Handle errors anywhere in the schema migration process
- [ ] Handle multiple indexes backfilling at same time (issue yugabyte#4785)
- [ ] Have postgres respect master to tserver RPC deadline
- [ ] Support create unique index (issue yugabyte#4899)
- [ ] Support nested DDL create index (issue yugabyte#4786)
- [ ] Work on multi-stage drop index

Implement it as follows:

1. Pass database name from master to tserver on `BackfillIndex` request
1. Link libpq to tablet in order to send libpq request from tserver
1. Add `BACKFILL INDEX <index_oids> READ TIME <read_time> PARTITION
   <partition_key> [ FROM <row_key_start> [ TO <row_key_end> ] ]`
   grammar
1. Wire it down a similar path as `index_build`, but pass down read time
   and partition key (don't handle row keys yet) through exec params
1. Pass down hard-coded ancient write time
1. Read from indexed table tablet with specified partition key with
   specified read time
1. Non-transactionally write to index table with specified write time

For now, explicitly error on unique index creation and nested DDL index
creation because they are unstable.  They can later be enabled and wired
to use the fast path (no multi-stage).  Eventually, after some work, we
want to enable them with backfill (multi-stage).

Also, remove support for collecting `reltuples` stats on indexes when
using backfill.  We don't really use this stat, and we don't even
collect it for non-index tables, so it shouldn't be a big deal for now.

This is part 4 of the effort of bringing index backfill to YSQL.

Keep yugabyte#2301 open.

Depends on D8368

Depends on D8578

Test Plan:

`./yb_build.sh --cxx-test pgwrapper_pg_libpq-test --gtest_filter
'PgLibPqTest.Backfill*'`

Reviewers: amitanand, neil, mihnea

Reviewed By: mihnea

Subscribers: yql, bogdan

Differential Revision: https://phabricator.dev.yugabyte.com/D8487
@jaki
Copy link
Contributor Author

jaki commented Apr 23, 2021

Besides the YSQL part, this is mostly covered in the new issue #8069. Marking this as duplicate. Another issue should eventually be created for YSQL batching.

@jaki jaki closed this as completed Apr 23, 2021
@jaki jaki added the duplicate label Apr 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features duplicate
Projects
None yet
Development

No branches or pull requests

2 participants