sql: add option to use lookup join for db-scope virtual tables #90116

ajwerner · 2022-10-18T04:51:22Z

information_schema and pg_catalog are always scoped to a database. Before this change, we would always look up all descriptors to materialize virtual tables, even if we only care about descriptors in the current database. The new approach: scan the namespace table to find the relevant descriptors and then only resolve those.

On my local machine where I loaded up a few megabytes of descriptors into 6k databases, a simple query like SELECT * FROM information_schema.sequences takes 1.2s without the setting enabled and 1ms with it.

Fixes #90115

Epic: CRDB-20865

Release note (performance improvement): Tables in pg_catalog and
information_schema (when not explicitly referenced as
"".information_schema) may now be much faster if the current database has
a small number of relations relative to the total number in the cluster.

cockroach-teamcity · 2022-10-18T04:51:31Z

This change is

chengxiong-ruan

Looking good! Do we want to backport this?

chengxiong-ruan · 2022-10-24T18:02:41Z

pkg/sql/catalog/descs/collection.go

+	ids.Add(db.GetID())
+	_ = ns.ForEachNamespaceEntry(func(e nstree.NamespaceEntry) error {
+		if !strings.HasPrefix(e.GetName(), catconstants.PgTempSchemaName) &&
+			e.GetID() != 29 {


29 -> catconstants.PublicSchemaID

chengxiong-ruan · 2022-10-24T18:24:21Z

pkg/sql/schema_resolver.go

@@ -300,12 +300,15 @@ func (sr *schemaResolver) GetTypeDescriptor(
 	ctx context.Context, id descpb.ID,
 ) (tree.TypeName, catalog.TypeDescriptor, error) {
 	tc := sr.descCollection
-	desc, err := tc.GetImmutableTypeByID(ctx, sr.txn, id, tree.ObjectLookupFlags{})
+	flags := sr.CommonLookupFlagsRequired()
+	flags.ParentID = sr.typeResolutionDbID


Thanks for doing this, I'd have forgotten about doing this for types if it's me.

ajwerner

Looking good! Do we want to backport this?

I'd like to try. There's a customer who would like it. I don't think 22.1 will be clean.

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan)

pkg/sql/catalog/descs/collection.go line 382 at r3 (raw file):

Previously, chengxiong-ruan (Chengxiong Ruan) wrote…

29 -> catconstants.PublicSchemaID

Done.

chengxiong-ruan

LGTM

chengxiong-ruan · 2022-10-24T21:57:55Z

Looking good! Do we want to backport this?

I'd like to try. There's a customer who would like it. I don't think 22.1 will be clean.

Yeah...I bet the backport would be annoying...Thanks again for working on this, huge.

ajwerner · 2022-10-24T22:31:23Z

Yeah, I'm giving up some hope on 22.1. 22.2 should be fine.

ajwerner · 2022-10-24T22:39:29Z

@rafiss do you want to give this a look?

rafiss

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @chengxiong-ruan)

-- commits line 4 at r15:
not entirely accurate: we support queries like select * from "".information_schema.tables

.ijwb/.idea/modules.xml line 1 at r13 (raw file):

<?xml version="1.0" encoding="UTF-8"?>

was this committed on purpose?

pkg/sql/crdb_internal.go line 517 at r14 (raw file):

		},
	},
	populate: func(ctx context.Context, p *planner, db catalog.DatabaseDescriptor, addRow func(...tree.Datum) error) error {

it used to use a generator but now it uses populate. is that on purpose? i thought generators were preferred

pkg/sql/crdb_internal.go line 518 at r14 (raw file):

	},
	populate: func(ctx context.Context, p *planner, db catalog.DatabaseDescriptor, addRow func(...tree.Datum) error) error {
		all, err := p.Descriptors().GetAllDescriptors(ctx, p.txn)

should this be using the new cluster setting to only lookup descriptors from the current DB if possible?

pkg/sql/crdb_internal.go line 582 at r14 (raw file):

	ctx context.Context, p *planner, db catalog.DatabaseDescriptor, addRow func(...tree.Datum) error,
) (bool, error) {
	dbDescs, err := p.Descriptors().GetAllDescriptorsForDatabase(ctx, p.Txn(), db)

does this need to be guarded by the cluster setting?

pkg/sql/resolver.go line 152 at r13 (raw file):

	}
	flags := p.ObjectLookupFlags(true /* required */, false /* requireMutable */)
	flags.ParentID = dbDesc.GetID()

i didn't realize that we had to do this. why do we?

pkg/sql/resolver.go line 174 at r13 (raw file):

	}
	if schemaDesc.SchemaKind() != catalog.SchemaVirtual {
		if dbDesc.GetID() != tableDesc.GetParentID() {

is this even possible, based on the lookup flags above?

pkg/sql/scheduledlogging/captured_index_usage_stats.go line 219 at r15 (raw file):

			sessiondata.InternalExecutorOverride{
				User:     username.RootUserName(),
				Database: string(databaseName),

FWIW, i think making the query use databaseName.pg_catalog.pg_class (or databaseName.crdb_internal.tables as it did before) would achieve the same result, so i'm not convinced we need this last commit. did you observe differently?

rafiss

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @chengxiong-ruan)

pkg/sql/crdb_internal.go line 518 at r14 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

should this be using the new cluster setting to only lookup descriptors from the current DB if possible?

specifically, if db != nil

ajwerner

TFTR

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @rafiss)

-- commits line 4 at r15:

Previously, rafiss (Rafi Shamim) wrote…

not entirely accurate: we support queries like select * from "".information_schema.tables

Good point. We only support it for information_schema, not pg_catalog.

[email protected]:26257/movr> select * from "".pg_catalog.pg_class;
ERROR: cannot access virtual schema in anonymous database
SQLSTATE: 42704
HINT: verify that the current database is set

.ijwb/.idea/modules.xml line 1 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

was this committed on purpose?

no

pkg/sql/crdb_internal.go line 517 at r14 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

it used to use a generator but now it uses populate. is that on purpose? i thought generators were preferred

I thought populate was preferred. There's some logic which prevents use of the index if you use generator here:

cockroach/pkg/sql/virtual_schema.go

Line 561 in 72735b8

    
           if def.generator != nil && !def.preferIndexOverGenerator(p, index, idxConstraint) {

I don't really get it, but an index join will happen if you use populate and not generator. Neither are commented as being preferred over the other. Who knows about this stuff?

pkg/sql/crdb_internal.go line 518 at r14 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

specifically, if db != nil

It can't, sadly. This table includes dropped tables which cannot be found by scanning namespace.

pkg/sql/crdb_internal.go line 582 at r14 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

does this need to be guarded by the cluster setting?

Yeah, I suppose so. Another option I can do is disable the index or something.

pkg/sql/resolver.go line 152 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i didn't realize that we had to do this. why do we?

I'll add commentary to the code. This flag is new. The essence of it is when we had all descriptors, there's logic in the descs.Collection which says that if you ask for an ID and it is not known, then we know it doesn't exist. When I first wrote this change, it made the ORM queries in rttanalysis actually much slower because they resolve a bunch of hashed OIDs which don't exist. We now had to do a lookup for each one. This now allows us to avoid that.

pkg/sql/resolver.go line 174 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

is this even possible, based on the lookup flags above?

It ought not be. I worry about cases where we're mixing cached and uncached descriptors. I'll remove it.

pkg/sql/scheduledlogging/captured_index_usage_stats.go line 219 at r15 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

FWIW, i think making the query use databaseName.pg_catalog.pg_class (or databaseName.crdb_internal.tables as it did before) would achieve the same result, so i'm not convinced we need this last commit. did you observe differently?

databaseName.crdb_internal.tables ignores databaseName and shows all tables in all databases. I assumed that that was well-known. It's totally horrific and something I don't think we can change.

rafiss

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @chengxiong-ruan)

pkg/sql/logictest/logictestbase/logictestbase.go line 454 at r13 (raw file):

	},
	{
		Name:                "local-use-db-lookup",

oh missed this earlier. instead of this, we could make the cluster setting default be a metamorphic constant. we do that for a few others. (idea would be to save a bit of test overhead)

pkg/sql/scheduledlogging/captured_index_usage_stats.go line 219 at r15 (raw file):

Previously, ajwerner wrote…

databaseName.crdb_internal.tables ignores databaseName and shows all tables in all databases. I assumed that that was well-known. It's totally horrific and something I don't think we can change.

i didn't know or i forgot. so sad.

pkg/bench/rttanalysis/testdata/benchmark_expectations line 72 at r13 (raw file):

4,ORMQueries/information_schema._pg_index_position
3,ORMQueries/pg_attribute
3,ORMQueries/pg_class

is it surprising that some of these increased? is it worth knowing why?

ajwerner

Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @chengxiong-ruan and @rafiss)

pkg/bench/rttanalysis/testdata/benchmark_expectations line 72 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

is it surprising that some of these increased? is it worth knowing why?

It's completely expected. We're now doing an index-join -- a scan on the namespace table followed by a batch of point lookups against descriptor. Before we were just scanning all of descriptor.

ajwerner · 2022-10-26T20:14:06Z

In theory, IIRC, the generator callback has been carefully designed to have 0 true concurrency.

This is great to hear!

Flakes were #90731 and #90730

bors r+

craig · 2022-10-26T21:00:37Z

Build failed (retrying...):

Bazel Essential CI (Cockroach)

ajwerner · 2022-10-26T21:07:06Z

bors r-

craig · 2022-10-26T21:07:09Z

Canceled.

rafiss · 2022-10-28T18:40:21Z

for this and for #90597, could we add some rttanalysis benchmark that shows fewer lookups happening?

ajwerner · 2022-10-30T23:22:24Z

for this and for #90597, could we add some rttanalysis benchmark that shows fewer lookups happening?

We cannot show that fewer lookups happen because more lookups happen in terms of round-trips :). The problem is that before we were doing a scan of all descriptors. I think one possibility is to extend the rttanalysis framework to include some sort of counter for number descriptors added to the collection or something. I filed #90960

ajwerner · 2022-10-31T04:01:08Z

The first commit is #90962. I added a new, important second commit. before the second commit was added, partial predicates on virtual indexes were totally ignored. This is now fixed. The testing for it really comes in the 4th commit. Somebody from @cockroachdb/sql-queries ought to sign off on that small change before I merge this.

ajwerner · 2022-10-31T04:45:26Z

Sadly my simplified attempt to add the partial index predicates for virtual indexes does not quite work because we hide the fact that we might fall back to the primary index underneath the execbuilder. We need a way to detect this fallback and then further filter the rows based on the predicate value. In theory, this shouldn't be too hard. We'll want to do it around here:

cockroach/pkg/sql/virtual_schema.go

Lines 713 to 720 in 27a65cc

    
           // NB: If we allow virtualSchemaTables with generator to perform a constrained scan, 
        
           // we then need to ensure that we don't call populate without checking, as it may be nil. 
        
           if def.populate == nil { 
        
           	return errors.AssertionFailedf( 
        
           		"programming error: can't fall back to unconstrained scan on generated vtables") 
        
           } 
        
           return def.populate(ctx, p, dbDesc, addRowIfPassesFilter(&newConstraint))

I don't quite know how to properly plumb the row values in so that we can evaluate the constraint reasonably.

ajwerner · 2022-10-31T12:27:26Z

Alright, I have a fixup commit which does what I suggested and it seems to work, but might be ugly and under-tested.

mgartner

The sql,opt: add support for partial indexes on virtual tables commit LGTM.

mgartner · 2022-10-31T19:26:26Z

pkg/sql/opt_catalog.go

@@ -2429,6 +2429,10 @@ func (oi *optVirtualIndex) InvertedColumn() cat.IndexColumn {

 // Predicate is part of the cat.Index interface.
 func (oi *optVirtualIndex) Predicate() (string, bool) {
+	if oi.idx != nil {


nit: putting the if oid.idx == nil { ... } case first would match the other methods and be easier to read IMO

mgartner · 2022-10-31T19:34:14Z

pkg/sql/virtual_schema.go

@@ -698,16 +743,72 @@ func (e *virtualDefEntry) makeConstrainedRowsGenerator(
 			newConstraint.Spans.Append(idxConstraint.Spans.Get(currentSpan))
 		}

+		// We'd want to walk the expression, figure out the columns we need, then


nit: end of the sentence is missing.

Cleaned this commentary up.

mgartner · 2022-10-31T19:40:36Z

pkg/sql/virtual_schema.go

+		return nil, nil
+	}
+	publicColumns := e.desc.PublicColumns()
+	exprs, _, err := schemaexpr.MakePartialIndexExprs(


nit: make a function that can make a single partial index expression, or a leave a TODO for me here

Release note: None

information_schema and pg_catalog are always scoped to a database. Before this change, we would always look up all descriptors to materialize virtual tables, even if we only care about descriptors in the current database. This PR adds a new cluster setting (so that backports can be made kosher) to enable a new approach: scan the namespace table to find the relevant descriptors and then only resolve those. On my local machine where I loaded up a few megabytes of descriptors into 6k databases, a simple query like `SELECT * FROM information_schema.sequences` takes 1.2s without the setting enabled and 1ms with it. Release note (performance improvement): Tables in `pg_catalog` and `information_schema` (when not explicitly referenced as `"".information_schema`) may now be much faster if the current database has a small number of relations relative to the total number in the cluster.

This adds some virtual indexes to `crdb_internal.tables` so that it performs an index join when looking up the tables in a database. Release note: None

This commit modifies the query to use database-scoped virtual tables. In doing this, we now avoid reading all descriptors in the cluster. On a local cluster which has a large number of small databases, this query was using lots of the CPU. Release note: None

A partial index already means something. This means something different. Release note: None

The index would never be used. This eliminates a footgun. Release note: None

ajwerner

I reworked the schemaexpr stuff to have a cleaner single-expression function and to share logic. @mgartner can you give the last revision a quick peek before I bors it?

ajwerner · 2022-10-31T20:14:09Z

pkg/sql/virtual_schema.go

+		return nil, nil
+	}
+	publicColumns := e.desc.PublicColumns()
+	exprs, _, err := schemaexpr.MakePartialIndexExprs(


ajwerner · 2022-10-31T20:27:03Z

pkg/sql/virtual_schema.go

@@ -698,16 +743,72 @@ func (e *virtualDefEntry) makeConstrainedRowsGenerator(
 			newConstraint.Spans.Append(idxConstraint.Spans.Get(currentSpan))
 		}

+		// We'd want to walk the expression, figure out the columns we need, then


Cleaned this commentary up.

ajwerner · 2022-10-31T20:27:15Z

pkg/sql/opt_catalog.go

@@ -2429,6 +2429,10 @@ func (oi *optVirtualIndex) InvertedColumn() cat.IndexColumn {

 // Predicate is part of the cat.Index interface.
 func (oi *optVirtualIndex) Predicate() (string, bool) {
+	if oi.idx != nil {


ajwerner · 2022-11-01T04:22:30Z

bors r+

craig · 2022-11-01T05:08:37Z

Build succeeded:

Bazel Essential CI (Cockroach)

blathers-crl · 2022-11-01T05:08:49Z

Encountered an error creating backports. Some common things that can go wrong:

The backport branch might have already existed.
There was a merge conflict.
The backport branch contained merge commits.

You might need to create your backport manually using the backport tool.

error creating merge commit from f588ab6 to blathers/backport-release-22.2-90116: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict []

you may need to manually resolve merge conflicts with the backport tool.

Backport to branch 22.2.x failed. See errors above.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

ajwerner force-pushed the ajwerner/faster-virtual-tables branch 7 times, most recently from 04f894d to 1dd2396 Compare October 24, 2022 17:29

ajwerner marked this pull request as ready for review October 24, 2022 17:29

ajwerner requested review from a team October 24, 2022 17:29

ajwerner requested a review from a team as a code owner October 24, 2022 17:29

chengxiong-ruan reviewed Oct 24, 2022

View reviewed changes

ajwerner force-pushed the ajwerner/faster-virtual-tables branch 2 times, most recently from cc7367c to 86e803a Compare October 24, 2022 21:23

ajwerner commented Oct 24, 2022

View reviewed changes

ajwerner force-pushed the ajwerner/faster-virtual-tables branch from 86e803a to a9d286d Compare October 24, 2022 21:46

chengxiong-ruan approved these changes Oct 24, 2022

View reviewed changes

ajwerner force-pushed the ajwerner/faster-virtual-tables branch from a9d286d to cd48384 Compare October 24, 2022 22:39

ajwerner force-pushed the ajwerner/faster-virtual-tables branch 2 times, most recently from 2a651fa to 77a99e8 Compare October 24, 2022 23:48

ajwerner added the backport-22.2.x label Oct 25, 2022

rafiss requested changes Oct 25, 2022

View reviewed changes

ajwerner commented Oct 25, 2022

View reviewed changes

rafiss requested changes Oct 25, 2022

View reviewed changes

ajwerner commented Oct 25, 2022

View reviewed changes

ajwerner mentioned this pull request Oct 30, 2022

rttanalysis: extend framework to track number of descriptors fetched #90960

Open

ajwerner force-pushed the ajwerner/faster-virtual-tables branch 2 times, most recently from 0b1ad30 to 27a65cc Compare October 31, 2022 03:27

ajwerner requested a review from a team as a code owner October 31, 2022 03:27

ajwerner force-pushed the ajwerner/faster-virtual-tables branch from 89bdb29 to 95a770e Compare October 31, 2022 14:30

mgartner approved these changes Oct 31, 2022

View reviewed changes

ajwerner added 6 commits October 31, 2022 16:45

sql,opt: add support for partial indexes on virtual tables

f588ab6

Release note: None

sql: optimize SHOW TABLES and crdb_internal.tables

88ccc92

This adds some virtual indexes to `crdb_internal.tables` so that it performs an index join when looking up the tables in a database. Release note: None

sql: rename virtualIndex.partial to incomplete

8899e63

A partial index already means something. This means something different. Release note: None

sql: ensure that generator isn't used with an incomplete virtual index

81e66ce

The index would never be used. This eliminates a footgun. Release note: None

ajwerner force-pushed the ajwerner/faster-virtual-tables branch from 95a770e to 81e66ce Compare October 31, 2022 20:46

ajwerner commented Oct 31, 2022

View reviewed changes

craig bot merged commit 3506349 into cockroachdb:master Nov 1, 2022

cockroach-teamcity mentioned this pull request Nov 1, 2022

PR #90116 - sql: add option to use lookup join for db-scope virtual tables cockroachdb/docs#15522

Open

ajwerner mentioned this pull request Nov 1, 2022

release-22.2: sql,opt: improve SHOW TABLES and virtual tables with many databases #91054

Merged

sql: add option to use lookup join for db-scope virtual tables #90116

sql: add option to use lookup join for db-scope virtual tables #90116

Conversation

ajwerner commented Oct 18, 2022 • edited Loading

cockroach-teamcity commented Oct 18, 2022

chengxiong-ruan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

chengxiong-ruan left a comment

Choose a reason for hiding this comment

chengxiong-ruan commented Oct 24, 2022

ajwerner commented Oct 24, 2022

ajwerner commented Oct 24, 2022

rafiss left a comment

Choose a reason for hiding this comment

rafiss left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

rafiss left a comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

ajwerner commented Oct 26, 2022

craig bot commented Oct 26, 2022

ajwerner commented Oct 26, 2022

craig bot commented Oct 26, 2022

rafiss commented Oct 28, 2022

ajwerner commented Oct 30, 2022

ajwerner commented Oct 31, 2022

ajwerner commented Oct 31, 2022

ajwerner commented Oct 31, 2022

mgartner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajwerner left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ajwerner commented Nov 1, 2022

craig bot commented Nov 1, 2022

blathers-crl bot commented Nov 1, 2022

ajwerner commented Oct 18, 2022 •

edited

Loading