Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: increase cost for table descriptor fetch during virtual scan #56345

Merged
merged 1 commit into from
Nov 12, 2020

Conversation

rytaft
Copy link
Collaborator

@rytaft rytaft commented Nov 5, 2020

This commit bumps the cost of each virtual scan to 25*randIOCostFactor
from its previous value of 10*randIOCostFactor. This new value threads
the needle so that a lookup join will still be chosen if the predicate
is very selective, but the plan for the PGJDBC query identified in #55140
no longer includes lookup joins.

Fixes #55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.

This commit bumps the cost of each virtual scan to 25*randIOCostFactor
from its previous value of 10*randIOCostFactor. This new value threads
the needle so that a lookup join will still be chosen if the predicate
is very selective, but the plan for the PGJDBC query identified in cockroachdb#55140
no longer includes lookup joins.

Fixes cockroachdb#55140

Release note (performance improvement): Adjusted the cost model in
the optimizer so that the optimizer is less likely to plan a lookup
join into a virtual table. Performing a lookup join into a virtual
table is expensive, so this change will generally result in better
performance for queries involving joins with virtual tables.
@rytaft rytaft requested a review from a team as a code owner November 5, 2020 23:40
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rytaft
Copy link
Collaborator Author

rytaft commented Nov 5, 2020

I ran the sample workload in #55140 and confirmed that the new plan is much faster (I think I was confused before since I didn't realize how much caching affects performance for this query). On my single-node cluster on my laptop, the old plan executes in 3.565s and the new plan executes in 275ms.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @RaduBerinde, @rafiss, and @rytaft)


pkg/sql/opt/exec/execbuilder/testdata/virtual, line 106 at r1 (raw file):

EXPLAIN SELECT * FROM pg_constraint INNER LOOKUP JOIN pg_index on true

# Test that a gnarly ORM query from ActiveRecord uses lookup joins for speed.

Looks like this was specifically testing that we use lookup join

@rytaft
Copy link
Collaborator Author

rytaft commented Nov 6, 2020


pkg/sql/opt/exec/execbuilder/testdata/virtual, line 106 at r1 (raw file):

Previously, RaduBerinde wrote…

Looks like this was specifically testing that we use lookup join

We are still doing two lookup joins...

Looks like the relevant issue is #48178. My sense from the issue description is that we definitely want to use at least some lookup joins, but it's not clear to me that it has to be this exact plan (and the existing plan does still have some hash/merge joins).

I tried running this query on my laptop to see if there was a performance difference, but I couldn't see any difference. I may not have the right set of tables/settings to see any difference, though. @rafiss do you know how I might be able to test this?

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner, @RaduBerinde, @rafiss, and @rytaft)


pkg/sql/opt/exec/execbuilder/testdata/virtual, line 106 at r1 (raw file):

Previously, rytaft (Rebecca Taft) wrote…

We are still doing two lookup joins...

Looks like the relevant issue is #48178. My sense from the issue description is that we definitely want to use at least some lookup joins, but it's not clear to me that it has to be this exact plan (and the existing plan does still have some hash/merge joins).

I tried running this query on my laptop to see if there was a performance difference, but I couldn't see any difference. I may not have the right set of tables/settings to see any difference, though. @rafiss do you know how I might be able to test this?

A good way to test this is to try running it on a cluster that has ~100 tables with multiple foreign keys per table.

Copy link
Collaborator

@mgartner mgartner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @RaduBerinde and @rytaft)

@rytaft
Copy link
Collaborator Author

rytaft commented Nov 12, 2020


pkg/sql/opt/exec/execbuilder/testdata/virtual, line 106 at r1 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

A good way to test this is to try running it on a cluster that has ~100 tables with multiple foreign keys per table.

I did this experiment and found that the times are nearly identical (~120 ms in each case).

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thanks for this improvement!

lgtm!

@rytaft
Copy link
Collaborator Author

rytaft commented Nov 12, 2020

TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Nov 12, 2020

This PR was included in a batch that was canceled, it will be automatically retried

@craig
Copy link
Contributor

craig bot commented Nov 12, 2020

Build succeeded:

@craig craig bot merged commit a5e67b6 into cockroachdb:master Nov 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: 100x performance degradation in pgjdbc query to fetch columns
5 participants