-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: include system.jobs in auto stats collection #102213
Comments
GH is claiming I removed those labels but I don't think I did? maybe some stale browser state or something. Putting them back. |
That said, I'm not sure this is a GA-blocker: all the queries we are running now are apparently running well enough, and we've manually index-hinted the one that wasn't, so I while I do suspect that in the long term we want stats on this table I don't feel strongly that we do in 23.1 |
[yahor] We split off the most active/contentious part of the jobs table, so we may be able to reconsider collecting stats for the jobs table to ensure good index selection. One concern is that the auto-stats query that checks for existing auto-stats checks the jobs table. Removing the GA-blocker label. Plan is to enable stats on the jobs table for master, then consider back-porting to 23.1.1. |
Fixes cockroachdb#102213 Release note (performance improvement): We now automaticlaly collect table statistics on the system.jobs table, which will enable the optimizer to produce better query plans for internal queries that access the system.jobs table. This may result in better performance of the system.
102594: sql/stats: collect automatic table statistics on the system.jobs table r=rytaft a=rytaft Fixes #102213 Release note (performance improvement): We now automatically collect table statistics on the `system.jobs` table, which will enable the optimizer to produce better query plans for internal queries that access the `system.jobs` table. This may result in better performance of the system. Co-authored-by: Rebecca Taft <[email protected]>
Fixes #102213 Release note (performance improvement): We now automatically collect table statistics on the system.jobs table, which will enable the optimizer to produce better query plans for internal queries that access the system.jobs table. This may result in better performance of the system.
102637: release-23.1: sql/stats: collect automatic table statistics on the system.jobs table r=rytaft a=blathers-crl[bot] Backport 1/1 commits from #102594 on behalf of `@rytaft.` /cc `@cockroachdb/release` ---- Fixes #102213 Release note (performance improvement): We now automatically collect table statistics on the `system.jobs` table, which will enable the optimizer to produce better query plans for internal queries that access the `system.jobs` table. This may result in better performance of the system. ---- Release justification: with changes to the jobs infrastructure, it's more important to have accurate stats on the jobs table. Co-authored-by: Rebecca Taft <[email protected]>
In #80887 we excluded system.jobs from automatic stats collection, but this is a system table that has 5 different indexes and is used by a wide variety of different queries coming from different places -- the jobs system itself, inspection tools, virtual tables, join with other jobs-info, etc, with a variety of different predicates. Choosing the best plan and index for each of these is obviously going to depend on the predicate and join and data distribution, but this is exactly the kind of information gathered statistics would provide the optimizer.
The lack of statistics here was recently implicated in an incident where a query on system.jobs that had two predicates -- one on jobs.job_type that it be one of x values and one on jobs.status that it be be one of y values; the plan chosen picked the job_type index instead of the job status index, even though there were 130,000 jobs of the matching type and only ~10 of the matching status.
Jira issue: CRDB-27344
The text was updated successfully, but these errors were encountered: