Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: improve the behavior when hitting the limit on the number of open FDs for disk spilling #91784

Open
yuzefovich opened this issue Nov 12, 2022 · 0 comments
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-sql-queries SQL Queries Team

Comments

@yuzefovich
Copy link
Member

yuzefovich commented Nov 12, 2022

As of #84398, we now might hit a timeout when trying to acquire the file descriptors when spilling to disk. We should try a bit harder by lowering the desired FD count - e.g. the external sort needs 3 FDs to make progress, but it can be asking for more (at the moment for 1/16 of the node-wide limit), and if it doesn't get it, the sort errors out. We should be smarter here.

Jira issue: CRDB-21425

@yuzefovich yuzefovich added the C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. label Nov 12, 2022
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Nov 12, 2022
craig bot pushed a commit that referenced this issue Nov 15, 2022
91785: colexecdisk: limit number of FDs used by a single disk-backed op r=yuzefovich a=yuzefovich

This commit fixes an oversight in how we're computing the number of file descriptors that a single disk-backed operator can use. Previously, we would start the calculation at 1/16 of the node-wide limit, regardless of what that comes down to. If an operator cannot acquire the FDs, we have a hint suggesting to increase that node-wide limit, but then it wouldn't actually do much - the node would still support the same number of concurrent disk-spilling queries. This is now fixed by capping the number at 16 (which is the number that an operator gets with the default limit value). I don't think there is much upside in going higher - it seems more important to support higher concurrency of disk-spilling queries rather than to speed up a smaller number of queries by a percent.

Addresses: #91784.

Epic: CRDB-20535

Release note: None

91861: sql: add tracing span on client side of SetupFlow RPC r=yuzefovich a=yuzefovich

This commit restores the creation of the tracing span for the clients side of the SetupFlow RPC when it is issued in parallel (by the DistSQL worker). This was lost in a recent refactor.

Epic: None

Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
@mgartner mgartner moved this to Backlog (DO NOT ADD NEW ISSUES) in SQL Queries Jul 24, 2023
@mgartner mgartner moved this from Backlog (DO NOT ADD NEW ISSUES) to New Backlog in SQL Queries Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-cleanup Tech debt, refactors, loose ends, etc. Solution not expected to significantly change behavior. T-sql-queries SQL Queries Team
Projects
Status: Backlog
Development

No branches or pull requests

1 participant