Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jobs: Clear out claim columns when pausing jobs #82698

Closed
miretskiy opened this issue Jun 9, 2022 · 0 comments
Closed

jobs: Clear out claim columns when pausing jobs #82698

miretskiy opened this issue Jun 9, 2022 · 0 comments
Assignees
Labels
A-jobs C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) sync-me-8 T-jobs

Comments

@miretskiy
Copy link
Contributor

miretskiy commented Jun 9, 2022

If a job is paused, it retains claim_session_id and claim_instance_id columns.
Thus, if it is subsequently resumed, the same node will reclaim the job...
We should clear out both columns when pausing the job.

Currently, this is the only mechanism we have to try to "rebalance" jobs resumer
assignment.

Also consider lowering the number of jobs registry can claim at a time.
Bonus: consider moving this limit from an env var to a setting.

Jira issue: CRDB-16553

Epic CRDB-15034

@miretskiy miretskiy added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-jobs T-jobs labels Jun 9, 2022
craig bot pushed a commit that referenced this issue Sep 30, 2022
89014: jobs: Clear out claim info when pausing r=miretskiy a=miretskiy

Clear out job claim information when job is paused. Clearing out claim information is beneficial since it allows operator to pause/resume job if they want to try to move job coordinator to another node.

Addresses #82698

Release note: none

89026: kvserver: add `SmallEngineBlocks` testing knob and metamorphic params r=erikgrinaker a=erikgrinaker

`@cockroachdb/repl-prs` to do the main review, tagging other teams for visibility/review of metamorphic test params.

Resolves #86648.

---

**kvserver: add `SmallEngineBlocks` testing knob**

This patch adds a store testing knob `SmallEngineBlocks` which
configures Pebble with a block size of 1 byte. This will store every key
in a separate block, which can provoke bugs in time-bound iterators.

Release note: None
  
**sql/logictest: add metamorphic test param for small engine blocks**

Uses a Pebble block size of 1 byte, to provoke bugs in time-bound
iterators.

Release note: None
  
**kvserver/rangefeed: add metamorphic test param for small engine blocks**

Uses a Pebble block size of 1 byte, to provoke bugs in time-bound
iterators.

Release note: None
  
**kvserver/gc: add metamorphic test param for small engine blocks**

Uses a Pebble block size of 1 byte, to provoke bugs in time-bound
iterators.

Release note: None

  
**backupccl: add metamorphic test param for small engine blocks**

Uses a Pebble block size of 1 byte, to provoke bugs in time-bound
iterators.

Release note: None

89030: codeowners: add test-eng to owners of pkg/workload r=srosenberg a=srosenberg

Add test-eng as a code owner/watcher for pkg/workload.

In light of recent and future improvements [1], [2], TestEng would prefer to be in sync with all changes to the workload code. Over time, the team plans to build expertise in this area.

[1] #88362 [2] #88566

Release note: None
Release justification: test only change

Co-authored-by: Yevgeniy Miretskiy <[email protected]>
Co-authored-by: Erik Grinaker <[email protected]>
Co-authored-by: Stan Rosenberg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-jobs C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) sync-me-8 T-jobs
Projects
None yet
Development

No branches or pull requests

2 participants