-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
jobs: remove FOR UPDATE clause when updating job #67660
jobs: remove FOR UPDATE clause when updating job #67660
Conversation
pkg/jobs/update.go
Outdated
@@ -119,7 +119,7 @@ func (j *Job) Update(ctx context.Context, txn *kv.Txn, updateFn UpdateFn) error | |||
var payload *jobspb.Payload | |||
var progress *jobspb.Progress | |||
if err := j.runInTxn(ctx, txn, func(ctx context.Context, txn *kv.Txn) error { | |||
stmt := "SELECT status, payload, progress FROM system.jobs WHERE id = $1 FOR UPDATE" | |||
stmt := "SELECT status, payload, progress FROM system.jobs WHERE id = $1" | |||
if j.sessionID != "" { | |||
stmt = "SELECT status, payload, progress, claim_session_id FROM system." + | |||
"jobs WHERE id = $1 FOR UPDATE" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to remove the FOR UPDATE locking clause from this statement as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, whoops.
pkg/jobs/update.go
Outdated
@@ -119,7 +119,7 @@ func (j *Job) Update(ctx context.Context, txn *kv.Txn, updateFn UpdateFn) error | |||
var payload *jobspb.Payload | |||
var progress *jobspb.Progress | |||
if err := j.runInTxn(ctx, txn, func(ctx context.Context, txn *kv.Txn) error { | |||
stmt := "SELECT status, payload, progress FROM system.jobs WHERE id = $1 FOR UPDATE" | |||
stmt := "SELECT status, payload, progress FROM system.jobs WHERE id = $1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be worth retaining the FOR UPDATE
clause for calls to (*Job).Update
that do intend to update the system.jobs
row? Can we distinguish those from the cases that are just using this function to read by checking whether updateFn
is a nil (after replacing no-op functions with nil functions)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's the tool we want to use. In almost all cases the job is either updated by its coordinator or from pause or cancel. In those cases, I'd actually prefer a pause or cancel to be able to overwrite the status during a long-running update of a job. The only case I think we want locking is this one right here:
cockroach/pkg/sql/row/expr_walker.go
Line 346 in 0cea9dc
err := j.Registry.UpdateJobWithTxn(ctx, j.JobID, txn, resolveChunkFunc) |
I think it's the only place where we try to modify the job from multiple nodes concurrently during normal interaction. I'm going to do some plumbing to retain the locking in that call.
e589ffe
to
970813d
Compare
@nvanbenschoten, @adityamaru see how this makes you feel. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r2.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @ajwerner and @sajjadrizvi)
pkg/jobs/registry.go, line 557 at r2 (raw file):
// that a txn will be automatically created. func (r *Registry) UpdateJobWithTxn( ctx context.Context, jobID jobspb.JobID, txn *kv.Txn, useReadLock bool, updateFunc UpdateFn,
This new param could use a comment.
pkg/jobs/update.go, line 119 at r2 (raw file):
// defined in jobs.go. func (j *Job) Update(ctx context.Context, txn *kv.Txn, updateFn UpdateFn) error { const useForUpdate = false
nit: we use three different names for this. useReadLock
, useForUpdate
, and useForUpdateReadLock
. Consider consolidating.
pkg/jobs/update.go, line 270 at r2 (raw file):
switch { case hasSessionID && !useForUpdate: return "SELECT " + columnsWithSession + from
This is fine, though I'd imagine the logic would be easier to read like:
cols := columnsWithoutSession
if hasSessionID {
cols = columnsWithSession
}
sfu := ""
if useForUpdate {
sfu = " FOR UPDATE"
}
return "SELECT " + cols + from + sfu
Was the intention to avoid runtime string concatenation costs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r2.
Reviewable status:complete! 2 of 0 LGTMs obtained (waiting on @ajwerner)
In cockroachdb currently, the `FOR UPDATE` lock in an exclusive lock. That means that both clients trying to inspect jobs and the job adoption loops will both try to scan the table and encounter these locks. For the most part, we don't really update the job from the leaves of a distsql flow. There is an exception which is IMPORT incrementing a sequence. In that case, which motivated the initial locking addition, we'll leave the locking. The other exception is pausing or canceling jobs. I think that in that case we prefer to invalidate the work of the transaction as our intention is to cancel it. If cockroach implemented UPGRADE locks (cockroachdb#49684), then this FOR UPDATE would not be a problem. Release note (performance improvement): Jobs no longer hold exclusive locks during the duration of their checkpointing transactions which can result in long wait times when trying to run SHOW JOBS.
970813d
to
269bf63
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @nvanbenschoten and @sajjadrizvi)
pkg/jobs/registry.go, line 557 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
This new param could use a comment.
Done.
pkg/jobs/update.go, line 119 at r2 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
nit: we use three different names for this.
useReadLock
,useForUpdate
, anduseForUpdateReadLock
. Consider consolidating.
Done.
pkg/jobs/update.go, line 270 at r2 (raw file):
Was the intention to avoid runtime string concatenation costs?
yes, that was in my head when writing this. Probably silly. I reworked it a bit for readability but still returning a constant. I think the new thing is better than the old thing and is good enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r3.
Reviewable status:complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @ajwerner)
TFTR! bors r+ |
Build succeeded: |
In cockroachdb currently, the
FOR UPDATE
lock in an exclusive lock. Thatmeans that both clients trying to inspect jobs and the job adoption loops will
both try to scan the table and encounter these locks. For the most part, we
don't really update the job from the leaves of a distsql flow. There is an
exception which is IMPORT incrementing a sequence. Nevertheless, the retry
behavior there seems sound. The other exception is pausing or canceling jobs.
I think that in that case we prefer to invalidate the work of the transaction
as our intention is to cancel it.
If cockroach implemented UPGRADE locks (#49684), then this FOR UPDATE would
not be a problem.
Release note (performance improvement): Jobs no longer hold exclusive locks
during the duration of their checkpointing transactions which can result in
long wait times when trying to run SHOW JOBS.