-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: reduce contention on SessionRegistry
mutex
#95553
sql: reduce contention on SessionRegistry
mutex
#95553
Conversation
session.cancelSession() | ||
return &serverpb.CancelSessionResponse{Canceled: true}, nil | ||
} | ||
session, ok := r.getSessionByID(sessionID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now uses the map to look up the session instead of scanning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @yuzefovich)
pkg/server/status.go
line 223 at r1 (raw file):
b.sessionRegistry.RLock() defer b.sessionRegistry.RUnlock() sessions = b.sessionRegistry.SerializeAll(false /* lock */)
it looks like this is the only instance where false
is passed to SerializeAll
. and the lock is just acquired above. so can we just get rid of the boolean parameter, and always do the locking inside of SerializeAll
?
pkg/sql/exec_util.go
line 2100 at r1 (raw file):
sessions := make([]registrySession, len(r.sessions)) i := 0 for _, session := range r.sessions {
nit: this could be for i, session := range r.session
so you don't need to increment i
manually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice finds!
Reviewed 4 of 5 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @rafiss)
pkg/server/status.go
line 223 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
it looks like this is the only instance where
false
is passed toSerializeAll
. and the lock is just acquired above. so can we just get rid of the boolean parameter, and always do the locking inside ofSerializeAll
?
+1 - I think we'll just want to RLock and RUnlock in getSessions
unconditionally, and everything else should just work.
pkg/sql/exec_util.go
line 2100 at r1 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
nit: this could be
for i, session := range r.session
so you don't need to incrementi
manually
I don't think it would work since i
would be a key in the map (i.e. clusterunique.ID
) and not an integer index.
We could instead change the allocation of sessions
to make([]registrySession, 0, len(r.sessions))
and then use append
in order to avoid the index altogether.
pkg/sql/exec_util.go
line 2186 at r2 (raw file):
} // SerializeAll returns a slice of all sessions in the registry, converted to serverpb.Sessions.
nit: wrap comments at 80 characters.
pkg/sql/exec_util.go
line 2192 at r2 (raw file):
i := 0 for _, s := range sessions { response[i] = s.serialize()
Currently there is no increment of i
, so this seems broken - we should probably just keep the append
ing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @yuzefovich)
pkg/server/status.go
line 223 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
+1 - I think we'll just want to RLock and RUnlock in
getSessions
unconditionally, and everything else should just work.
The difference is that b.closedSessionCache.GetSerializedSessions()
is also being locked without being unlocked in between. Why does that not need to be included in the lock?
Or are you suggesting to lock here -> lock in getSessions -> unlock in getSessions -> unlock here?
pkg/sql/exec_util.go
line 2100 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
I don't think it would work since
i
would be a key in the map (i.e.clusterunique.ID
) and not an integer index.We could instead change the allocation of
sessions
tomake([]registrySession, 0, len(r.sessions))
and then useappend
in order to avoid the index altogether.
Changed to append
+ make([]registrySession, 0, len(r.sessions))
pkg/sql/exec_util.go
line 2186 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: wrap comments at 80 characters.
Fixed.
pkg/sql/exec_util.go
line 2192 at r2 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
Currently there is no increment of
i
, so this seems broken - we should probably just keep theappend
ing behavior.
Changed to append.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @ecwall and @rafiss)
pkg/server/status.go
line 223 at r1 (raw file):
Previously, ecwall (Evan Wall) wrote…
The difference is that
b.closedSessionCache.GetSerializedSessions()
is also being locked without being unlocked in between. Why does that not need to be included in the lock?Or are you suggesting to lock here -> lock in getSessions -> unlock in getSessions -> unlock here?
But it's a different lock, right? So we'll have b.sessionRegistry.RLock()
and b.sessionRegistry.RUnlock()
inside of b.sessionRegistry.SerializeAll()
, once that method returns, we do b.closedSessionCache.mu.Lock()
and b.closedSessionCache.mu.Unlock()
inside of GetSerializedSessions
. Two locks seem independent of each other, or am I missing something?
pkg/sql/exec_util.go
line 2065 at r3 (raw file):
// Use register() and deregister() to modify this registry. type SessionRegistry struct { syncutil.RWMutex
nit: it's probably better to name this mutex, i.e. mu syncutil.RWMutex
) so that only SessionRegistry
object itself would lock / unlock it as necessary. Also it'd be probably even nicer to move the other two fields into an anonymous struct
type SessionRegistry struct {
mu struct {
syncutil.RWMutex
sessionsByID ...
sessionsByCancelKey ...
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @yuzefovich)
pkg/server/status.go
line 223 at r1 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
But it's a different lock, right? So we'll have
b.sessionRegistry.RLock()
andb.sessionRegistry.RUnlock()
inside ofb.sessionRegistry.SerializeAll()
, once that method returns, we dob.closedSessionCache.mu.Lock()
andb.closedSessionCache.mu.Unlock()
inside ofGetSerializedSessions
. Two locks seem independent of each other, or am I missing something?
The way it is special cased, I assumed that the sessionRegistry.Lock()
was required over the whole operation to prevent duplicates somehow, but I didn't look into it to verify that so I'll do that next. I would like to simplify it like you suggested if possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @rafiss and @yuzefovich)
pkg/server/status.go
line 223 at r1 (raw file):
Previously, ecwall (Evan Wall) wrote…
The way it is special cased, I assumed that the
sessionRegistry.Lock()
was required over the whole operation to prevent duplicates somehow, but I didn't look into it to verify that so I'll do that next. I would like to simplify it like you suggested if possible.
It looks like it was just to make sure that the closed sessions did not overlap with the open ones. I changed it to not lock over the whole check and to just prefer closed sessions to open ones when populating userSessions
.
pkg/sql/exec_util.go
line 2065 at r3 (raw file):
Previously, yuzefovich (Yahor Yuzefovich) wrote…
nit: it's probably better to name this mutex, i.e.
mu syncutil.RWMutex
) so that onlySessionRegistry
object itself would lock / unlock it as necessary. Also it'd be probably even nicer to move the other two fields into an anonymous structtype SessionRegistry struct { mu struct { syncutil.RWMutex sessionsByID ... sessionsByCancelKey ... } }
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! but probably worth waiting for Rafi to take a look too. I'm assuming that we'll want to backport this, right? Consider adding the corresponding labels.
Reviewed 4 of 5 files at r4, 1 of 1 files at r5, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @rafiss)
pkg/server/status.go
line 223 at r1 (raw file):
Previously, ecwall (Evan Wall) wrote…
It looks like it was just to make sure that the closed sessions did not overlap with the open ones. I changed it to not lock over the whole check and to just prefer closed sessions to open ones when populating
userSessions
.
Oh, I initially missed that comment about deduplicating sessions.
Fixes #95535 This reduces `SessionRegistry` contention in 2 ways: 1) Change `SessionRegistry` `syncutil.Mutex` to `syncutil.RWMutex` and use `RLock()` in place of `Lock()` when not modifying the registry. 2) Only lock reading/writing `SessionRegistry` maps with mutex instead of including the session operations in the lock. Release note (bug fix): Reduce register session, deregister session, and session cancel query contention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thanks for improving this
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @yuzefovich)
bors r=yuzefovich,rafiss |
Build failed (retrying...): |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error creating merge commit from 56d3687 to blathers/backport-release-22.1-95553: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 22.1.x failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Fixes #95535
This reduces
SessionRegistry
contention in 2 ways:SessionRegistry
syncutil.Mutex
tosyncutil.RWMutex
and useRLock()
in place ofLock()
when not modifying the registry.SessionRegistry
maps with mutex instead of including the session operations in the lock.Release note (bug fix): Reduce register session, deregister session, and session cancel query contention.