-
Notifications
You must be signed in to change notification settings - Fork 14
feat: refactor and enhance session creation logic #1343
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1343 +/- ##
==========================================
+ Coverage 94.65% 94.68% +0.02%
==========================================
Files 174 175 +1
Lines 13644 13661 +17
==========================================
+ Hits 12915 12935 +20
+ Misses 729 726 -3
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @devbww, @devjgm, and @mr-salty)
google/cloud/spanner/internal/session.h, line 39 at r1 (raw file):
public: Session(std::string session_name, std::shared_ptr<SpannerStub> stub, int* channel_session_counter)
FYI: passing pointers like this feels icky, and I consider it a code smell. Is there an abstraction we are missing? Maybe an API to say "call here to decrease the session counter"? Or maybe the "dissociate_from_pool" should be a std::function
or abstract class that gets called when the session is removed, and sometimes we initialize it with a noop, and sometimes with a full object that decrements counters and so on?
google/cloud/spanner/internal/session.h, line 58 at r1 (raw file):
private: friend class SessionPool; // for access to stub() and channel_info_index()
nit: s/channel_info_index()/channel_session_counter()/ ?
google/cloud/spanner/internal/session_pool.cc, line 134 at r1 (raw file):
/** * Grow the session pool by creating up to `sessions_to_create` sessions and * adding them to the pool. Note that `lk` may be dropped in the process.
nit: s/dropped/released/
google/cloud/spanner/internal/session_pool.cc, line 175 at r1 (raw file):
std::vector<std::pair<ChannelInfo*, int>> create_counts; for (auto& channel : channels_by_count) { int target = (sessions_remaining / channels_remaining) +
@devbww taught me this trick: (sessions_remaining + channels_remaining - 1) / channels_remaining)
google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):
} } // TODO(#1172) in the async case this needs to happen when the calls finish.
FYI: consider returning a google::cloud::future<void>
that represents "calls finished", then the client can block or not, at their preference.
google/cloud/spanner/internal/session_pool.cc, line 219 at r1 (raw file):
sessions_.pop_back(); if (dissociate_from_pool) { --*session->channel_session_counter();
Ugh. What do you think of adding a member function called decrease_channel_sessions()
instead?
google/cloud/spanner/internal/session.h, line 39 at r1 (raw file): Previously, coryan (Carlos O'Ryan) wrote…
yeah, I mentioned this in the description, it's not ideal but I also don't think adding another layer of abstraction makes it any better. Let me split this out to another PR like I should have done in the first place. |
*/ | ||
Status SessionPool::Grow(std::unique_lock<std::mutex>& lk, | ||
int sessions_to_create, bool async) { | ||
create_in_progress_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Could this simply be moved down to avoid the = false
in the "already at max" case? It seems to be something that should be closely bound to lk
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @coryan, @devjgm, and @mr-salty)
google/cloud/spanner/internal/session.h, line 39 at r1 (raw file):
Previously, mr-salty (Todd Derr) wrote…
yeah, I mentioned this in the description, it's not ideal but I also don't think adding another layer of abstraction makes it any better. Let me split this out to another PR like I should have done in the first place.
ok, I moved the counter stuff to #1346 and I think that's a cleaner approach... I'll resolve those comments on this PR and address the other comments.
google/cloud/spanner/internal/session_pool.cc, line 219 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
Ugh. What do you think of adding a member function called
decrease_channel_sessions()
instead?
Done. (moved to another PR)
Centralize all Session creation in the `Grow()` method, which computes the number of Sessions to create on each channel to balance the load across channels. Also introduces an async mode (not implemented here) which will eventually become the only mode. Also fixes a bug where the per-channel session counts were not being updated properly when `Sessions` were marked bad or disassociated. Maybe this part should be its own PR, and I don't really like how it obfuscates the locking requirements so I'm open to suggestions on how to do this in a cleaner way. Part of #1172
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL - I rebased and addressed comments
Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @coryan, @devbww, and @devjgm)
google/cloud/spanner/internal/session_pool.cc, line 134 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
nit: s/dropped/released/
Done
google/cloud/spanner/internal/session_pool.cc, line 141 at r1 (raw file):
Previously, devbww (Bradley White) wrote…
Nit: Could this simply be moved down to avoid the
= false
in the "already at max" case? It seems to be something that should be closely bound tolk
.
Done
google/cloud/spanner/internal/session_pool.cc, line 175 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
@devbww taught me this trick:
(sessions_remaining + channels_remaining - 1) / channels_remaining)
Done.
google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
FYI: consider returning a
google::cloud::future<void>
that represents "calls finished", then the client can block or not, at their preference.
Ack - my goal is to get rid of synchronous create entirely, which will happen in another PR or two. It's not necessary aside from the issues I had getting tests to pass (which may indicate some other latent issue, or maybe I need to add a testonly method to wait for completion).
In the real world, Allocate()
could end up blocking waiting for Session creation to complete, but that's accomplished via condition variables (because another thread returning a Session
to the pool can also unblock waiters). Otherwise we should never have to actually wait.
(std::min)(total_sessions_ + sessions_to_create, session_limit); | ||
|
||
// Sort the channels in *descending* order of session count. | ||
std::vector<std::shared_ptr<Channel>> channels_by_count; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this just equivalent to
auto chanenls_by_count = channels_;
// to delete sessions just to make the counts equal, so do the best we | ||
// can within those constraints. | ||
int target_total_sessions = | ||
(std::min)(total_sessions_ + sessions_to_create, session_limit); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fwiw, I don't think the (std::min)
trick is needed in .cc files, because we know what our includes are and if our build works, then it works. This differes from .h
files where callers may include us after including the offending Windows include that defines the macro for min
.
* asynchronously. The main obstacle is making existing tests pass. | ||
*/ | ||
Status SessionPool::Grow(std::unique_lock<std::mutex>& lk, | ||
int sessions_to_create, bool async) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bool params like this lead to unclear call sites, hence the need to comment /*async=*/
at call sites. Consider using an enum to make the code itself perfectly clear even without the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some thoughts for you, please also address the questions from @devjgm
Reviewed 2 of 4 files at r2.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @devjgm and @mr-salty)
google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):
Previously, mr-salty (Todd Derr) wrote…
Ack - my goal is to get rid of synchronous create entirely, which will happen in another PR or two. It's not necessary aside from the issues I had getting tests to pass (which may indicate some other latent issue, or maybe I need to add a testonly method to wait for completion).
In the real world,
Allocate()
could end up blocking waiting for Session creation to complete, but that's accomplished via condition variables (because another thread returning aSession
to the pool can also unblock waiters). Otherwise we should never have to actually wait.
You could return a satisfied future: return google::cloud::make_ready_future()
, which would stabilize your API and remove that unsightly bool that we are giving you so much grief about 😁
google/cloud/spanner/internal/session_pool.cc, line 125 at r2 (raw file):
deleting
This suggests the function should be Resize()
, not Grow
?
google/cloud/spanner/internal/session_pool.cc, line 158 at r2 (raw file):
Previously, devjgm (Greg Miller) wrote…
fwiw, I don't think the
(std::min)
trick is needed in .cc files, because we know what our includes are and if our build works, then it works. This differes from.h
files where callers may include us after including the offending Windows include that defines the macro formin
.
Yeah, but I hate to think.
google/cloud/spanner/internal/session_pool.cc, line 161 at r2 (raw file):
Previously, devjgm (Greg Miller) wrote…
Is this just equivalent to
auto chanenls_by_count = channels_;
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @coryan and @devjgm)
google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
You could return a satisfied future:
return google::cloud::make_ready_future()
, which would stabilize your API and remove that unsightly bool that we are giving you so much grief about 😁
that doesn't solve the problem for this CL because I call a different RPC based on async
(now wait
). Eventually I'll only use the async RPC (and never have to wait for it, as noted) but I wanted to keep all of that complication out of this CL. And, Status
is the return type I'll eventually need (unless I can get away with void
.)
google/cloud/spanner/internal/session_pool.cc, line 125 at r2 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
deleting
This suggests the function should be
Resize()
, notGrow
?
sorry, that is the comment for MaintainPoolSize
, I just put it in the wrong place. Grow()
only grows.
google/cloud/spanner/internal/session_pool.cc, line 141 at r2 (raw file):
Previously, devjgm (Greg Miller) wrote…
bool params like this lead to unclear call sites, hence the need to comment
/*async=*/
at call sites. Consider using an enum to make the code itself perfectly clear even without the comment.
Done. As I noted below this parameter will be short-lived and is just a concession to splitting the changes into smaller CLs. It is a private method so that seemed like a reasonable step.
google/cloud/spanner/internal/session_pool.cc, line 158 at r2 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
Yeah, but I hate to think.
I'll leave it as-is, I know this has bitten me at least once before... and maybe experiment to see if I can get rid of this in another branch.
google/cloud/spanner/internal/session_pool.cc, line 161 at r2 (raw file):
Previously, coryan (Carlos O'Ryan) wrote…
+1
done - good catch, those weren't originally the same type.
…cloud-cpp-spanner#1343) * feat: refactor and enhance session creation logic Centralize all Session creation in the `Grow()` method, which computes the number of Sessions to create on each channel to balance the load across channels. Also introduces an async mode (not implemented here) which will eventually become the only mode. Also fixes a bug where the per-channel session counts were not being updated properly when `Sessions` were marked bad or disassociated. Maybe this part should be its own PR, and I don't really like how it obfuscates the locking requirements so I'm open to suggestions on how to do this in a cleaner way. Part of googleapis/google-cloud-cpp-spanner#1172 * address review comments * address review comments
Centralize all Session creation in the
Grow()
method, which computesthe number of Sessions to create on each channel to balance the load
across channels. Also introduces an async mode (not implemented here)
which will eventually become the only mode.
Also fixes a bug where the per-channel session counts were not being
updated properly when
Sessions
were marked bad or disassociated.Maybe this part should be its own PR, and I don't really like how it
obfuscates the locking requirements so I'm open to suggestions on how
to do this in a cleaner way.
Part of #1172
This change is![Reviewable](https://camo.githubusercontent.com/1541c4039185914e83657d3683ec25920c672c6c5c7ab4240ee7bff601adec0b/68747470733a2f2f72657669657761626c652e696f2f7265766965775f627574746f6e2e737667)