Skip to content
This repository has been archived by the owner on Dec 8, 2021. It is now read-only.

feat: refactor and enhance session creation logic #1343

Merged
merged 3 commits into from
Mar 10, 2020
Merged

feat: refactor and enhance session creation logic #1343

merged 3 commits into from
Mar 10, 2020

Conversation

mr-salty
Copy link
Contributor

@mr-salty mr-salty commented Mar 9, 2020

Centralize all Session creation in the Grow() method, which computes
the number of Sessions to create on each channel to balance the load
across channels. Also introduces an async mode (not implemented here)
which will eventually become the only mode.

Also fixes a bug where the per-channel session counts were not being
updated properly when Sessions were marked bad or disassociated.
Maybe this part should be its own PR, and I don't really like how it
obfuscates the locking requirements so I'm open to suggestions on how
to do this in a cleaner way.

Part of #1172


This change is Reviewable

@mr-salty mr-salty requested review from devjgm and devbww March 9, 2020 08:24
@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Mar 9, 2020
@codecov
Copy link

codecov bot commented Mar 9, 2020

Codecov Report

Merging #1343 into master will increase coverage by 0.02%.
The diff coverage is 91.11%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1343      +/-   ##
==========================================
+ Coverage   94.65%   94.68%   +0.02%     
==========================================
  Files         174      175       +1     
  Lines       13644    13661      +17     
==========================================
+ Hits        12915    12935      +20     
+ Misses        729      726       -3
Impacted Files Coverage Δ
google/cloud/spanner/internal/session_pool.h 0% <ø> (ø) ⬆️
google/cloud/spanner/internal/session_pool.cc 95.75% <91.11%> (-0.82%) ⬇️
google/cloud/spanner/internal/log_wrapper.h 71.42% <0%> (-13.19%) ⬇️
google/cloud/spanner/internal/spanner_stub.cc 65.65% <0%> (-2.03%) ⬇️
...anner/integration_tests/client_integration_test.cc 98.49% <0%> (-0.02%) ⬇️
google/cloud/spanner/keys.h 100% <0%> (ø) ⬆️
google/cloud/spanner/internal/session.h 100% <0%> (ø) ⬆️
google/cloud/spanner/session_pool_options.h 100% <0%> (ø) ⬆️
.../spanner/benchmarks/multiple_rows_cpu_benchmark.cc 91.97% <0%> (ø) ⬆️
...oogle/cloud/spanner/internal/database_admin_stub.h
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f2e66e...df1353e. Read the comment docs.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @devbww, @devjgm, and @mr-salty)


google/cloud/spanner/internal/session.h, line 39 at r1 (raw file):

 public:
  Session(std::string session_name, std::shared_ptr<SpannerStub> stub,
          int* channel_session_counter)

FYI: passing pointers like this feels icky, and I consider it a code smell. Is there an abstraction we are missing? Maybe an API to say "call here to decrease the session counter"? Or maybe the "dissociate_from_pool" should be a std::function or abstract class that gets called when the session is removed, and sometimes we initialize it with a noop, and sometimes with a full object that decrements counters and so on?


google/cloud/spanner/internal/session.h, line 58 at r1 (raw file):

 private:
  friend class SessionPool;  // for access to stub() and channel_info_index()

nit: s/channel_info_index()/channel_session_counter()/ ?


google/cloud/spanner/internal/session_pool.cc, line 134 at r1 (raw file):

/**
 * Grow the session pool by creating up to `sessions_to_create` sessions and
 * adding them to the pool.  Note that `lk` may be dropped in the process.

nit: s/dropped/released/


google/cloud/spanner/internal/session_pool.cc, line 175 at r1 (raw file):

  std::vector<std::pair<ChannelInfo*, int>> create_counts;
  for (auto& channel : channels_by_count) {
    int target = (sessions_remaining / channels_remaining) +

@devbww taught me this trick: (sessions_remaining + channels_remaining - 1) / channels_remaining)


google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):

    }
  }
  // TODO(#1172) in the async case this needs to happen when the calls finish.

FYI: consider returning a google::cloud::future<void> that represents "calls finished", then the client can block or not, at their preference.


google/cloud/spanner/internal/session_pool.cc, line 219 at r1 (raw file):

      sessions_.pop_back();
      if (dissociate_from_pool) {
        --*session->channel_session_counter();

Ugh. What do you think of adding a member function called decrease_channel_sessions() instead?

@mr-salty
Copy link
Contributor Author

mr-salty commented Mar 9, 2020


google/cloud/spanner/internal/session.h, line 39 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

FYI: passing pointers like this feels icky, and I consider it a code smell. Is there an abstraction we are missing? Maybe an API to say "call here to decrease the session counter"? Or maybe the "dissociate_from_pool" should be a std::function or abstract class that gets called when the session is removed, and sometimes we initialize it with a noop, and sometimes with a full object that decrements counters and so on?

yeah, I mentioned this in the description, it's not ideal but I also don't think adding another layer of abstraction makes it any better. Let me split this out to another PR like I should have done in the first place.

*/
Status SessionPool::Grow(std::unique_lock<std::mutex>& lk,
int sessions_to_create, bool async) {
create_in_progress_ = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Could this simply be moved down to avoid the = false in the "already at max" case? It seems to be something that should be closely bound to lk.

Copy link
Contributor Author

@mr-salty mr-salty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @coryan, @devjgm, and @mr-salty)


google/cloud/spanner/internal/session.h, line 39 at r1 (raw file):

Previously, mr-salty (Todd Derr) wrote…

yeah, I mentioned this in the description, it's not ideal but I also don't think adding another layer of abstraction makes it any better. Let me split this out to another PR like I should have done in the first place.

ok, I moved the counter stuff to #1346 and I think that's a cleaner approach... I'll resolve those comments on this PR and address the other comments.


google/cloud/spanner/internal/session_pool.cc, line 219 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

Ugh. What do you think of adding a member function called decrease_channel_sessions() instead?

Done. (moved to another PR)

mr-salty added 2 commits March 9, 2020 18:37
Centralize all Session creation in the `Grow()` method, which computes
the number of Sessions to create on each channel to balance the load
across channels. Also introduces an async mode (not implemented here)
which will eventually become the only mode.

Also fixes a bug where the per-channel session counts were not being
updated properly when `Sessions` were marked bad or disassociated.
Maybe this part should be its own PR, and I don't really like how it
obfuscates the locking requirements so I'm open to suggestions on how
to do this in a cleaner way.

Part of #1172
Copy link
Contributor Author

@mr-salty mr-salty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PTAL - I rebased and addressed comments

Reviewable status: 0 of 2 files reviewed, 2 unresolved discussions (waiting on @coryan, @devbww, and @devjgm)


google/cloud/spanner/internal/session_pool.cc, line 134 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

nit: s/dropped/released/

Done


google/cloud/spanner/internal/session_pool.cc, line 141 at r1 (raw file):

Previously, devbww (Bradley White) wrote…

Nit: Could this simply be moved down to avoid the = false in the "already at max" case? It seems to be something that should be closely bound to lk.

Done


google/cloud/spanner/internal/session_pool.cc, line 175 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

@devbww taught me this trick: (sessions_remaining + channels_remaining - 1) / channels_remaining)

Done.


google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

FYI: consider returning a google::cloud::future<void> that represents "calls finished", then the client can block or not, at their preference.

Ack - my goal is to get rid of synchronous create entirely, which will happen in another PR or two. It's not necessary aside from the issues I had getting tests to pass (which may indicate some other latent issue, or maybe I need to add a testonly method to wait for completion).

In the real world, Allocate() could end up blocking waiting for Session creation to complete, but that's accomplished via condition variables (because another thread returning a Session to the pool can also unblock waiters). Otherwise we should never have to actually wait.

(std::min)(total_sessions_ + sessions_to_create, session_limit);

// Sort the channels in *descending* order of session count.
std::vector<std::shared_ptr<Channel>> channels_by_count;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this just equivalent to

auto chanenls_by_count = channels_;

// to delete sessions just to make the counts equal, so do the best we
// can within those constraints.
int target_total_sessions =
(std::min)(total_sessions_ + sessions_to_create, session_limit);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw, I don't think the (std::min) trick is needed in .cc files, because we know what our includes are and if our build works, then it works. This differes from .h files where callers may include us after including the offending Windows include that defines the macro for min.

* asynchronously. The main obstacle is making existing tests pass.
*/
Status SessionPool::Grow(std::unique_lock<std::mutex>& lk,
int sessions_to_create, bool async) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bool params like this lead to unclear call sites, hence the need to comment /*async=*/ at call sites. Consider using an enum to make the code itself perfectly clear even without the comment.

Copy link
Contributor

@coryan coryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts for you, please also address the questions from @devjgm

Reviewed 2 of 4 files at r2.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @devjgm and @mr-salty)


google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):

Previously, mr-salty (Todd Derr) wrote…

Ack - my goal is to get rid of synchronous create entirely, which will happen in another PR or two. It's not necessary aside from the issues I had getting tests to pass (which may indicate some other latent issue, or maybe I need to add a testonly method to wait for completion).

In the real world, Allocate() could end up blocking waiting for Session creation to complete, but that's accomplished via condition variables (because another thread returning a Session to the pool can also unblock waiters). Otherwise we should never have to actually wait.

You could return a satisfied future: return google::cloud::make_ready_future(), which would stabilize your API and remove that unsightly bool that we are giving you so much grief about 😁


google/cloud/spanner/internal/session_pool.cc, line 125 at r2 (raw file):

deleting

This suggests the function should be Resize(), not Grow?


google/cloud/spanner/internal/session_pool.cc, line 158 at r2 (raw file):

Previously, devjgm (Greg Miller) wrote…

fwiw, I don't think the (std::min) trick is needed in .cc files, because we know what our includes are and if our build works, then it works. This differes from .h files where callers may include us after including the offending Windows include that defines the macro for min.

Yeah, but I hate to think.


google/cloud/spanner/internal/session_pool.cc, line 161 at r2 (raw file):

Previously, devjgm (Greg Miller) wrote…

Is this just equivalent to

auto chanenls_by_count = channels_;

+1

Copy link
Contributor Author

@mr-salty mr-salty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @coryan and @devjgm)


google/cloud/spanner/internal/session_pool.cc, line 204 at r1 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

You could return a satisfied future: return google::cloud::make_ready_future(), which would stabilize your API and remove that unsightly bool that we are giving you so much grief about 😁

that doesn't solve the problem for this CL because I call a different RPC based on async (now wait). Eventually I'll only use the async RPC (and never have to wait for it, as noted) but I wanted to keep all of that complication out of this CL. And, Status is the return type I'll eventually need (unless I can get away with void.)


google/cloud/spanner/internal/session_pool.cc, line 125 at r2 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…
deleting

This suggests the function should be Resize(), not Grow?

sorry, that is the comment for MaintainPoolSize, I just put it in the wrong place. Grow() only grows.


google/cloud/spanner/internal/session_pool.cc, line 141 at r2 (raw file):

Previously, devjgm (Greg Miller) wrote…

bool params like this lead to unclear call sites, hence the need to comment /*async=*/ at call sites. Consider using an enum to make the code itself perfectly clear even without the comment.

Done. As I noted below this parameter will be short-lived and is just a concession to splitting the changes into smaller CLs. It is a private method so that seemed like a reasonable step.


google/cloud/spanner/internal/session_pool.cc, line 158 at r2 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

Yeah, but I hate to think.

I'll leave it as-is, I know this has bitten me at least once before... and maybe experiment to see if I can get rid of this in another branch.


google/cloud/spanner/internal/session_pool.cc, line 161 at r2 (raw file):

Previously, coryan (Carlos O'Ryan) wrote…

+1

done - good catch, those weren't originally the same type.

@mr-salty mr-salty merged commit 2d10fc2 into googleapis:master Mar 10, 2020
@mr-salty mr-salty deleted the poolsize branch March 10, 2020 18:11
devjgm pushed a commit to devjgm/google-cloud-cpp that referenced this pull request May 7, 2020
…cloud-cpp-spanner#1343)

* feat: refactor and enhance session creation logic

Centralize all Session creation in the `Grow()` method, which computes
the number of Sessions to create on each channel to balance the load
across channels. Also introduces an async mode (not implemented here)
which will eventually become the only mode.

Also fixes a bug where the per-channel session counts were not being
updated properly when `Sessions` were marked bad or disassociated.
Maybe this part should be its own PR, and I don't really like how it
obfuscates the locking requirements so I'm open to suggestions on how
to do this in a cleaner way.

Part of googleapis/google-cloud-cpp-spanner#1172

* address review comments

* address review comments
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants