-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new Resolver and Balancer APIs (gRFC L9) #1408
Conversation
ab7246e
to
6aaa555
Compare
balancer/balancer.go
Outdated
|
||
var ( | ||
// m is a map from name to balancer builder. | ||
m map[string]Builder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
m = make(map[string]Builder)
to save the line in init()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
balancer/balancer.go
Outdated
// Get returns the resolver builder registered with the given name. | ||
// If no builder is register with the name, the default pickfirst will | ||
// be used. | ||
func Get(name string) (b Builder, ok bool) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is "ok" necessary? Or can the user check b == nil?
or even:
var defaultBuilder Builder = &pickfirst.Builder{}
-- now "!ok" is not possible (once implemented).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Return defaultBuilder
if map lookup is not ok.
balancer/balancer.go
Outdated
// SubConnection represents a gRPC sub connection. | ||
// Each sub connection contains a list of addresses. gRPC will | ||
// try to connect to them (in sequence), and stop trying the | ||
// remainings if one connection was successful. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"// remainder once one connection is successful"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
balancer/balancer.go
Outdated
// The reconnect backoff will be applied on the list, not a single address. | ||
// For example, try_on_all_addresses -> backoff -> try_on_all_addresses. | ||
// | ||
// All SubConnection starts in IDLE, and will not try to connect. To trigger |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"All SubConnections start"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
balancer/balancer.go
Outdated
// All SubConnection starts in IDLE, and will not try to connect. To trigger | ||
// the connecting, Balancers must call Connect. | ||
// When the connection encounters an error, it will reconnect immediately. | ||
// When the connection becomes IDLE, it will not reconnect unless Connect is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we document when this could happen (that a connection goes to IDLE)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be documented in the connectivity semantics doc. Also, it's not clear when a SubConn should go IDLE now...
|
||
// Package resolver defines APIs for name resolution in gRPC. | ||
// All APIs in this package are experimental. | ||
package resolver |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many of the comments in the balancer package apply here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
// SetDefaultScheme sets the default scheme that will be used. | ||
// The default default scheme is "dns". | ||
func SetDefaultScheme(scheme string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: balancer comments, how about setting the default scheme to a Builder instead of a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The resolver is a bit different from balancer I think.
For balancer, users cannot override the default balancer. The default will always be pickfirst.
For resolver, the default is whatever the users set here.
If we read the map in SetDefaultScheme
function, we need to make sure the builder was already registered.
We could make this SetDefaultBuilder(b Builder)
.
Then if users set the default resolver to one with scheme "blah", and then dial with "blah:///", we should find the correct resolver to use. So we need to register the default Builder as well.
resolver/resolver.go
Outdated
// of resolved addresses. | ||
// The address list should be the complete list of new addresses. | ||
// Two consecutive call with the same addresses will be noop. It's preferred | ||
// that resolver doesn't call this consecutively with non-updated addresses. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we even need this sentence. Like, if the resolver refreshes every 30 minutes and calls this, it's not a problem. The idea was to avoid a busy-loop situation, right? I'm not sure who would implement a resolver to continually call NewAddress, or what line of thinking might even lead to that.
// Resolver watches for the updates on the specified target. | ||
// Updates include address updates and service config updates. | ||
type Resolver interface { | ||
// ResolveNow will be called by gRPC to try to resolve the target name again. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For further discussion: can we avoid the need for this by having grpc start up a new Resolver for the same target?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Starting up a new resolver means we need to support switching resolver.
This sounds more complicated than just calling a function on the existing resolver...
clientconn.go
Outdated
} | ||
|
||
// ConnectivityState indicates the state of a client connection. | ||
type ConnectivityState int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes will break anyone using the connectivity state API (it's new, but, still).
Maybe we can push a PR with only that change in it ASAP while we iterate on this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in #1430
8997c7e
to
d3ded16
Compare
Add and use connectivity package for states (grpc#1430) * Add and use connectivity package * Mark cc state APIs as experimental Add balancer, resolver and connectivity package. add balancer_v1_wrapper.go and remove grpclb.go all test passed, how? end2end passed, nil pointer failure, ac.wait return nil transport fix ac.wait return nil transport, races not fixed (accessing cc.balancer) end2end passed, TODO grpclb all test passed add grpclb back move building balancer out from a goroutine to avoid race Otherwise, Dial() and Close() may race. Mark new APIs experimental split resetAddrConn into newAddrConn and connect add acBalancerWrapper rename file to balancer_conn_wrappers.go remove grpclog.Print make TODO(bar) remove Print comments fixes add missing license fix build failure after merge fix race in grpclb
d3ded16
to
b4e43a0
Compare
Implement UpdateAddresses to fix TestPickFirstOrderAllServerUp.
clientconn.go
Outdated
dopts dialOptions | ||
events trace.EventLog | ||
acbw *acBalancerWrapper |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a SubConn type instead? We only need it so we can notify the balancer that the ac's state changed, IIUC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
balancer_conn_wrappers.go
Outdated
if !ok { | ||
return | ||
} | ||
ccb.cc.removeAddrConn(acbw.getAddrConn(), errConnClosing) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be errConnDrain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
balancer/balancer.go
Outdated
// If it's in the list, the connection will be kept. | ||
// If it's not in the list, the connection will gracefully closed, and | ||
// a new connection will be created. | ||
UpdateAddresses([]resolver.Address) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What SubConn state changes could result from this call if the current address is not in the new list? It should depend on the current state of the subconn, presumably:
Idle->Idle (Would this trigger a callback?)
Connecting->Connecting (to new address) ? (Would this trigger a callback?)
Ready->TransientFailure->Connecting->Ready ?
TransientFailure->Connecting->Ready
Shutdown->Shutdown
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Idle and Shutdown, it won't trigger a callback.
For the other 3 cases, yes, they can all happen.
We actually can have Ready->Connecting->Ready.
I updated the comment to mention this can trigger state update for the subconn.
I also added a TODO to check all the state transitions.
balancer/balancer.go
Outdated
// The returned put() function will be called once the rpc has finished, with the | ||
// final status of that RPC. | ||
// put may be nil if balancer doesn't care about the RPC status. | ||
Pick(ctx context.Context, opts PickOptions) (conn SubConn, put func(PutInfo), err error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/put/done/g?
Or maybe rpcDone or rpcEnd?
(I would rather this not seem like the opposite operation as Pick().)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
balancer_v1_wrapper.go
Outdated
if oldSC != nil { | ||
// Teardown old sc. | ||
delete(bw.conns, oldA) | ||
delete(bw.connSt, oldSC) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Q. Why is this not protected via bw.mu
as in line 187-188?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I believe this needs to be protected by the mutex. Will fix it soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed. Please take a look.
Thanks for pointing this out!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Np. Thanks!
b766426
to
1f516b1
Compare
This is the first step for #1388.
See gRFC L9 for details: grpc/proposal#30
This PR adds package
balancer
andresolver
.It changes ClientConn internals to new APIs and adds a wrapper for v1 balancer.
It does minimal change to ClientConn's behavior: