-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle etcd connection failures in etcd v3 watch API. #9
Conversation
@@ -803,43 +807,63 @@ func (et *etcdKV) watchStart( | |||
if waitIndex != 0 { | |||
opts = append(opts, e.WithRev(int64(waitIndex+1))) | |||
} | |||
session, err := concurrency.NewSession(et.kvClient, concurrency.WithTTL(defaultSessionTimeout)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per the following change, it may be better to create a cancel context ?
etcd-io/etcd#6699
Also, when it retries, the connection go to the next available etcd ? (kvClient needs to be refreshed ?)
otherwise looks good !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I had that cancel context, but cancelling it does not have any effect on the go-routine which is handling the watch responses (watchChan). Will add it back.
Should we refresh the kvClient in the watch api ?
The v2 docs say that the client goes to the next etcd - https://github.com/coreos/etcd/tree/master/client#caveat
But I could not find any doc or mention for clientv3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably the client is fixed in v3.2.0
I am following up with etcd here - etcd-io/etcd#7941
And if that works then we might not need this change at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gr8. Let me know if you want me to follow up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In a single node etcd server, if it goes down, the watch will still be hung and won't return. Looks like we will still need this change anyways. I am still running a test to see if v3.2.0 solves the reconnection issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to clarify, with this change in a single node etcd, session should terminate right ? (ignoring the reconnection issue ?)
Yes that is right.
…On Jun 15, 2017 09:09, "sangleganesh" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In etcd/v3/kv_etcd.go
<#9 (comment)>:
> @@ -803,43 +807,63 @@ func (et *etcdKV) watchStart(
if waitIndex != 0 {
opts = append(opts, e.WithRev(int64(waitIndex+1)))
}
+ session, err := concurrency.NewSession(et.kvClient, concurrency.WithTTL(defaultSessionTimeout))
just to clarify, with this change in a single node etcd, session should
terminate right ? (ignoring the reconnection issue ?)
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#9 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AATJC5KXgJkCsGExA0WBanIR3oREScFyks5sEVcKgaJpZM4N6cOv>
.
|
lets merge it then ! |
Add a Session to the Watch API.
The session can be used to detect etcd connection failures.
With the new clientv3, the watch is hung if there etcd connectivity is lost