-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track closed connections and reason for closing #692
Conversation
Signed-off-by: Derek Collison <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe that there is an error in ConnInfo.fill() where you don't use the passed nc
. Also, the reason for the close we would want to report may not be the one that is recorded due to error detection lower in the stack.
return ConnOpen, nil | ||
} | ||
switch strings.ToLower(str) { | ||
case "open": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be opened
since you have closed
(or add both open, opened and close, closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think open is correct since it means perpetual vs closed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
func (reason ClosedState) String() string { | ||
switch reason { | ||
case ClientClosed: | ||
return "Client" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be Closed
instead or Client Closed
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had it as Client Closed, but thought it read better looking at the JSON as reason for closed is Client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
|
||
// Fixed sized ringbuffer for closed connections. | ||
type closedRingBuffer struct { | ||
total uint64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would that make more sense to add a mutex in that object and so we don't need server lock for that? When I create a new object like that, I tend to favor that object having its own lock so we can reduce lock contention. Especially this is a singleton, so adding a mutex should not be costly mem wise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that too, but decided its very fast and will only hold for time to copy. I was thinking about preallocating the structs too. I could put a lock around it.
@@ -834,6 +838,35 @@ func (s *Server) createClient(conn net.Conn) *client { | |||
return c | |||
} | |||
|
|||
// This will save off a closed client in a ring buffer such that | |||
// /connz can inspect. Useful for debugging, etc. | |||
func (s *Server) saveClosedClient(c *client, nc net.Conn, reason ClosedState) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a receiving of the ring buffer object (especially if we use a dedicated lock there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it could. Feels like a server verb though that just uses the ring buffer. We do the time stamp etc.
ci.InMsgs = atomic.LoadInt64(&client.inMsgs) | ||
ci.InBytes = atomic.LoadInt64(&client.inBytes) | ||
|
||
// If the connection is gone, too bad, we won't set TLSVersion and TLSCipher. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe you should use nc
down below, not client.nc
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree will fix.
server/server.go
Outdated
@@ -786,7 +790,7 @@ func (s *Server) createClient(conn net.Conn) *client { | |||
if err := conn.Handshake(); err != nil { | |||
c.Errorf("TLS handshake error: %v", err) | |||
c.sendErr("Secure Connection - TLS Required") | |||
c.closeConnection() | |||
c.closeConnection(TLSHandshakeError) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Humm... I tried to connect a non TLS client to a TLS server and I get Write Error, not TLS Handshake Error. I believe that since we call sendErr(), it is possible that flushOutbound() gets an error because the client will also close the connection (after receiving the error), so the actual close of the connection occurs there, with WriteError. That may then not prove that useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Non TLS client won't try to handshake will it? I will look into it though, agree we want the correct reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No they won't.
Actually, what happened is that I was connecting without TLS, so the client when getting the INFO fails the connect and says that secure connection is required. The server when getting into the Handshake will fail and send the error, but since the connection is closed it will get a write error, so that kind of makes sense.
But, I just tried with a client connection with TLS but without the CA root cert so it won't be able to verify the server certificate, and I still get the Write Error as a reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I will fix and add a test, thanks.
I also just found a bug with closed connections and optional Subs (and AuthorizedUser). Once set it stays set, will fix in the am. |
Signed-off-by: Derek Collison <[email protected]>
Signed-off-by: Derek Collison <[email protected]>
Signed-off-by: Derek Collison <[email protected]>
Signed-off-by: Derek Collison <[email protected]>
Signed-off-by: Derek Collison <[email protected]>
Signed-off-by: Derek Collison <[email protected]>
Signed-off-by: Derek Collison <[email protected]>
server/closed_conns_test.go
Outdated
@@ -267,7 +267,7 @@ func TestClosedMaxPayload(t *testing.T) { | |||
|
|||
func TestClosedSlowConsumerWriteDeadline(t *testing.T) { | |||
opts := DefaultOptions() | |||
opts.WriteDeadline = 10 * time.Microsecond // Make very small to trip. | |||
opts.WriteDeadline = 10 * time.Millisecond // Make very small to trip. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not enough. Bumping it to 30ms in TestWriteDeadline for very same reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(it was originally at 1 ms though), so that may be ok...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to pass on travis..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we can change it..
LGTM |
This PR allows the server to maintain state on closed connections. Uses a fixed size ring buffer that is configurable. Monitoring endpoints now can return closed connections via /connz?state=closed and will report when they closed and why.
Adresses #520 and #663
Signed-off-by: Derek Collison [email protected]
/cc @nats-io/core