-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
services/rpcsrv: Return a new server by pointer #3661
services/rpcsrv: Return a new server by pointer #3661
Conversation
Additionally, it seems that Lines 565 to 566 in 95098d4
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3661 +/- ##
=======================================
Coverage 83.09% 83.09%
=======================================
Files 334 334
Lines 46576 46576
=======================================
+ Hits 38702 38704 +2
+ Misses 6303 6301 -2
Partials 1571 1571 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
it seems that rpcsrv.Server doesn't wait for all HTTP handler to complete
Why do you think so? rpcServer.Shutdown()
must properly wait for all connections to be gracefully closed and all server routines to be finished:
neo-go/pkg/services/rpcsrv/server.go
Lines 457 to 459 in 95098d4
for _, srv := range s.http { | |
s.log.Info("shutting down RPC server", zap.String("endpoint", srv.Addr)) | |
err := srv.Shutdown(context.Background()) |
And
http.Server
's `Shutdown documentation says that:
// Shutdown gracefully shuts down the server without interrupting any
// active connections. Shutdown works by first closing all open
// listeners, then closing all idle connections, and then waiting
// indefinitely for connections to return to idle and then shut down.
and subscription routines are also properly awaited:
neo-go/pkg/services/rpcsrv/server.go
Lines 482 to 483 in 95098d4
// Wait for handleSubEvents to finish. | |
<-s.subEventsToExitCh |
Before, a new server was returned by value which could cause a panic `unlock of unlocked mutex` on SIGHUP handling. It's because the new server overwrites a locked mutex of the already existing server. oct‚ 22 13:51:15 node1 neo-go[1183338]: fatal error: sync: Unlock of unlocked RWMutex oct‚ 22 13:51:15 node1 neo-go[1183338]: goroutine 538 [running]: oct‚ 22 13:51:15 node1 neo-go[1183338]: sync.fatal({0xf83d64?, 0xc001085880?}) oct‚ 22 13:51:15 node1 neo-go[1183338]: runtime/panic.go:1007 +0x18 oct‚ 22 13:51:15 node1 neo-go[1183338]: sync.(*RWMutex).Unlock(0xc00019a4c8) oct‚ 22 13:51:15 node1 neo-go[1183338]: sync/rwmutex.go:208 +0x45 oct‚ 22 13:51:15 node1 neo-go[1183338]: github.com/nspcc-dev/neo-go/pkg/services/rpcsrv.(*Server).dropSubscriber(0xc00019a2c8, 0xc000a77740) oct‚ 22 13:51:15 node1 neo-go[1183338]: github.com/nspcc-dev/neo-go/pkg/services/rpcsrv/server.go:825 +0xce oct‚ 22 13:51:15 node1 neo-go[1183338]: github.com/nspcc-dev/neo-go/pkg/services/rpcsrv.(*Server).handleWsReads(0xc00019a2c8, 0xc0034478c0, 0xc000af5f80, 0xc000a7 7740) oct‚ 22 13:51:15 node1 neo-go[1183338]: github.com/nspcc-dev/neo-go/pkg/services/rpcsrv/server.go:810 +0x266 oct‚ 22 13:51:15 node1 neo-go[1183338]: github.com/nspcc-dev/neo-go/pkg/services/rpcsrv.(*Server).handleHTTPRequest(0xc00019a2c8, {0x11c3900, 0xc003437dc0}, 0xc00 31945a0) oct‚ 22 13:51:15 node1 neo-go[1183338]: github.com/nspcc-dev/neo-go/pkg/services/rpcsrv/server.go:582 +0x54a oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http.HandlerFunc.ServeHTTP(0x471779?, {0x11c3900?, 0xc003437dc0?}, 0xc000943b68?) oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http/server.go:2171 +0x29 oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http.serverHandler.ServeHTTP({0xc000a77680?}, {0x11c3900?, 0xc003437dc0?}, 0x6?) oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http/server.go:3142 +0x8e oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http.(*conn).serve(0xc0032030e0, {0x11c5220, 0xc000a76960}) oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http/server.go:2044 +0x5e8 oct‚ 22 13:51:15 node1 neo-go[1183338]: created by net/http.(*Server).Serve in goroutine 534 oct‚ 22 13:51:15 node1 neo-go[1183338]: net/http/server.go:3290 +0x4b4 Signed-off-by: Alexey Savchuk <[email protected]>
That's a good point. I've thought about it as well.
My bad — I meant not the handler itself but the goroutines it might spawn. I've added an example below. The stack trace I attached shows that the mutex unlock happens in the following call chain: package main
import (
"context"
"errors"
"log"
"net/http"
"time"
)
func main() {
// Create a server with a default handler
srv := http.Server{
Addr: ":8080",
Handler: http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Spawn a goroutine
go func() {
ticker := time.NewTicker(time.Second)
defer ticker.Stop()
for {
<-ticker.C
log.Println("Running")
}
}()
}),
}
// Run the server
go func() {
err := srv.ListenAndServe()
if !errors.Is(err, http.ErrServerClosed) {
exitOnErr(err)
}
log.Println("Server has been closed")
}()
time.Sleep(time.Second)
// Make a request
go func() {
c := http.Client{}
req, err := http.NewRequest("GET", "http://localhost:8080", nil)
exitOnErr(err)
_, err = c.Do(req)
exitOnErr(err)
log.Println("Request has been handled")
}()
time.Sleep(5 * time.Second)
err := srv.Shutdown(context.Background())
exitOnErr(err)
log.Println("Server shutdown is completed")
// The handler's goroutine keeps running
for {
}
}
func exitOnErr(err error) {
if err != nil {
log.Fatalf("unexpected error: %s\n", err)
}
} |
d06088f
to
df9247c
Compare
Yes, you're right. I've created a separate issue for that, ref. #3664. |
Problem
Before, a new server was returned by value which could cause a panic
unlock of unlocked mutex
on SIGHUP handling. It's because the new server overwrites a locked mutex of the already existing server.Solution
Return a new server by pointer.