-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding node software version info to Ping metrics. #662
Conversation
@@ -226,62 +228,62 @@ docker-ci-benchmark-team-city: | |||
|
|||
.PHONY: docker-build-collection | |||
docker-build-collection: | |||
docker build -f cmd/Dockerfile --build-arg TARGET=collection --target production \ | |||
docker build -f cmd/Dockerfile --build-arg TARGET=collection --build-arg COMMIT=$(COMMIT) --build-arg VERSION=$(IMAGE_TAG) --target production \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The software version is passed in as linker params (ld flags). The Image_Tag
will be set to something like - v0.16.0-patch1
. This will be the software version reported by the Ping metric.
@@ -7,7 +7,9 @@ COMMIT := $(shell git rev-parse HEAD) | |||
VERSION := $(shell git describe --tags --abbrev=0 --exact-match 2>/dev/null) | |||
|
|||
# Image tag: if image tag is not set, set it with version (or short commit if empty) | |||
ifeq (${IMAGE_TAG},) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
without this, the IMAGE_TAG even if specified is always set to the version
-t "$(CONTAINER_REGISTRY)/collection:latest" -t "$(CONTAINER_REGISTRY)/collection:$(SHORT_COMMIT)" -t "$(CONTAINER_REGISTRY)/collection:$(IMAGE_TAG)" . | ||
|
||
.PHONY: docker-build-collection-debug | ||
docker-build-collection-debug: | ||
docker build -f cmd/Dockerfile --build-arg TARGET=collection --target debug \ | ||
docker build -f cmd/Dockerfile --build-arg TARGET=collection --build-arg COMMIT=$(COMMIT) --build-arg VERSION=$(IMAGE_TAG) --target debug \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The docker-build-
commands for each of the node type now pass-in two params to the dockerfile - COMMIT and VERSION. The docker file then uses is to set the LD flags when building those images.
cmd/Dockerfile
Outdated
@@ -46,7 +46,9 @@ WORKDIR /app | |||
# https://github.com/golang/go/issues/27719#issuecomment-514747274 | |||
RUN --mount=type=cache,target=/go/pkg/mod \ | |||
--mount=type=cache,target=/root/.cache/go-build \ | |||
GO111MODULE=on CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build --tags "relic,netgo" -ldflags "-extldflags -static" -o ./app ./cmd/${TARGET} | |||
GO111MODULE=on CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build --tags "relic,netgo" -ldflags "-extldflags -static \ | |||
-X 'github.com/onflow/flow-go/cmd/bootstrap/build.commit=${COMMIT}' -X 'github.com/onflow/flow-go/cmd/bootstrap/build.semver=${VERSION}'" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set the build.semver and build.commit. build.semver will be something like v0.16.1-patch1
while build.commit will be the short commit.
Build.commit is currently not being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
github.com/onflow/flow-go/cmd/bootstrap/build
is probably not the best place for this to live, now that it's used across multiple binaries. Could we move this to eg cmd/build
or utils/build
?
cmd/bootstrap/build/version.go
Outdated
@@ -41,3 +41,11 @@ func init() { | |||
commit = undefined | |||
} | |||
} | |||
|
|||
func SetSemver(sv string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two setters to make testing easy (without using ldflags)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much rather simplify testing another way than by adding these setters. Having these makes it too easy to set the wrong version globally.
Instead, what about injecting a function to PingService
to get the version? It is likely we will want to take a similar approach anyway to include the block height.
@@ -109,7 +109,7 @@ func (e *Engine) pingNode(peer *flow.Identity) { | |||
id := peer.ID() | |||
|
|||
// ping the node | |||
rtt, err := e.middleware.Ping(id) // ping will timeout in libp2p.PingTimeoutSecs seconds | |||
resp, rtt, err := e.middleware.Ping(id) // ping will timeout in libp2p.PingTimeoutSecs seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ping
now returns the response which contains the version field.
@@ -349,5 +349,6 @@ type TransactionMetrics interface { | |||
type PingMetrics interface { | |||
// NodeReachable tracks the round trip time in milliseconds taken to ping a node | |||
// The nodeInfo provides additional information about the node such as the name of the node operator | |||
NodeReachable(node *flow.Identity, nodeInfo string, rtt time.Duration) | |||
// version is the software version the target node is running | |||
NodeReachable(node *flow.Identity, nodeInfo string, rtt time.Duration, version string) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new label - version
@@ -0,0 +1,9 @@ | |||
# go get github.com/gogo/protobuf/protoc-gen-gofast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makefile to regenerate networking protobufs
@@ -0,0 +1,174 @@ | |||
package p2p | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Loosely based on how libp2p implements it's Ping
protocol - https://github.com/libp2p/go-libp2p/blob/master/p2p/protocol/ping/ping.go
} | ||
|
||
// PingHandler receives the inbound stream for Flow ping protocol and respond back with the PingResponse message | ||
func (ps *PingService) PingHandler(s network.Stream) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
server function that responds to the ping request
} | ||
|
||
// Ping sends a Ping request to the remote node and returns the response, rtt and error if any. | ||
func (ps *PingService) Ping(ctx context.Context, p peer.ID) (message.PingResponse, time.Duration, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
client function that sends a ping request
network/p2p/libp2pNode.go
Outdated
@@ -113,6 +114,9 @@ func NewLibP2PNode(logger zerolog.Logger, | |||
return nil, fmt.Errorf("could not bootstrap libp2p host: %w", err) | |||
} | |||
|
|||
pingLibP2PProtocolID := generateProtocolID(rootBlockID) | |||
pingService := NewPingService(libP2PHost, pingLibP2PProtocolID, logger) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create the PingService to handle incoming ping request and to send ping request to remote nodes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I learned something from reviewing this PR. Thanks for adding the comments to explain the changes. 👏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👏 Awesome, thanks for setting this up!
Added some suggestions, I also think we should include the semver and commit in the startup log message.
cmd/Dockerfile
Outdated
@@ -46,7 +46,9 @@ WORKDIR /app | |||
# https://github.com/golang/go/issues/27719#issuecomment-514747274 | |||
RUN --mount=type=cache,target=/go/pkg/mod \ | |||
--mount=type=cache,target=/root/.cache/go-build \ | |||
GO111MODULE=on CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build --tags "relic,netgo" -ldflags "-extldflags -static" -o ./app ./cmd/${TARGET} | |||
GO111MODULE=on CGO_ENABLED=1 GOOS=linux GOARCH=amd64 go build --tags "relic,netgo" -ldflags "-extldflags -static \ | |||
-X 'github.com/onflow/flow-go/cmd/bootstrap/build.commit=${COMMIT}' -X 'github.com/onflow/flow-go/cmd/bootstrap/build.semver=${VERSION}'" \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
github.com/onflow/flow-go/cmd/bootstrap/build
is probably not the best place for this to live, now that it's used across multiple binaries. Could we move this to eg cmd/build
or utils/build
?
network/message/ping.proto
Outdated
message PingResponse { | ||
string version = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
message PingResponse { | |
string version = 1; | |
message PingResponse { | |
string version = 1; // node software version |
network/p2p/libp2pUtils.go
Outdated
@@ -163,6 +163,10 @@ func generateProtocolID(rootBlockID string) protocol.ID { | |||
return protocol.ID(FlowLibP2PProtocolIDPrefix + rootBlockID) | |||
} | |||
|
|||
func generateID(rootBlockID string) protocol.ID { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func generateID(rootBlockID string) protocol.ID { | |
func generatePingID(rootBlockID string) protocol.ID { |
Is this only for ping messages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah only for Ping message. I believe my earlier name change got lost somehow. I have added it back.
network/p2p/ping.go
Outdated
case err, ok := <-errCh: | ||
// if and error occur while reposning to a pind request, then reset the stream | ||
if ok { | ||
log.Error().Err(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Error().Err(err) | |
log.Error().Err(err).Msg("error responding to ping request") |
I don't think this line will print anything as-is, believe it needs the Msg
call as well
cmd/bootstrap/build/version.go
Outdated
@@ -41,3 +41,11 @@ func init() { | |||
commit = undefined | |||
} | |||
} | |||
|
|||
func SetSemver(sv string) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would much rather simplify testing another way than by adding these setters. Having these makes it too easy to set the wrong version globally.
Instead, what about injecting a function to PingService
to get the version? It is likely we will want to take a similar approach anyway to include the block height.
network/p2p/ping.go
Outdated
type PingService struct { | ||
host host.Host | ||
logger zerolog.Logger | ||
pingProtocolID protocol.ID | ||
} | ||
|
||
func NewPingService(h host.Host, pingProtocolID protocol.ID, logger zerolog.Logger) *PingService { | ||
ps := &PingService{host: h, logger: logger, pingProtocolID: pingProtocolID} | ||
h.SetStreamHandler(pingProtocolID, ps.PingHandler) | ||
return ps | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion for a different approach than SetSemver
:
type PingService struct { | |
host host.Host | |
logger zerolog.Logger | |
pingProtocolID protocol.ID | |
} | |
func NewPingService(h host.Host, pingProtocolID protocol.ID, logger zerolog.Logger) *PingService { | |
ps := &PingService{host: h, logger: logger, pingProtocolID: pingProtocolID} | |
h.SetStreamHandler(pingProtocolID, ps.PingHandler) | |
return ps | |
} | |
type PingService struct { | |
host host.Host | |
logger zerolog.Logger | |
getVersion func() string | |
pingProtocolID protocol.ID | |
} | |
func NewPingService(h host.Host, pingProtocolID protocol.ID, logger zerolog.Logger, getVersion func() string) *PingService { | |
ps := &PingService{host: h, logger: logger, pingProtocolID: pingProtocolID, getVersion: getVersion} | |
h.SetStreamHandler(pingProtocolID, ps.PingHandler) | |
return ps | |
} |
Then we can do:
pingService := NewPingService(..., build.Semver)
And for testing something like:
pingService := NewPingService(..., func() {return "test string"})
Using the function argument directly is simplest and probably fine for now, but if we're adding block height and potentially other things as well, it may be cleaner to create an interface for PingInfoProvider
type interface PingInfoProvider {
SoftwareVersion() string
LatestFinalizedBlockHeight() uint64
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense. for now I am just passing in the semver value as a string to the NewPingService
function since the software version won't change till the node is restarted.
I will define the interface once I start implementing the finalized block height changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, I learned something from reviewing this PR. Thanks for adding the comments to explain the changes. 👏
Thanks @zhangchiqing for taking the time. Appreciate it.
) | ||
|
||
// the Flow Ping protocol prefix | ||
const FlowLibP2PPingPrefix = "/flow/ping/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lib2p uses a concept of Protocols to allow discrete one-to-one communication streams between two nodes. So far all the Flow direct communication has been using the Flow Libp2p protocol ID - /flow/push/<root block ID>
.
Here, a new protocol ID specifically for Ping is being defined. Similar to https://docs.libp2p.io/concepts/protocols/#ping.
Why a new Protocol ID and why not reuse the old /flow/push protocol ID?
- Mainly, to keep core protocol network traffic separate from non-core protocol traffic.
- Mimicing, what libp2p themselves do when it comes to ping.
- Allows a
request-response
type of communication not available with the/flow/push
protocol.
"github.com/onflow/flow-go/cmd/bootstrap/build" | ||
) | ||
|
||
// TestPing tests PingService by creating two libp2p hosts and ping each one from the other |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied over from libp2p's own ping test - https://github.com/libp2p/go-libp2p/blob/master/p2p/protocol/ping/ping_test.go
// SetStreamHandler sets the stream handler of libp2p host of the node. | ||
func (n *Node) SetStreamHandler(handler libp2pnet.StreamHandler) { | ||
// SetFlowProtocolStreamHandler sets the stream handler of Flow libp2p Protocol | ||
func (n *Node) SetFlowProtocolStreamHandler(handler libp2pnet.StreamHandler) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename
go.mod
Outdated
@@ -33,6 +33,7 @@ require ( | |||
github.com/libp2p/go-libp2p-pubsub v0.4.1 | |||
github.com/libp2p/go-libp2p-swarm v0.4.0 | |||
github.com/libp2p/go-libp2p-transport-upgrader v0.4.0 | |||
github.com/libp2p/go-msgio v0.0.6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
package that helps read and write length-delimited slices (https://github.com/libp2p/go-msgio). Use to read and write the ping request and response.
network/message/ping.proto
Outdated
|
||
message PingResponse { | ||
string version = 1; | ||
uint64 blockHeight = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
blockHeight
not being populated for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 LGTM
Could we also add the version to a startup log (ie. in scaffold) so operators have a way to access this easily as well?
Added statements to print the version and the commit id at the node startup in |
network/p2p/ping.go
Outdated
// create a new stream to the remote node | ||
s, err := ps.host.NewStream(ctx, p, ps.pingProtocolID) | ||
if err != nil { | ||
return message.PingResponse{}, -1, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return message.PingResponse{}, -1, err | |
return message.PingResponse{}, -1, fmt.Errorf("could not create stream: %w", err) |
network/p2p/ping_test.go
Outdated
|
||
if err != nil { | ||
t.Fatal(err) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you may use require.NoError(t, err)
.
network/p2p/ping.go
Outdated
log.Error().Msg("context timed out on ping to remote node") | ||
case <-done: | ||
} | ||
err := s.Reset() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few questions about this:
- Assume that our context times out and we reach here. Then does it mean that we have only one single shot to reset the stream? why only a single shot?
- Why we do reset the stream upon completion of ping?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great, just a few minor comments/questions.
} | ||
|
||
// if ping succeeded, close the stream else reset the stream | ||
go func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @yhassanzadeh13 - thanks for your question earlier about why Reset
the stream. I got the reset statement from the libp2p's Ping logic. But after I read your question, I did some investigation and changed the code accordingly.
Libp2p's stream.Close()
indeed closes the stream gracefully from both ends - See this issue.
In a nut shell, if you would like to cause an error on the remote node while closing the stream, use stream.Reset()
else use stream.Close()
.
I have modified the code to do that - if the client times out, do a reset and let the remote know that something bad has happened else just close.
Currently, the Ping metrics reports the RTT it takes to ping a remote node and the operator name as part of node info.
This change adds another label -
nodelabel
to the Ping metrics to report the node software version that each of the node in the Flow network is running.The current Ping implementation is using the Libp2p provided Ping functionality which is supported via the special
Ping
protocol - https://docs.libp2p.io/concepts/protocols/#ping. However, this protcol cannot be extended to add in additional information. The other protocols such as identity is also not extensible and mostly designed for libp2p core functionality.Hence, to support the
version
and also theblock height
later, I created something on the same lines as the libp2p Ping protocol calling it theFlow Ping
protocol. The Pings are exchanged on the a special protocol ID/flow/ping/<root block id>
and are handled by thelibp2pNode.go
file.This is how it looks on localnet - notice the additional label -
nodeversion