rpc server metrics impl #3913

0xprames · 2023-07-25T17:23:03Z

skeleton draft PR for #3907, please excuse the formatting changes my IDE's auto save rust fmt applied. If needbe - I can change to the repo's fmt standard, not sure what's different with my setup

@mattsse if this looks on the right path I can keep at it - questions I have rn revolve around:

does the adding type param RpcServerMetrics to the jsonrpsee::server::Server<Identity, L: ()> in WsHttpServerKind seem OK or is there a cleaner way to do it
the metrics scope implementation rn changes relative to whether the WS/HTTP servers are combined or not, should it be that way?

and generally haven't started thinking through what exact metrics we want to track in RpcServerMetrics

0xprames · 2023-07-25T17:25:18Z

crates/rpc/rpc-builder/src/lib.rs

@@ -1242,7 +1246,7 @@ impl RpcServerConfig {
                            http_cors_domains: Some(http_cors.clone()),
                            ws_cors_domains: Some(ws_cors.clone()),
                        }
-                        .into())
+                        .into());


this was done by my ide autofmt on save - ill try and take some time to revert these in the next commit

mattsse

cool, this is a great draft

we want nightly formatting: cargo +nightly fmt

you should be instructions for vscode for this somewhere

mattsse · 2023-07-25T23:05:53Z

crates/rpc/rpc-builder/src/metrics.rs

+#[derive(Metrics, Clone)]
+#[metrics(dynamic = true)]
+pub(crate) struct RpcServerMetrics {
+    //TODO:: define relevant metric attributes


here's what we want:

calls_started: Counter,
successful_calls: Counter,
failed_calls: Counter,
requests_started: Counter,
requests_finished: Counter,
ws_sessions_opened: Counter,
ws_sessions_closed: Counter

maybe a histogram for call durations that tracks milliseconds, but we could do this seprately

the active/open numbers are then derived by comparing these values in the dashboard

ah I pushed up my changes before seeing this comment.

I've got:

/// The number of ws requests currently being served ws_req_count: Gauge, /// The number of http requests currently being served http_req_count: Gauge, /// The number of ws sessions currently active ws_session_count: Gauge, /// The number of http connections currently active http_session_count: Gauge, /// Latency for a single request/response pair request_latency: Histogram,

and inline:

// note that on_call will be called multiple times in case of a batch request // increment method call count, use macro because derive(Metrics) doesnt seem to support dynamically configuring metric name (?) let metric_call_count_name = format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_COUNT_METRIC, "_", method_name); counter!(metric_call_count_name, 1); // this could be a gauge since one call here should map to one "result" in on_result } fn on_result( &self, method_name: &str, success: bool, started_at: Self::Instant, _transport: TransportProtocol, ) { // capture method call latency, use macro because of the same reason stated in on_call let metric_name_call_latency = format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_LATENCY_METRIC, "_", method_name); histogram!(metric_name_call_latency, started_at.elapsed()); if !success { // capture error count for method call, use macro because of the same reason stated in on_call let metric_name_call_error_count = format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_ERROR_METRIC, "_", method_name); counter!(metric_name_call_error_count, 1); }

let me reconcile that with what you asked for

ah so the active/open sessions and requests I'm calculating with a gauge - and incrementing on_connect/req and decrementing on_disconnect/response is that OK?

edit: tagging you just to make sure @mattsse

I can go the counter route with an open counter and closed counter like you suggested as well. this seemed to make more sense to me though

alright I went ahead and reconciled what you requested with what I had before

mainly:

keeping the per method call metrics I had originally added - I think in production, this level of granularity will help for debugging purposes and isolating issues with a specific method

modifying the open/closed session counters to separate counters instead of using one gauge for both

mattsse

cool, this looks great.

I'd like to hold off on anything dynamic for now, but other than that this looks great

mattsse · 2023-07-26T09:54:24Z

crates/rpc/rpc-builder/src/metrics.rs

+        // increment call count per method, use macro because derive(Metrics) doesnt seem to support
+        // dynamically configuring metric name (?)
+        let metric_call_count_name =
+            format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_COUNT_METRIC, "_", method_name);
+        counter!(metric_call_count_name, 1);


this is a bit expensive and we don't really need this dynamic value, so this can be removed

for sure makes sense

mattsse · 2023-07-26T09:54:53Z

crates/rpc/rpc-builder/src/metrics.rs

+        // capture per method call latency
+        let metric_name_call_latency =
+            format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_LATENCY_METRIC, "_", method_name);
+        histogram!(metric_name_call_latency, started_at.elapsed().as_millis());


I like this

cool, I left the generic call latency metric and removed this per-method-call one dynamic one

it may make sense to somehow add metrics with a method call dimension back in at some point

mattsse · 2023-07-26T09:55:35Z

crates/rpc/rpc-builder/src/metrics.rs

+        // capture per method call latency
+        let metric_name_call_latency =
+            format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_LATENCY_METRIC, "_", method_name);


I need to think about this first,
until then we should remove dynamic labels like this

mattsse · 2023-07-26T09:55:51Z

crates/rpc/rpc-builder/src/metrics.rs

+            // capture error count per method call
+            let metric_name_call_error_count =
+                format!("{}{}{}{}{}", METRICS_SCOPE, "_", CALL_ERROR_METRIC, "_", method_name);


…etrics

0xprames · 2023-07-26T12:54:32Z

rebased against mainline and pushed up to pull in the commits that were just merged in

mattsse

this is great work,tysm

now we can integrate this into the dashboard, @Rjected probably has some ideas for this, will open an issue

codecov · 2023-07-26T13:15:37Z

Codecov Report

Merging #3913 (ec43302) into main (d5ea168) will decrease coverage by 0.03%.
The diff coverage is 98.24%.

Files Changed	Coverage Δ
crates/rpc/rpc-builder/src/metrics.rs	`98.00% <98.00%> (ø)`
crates/rpc/rpc-builder/src/lib.rs	`65.47% <100.00%> (+0.21%)`	⬆️

... and 10 files with indirect coverage changes

Flag	Coverage Δ
integration-tests	`15.54% <98.24%> (+0.06%)`	⬆️
unit-tests	`64.42% <0.00%> (-0.12%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components	Coverage Δ
reth binary	`27.24% <ø> (ø)`
blockchain tree	`83.01% <ø> (ø)`
pipeline	`89.68% <ø> (ø)`
storage (db)	`74.19% <ø> (ø)`
trie	`94.70% <ø> (ø)`
txpool	`46.00% <ø> (-0.60%)`	⬇️
networking	`77.65% <ø> (-0.06%)`	⬇️
rpc	`58.54% <98.24%> (+0.16%)`	⬆️
consensus	`64.46% <ø> (ø)`
revm	`33.68% <ø> (ø)`
payload builder	`6.61% <ø> (ø)`
primitives	`88.03% <ø> (-0.01%)`	⬇️

0xprames · 2023-07-26T13:39:50Z

ah I see - my rust-analyzer plugin should've picked up those warnings/compiler errors that you fixed, but it was yelling about some other issue which I didn't notice:

I assumed it was all good and didn't rerun a cargo build woops.

thanks for the touchups

0xprames requested review from mattsse, Rjected and gakonst as code owners July 25, 2023 17:23

0xprames commented Jul 25, 2023

View reviewed changes

0xprames marked this pull request as draft July 25, 2023 17:28

mattsse requested changes Jul 25, 2023

View reviewed changes

0xprames force-pushed the add-rpc-metrics branch 2 times, most recently from 158574e to 986e295 Compare July 25, 2023 23:52

0xprames marked this pull request as ready for review July 25, 2023 23:53

0xprames requested a review from mattsse July 25, 2023 23:53

0xprames changed the title ~~rpc server metrics skeleton impl~~ rpc server metrics impl Jul 26, 2023

mattsse requested changes Jul 26, 2023

View reviewed changes

0xprames requested a review from mattsse July 26, 2023 12:30

0xprames added 4 commits July 26, 2023 08:53

rpc server metrics impl

bba3882

reconcile review comments with previously pushed changes

2aece78

nightly formatting on rpc-builder/lib.rs to match previous formatting

e50e4b8

remove dynamic method call metric names and dynamic per method call m…

e2eed7f

…etrics

0xprames force-pushed the add-rpc-metrics branch from 704a65e to e2eed7f Compare July 26, 2023 12:53

mattsse approved these changes Jul 26, 2023

View reviewed changes

mattsse enabled auto-merge July 26, 2023 13:04

mattsse mentioned this pull request Jul 26, 2023

Integrate rpc server metrics into dashboard #3925

Closed

smol touchups

ec43302

mattsse added this pull request to the merge queue Jul 26, 2023

Merged via the queue into paradigmxyz:main with commit caa2683 Jul 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc server metrics impl #3913

rpc server metrics impl #3913

0xprames commented Jul 25, 2023 •

edited

Loading

0xprames Jul 25, 2023

mattsse left a comment

mattsse Jul 25, 2023

0xprames Jul 25, 2023

0xprames Jul 25, 2023

0xprames Jul 25, 2023 •

edited

Loading

0xprames Jul 25, 2023 •

edited

Loading

mattsse left a comment

mattsse Jul 26, 2023

0xprames Jul 26, 2023

mattsse Jul 26, 2023

0xprames Jul 26, 2023 •

edited

Loading

mattsse Jul 26, 2023

mattsse Jul 26, 2023

0xprames commented Jul 26, 2023

mattsse left a comment

codecov bot commented Jul 26, 2023 •

edited

Loading

0xprames commented Jul 26, 2023

rpc server metrics impl #3913

rpc server metrics impl #3913

Conversation

0xprames commented Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

mattsse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xprames Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

0xprames Jul 25, 2023 • edited Loading

Choose a reason for hiding this comment

mattsse left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xprames Jul 26, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

0xprames commented Jul 26, 2023

mattsse left a comment

Choose a reason for hiding this comment

codecov bot commented Jul 26, 2023 • edited Loading

Codecov Report

0xprames commented Jul 26, 2023

0xprames commented Jul 25, 2023 •

edited

Loading

0xprames Jul 25, 2023 •

edited

Loading

0xprames Jul 25, 2023 •

edited

Loading

0xprames Jul 26, 2023 •

edited

Loading

codecov bot commented Jul 26, 2023 •

edited

Loading