Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add canister metrics #107

Merged
merged 48 commits into from
Jan 4, 2024
Merged

feat: add canister metrics #107

merged 48 commits into from
Jan 4, 2024

Conversation

rvanasa
Copy link
Collaborator

@rvanasa rvanasa commented Dec 22, 2023

Includes metrics available on the /metrics HTTP endpoint in the standard ic-metrics-encoder format.

Progress:

  • Refactor metric system to represent individual Ethereum RPC methods and service hostnames
  • Use the same HTTP request logic for both JSON-RPC and Candid-RPC endpoints
  • Add traits to resolve metric labels and values based on data type
  • Add getMetrics canister method for simplified programmatic access
  • Test expected behavior

Note that the metrics are currently not stored in stable memory, so redeploying the canister will reset the metrics. This is intentional to allow breaking changes over time and will also make it quicker to recognize problems with newly deployed canisters. Total statistics can be computed from downstream monitoring services by integrating over all increases in metric numbers. Suggestions are welcome for different ways to approach persistence between upgrades.

Resolves #86.

@rvanasa rvanasa changed the title chore: add canister metrics feat: add canister metrics Dec 22, 2023
@rvanasa rvanasa marked this pull request as ready for review January 3, 2024 23:41
@@ -109,7 +109,7 @@ shared ({ caller = installer }) actor class Main() {
};

let candidRpcCycles = 1_000_000_000_000;
let ethMainnetSource = #EthMainnet(?[#Ankr, #BlockPi, #Cloudflare, #PublicNode]);
let ethMainnetSource = #EthMainnet(?[#Ankr, #Cloudflare, #PublicNode]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is BlockPi removed?

Copy link
Collaborator Author

@rvanasa rvanasa Jan 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was causing intermittent CI issues due to the response size limits. I'll see if there's a way to add it back in.

source: &ResolvedJsonRpcSource,
json_rpc_payload: &str,
payload_size_bytes: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is payload_size_bytes of type usize but max_response_bytes is u64?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching payload_size_bytes to u64 causes a lot of unnecessary conversions (since string/vec lengths are always returned as usize), whereas max_response_bytes is always a u64 due to being passed by the developer. It's a bit of an awkward situation, and it bugs me as well.

I'll explore changing this in a separate PR for a cleaner diff.

1000,
);
let estimated_cost_10_extra_bytes = base_cost
+ 10 * (INGRESS_MESSAGE_BYTE_RECEIVED_COST + HTTP_OUTCALL_BYTE_RECEIEVED_COST)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP_OUTCALL_BYTE_RECEIEVED_COST -> HTTP_OUTCALL_BYTE_RECEIVED_COST

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching that. For the record, this typo was in the original canister before I started working on it (along with many others).

+ 10 * (INGRESS_MESSAGE_BYTE_RECEIVED_COST + HTTP_OUTCALL_BYTE_RECEIEVED_COST)
* nodes_in_subnet as u128
/ NODES_IN_DEFAULT_SUBNET as u128;
// Request body with 10 additional bytes should be within 1 cycle of expected cost
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify where the 1-cycle difference comes from?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the comment; it's due to rounding since the unit test approximates the difference in cycles.

RpcSource::EthMainnet(None),
RpcSource::EthMainnet(Some(vec![
EthMainnetService::Cloudflare,
EthMainnetService::BlockPi,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BlockPi is still here! 🙂

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it back for now (will address the occasional CI issues in another PR if necessary).

@rvanasa rvanasa merged commit 9b0c05b into main Jan 4, 2024
3 checks passed
@rvanasa rvanasa deleted the candid-rpc-metrics branch January 5, 2024 17:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement canister metrics
2 participants