-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing metrics #125
Comments
For JSON-RPC endpoints that are served over |
@m-Peter also suggested tracking DB size over time and also DB query time |
Add metrics for index health. Trace index health is dependent on the trace download success, if one is failed the index becomes unhealthy. Transaction index health is dependent on how far back the latest ingested event is from the latest height on the network. If too far behind the index is unhealthy. |
Another high priority metric is: #384 |
First set of metrics is implemented and Grafana Dashboard created: https://flowfoundation.grafana.net/d/PkvVJj4Mz/mainnet-general?from=now-24h&to=now&timezone=America%2FVancouver |
We need comprehensive metrics to measure the performance and resource usage of our APIs. This will help us understand the performance of different API methods and track various states and errors.
Performance
Measuring performance can/should be done using tracing, so we can have multiple sub-calls measured as well. Ideally, we should have all the network calls as a sub-trace as well as any APIs. Traces should be enabled with a flag and not on by default.
Each API response time should also submit a simple metric measuring the time it took for the request to be processed.
Be careful to also include websocket request/responses metrics.
State
Ingestion
We should use prometheus and open telemetry to collect the traces and metrics.
The text was updated successfully, but these errors were encountered: