Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add basic prometheus exporter #238

Merged

Conversation

lmendes86
Copy link

We are recently running Resgate on our platform; thank you for this repo, by the way. However, we still needed basic metrics from it, so we decided to add the possibility of having a Prometheus exporter in Resgate to have insight into how it is operating.
We also upgraded to the latest Golang version, but if required, we can remove those changes from this pull request.
I hope this helps!

lmendes86 and others added 8 commits March 31, 2023 14:45
* Support for traceparent correlation header

When receiving a NATS header containing a traceparent id then we relay
it to subsequent collection requests to allow the server application to
correlate them.
* Setup GitHub Actions

- Build and push Docker image to GitHub Container Registry.
- Run tests and linting.
- Update Dockerfile to use less strict base image.

* Add OCI image source for permissions
@Atomzwieback
Copy link

@jirenius will this not be merged? Would be cool to have some basic metrics

@jirenius
Copy link
Collaborator

Sorry for the radio silence. The resgate project has been on ice for a while, even if the gateway has been actively used in other projects lead by me. Now I am working on a new release with some further improvements, bugfixes, and updated dependencies.
I wish to include this PR in the release, but due to other changes made, I will merge it into a side branch and handle the conflicts there.

Also, thanks for this PR inspiring other improvements!

@jirenius jirenius changed the base branch from master to feature/gh-255-add-prometheus-exporter June 30, 2024 16:29
@jirenius jirenius merged commit fccb8a2 into resgateio:feature/gh-255-add-prometheus-exporter Jun 30, 2024
@jirenius
Copy link
Collaborator

jirenius commented Jul 1, 2024

Testing it, I see one issue with the number of dependencies that comes with the prometheus package. The compiled file size increased with about 60% (6MB), making the server more vulnerable to dependency abuse chains.

One option would be to use a dependency-free package (eg. github.com/bsm/openmetrics ) to expose the desired metrics.
The ones I would add would probably be:

  • process_start_time_seconds
  • go_memstats_* (using runtime.MemStats)
  • go_info
    • version
  • resgate_info - Gauge (set to 1)
    • version=<resgate version>,protocol=`
  • resgate_ws_current_connections - Gauge
  • resgate_ws_connections_total - Counter
  • resgate_ws_subscriptions - Gauge
    • type=direct
    • type=indirect
  • resgate_ws_requests_total - Counter
    • method=get
    • method=subscribe
    • method=call
    • method=auth
  • resgate_cached_resources - Gauge
  • resgate_http_requests_total - Counter
    • method=POST
    • method=GET

I think I'll skip:

  • resgate_nats_connected - Since resgate stops if the connection is closed. So it will always be 1 when successfully scraping
  • resgate_subscriptions - (or rather, per resource subscription labels) Some solutions may have many thousand different subscriptions, causing the metrics response to be huge. Possibly have it as an opt in thing through configuration.

@lmendes86
Copy link
Author

It's nice to hear that this is being taken care of!
Those metrics look good! We are using resgate_subscriptions, but I understand that it could lead to many metrics if there are a lot of subscription topics; for us, it is quite insightful to have, so it could be useful to keep it with an opt-in if you think that is a possibility. Here, I leave an example of a Grafana visualization of our current implementation. image
Thanks in advance for the work!

@jirenius
Copy link
Collaborator

jirenius commented Jul 2, 2024

Ah, that is nice!

For the grouping of resource IDs, Resgate would need some sort of knowledge of patterns. In your branch, you've solved it by detecting {id} and {uuid} parts. But I will try to see if I can come up with a more generic way to solve it. One way would be to provide resgate with resource patterns to track metrics through configuration:

{
   "metrics": {
      "resourcePatterns": [
         "availability.client.*",
         "availability.client.*.user.*",
         "availability.client.*.user.*.device",
         "availability.client.*.user.*.device.*",
         "dashboard.client.*",
         "dashboard.queue.*",
         "usertoken"
      ]
   }
}

It would require you to manually update resgate's configuration with the resource patterns.
So, it might work for some use cases.

Anyway. While I failed to merge your PR into develop due to me choosing to solve it differently and with a different package, it was still great inspiration in many ways! Big thanks for it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants