-
Notifications
You must be signed in to change notification settings - Fork 309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated the PostgreSQL Prometheus documentation and dashboard #581
Updated the PostgreSQL Prometheus documentation and dashboard #581
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EvanSimpson let's discuss this next week -- of the current wave of exporter doc updates this one involves changing the metric names outright, thereby breaking the dashboards.
It feels wrong to bump the db version, exporter version, update the dashboard to a new working state but everyone who set up the older version of the integration against postgres 14 now has a non-functional dashboard (especially since the integration detection using the characteristic metric pg_settings_max_connections will continue to work against both old and new setups).
@@ -7,7 +7,7 @@ platforms: | |||
exporter_metadata: | |||
name: GKE Prometheus Exporter | |||
doc_url: https://github.com/prometheus-community/postgres_exporter | |||
minimum_supported_version: v0.11.1 | |||
minimum_supported_version: v0.13.1 | |||
default_metrics: | |||
- name: prometheus.googleapis.com/pg_stat_database_numbackends/gauge | |||
prometheus_name: pg_stat_database_numbackends |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the list of metrics below need to be updated as well
Yeah, this looks like a breaking change for existing users. I've updated the integration guide to include criteria for breaking changes and how to version dashboards and integrations to keep all users in a working state. It looks like this warrants a version split. https://docs.google.com/document/d/16luyU9r9YBPOxTL1F5T-n1m0RiCh22qf-oH_p8K_A34/edit |
After looking into this further, I learned that I am mistaken about the changes to the metric names. PostgreSQL prometheus metrics were flowing into the same project from another system, which created confusion. After stopping that and looking at the release notes for the exporter, I found this in the
Tested PostgreSQL 15.3 against the previous exporter versions (0.12 and 0.11) and the integration/dashboard is working as expected, but with version image: quay.io/prometheuscommunity/postgres-exporter:v0.13.0
args:
- --collector.stat_statements
- --collector.stat_database
- --collector.database I'm thinking either there's an issue with how I'm configuring the collector to run, or perhaps there is something I missed with the new version of PostgreSQL. Based on the change in the release notes, I don't think the names for the metrics did change, so we might not have to split versions. |
Thanks for investigating! Could this be related to prometheus-community/postgres_exporter#818 ? From the comments on that thread, that issue can be spotted via the logs on the exporter. If so, maybe we should hold off on updating the documentation to support 0.13.x for now? |
let me add a logger to it, maybe it'll spit out some errors |
@yqlu I think that for now we should hold off, as you suggested. Would you like this PR closed for now? I left a comment on this thread regarding what I'm seeing for metrics when using PostgreSQL 15.3 and version |
@yqlu It was determined that there is a race condition that causes some collectors to fail in the exporter. The latest information is here. I went ahead and tested out the exporter after letting PostgreSQL start completely and after that, I saw the expected metrics. That means the dashboard doesn't have to be updated and with a fix for the problem on the way I am wondering how we'd like to handle that in the docs, if at all? |
As discussed, let's:
|
Watch for this to be merged and released |
The PR that was mentioned has been merged, but the tag for the container that includes the fix is only on |
Changes
Details
During testing, we learned that
_total
suffix added to them and updates were made to the dashboard json to account for this.