Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postgresql_exporter 0.14.0 leaks connections when queried simultaneously #921

Closed
btmc opened this issue Sep 21, 2023 · 10 comments · Fixed by #931
Closed

postgresql_exporter 0.14.0 leaks connections when queried simultaneously #921

btmc opened this issue Sep 21, 2023 · 10 comments · Fixed by #931

Comments

@btmc
Copy link

btmc commented Sep 21, 2023

What did you do?

I run postgresql-exporter in an environment with three vmagents scraping the exporter.

It happens that they do it almost simultaneously every time: all three HTTP requests come before the first answer starts to be returned, I see that from tcpdump.

At the exporter side I see multiple 'collector failed' errors every scrape round, on random collector modules.

At postgres side I see the following:

At first round of scrapes, there are 3 new connections in postgres, two of them have 'select version()' as their last query and stay idle, one is functional.
At every next round of scrapes, there are 2 new additional connections (previous idle connections remain), which are also idle, the first functional connection continues to be used.

I tried to run exporter version 0.13.2 in the same vmagent setup and it was fine: there are two connections at postgres side, which are being reused.

Also there are no leaks when I make HTTP requests one by one on version 0.14.0.

I guess it might be related to sql.Open call in instance.setup method, which is called on every incoming request in 0.14.0, but only once on collector initialization in 0.13.2.

https://github.com/prometheus-community/postgres_exporter/blob/v0.14.0/collector/instance.go#L46

What did you expect to see?

Postgres connections are correctly handled.

What did you see instead? Under which circumstances?

Postgres connections are used up to the limit.

Environment

  • System information:
Linux 5.10.0-25-amd64 x86_64
  • postgres_exporter version:
postgres_exporter, version 0.14.0 (branch: HEAD, revision: c06e57db4e502696ab4e8b8898bb2a59b7b33a59)
  build user:       root@f2337de13240
  build date:       20230920-01:43:49
  go version:       go1.20.8
  platform:         linux/amd64
  tags:             netgo static_build
  • postgres_exporter flags:

  • PostgreSQL version:

PostgreSQL 16.0 (Debian 16.0-1.pgdg120+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 12.2.0-14) 12.2.0, 64-bit
  • Logs:
ts=2023-09-21T15:09:19.429Z caller=collector.go:199 level=error msg="collector failed" name=database duration_seconds=0.072067468 err="sql: database is closed"
ts=2023-09-21T15:09:19.429Z caller=collector.go:199 level=error msg="collector failed" name=wal duration_seconds=0.055904477 err="sql: database is closed"
ts=2023-09-21T15:09:19.431Z caller=collector.go:199 level=error msg="collector failed" name=database duration_seconds=0.057682597 err="sql: database is closed"
ts=2023-09-21T15:09:29.426Z caller=collector.go:199 level=error msg="collector failed" name=replication_slot duration_seconds=0.067115499 err="sql: database is closed"
ts=2023-09-21T15:09:29.426Z caller=collector.go:199 level=error msg="collector failed" name=locks duration_seconds=0.066662661 err="sql: database is closed"
ts=2023-09-21T15:09:29.429Z caller=collector.go:199 level=error msg="collector failed" name=database duration_seconds=0.069763998 err="sql: database is closed"
@sysadmind
Copy link
Contributor

Are you using the multi target feature (/probe) or the traditional /metrics endpoint?

To clarify, is this multiple systems scraping the exporter, which is connected to a single postgres server? Or are there multiple postgres servers?

@btmc
Copy link
Author

btmc commented Sep 22, 2023

Are you using the multi target feature (/probe) or the traditional /metrics endpoint?

I'm using /metrics endpoint.

To clarify, is this multiple systems scraping the exporter, which is connected to a single postgres server? Or are there multiple postgres servers?

Multiple systems are scraping one exporter connected to one postgres server.

@weastur
Copy link
Contributor

weastur commented Sep 26, 2023

Got the same, after upgrading to 0.14.0
image

@nicolaiarocci
Copy link

nicolaiarocci commented Sep 26, 2023

same here. we went back to 0.13.2 and the open connections are back to normal (we went from 200-ish to 1000-ish as soon as 0.14 went up - yes we do have many dbs.)

@sysadmind
Copy link
Contributor

I think I see the problem now. The instance{} is shared when using /metrics and it's limited to a single connection. I'm working on a fix to clone the instance for each scrape with a separate connection, but it's a bit more tricky to test so it make take a bit of time to work through that.

@CarpathianUA
Copy link

Experiencing the same

@b-a-t
Copy link

b-a-t commented Oct 9, 2023

I think I see the problem now. The instance{} is shared when using /metrics and it's limited to a single connection. I'm working on a fix to clone the instance for each scrape with a separate connection, but it's a bit more tricky to test so it make take a bit of time to work through that.

For the moment I thought @btmc is one of my colleagues as we also have 3 vmagents scraping the same exporter 😄
But, to complicate the setup even more - we access postgres_exporter through the exporter_exporter, which is a dedicated reverse proxy for exporters.

So, in our case, I'm not certain that it's easy to distinguish where the scraping connections are coming from. Well, hopefully, connections from proxy are distinct enough:

tcp    ESTAB      0      0      127.0.0.1:9187                 127.0.0.1:45690
tcp    ESTAB      0      0      127.0.0.1:45688                127.0.0.1:9187
tcp    ESTAB      0      0      127.0.0.1:45690                127.0.0.1:9187
tcp    ESTAB      0      0      127.0.0.1:9187                 127.0.0.1:45688

sysadmind added a commit to sysadmind/postgres_exporter that referenced this issue Oct 10, 2023
sysadmind added a commit that referenced this issue Oct 10, 2023
@GauntletWizard
Copy link

This was pretty bad - Brought down one of our db servers last night. Any chance you can roll a point release?

@Monstrofil
Copy link

@SuperQ this caused downtime on our servers also, when we can expect the release of the fix?

BupycHuk pushed a commit to percona/postgres_exporter that referenced this issue Nov 15, 2023
BupycHuk pushed a commit to percona/postgres_exporter that referenced this issue Nov 15, 2023
jaimeyh added a commit to sysdiglabs/postgres_exporter that referenced this issue Feb 22, 2024
* Dashboard linting improvements for mixin

Signed-off-by: Ryan J. Geyer <[email protected]>

* Convert pg_stat_database to new collector model

Signed-off-by: Joe Adams <[email protected]>

* Capture usename and application_name for pg_stat_activity

It is necessary to be able to exclude backups from long-running
transaction alerts, as they are to be expected. With the current
pg_stat_activity metric there is no ability to filter out
specific users or application names.

Resolves prometheus-community#668

Signed-off-by: cezmunsta <[email protected]>

* Fixed formatting

Signed-off-by: cezmunsta <[email protected]>

* Update common Prometheus files

Signed-off-by: prombot <[email protected]>

* probe: clean-up database connection after probe to prevent connection leak

Signed-off-by: Kurtis Bass <[email protected]>

* Set gauge to 1 when collector is successful

Signed-off-by: Julien Pivotto <[email protected]>
Signed-off-by: Khiem Doan <[email protected]>

* Add postgres 15 for CI test

Signed-off-by: Khiem Doan <[email protected]>

* Add postgres 15 for CI test

Signed-off-by: Khiem Doan <[email protected]>

* New unit value 64kB

Signed-off-by: Oleksandr Mysyura <[email protected]>

* Update common Prometheus files

Signed-off-by: prombot <[email protected]>

* Update exporter-toolkit

Update to the latest exporter-toolkit
* Enables multi-listener and systemd socket activation.
* Bump Go to 1.19.
* Remove `PG_EXPORTER_WEB_LISTEN_ADDRESS` env var because this is now a
  repeatable flag.

Signed-off-by: SuperQ <[email protected]>

* go fmt

Signed-off-by: SuperQ <[email protected]>

* adding codified functionality for logical replication metrics

Signed-off-by: Zachary Caldarola <[email protected]>

* Bump github.com/prometheus/client_golang from 1.13.0 to 1.14.0

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.13.0 to 1.14.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.13.0...v1.14.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Bump github.com/prometheus/common from 0.37.0 to 0.39.0

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.37.0 to 0.39.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](prometheus/common@v0.37.0...v0.39.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* addressing comments

Signed-off-by: Zachary Caldarola <[email protected]>

* more comments

Signed-off-by: Zachary Caldarola <[email protected]>

* fmt

Signed-off-by: Zachary Caldarola <[email protected]>

* typing

Signed-off-by: Zachary Caldarola <[email protected]>

* fmt

Signed-off-by: Zachary Caldarola <[email protected]>

* send stdout/stderr to syslog

Signed-off-by: Mike <[email protected]>

* Update common Prometheus files

Signed-off-by: prombot <[email protected]>

* Fix exclude-databases for collector package

The pg_database collector was not respecting the --exclude-databases flag and causing problems where databases were not accessible. This now respects the list of databases to exclude.

- Adjusts the Collector create func to take a config struct instead of a logger. This allows more changes like this in the future. I figured we would need to do this at some point but I wasn't sure if we could hold off.
- Split the database size collection to a separate query when database is not excluded.
- Comment some probe code that was not useful/accurate

Signed-off-by: Joe Adams <[email protected]>

* Remove commented code

Signed-off-by: Joe Adams <[email protected]>

* Remove more dead code

Signed-off-by: Joe Adams <[email protected]>

* Update build

* Update Go to 1.20.
* Update golanci-lint.
* Bump modules.
* Update CI orb.
* Fix up use of deprecated ioutil.

Signed-off-by: SuperQ <[email protected]>

* Reduce cardinality of pg_stat_statements

Make the example queries.yaml `pg_stat_statements` query safer.
* Select the top 10% of queries by total query time.
* Only expose the top 100 queries by total query time.
* Keep only the most useful metrics.
* Comment out the example by default.

Fixes: prometheus-community#549

Signed-off-by: SuperQ <[email protected]>

* Update changelog and version for v0.12.0 release

Signed-off-by: Joe Adams <[email protected]>

* Update exporter-toolkit

Updates the exporter-toolkit to the latest version
* Adds new landing page feature.
* Allow metrics path to be on `/`.

Signed-off-by: SuperQ <[email protected]>

* Update common Prometheus files

Signed-off-by: prombot <[email protected]>

* Fix column type for pg_replication_slots

Change the data type of `active` from int64 to bool. The documentation confirms that this is a boolean field.
https://www.postgresql.org/docs/current/view-pg-replication-slots.html

fixes prometheus-community#769

Signed-off-by: Joe Adams <[email protected]>

* Update versions listed in the README

Update the supported versions based on what we actually test in CI.

Signed-off-by: SuperQ <[email protected]>

* Update README cli flags

These have not been kept up to date.

Signed-off-by: Joe Adams <[email protected]>

* Adjust log level for collector startup

Since we support both multi-target and typical direct scrapes, either of these can fail and it is no longer an error.

Signed-off-by: Joe Adams <[email protected]>

* Fix pg_setting different help values

Signed-off-by: GitHub <[email protected]>

* Supports alternate postgres:// prefix in URLs

Adds support for the alternate postgres:// prefix in URLs. It's maybe
not the cleanest approach, but works.  Hoping I can either get some
pointers on a more appropriate patch, or that we could use this in the
interim to unblock this use-case.

Signed-off-by: Jack Wink <[email protected]>

* Bump github.com/lib/pq from 1.10.7 to 1.10.9

Bumps [github.com/lib/pq](https://github.com/lib/pq) from 1.10.7 to 1.10.9.
- [Release notes](https://github.com/lib/pq/releases)
- [Commits](lib/pq@v1.10.7...v1.10.9)

---
updated-dependencies:
- dependency-name: github.com/lib/pq
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>

* Refactor collector descriptors

Use individual collector metric descriptor vars to help avoid
miss-mapped or unused metrics.

Signed-off-by: SuperQ <[email protected]>

* Bump github.com/prometheus/common from 0.42.0 to 0.44.0

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.42.0 to 0.44.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](prometheus/common@v0.42.0...v0.44.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update linting

* Move errcheck exclude list to config file.
* Enable revive linter
* Fix up revive linting issues.

Signed-off-by: SuperQ <[email protected]>

* Bump github.com/prometheus/exporter-toolkit from 0.9.1 to 0.10.0

Bumps [github.com/prometheus/exporter-toolkit](https://github.com/prometheus/exporter-toolkit) from 0.9.1 to 0.10.0.
- [Release notes](https://github.com/prometheus/exporter-toolkit/releases)
- [Changelog](https://github.com/prometheus/exporter-toolkit/blob/master/CHANGELOG.md)
- [Commits](prometheus/exporter-toolkit@v0.9.1...v0.10.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/exporter-toolkit
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Move queries from queries.yaml to collectors (prometheus-community#801)

Signed-off-by: Ben Kochie <[email protected]>

* Fix pg_stat_database collector

The signature for creating a collector changed and CI didn't retrigger. Move metrics out of map and into individual vars.

Signed-off-by: Joe Adams <[email protected]>

* Fix up collector registration (prometheus-community#812)

Use const definitions to make collector registration consistent.
* Use collector subsystem name consistently.
* Fix up replication metric name unit.

Signed-off-by: SuperQ <[email protected]>

* Update release info for v0.12.1

Signed-off-by: Joe Adams <[email protected]>

* Deprecate extend queries feature (prometheus-community#811)

Mark the extend queries feature as deprecated in favor of recommending
the sql_exporter.

Signed-off-by: SuperQ <[email protected]>

* Update common Prometheus files

Signed-off-by: prombot <[email protected]>

* Deprecate additional database features

Now that we have deprecated extended queries we can deprecate related
database features.
* Deprecate flags/functions around auto discover databases.
* Deprecate flags/functions for additional constant labels.

Signed-off-by: SuperQ <[email protected]>

* Release v0.13.0

BREAKING CHANGES:

Please note, the following features are deprecated and may be removed in a future release:
- `auto-discover-databases`
- `extend.query-path`
- `constantLabels`
- `exclude-databases`
- `include-databases`

This exporter is meant to monitor PostgresSQL servers, not the user data/databases. If
you need a generic SQL report exporter https://github.com/burningalchemist/sql_exporter
is recommended.

* [CHANGE] Adjust log level for collector startup prometheus-community#784
* [CHANGE] Move queries from queries.yaml to collectors prometheus-community#801
* [CHANGE] Deprecate extend queries feature prometheus-community#811
* [CHANGE] Deprecate additional database features prometheus-community#815
* [CHANGE] Convert pg_stat_database to new collector prometheus-community#685
* [ENHANCEMENT] Supports alternate postgres:// prefix in URLs prometheus-community#787
* [BUGFIX] Fix pg_setting different help values prometheus-community#771
* [BUGFIX] Fix column type for pg_replication_slots prometheus-community#777
* [BUGFIX] Fix pg_stat_database collector prometheus-community#809

Signed-off-by: SuperQ <[email protected]>

* Add the instance struct to handle connections

The intent is to use the instance struct to hold the connection to the database as well as metadata about the instance. Currently this metadata only includes the version of postgres for the instance which can be used in the collectors to decide what query to run. In the future this could hold more metadata but for now it keeps the Collector interface arguments to a reasonable number.

Signed-off-by: Joe Adams <[email protected]>

* chore: fix a few typos

Signed-off-by: Alex Tymchuk <[email protected]>

* Bug fix: Make collector not fail on null values (prometheus-community#823)

* Make all values nullable

---------

Signed-off-by: Felix Yuan <[email protected]>
Co-authored-by: Ben Kochie <[email protected]>

* Release 0.13.1 (prometheus-community#824)

* [BUGFIX] Make collectors not fail on null values prometheus-community#823

Signed-off-by: SuperQ <[email protected]>

* Fixed replication pgReplicationSlotQuery - now it's working correctly for replica and primary (prometheus-community#825)

Signed-off-by: Vadim Voitenko <[email protected]>
Co-authored-by: Vadim Voitenko <[email protected]>

* Migrate pg_locks to collector package (prometheus-community#817)

Migrate the `pg_locks_count` query from `main` to the `collector`
package.

Signed-off-by: SuperQ <[email protected]>

* Cleanup collectors (prometheus-community#826)

Fix up `replication` and `process_idle` Update input params to match
the rest of the collectors.

Signed-off-by: SuperQ <[email protected]>

* Bug Fix: Fix lingering type issues (prometheus-community#828)

* Fix postmaster type issue
* Disable postmaster collector by default

---------

Signed-off-by: Felix Yuan <[email protected]>

* Update common Prometheus files (prometheus-community#829)

Signed-off-by: prombot <[email protected]>

* Fix replication collector

Signed-off-by: Tom Hughes <[email protected]>

* Add some more escapes to the query sanitizer

Signed-off-by: Tom Hughes <[email protected]>

* Add a collector to gather metrics on WAL size

Signed-off-by: Tom Hughes <[email protected]>

* Bump github.com/prometheus/client_golang from 1.15.1 to 1.16.0 (prometheus-community#853)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.15.1 to 1.16.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.15.1...v1.16.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Fix untyped integer overflows on 32-bit archs (prometheus-community#857)

go-sqlmock's Rows.AddRow() takes values which have a type alias of
"any", and appear to default to untyped ints if not explicitly cast.
When large values are passed which would overflow int32, tests fail.

Signed-off-by: Daniel Swarbrick <[email protected]>

* Bump github.com/smartystreets/goconvey from 1.8.0 to 1.8.1 (prometheus-community#852)

Bumps [github.com/smartystreets/goconvey](https://github.com/smartystreets/goconvey) from 1.8.0 to 1.8.1.
- [Release notes](https://github.com/smartystreets/goconvey/releases)
- [Commits](smartystreets/goconvey@v1.8.0...v1.8.1)

---
updated-dependencies:
- dependency-name: github.com/smartystreets/goconvey
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Unpack postgres arrays for process idle times correctly (prometheus-community#855)

Signed-off-by: Ben Kochie <[email protected]>

* Include all idle processes in the process idle metrics

Signed-off-by: Tom Hughes <[email protected]>

* Improve linting (prometheus-community#861)

* Disable unused-parameter check due to false positives on Collect()
  calls.
* Enable misspell.
* Simplify error returns.

Signed-off-by: SuperQ <[email protected]>

* Update common Prometheus files (prometheus-community#860)

Signed-off-by: prombot <[email protected]>
Co-authored-by: Ben Kochie <[email protected]>

* Update common Prometheus files

Signed-off-by: prombot <[email protected]>

* Gitlab collector: Database wraparound collector and test (prometheus-community#834)

* Database wraparound collector and test

---------

Signed-off-by: Felix Yuan <[email protected]>
Co-authored-by: Joe Adams <[email protected]>

* Add a logger to stat_database collector to get better handle on error
(also clean up some metric validity checks)

Signed-off-by: Felix Yuan <[email protected]>

* Update changelog for release 0.13.2 (prometheus-community#872)

Signed-off-by: Joe Adams <[email protected]>

* Gitlab Collector: Autovacuum collector and test (prometheus-community#840)

* Autovacuum collector and test

Signed-off-by: Felix Yuan <[email protected]>

* Update collector/pg_stat_activity_autovacuum.go

Co-authored-by: Joe Adams <[email protected]>
Signed-off-by: Felix Yuan <[email protected]>

* Update collector/pg_stat_activity_autovacuum.go

Co-authored-by: Joe Adams <[email protected]>
Signed-off-by: Felix Yuan <[email protected]>

* Use timestamp seconds

Signed-off-by: Felix Yuan <[email protected]>

* query formating

Signed-off-by: Felix Yuan <[email protected]>

* SQL format

Signed-off-by: Felix Yuan <[email protected]>

* Loosen autovacuum query

Signed-off-by: Felix Yuan <[email protected]>

---------

Signed-off-by: Felix Yuan <[email protected]>
Co-authored-by: Joe Adams <[email protected]>

* Gitlab Collector: Wal Receiver Collector and Test (prometheus-community#844)

* Wal Receiver Collector and Test

Signed-off-by: Felix Yuan <[email protected]>

* Add more escapes

Signed-off-by: Felix Yuan <[email protected]>

* Corrections to wal_receiver

Signed-off-by: Felix Yuan <[email protected]>

* Continue on null labels

Signed-off-by: Felix Yuan <[email protected]>

* Skip nulls and log a message

Signed-off-by: Felix Yuan <[email protected]>

* Redundant breaks

Signed-off-by: Felix Yuan <[email protected]>

* Fix up walreceiver

Signed-off-by: Felix Yuan <[email protected]>

* Remove extra label

Signed-off-by: Felix Yuan <[email protected]>

* Update collector/pg_stat_walreceiver.go

Co-authored-by: Ben Kochie <[email protected]>
Signed-off-by: Felix Yuan <[email protected]>

* Clean up the extra assignments

Signed-off-by: Felix Yuan <[email protected]>

* Update collector/pg_stat_walreceiver.go

Co-authored-by: Joe Adams <[email protected]>
Signed-off-by: Felix Yuan <[email protected]>

---------

Signed-off-by: Felix Yuan <[email protected]>
Co-authored-by: Ben Kochie <[email protected]>
Co-authored-by: Joe Adams <[email protected]>

* Gitlab collector: Xlog location collector and test (prometheus-community#849)

* Xlog location collector and test

Signed-off-by: Felix Yuan <[email protected]>

* Add more escapes

Signed-off-by: Felix Yuan <[email protected]>

* Change to Gauge

Signed-off-by: Felix Yuan <[email protected]>

---------

Signed-off-by: Felix Yuan <[email protected]>

* Handle new pg_stat_statements column names (prometheus-community#874)

Update pg_stat_statements collector to handle the new column names in
PostgreSQL 13.

Fixes: prometheus-community#502

Signed-off-by: SuperQ <[email protected]>

* Fixup new pg_stats_statements query (prometheus-community#876)

Fix all renames of `total_time` to `total_exec_time`.

Fixes: prometheus-community#502

Signed-off-by: SuperQ <[email protected]>

* Add a multi-target example config (prometheus-community#890)

Add an example Prometheus scrape config, similar to the
blackbox_exporter's example config.

Fixes: prometheus-community#888

Signed-off-by: SuperQ <[email protected]>

* Delay database connection until scrape (prometheus-community#882)

This no longer returns an error when creating a collector.instance when the database cannot be reached for the version query. This will resolve the entire postgresCollector not being registered for metrics collection when a database is not available. If the version query fails, the scrape will fail.

Resolves prometheus-community#880

Signed-off-by: Joe Adams <[email protected]>

* Bugfix: Make statsreset nullable (prometheus-community#877)

* Stats_reset as null seems to actually be legitimate for new databases,
so don't fail for it

---------

Signed-off-by: Felix Yuan <[email protected]>
Co-authored-by: Ben Kochie <[email protected]>

* Gitlab Collector: User Index io stats collector and test (prometheus-community#845)

* User Index io stats collector and test

---------

Signed-off-by: Felix Yuan <[email protected]>

* Update README to reflect changes made in prometheus-community#828 (prometheus-community#894)

Signed-off-by: Mathis Raguin <[email protected]>

* Gitlab Collector: Long running transactions collector and test (prometheus-community#836)

* Long running transactions collector and test

---------

Signed-off-by: Felix Yuan <[email protected]>
Co-authored-by: Ben Kochie <[email protected]>

* Update common Prometheus files (prometheus-community#900)

Signed-off-by: prombot <[email protected]>

* Fix a connection leak (prometheus-community#902)

The leak was introduced in PR#882

Signed-off-by: Christian Albrecht <[email protected]>
Co-authored-by: Christian Albrecht <[email protected]>

* Fix cross-compilation command in README.md (prometheus-community#903)

Signed-off-by: David Cook <[email protected]>

* fix pg_replication_lag_seconds (prometheus-community#895)

Signed-off-by: Vladimir Luksha <[email protected]>
Co-authored-by: Vladimir Luksha <[email protected]>

* stat_user_tables: Add total size metric (prometheus-community#904)

Signed-off-by: David Cook <[email protected]>

* Fix bugs mentioned in prometheus-community#908 (prometheus-community#910)

* Fix bugs mentioned in prometheus-community#908

These collectors are disabled by default, so unless enabled, they are not tested regularly.

Signed-off-by: Joe Adams <[email protected]>

---------

Signed-off-by: Joe Adams <[email protected]>

* Update common Prometheus files (prometheus-community#913)

Signed-off-by: prombot <[email protected]>

* Add changelog for v0.14 (prometheus-community#906)

* Add changelog for v0.14

- Add changelog entries since v0.13.2
- Update README with new options
- Bump version file

Signed-off-by: Joe Adams <[email protected]>

* Add changelog entry for prometheus-community#904

Signed-off-by: Joe Adams <[email protected]>

---------

Signed-off-by: Joe Adams <[email protected]>

* Adds 1kB and 2kB units (prometheus-community#915)

Signed-off-by: Eric tyrrell <[email protected]>

* Add error log when probe collector creation fails (prometheus-community#918)

Signed-off-by: Joe Adams <[email protected]>

* Fix test build failures on 32-bit arch again (prometheus-community#919)

Another case of untyped integer overflows on 32-bit arch.

Signed-off-by: Daniel Swarbrick <[email protected]>

* Add 32-bit testing to CI (prometheus-community#920)

Run Go tests with 32-bit to validate value overflow.

Signed-off-by: SuperQ <[email protected]>

* Bump github.com/prometheus/client_golang from 1.16.0 to 1.17.0 (prometheus-community#925)

* Bump github.com/prometheus/client_golang from 1.16.0 to 1.17.0

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>

* Update tests for latest client_golang.

Signed-off-by: SuperQ <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: SuperQ <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: SuperQ <[email protected]>

* Update common Prometheus files (prometheus-community#926)

Signed-off-by: prombot <[email protected]>

* Adjust collector to use separate connection per scrape (prometheus-community#931)

Fixes prometheus-community#921

Signed-off-by: Joe Adams <[email protected]>

* Bump golang.org/x/net from 0.10.0 to 0.17.0 (prometheus-community#936)

Bumps [golang.org/x/net](https://github.com/golang/net) from 0.10.0 to 0.17.0.
- [Commits](golang/net@v0.10.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/net
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Release v0.15.0 (prometheus-community#944)

* [ENHANCEMENT] Add 1kB and 2kB units prometheus-community#915
* [BUGFIX] Add error log when probe collector creation fails prometheus-community#918
* [BUGFIX] Fix test build failures on 32-bit arch prometheus-community#919
* [BUGFIX] Adjust collector to use separate connection per scrape prometheus-community#936

Signed-off-by: SuperQ <[email protected]>

* Update common Prometheus files (prometheus-community#951)

Signed-off-by: prombot <[email protected]>

* Update common Prometheus files (prometheus-community#963)

Signed-off-by: prombot <[email protected]>

* pg_replication_slot: add slot type label (prometheus-community#960)

Signed-off-by: Alex Simenduev <[email protected]>

* Bump github.com/prometheus/common from 0.44.0 to 0.45.0 (prometheus-community#948)

Bumps [github.com/prometheus/common](https://github.com/prometheus/common) from 0.44.0 to 0.45.0.
- [Release notes](https://github.com/prometheus/common/releases)
- [Commits](prometheus/common@v0.44.0...v0.45.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/common
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/client_model (prometheus-community#949)

Bumps [github.com/prometheus/client_model](https://github.com/prometheus/client_model) from 0.4.1-0.20230718164431-9a2bf3000d16 to 0.5.0.
- [Release notes](https://github.com/prometheus/client_model/releases)
- [Commits](https://github.com/prometheus/client_model/commits/v0.5.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_model
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* pg_stat_database: added support for `active_time` counter (prometheus-community#961)

* feat(pg_stat_database): active time metric

---------

Signed-off-by: Jiri Sveceny <[email protected]>

* Bump golang.org/x/crypto from 0.14.0 to 0.17.0 (prometheus-community#988)

Bumps [golang.org/x/crypto](https://github.com/golang/crypto) from 0.14.0 to 0.17.0.
- [Commits](golang/crypto@v0.14.0...v0.17.0)

---
updated-dependencies:
- dependency-name: golang.org/x/crypto
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* Bump github.com/prometheus/client_golang from 1.17.0 to 1.18.0 (prometheus-community#993)

Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.17.0 to 1.18.0.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.17.0...v1.18.0)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* use Info level for excluded databases log message (prometheus-community#1003)

This is the only log message which didn't specify a level in the
postgres_exporter. I am unsure if this log message should be info or
debug, but leaning towards the more important since previously it would
just always log.

The way I validated this was the only non-leveled logger was via grep.
Both of these only returned this callsite previously:

  git grep 'logger\.Log'
  git grep '\.Log(' | grep -v level

Signed-off-by: Keegan Carruthers-Smith <[email protected]>

---------

Signed-off-by: Ryan J. Geyer <[email protected]>
Signed-off-by: Joe Adams <[email protected]>
Signed-off-by: cezmunsta <[email protected]>
Signed-off-by: prombot <[email protected]>
Signed-off-by: Kurtis Bass <[email protected]>
Signed-off-by: Julien Pivotto <[email protected]>
Signed-off-by: Khiem Doan <[email protected]>
Signed-off-by: Oleksandr Mysyura <[email protected]>
Signed-off-by: SuperQ <[email protected]>
Signed-off-by: Zachary Caldarola <[email protected]>
Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Zachary Caldarola <[email protected]>
Signed-off-by: Mike <[email protected]>
Signed-off-by: GitHub <[email protected]>
Signed-off-by: Jack Wink <[email protected]>
Signed-off-by: Ben Kochie <[email protected]>
Signed-off-by: Alex Tymchuk <[email protected]>
Signed-off-by: Felix Yuan <[email protected]>
Signed-off-by: Vadim Voitenko <[email protected]>
Signed-off-by: Tom Hughes <[email protected]>
Signed-off-by: Daniel Swarbrick <[email protected]>
Signed-off-by: Mathis Raguin <[email protected]>
Signed-off-by: Christian Albrecht <[email protected]>
Signed-off-by: David Cook <[email protected]>
Signed-off-by: Vladimir Luksha <[email protected]>
Signed-off-by: Eric tyrrell <[email protected]>
Signed-off-by: Alex Simenduev <[email protected]>
Signed-off-by: Jiri Sveceny <[email protected]>
Signed-off-by: Keegan Carruthers-Smith <[email protected]>
Co-authored-by: Ryan J. Geyer <[email protected]>
Co-authored-by: Joe Adams <[email protected]>
Co-authored-by: cezmunsta <[email protected]>
Co-authored-by: prombot <[email protected]>
Co-authored-by: Kurtis Bass <[email protected]>
Co-authored-by: Julien Pivotto <[email protected]>
Co-authored-by: Khiem Doan <[email protected]>
Co-authored-by: Oleksandr Mysyura <[email protected]>
Co-authored-by: Ben Kochie <[email protected]>
Co-authored-by: Zachary Caldarola <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Zachary Caldarola <[email protected]>
Co-authored-by: Mike <[email protected]>
Co-authored-by: Khaled Khalifa <[email protected]>
Co-authored-by: Jack Wink <[email protected]>
Co-authored-by: Felix Yuan <[email protected]>
Co-authored-by: Alex Tymchuk <[email protected]>
Co-authored-by: Vadim Voitenko <[email protected]>
Co-authored-by: Vadim Voitenko <[email protected]>
Co-authored-by: Tom Hughes <[email protected]>
Co-authored-by: Daniel Swarbrick <[email protected]>
Co-authored-by: Mathis Raguin <[email protected]>
Co-authored-by: Christian Albrecht <[email protected]>
Co-authored-by: Christian Albrecht <[email protected]>
Co-authored-by: David Cook <[email protected]>
Co-authored-by: Vladimir Luksha <[email protected]>
Co-authored-by: Vladimir Luksha <[email protected]>
Co-authored-by: David Cook <[email protected]>
Co-authored-by: Eric Tyrrell <[email protected]>
Co-authored-by: Alex Simenduev <[email protected]>
Co-authored-by: Jiri Sveceny <[email protected]>
Co-authored-by: Keegan Carruthers-Smith <[email protected]>
@stepanselyuk
Copy link

stepanselyuk commented Feb 29, 2024

I'm unsure if that is fixed or not, as well as why it happened twice on our systems, but we use version 0.15, the exporter took 100 and then 500 connections (after updating the connections limit).

postgres_connections_from_127 0 0 1

and the scraping interval is 15 seconds ...

docker images were in use: docker.io/bitnami/postgres-exporter:0.15.0-debian-11-r7 and docker.io/bitnami/postgres-exporter:0.15.0-debian-12-r13 (bitnami/postgresql helm chart in use)

betindex-pg-number-active-connections

that time the exporter stopped issuing metrics, which may be an important thing, but it stopped in "round" time 22:00 UTC and 23:00 UTC.

@sysadmind maybe open a new issue for this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants