Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using custom queries is deprecated in postgres_exporter #81

Closed
jkuester opened this issue Jul 27, 2023 · 6 comments
Closed

Using custom queries is deprecated in postgres_exporter #81

jkuester opened this issue Jul 27, 2023 · 6 comments
Assignees

Comments

@jkuester
Copy link
Collaborator

jkuester commented Jul 27, 2023

intro

So, apparently the custom queries functionality in postgres_exporter which allows us to collect metrics based on the data in the postgres database (aka pretty much the only reason we are using the postgres_exporter) has been deprecated. It seems that the maintainers of postgres_exporter view the main purpose of the project as to provide metrics specific to the inner workings of the Postgres instance. They recommend using a different exporter for collecting metrics from the actual data in the Postgres database.

Test setup

pinning these to the top of the ticket, @mrjones-plip to keep up to date:

  1. check out these repos
  2. using script/docker-helper-4.x directory in CHT Core repo, start a docker helper instance of CHT core - note the URL and Port.
  3. assuming docker helper gave you a URL and port of 192-168-68-17.local-ip.medicmobile.org:10464 - start your couch2pg instance by cd into the cht-couch2pg directory and running:
    COUCH2PG_SLEEP_MINS=0.1 \
       COUCHDB_URL=https://medic:[email protected]:10464/medic \
       docker compose up -d
  4. you can optionally connect with a postgres client to localhost:5432 with username cht_couch2pg , password cht_couch2pg_password to database cht to ensure connection is working
  5. cd into watchdog repo directory and check out 81-sql-exporter repo
  6. still in watchdog repo, update your cht-instances.ylm to have the URL from step 1. In this example it's 192-168-68-17.local-ip.medicmobile.org:10464
  7. copy exporters/postgres/sql_servers_example.yml to exporters/postgres/sql_servers.yml. Update the value in 172-17-0-1.local-ip.medicmobile.org:10464 to match the CHT Core URL from step 2 above.
  8. still in the top level cht-watchdog directory, run the restart script:
     ./development/kill.start.ips.sh

Test steps

  1. From step 8's output above, look for the one called *-sql_exporter-* and go to that URL (http://172.30.0.4:9399/metrics in this case).
     Services:
    
     cht-watchdog-grafana-1          http://172.30.0.3:3000
     cht-watchdog-prometheus-1       http://172.30.0.5:9090/targets?search=
     cht-watchdog-json-exporter-1    http://172.30.0.2:7979/metrics
     cht-watchdog-sql_exporter-1     http://172.30.0.4:9399/metrics
  2. ensure the web page looks like this:
     # HELP couch2pg couch2pg backlog.
     # TYPE couch2pg gauge
     couch2pg{db="_users",job="db_targets",target="local-cht"} 1
     couch2pg{db="medic",job="db_targets",target="local-cht"} 186
     couch2pg{db="medic-logs",job="db_targets",target="local-cht"} 10
     couch2pg{db="medic-sentinel",job="db_targets",target="local-cht"} 79
     couch2pg{db="medic-users-meta",job="db_targets",target="local-cht"} 3
     # HELP scrape_duration_seconds How long it took to scrape the target in seconds
     # TYPE scrape_duration_seconds gauge
     scrape_duration_seconds{job="db_targets",target="local-cht"} 0.008943289
     # HELP up 1 if the target is reachable, or 0 if the scrape failed
     # TYPE up gauge
     up{job="db_targets",target="local-cht"} 1
    
  3. log into the dev watchdog instance at http://localhost:3000 (user medic password password) and go to the main "admin overview" dashboard. ensure the "Couch2pg Backlog" panel is working. It should show a backlog of 0`:
    image
  4. using the name of the container on step 1 above in the section, stop the couch2pg container. (eg docker stop cht-couch2pg-cht-couch2pg-1).
  5. add a household to your cht instance. after a few min you should see a backlog greater than zero
  6. start the couch2pg container. (eg docker start cht-couch2pg-cht-couch2pg-1) and you should see the backlog go back to 0
@jkuester
Copy link
Collaborator Author

When addressing this, it would be a good idea to consider how we will support for getting metrics from a db populated by cht-sync vs couch2pg. Both should have first-class support and will probably use very similar exporters/configurations.

(Also worth calling out that switching exporters may result in a breaking change that needs a major version bump of Watchdog (unless the format of the DB connection config files happens to match....))

@andrablaj
Copy link
Member

andrablaj commented Jul 31, 2023

@lorerod FYI about the cht-sync v& couch2pg support

@mrjones-plip
Copy link
Contributor

mrjones-plip commented Apr 17, 2024

@jkuester - thanks again for the call earlier! Here's a status update that might be easier to digest.

cc @eljhkrr

Demo steps

  1. set up SSH tunnel to RDBMs to expose postgres on localhost (I often bind it to the shared docker interface at 172.17.0.1 as then both my local workstation and docker containers can access it: ssh -L 172.17.0.1:5432:localhost:5432 [email protected] -p 34796)
  2. use the 81-sql-exporter branch per my PR
  3. copy the sql_servers_example.yml file to sql_servers.yml and add two targets: cmmb-kenya-app and moh_mali_chw
  4. run the compose to start the server from the root of cht-watchdog directory: docker compose -f docker-compose.yml -f exporters/postgres/compose.yml up -d
  5. do a curl on the /metrics endpoint on the now running exporter. note that the result is: metric (1) * databases (5) * instances (2) = total metrics (10).
  6. Final result is this HTML that is ready for prometheus to scrape (not yet implemented)

Demo HTML (er "HTML")

 # HELP couch2pg couch2pg backlog.
 # TYPE couch2pg gauge
 couch2pg{db="_users",job="db_targets",target="cmmb-kenya-app"} 1329
 couch2pg{db="_users",job="db_targets",target="moh_mali_chw"} 3414
 couch2pg{db="medic",job="db_targets",target="cmmb-kenya-app"} 1.44244e+06
 couch2pg{db="medic",job="db_targets",target="moh_mali_chw"} 3.308697e+06
 couch2pg{db="medic-logs",job="db_targets",target="cmmb-kenya-app"} 0
 couch2pg{db="medic-logs",job="db_targets",target="moh_mali_chw"} 91042
 couch2pg{db="medic-sentinel",job="db_targets",target="cmmb-kenya-app"} 1.94765e+06
 couch2pg{db="medic-sentinel",job="db_targets",target="moh_mali_chw"} 6.566893e+06
 couch2pg{db="medic-users-meta",job="db_targets",target="cmmb-kenya-app"} 7018
 couch2pg{db="medic-users-meta",job="db_targets",target="moh_mali_chw"} 12386
 # HELP scrape_duration_seconds How long it took to scrape the target in seconds
 # TYPE scrape_duration_seconds gauge
 scrape_duration_seconds{job="db_targets",target="cmmb-kenya-app"} 0.297466424
 scrape_duration_seconds{job="db_targets",target="moh_mali_chw"} 0.299140411
 # HELP up 1 if the target is reachable, or 0 if the scrape failed
 # TYPE up gauge
 up{job="db_targets",target="cmmb-kenya-app"} 1
 up{job="db_targets",target="moh_mali_chw"} 1

Demo video

postgres-exporter-mini-demo.mp4

@mrjones-plip mrjones-plip moved this from This Week's commitments to Todo in Product Team Activities Apr 17, 2024
@mrjones-plip mrjones-plip moved this from Todo to This Week's commitments in Product Team Activities Apr 17, 2024
mrjones-plip added a commit that referenced this issue Apr 25, 2024
)

* Replace deprecated SQL exporter with non-deprecated one per #81

* remove comments to clean up yaml file

* default to local cht-couch2pg db values in example

* set example yaml file to actually work with default couch2pg settings

* set min_interval 60s which means calls happening <60s hit cache instead DB directly

* remove config file that shouldn't have been committed

* update git ignore, place holder for prom scrape

* omg get entry command and file name correct

* got prometheus scrape config working \o/

* adding scrape config

* finalize dashboard, update example sql, remove comments from scrape, change gauge -> counter

* add dev restart script, update detail dashboard json

* start to get dev and test environs to work

* normalize passwords for sql and couch2pg

* feat(na): path to new compose file in CI, update readme how to add new sql

* feat(na): fix path to ci test files

* feat(na): update git ignore to include test fake file

* fix(na): add set -e and fake cht to restart script per feedback
@mrjones-plip
Copy link
Contributor

once medic/cht-docs#1367 is merged I'll close this ticket

@lorerod
Copy link

lorerod commented May 2, 2024

Tested this locally.
MacOS Sonoma
Docker desktop 4.29.0
Docker compose v1.29.2

I followed this steps with a couple of tweaks for Mac.

  1. Followed this steps and changed the dev script first line to #!/usr/local/bin/bash

  2. Get the IP for my computer on the WiFi and then use that everywhere the 172.17.0.1 IP was used in the test steps.

  3. http://127.0.0.1:9399/metrics looks like this:

# HELP couch2pg_progress_sequence couch2pg backlog.
# TYPE couch2pg_progress_sequence counter
couch2pg_progress_sequence{db="_users",job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 1
couch2pg_progress_sequence{db="medic",job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 182
couch2pg_progress_sequence{db="medic-logs",job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 10
couch2pg_progress_sequence{db="medic-sentinel",job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 79
couch2pg_progress_sequence{db="medic-users-meta",job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 3
# HELP scrape_duration_seconds How long it took to scrape the target in seconds
# TYPE scrape_duration_seconds gauge
scrape_duration_seconds{job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 0.0011491000000000001
# HELP up 1 if the target is reachable, or 0 if the scrape failed
# TYPE up gauge
up{job="db_targets",target="192-168-100-62.local-ip.medicmobile.org:10456"} 1

Thanks!

@mrjones-plip
Copy link
Contributor

Thanks @lorerod for all the testing! Much appreciated.

Now that your testing went well and the docs PR is merged, closing out this ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: Done
Development

No branches or pull requests

4 participants