Skip to content

Commit

Permalink
prometheus: use older node_exporter
Browse files Browse the repository at this point in the history
v1.3.1, the most up to date released version, has a bug that inflates
the bytes written by ~8x for NVMe drives (which in particular includes
the default drives for our GCE roachprod machines). Fundamentally this
is caused by the fact that these devices use a 4K sector size whereas
the kernel will always report based on a 512B sector size.

This took us a while to figure out, and to avoid repeating this exercise
periodically, downgrade node_exporter to 1.2.2, which pre-dates a
refactor that introduces the regression.

See: prometheus/node_exporter#2310

Release note: None
  • Loading branch information
tbg committed Jul 7, 2022
1 parent 3b22cdd commit b6d9372
Showing 1 changed file with 4 additions and 1 deletion.
5 changes: 4 additions & 1 deletion pkg/roachprod/prometheus/prometheus.go
Original file line number Diff line number Diff line change
Expand Up @@ -192,12 +192,15 @@ func Init(
ctx context.Context, l *logger.Logger, c *install.SyncedCluster, cfg Config,
) (_ *Prometheus, _ error) {
if len(cfg.NodeExporter) > 0 {
// NB: when upgrading here, make sure to target a version that picks up this PR:
// https://github.com/prometheus/node_exporter/pull/2311
// At time of writing, there hasn't been a release in over half a year.
if err := c.RepeatRun(ctx, l, os.Stdout, os.Stderr, cfg.NodeExporter,
"download node exporter",
`
(sudo systemctl stop node_exporter || true) &&
rm -rf node_exporter && mkdir -p node_exporter && curl -fsSL \
https://github.com/prometheus/node_exporter/releases/download/v1.3.1/node_exporter-1.3.1.linux-amd64.tar.gz |
https://github.com/prometheus/node_exporter/releases/download/v1.2.2/node_exporter-1.2.2.linux-amd64.tar.gz |
tar zxv --strip-components 1 -C node_exporter
`); err != nil {
return nil, err
Expand Down

0 comments on commit b6d9372

Please sign in to comment.