Can't start the exporter prometheus #236

faguayot · 2021-06-24T15:05:10Z

Describe the bug
We can't start the recollecion of data with prometheus for the new harvest releases 21.05.2 and the pre-release 21.05.3. In the previous version, we were running the collector without errors.
Environment

Harvest version: harvest version 21.05.2-1 (commit ce091de) (build date 2021-06-14T20:31:09+0530) linux/amd64 or harvest version 21.05.3-1 (commit 63f0b11) (build date 2021-06-23T22:11:00+0530) linux/amd64
Command line arguments used: bin/harvest start --config two.yml
OS: RHEL 8.2
Install method: yum
ONTAP Version: 9.5 and 9.7

To Reproduce
Here the output of the command running in the foreground mode, here you can see the behaviour for the three different versions. In the 21.05.1 we were running correctly and in the other two not.

====================================
HARVESST 21.05.1
====================================

[root@harvest20 harvest]# bin/harvest -v
harvest version 21.05.1-1 (commit 2211c00) (build date 2021-05-21T01:28:12+0530) linux/amd64
[root@harvest20 harvest]# bin/harvest start --config two.yml --foreground
set debug mode ON (starting poller in foreground otherwise is unsafe)
starting in foreground, enter CTRL+C or close terminal to stop poller
2021/06/24 01:18:42 (info ) : options config: two.yml
2021/06/24 01:18:42 (info ) (poller) (sces1p1_01): started in foreground [pid=1792]
2021/06/24 01:18:45 (info ) (poller) (sces1p1_01): poller start-up complete
2021/06/24 01:18:45 (info ) (poller) (sces1p1_01): updated status, up collectors: 41 (of 41), up exporters: 1 (of 1)
2021/06/24 01:18:45 (info ) (collector) (Zapi:SnapMirror): no [SnapMirror] instances on system, entering standby mode
2021/06/24 01:18:45 (info ) (collector) (Zapi:SnapMirror): no [SnapMirror] instances on system, entering standby mode
2021/06/24 01:18:45 (info ) (collector) (ZapiPerf:Path): no [Path] instances on system, entering standby mode
2021/06/24 01:18:45 (info ) (collector) (ZapiPerf:Path): recovered from standby mode, back to normal schedule
2021/06/24 01:18:45 (warning) (collector) (ZapiPerf:Path): lagging behind schedule 60.63µs
2021/06/24 01:18:46 (info ) (collector) (Zapi:Lun): no [Lun] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:WAFLSizer): no [WAFLSizer] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:WAFLSizer): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:WAFLSizer): lagging behind schedule 61.529µs
^X2021/06/24 01:18:46 (error ) (collector) (ZapiPerf:FCVI): instance request: api request rejected => Counter collection is disabled
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:FCVI): no [FCVI] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:FCVI): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:FCVI): lagging behind schedule 74.465µs
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:WAFLCompBin): no [WAFLCompBin] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:WAFLCompBin): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:WAFLCompBin): lagging behind schedule 75.41µs
2021/06/24 01:18:46 (info ) (collector) (Zapi:Lun): no [Lun] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:WAFLAggr): no [WAFLAggr] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:WAFLAggr): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:WAFLAggr): lagging behind schedule 56.449µs
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:CIFSvserver): no [CIFSvserver] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:CIFSvserver): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:CIFSvserver): lagging behind schedule 185.021µs
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:ObjectStoreClient): no [ObjectStoreClient] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:ObjectStoreClient): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:ObjectStoreClient): lagging behind schedule 59.669µs
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:ISCSI): no [ISCSI] instances on system, entering standby mode
2021/06/24 01:18:46 (info ) (collector) (ZapiPerf:ISCSI): recovered from standby mode, back to normal schedule
2021/06/24 01:18:46 (warning) (collector) (ZapiPerf:ISCSI): lagging behind schedule 64.995µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv3Node): no [NFSv3Node] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv3Node): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:NFSv3Node): lagging behind schedule 354.255µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv4Node): no [NFSv4Node] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv4Node): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:NFSv4Node): lagging behind schedule 46.863µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:ExtCacheObj): no [ExtCacheObj] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:ExtCacheObj): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:ExtCacheObj): lagging behind schedule 188.952µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv3): no [NFSv3] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv3): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:NFSv3): lagging behind schedule 73.488µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:CIFSNode): no [CIFSNode] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:CIFSNode): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:CIFSNode): lagging behind schedule 57.672µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv41Node): no [NFSv41Node] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv41Node): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:NFSv41Node): lagging behind schedule 62.548µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv4): no [NFSv4] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv4): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:NFSv4): lagging behind schedule 64.687µs
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv41): no [NFSv41] instances on system, entering standby mode
2021/06/24 01:18:47 (info ) (collector) (ZapiPerf:NFSv41): recovered from standby mode, back to normal schedule
2021/06/24 01:18:47 (warning) (collector) (ZapiPerf:NFSv41): lagging behind schedule 82.519µs

 ====================================
 HARVESST 21.05.2
 ====================================

[root@harvest20 harvest]# bin/harvest -v
harvest version 21.05.2-1 (commit ce091de) (build date 2021-06-14T20:31:09+0530) linux/amd64
[root@harvest20 harvest]# bin/harvest start --config two.yml --foreground
set debug mode ON (starting poller in foreground otherwise is unsafe)
starting in foreground, enter CTRL+C or close terminal to stop poller
1:20AM INF command-line-arguments/poller.go:159 > log level used: info Poller=sces1p1_01
1:20AM INF command-line-arguments/poller.go:160 > options config: two.yml Poller=sces1p1_01
1:20AM INF command-line-arguments/poller.go:191 > started in foreground [pid=1895] Poller=sces1p1_01
1:20AM INF command-line-arguments/poller.go:293 > poller start-up complete Poller=sces1p1_01
1:20AM INF command-line-arguments/poller.go:417 > updated status, up collectors: 41 (of 41), up exporters: 1 (of 1) Poller=sces1p1_01
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [SnapMirror] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:SnapMirror
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [Path] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:Path
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:Path
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 336.887µs Poller=sces1p1_01 collector=ZapiPerf:Path
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [SnapMirror] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:SnapMirror
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [WAFLSizer] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:WAFLSizer
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:WAFLSizer
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 262.112µs Poller=sces1p1_01 collector=ZapiPerf:WAFLSizer
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [WAFLCompBin] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:WAFLCompBin
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:WAFLCompBin
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 81.755µs Poller=sces1p1_01 collector=ZapiPerf:WAFLCompBin
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [Lun] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:Lun
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [ObjectStoreClient] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:ObjectStoreClient
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:ObjectStoreClient
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 171.365µs Poller=sces1p1_01 collector=ZapiPerf:ObjectStoreClient
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [ISCSI] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:ISCSI
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:ISCSI
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 112.106µs Poller=sces1p1_01 collector=ZapiPerf:ISCSI
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [CIFSvserver] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:CIFSvserver
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:CIFSvserver
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 275.845µs Poller=sces1p1_01 collector=ZapiPerf:CIFSvserver
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [CIFSNode] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:CIFSNode
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:CIFSNode
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 114.169µs Poller=sces1p1_01 collector=ZapiPerf:CIFSNode
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [NFSv3] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv3
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv3
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 92.334µs Poller=sces1p1_01 collector=ZapiPerf:NFSv3
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [WAFLAggr] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:WAFLAggr
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:WAFLAggr
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 85.435µs Poller=sces1p1_01 collector=ZapiPerf:WAFLAggr
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [Lun] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:Lun
1:20AM ERR goharvest2/cmd/collectors/zapiperf/zapiperf.go:1163 > instance request error="api request rejected => Counter collection is disabled" Poller=sces1p1_01 collector=ZapiPerf:FCVI stack=[{"func":"New","line":"35","source":"errors.go"},{"func":"(*Client).invoke","line":"402","source":"client.go"},{"func":"(*Client).InvokeBatchWithTimers","line":"280","source":"client.go"},{"func":"(*Client).InvokeBatchRequest","line":"253","source":"client.go"},{"func":"(*ZapiPerf).PollInstance","line":"1162","source":"zapiperf.go"},{"func":"(*task).Run","line":"60","source":"schedule.go"},{"func":"(*AbstractCollector).Start","line":"270","source":"collector.go"},{"func":"goexit","line":"1371","source":"asm_amd64.s"}]
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [FCVI] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:FCVI
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:FCVI
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 95.872µs Poller=sces1p1_01 collector=ZapiPerf:FCVI
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [NFSv4] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv4
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv4
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 139.563µs Poller=sces1p1_01 collector=ZapiPerf:NFSv4
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [NFSv3Node] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv3Node
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv3Node
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 130.995µs Poller=sces1p1_01 collector=ZapiPerf:NFSv3Node
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [NFSv41Node] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv41Node
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv41Node
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 121.992µs Poller=sces1p1_01 collector=ZapiPerf:NFSv41Node
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [NFSv4Node] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv4Node
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv4Node
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 138.336µs Poller=sces1p1_01 collector=ZapiPerf:NFSv4Node
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [ExtCacheObj] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:ExtCacheObj
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:ExtCacheObj
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 144.945µs Poller=sces1p1_01 collector=ZapiPerf:ExtCacheObj
1:20AM INF goharvest2/cmd/poller/collector/collector.go:296 > no [NFSv41] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv41
1:20AM INF goharvest2/cmd/poller/collector/collector.go:318 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv41
1:20AM WRN goharvest2/cmd/poller/collector/collector.go:387 > lagging behind schedule 96.71µs Poller=sces1p1_01 collector=ZapiPerf:NFSv41

====================================
HARVESST 21.05.3
====================================

[root@harvest20 harvest]# bin/harvest -v
harvest version 21.05.3-1 (commit 63f0b11) (build date 2021-06-23T22:11:00+0530) linux/amd64
[root@harvest20 harvest]# bin/harvest start --config two.yml --foreground
set debug mode ON (starting poller in foreground otherwise is unsafe)
configuration error => Poller does not exist sces1p1_01
starting in foreground, enter CTRL+C or close terminal to stop poller
1:21AM INF command-line-arguments/poller.go:154 > log level used: info Poller=sces1p1_01
1:21AM INF command-line-arguments/poller.go:155 > options config: two.yml Poller=sces1p1_01
1:21AM INF command-line-arguments/poller.go:180 > started in foreground [pid=2017] Poller=sces1p1_01
1:21AM INF command-line-arguments/poller.go:282 > poller start-up complete Poller=sces1p1_01
1:21AM INF command-line-arguments/poller.go:406 > updated status, up collectors: 41 (of 41), up exporters: 1 (of 1) Poller=sces1p1_01
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [SnapMirror] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:SnapMirror
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [SnapMirror] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:SnapMirror
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [Path] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:Path
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:Path
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 99.318µs Poller=sces1p1_01 collector=ZapiPerf:Path
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [WAFLSizer] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:WAFLSizer
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:WAFLSizer
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 134.251µs Poller=sces1p1_01 collector=ZapiPerf:WAFLSizer
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [WAFLCompBin] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:WAFLCompBin
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:WAFLCompBin
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 118.276µs Poller=sces1p1_01 collector=ZapiPerf:WAFLCompBin
1:21AM ERR goharvest2/cmd/collectors/zapiperf/zapiperf.go:1170 > instance request error="api request rejected => Counter collection is disabled" Poller=sces1p1_01 collector=ZapiPerf:FCVI stack=[{"func":"New","line":"35","source":"errors.go"},{"func":"(*Client).invoke","line":"403","source":"client.go"},{"func":"(*Client).InvokeBatchWithTimers","line":"281","source":"client.go"},{"func":"(*Client).InvokeBatchRequest","line":"254","source":"client.go"},{"func":"(*ZapiPerf).PollInstance","line":"1169","source":"zapiperf.go"},{"func":"(*task).Run","line":"60","source":"schedule.go"},{"func":"(*AbstractCollector).Start","line":"269","source":"collector.go"},{"func":"goexit","line":"1371","source":"asm_amd64.s"}]
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [FCVI] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:FCVI
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:FCVI
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 75.559µs Poller=sces1p1_01 collector=ZapiPerf:FCVI
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [WAFLAggr] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:WAFLAggr
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:WAFLAggr
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 81.203µs Poller=sces1p1_01 collector=ZapiPerf:WAFLAggr
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [Lun] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:Lun
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [ObjectStoreClient] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:ObjectStoreClient
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:ObjectStoreClient
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 133.832µs Poller=sces1p1_01 collector=ZapiPerf:ObjectStoreClient
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [CIFSvserver] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:CIFSvserver
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:CIFSvserver
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 5.773295ms Poller=sces1p1_01 collector=ZapiPerf:CIFSvserver
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [ISCSI] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:ISCSI
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:ISCSI
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 83.647µs Poller=sces1p1_01 collector=ZapiPerf:ISCSI
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [ExtCacheObj] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:ExtCacheObj
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:ExtCacheObj
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 115.696µs Poller=sces1p1_01 collector=ZapiPerf:ExtCacheObj
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [CIFSNode] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:CIFSNode
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:CIFSNode
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 156.771µs Poller=sces1p1_01 collector=ZapiPerf:CIFSNode
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [Lun] instances on system, entering standby mode Poller=sces1p1_01 collector=Zapi:Lun
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [NFSv4Node] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv4Node
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv4Node
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 80.038µs Poller=sces1p1_01 collector=ZapiPerf:NFSv4Node
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [NFSv3Node] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv3Node
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv3Node
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 686.415µs Poller=sces1p1_01 collector=ZapiPerf:NFSv3Node
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [NFSv41Node] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv41Node
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv41Node
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 190.802µs Poller=sces1p1_01 collector=ZapiPerf:NFSv41Node
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [NFSv3] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv3
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv3
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 327.095µs Poller=sces1p1_01 collector=ZapiPerf:NFSv3
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [NFSv4] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv4
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv4
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 81.35µs Poller=sces1p1_01 collector=ZapiPerf:NFSv4
1:21AM INF goharvest2/cmd/poller/collector/collector.go:295 > no [NFSv41] instances on system, entering standby mode Poller=sces1p1_01 collector=ZapiPerf:NFSv41
1:21AM INF goharvest2/cmd/poller/collector/collector.go:317 > recovered from standby mode, back to normal schedule Poller=sces1p1_01 collector=ZapiPerf:NFSv41
1:21AM WRN goharvest2/cmd/poller/collector/collector.go:386 > lagging behind schedule 120.266µs Poller=sces1p1_01 collector=ZapiPerf:NFSv41

The text was updated successfully, but these errors were encountered:

cgrinds · 2021-06-24T16:04:41Z

@faguayot appreciate all the details - let's focus on 21.05.3 for now.
When staring harvest in foreground mode, Harvest also enables debug=true
and when debug is enabled, the Prometheus exporter won't export data.

Let's try:

clear logs for sces1p1_01 via rm /var/log/poller_sces1p1_01.log
cd to /opt/harvest
Run that one poller bin/harvest --config two.yml start sces1p1_01
Paste the results of ps aux | grep poller
Paste or attach the log output of /var/log/poller_sces1p1_01.log
Paste the results of bin/harvest --config two.yml status
This will show you the PromPort - wait a couple of minutes, and then use that port to curl the metrics endpoint like so:
curl -s 'http://127.0.0.1:<PORT>/metrics

faguayot · 2021-06-25T08:10:23Z

Hello Chris,

Here the image with the status like 6-7 minutos after I've start the process.

In this image appears the process running.

Since I've upgraded the harvest from 21.05.1, in the logs I've seen those traces in which the exporter isn't running.

{"level":"info","Poller":"sces1p1_01","caller":"command-line-arguments/poller.go:406","time":"2021-06-25T09:50:32+02:00","message":"updated status, up collectors: 41 (of 41), up exporters: 0 (of 0)"}

Here the poller log file.
poller_sces1p1_01.log

This is our configuration for the pollers, exporters and the default. I've changed the format yml to log because if it's not, I can't upload.
two.log

rahulguptajss · 2021-06-25T10:50:02Z

@faguayot your configuration in two.yml is not correct w.r.t. port. prometheus_port configuration is not supported.
You can refer @ https://github.com/NetApp/harvest/blob/main/harvest.yml for port configuration.
Alternatively, we have another way of configuring ports #172 in 21.05.3 . Documentation @ https://github.com/NetApp/harvest/blob/release/21.05.3/cmd/exporters/prometheus/README.md . Documentation will be updated in main branch by today.

cgrinds · 2021-06-25T11:50:01Z

@faguayot as @rahulguptajss mentioned prometheus_port is not a supported key/value. Take a look at port range and see if that's a better fit.

faguayot · 2021-06-28T07:26:14Z

Thanks @rahulguptajss I was checking something like this parameter port_range: 2000-2030 because I knew that you will implement in the future but I didn't found and I didn't know the prometheus_port wasn't supported anymore.

Another thing is in this new version I should put the exporter which I want to use with every poller instead of define the exporter into the Defaults configuration and taking this value as default for every poller always in the configuration of the cluster. In a case where we have this configuration in the Defaults and in the specific configuration for this cluster, the prevalence should be for the configuration in the cluster which is more specific. That is what happen in the configuration in the Defaults now (If I am not wrong) and it was the same for harvest 1.6.

I think it is important to know which configuration/parameters can use in the configuration for the different versions and which parameters will be remove about the parameters were using , because if you remove something we don't know what happen until we spend time in troubleshooting and we will drive crazy trying to look where is the problem, like happened to me.

Thanks to both.

faguayot · 2021-06-28T08:48:07Z

In the release notes for the veersion 21.05.2 , it says:

Add workload counters to ZapiPerf #9

Do we need to do something for see this counters? Maybe it is required to change something in the configuration in the ZapiPerf.

Could you help me with this? Because I moved to 21.05.2 and 21.05.3 versions for achiving this.

rahulguptajss · 2021-06-28T09:04:40Z

@faguayot could you try uncommenting lines @ https://github.com/NetApp/harvest/blob/main/conf/zapiperf/default.yaml#L48

faguayot · 2021-06-28T11:29:28Z

Perfect, when I uncommented those lines the workloads have been written in prometheus. There is documentation about what exactly is every metric?

Thanks.

cgrinds · 2021-06-29T16:04:27Z

Glad that's working for you @faguayot.

Documentation is a great question - ZAPIs are somewhat self-documenting and Harvest includes some rudimentary tools to help surface the ZAPI metadata as well as actual data. Here's an example relevant to your workload metrics question. And yes, this needs improvement - in the meantime, hopefully this will give you a bit more knowledge on how to dig deeper.

bin/harvest zapi
...
Examples:
  harvest zapi -p infinity show apis                             Query cluster infinity for available APIs
  harvest zapi -p infinity show attrs --api volume-get-iter      Query cluster infinity for volume-get-iter metrics
                                                                 Typically APIs suffixed with 'get-iter' have interesting metrics 
  harvest zapi -p infinity show data --api volume-get-iter       Query cluster infinity and print attribute tree of volume-get-iter

Let's use workload_detail_volume as an example.

The output of the command below shows us the metadata (including a description) of each of the counters listed in workload_detail_volume.yaml. For example, we can see below that service_time is The workload's average service time per visit to the service center.

# example
bin/harvest --config harvest.openlab.yml zapi -p umeng_aff300 show counters --object workload_detail_volume
connected to umeng-aff300-05-06 (NetApp Release 9.7P7: Thu Aug 27 20:57:05 UTC 2020)
[counters]                            -                                   *
  [counter-info]                      -                                   *
    [desc]                            - Determines whether or not service center-based statistics are in the latency path.
    [is-deprecated]                   -                               false
    [name]                            -                     in_latency_path
    [privilege-level]                 -                            advanced
    [properties]                      -                  raw,no-zero-values
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [desc]                            - Name of the workload_detail_volume instance
    [is-deprecated]                   -                               false
    [is-key]                          -                                true
    [name]                            -                       instance_name
    [privilege-level]                 -                            advanced
    [properties]                      -                   string,no-display
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [desc]                            - UUID for the workload_detail_volume instance
    [is-deprecated]                   -                               false
    [name]                            -                       instance_uuid
    [privilege-level]                 -                            advanced
    [properties]                      -                   string,no-display
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [desc]                            -                    System node name
    [is-deprecated]                   -                               false
    [is-key]                          -                                true
    [name]                            -                           node_name
    [privilege-level]                 -                            advanced
    [properties]                      -                   string,no-display
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [desc]                            -                      System node id
    [is-deprecated]                   -                               false
    [name]                            -                           node_uuid
    [privilege-level]                 -                            advanced
    [properties]                      -                   string,no-display
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [desc]                            - Ontap process that provided this instance
    [is-deprecated]                   -                               false
    [is-key]                          -                                true
    [name]                            -                        process_name
    [privilege-level]                 -                                diag
    [properties]                      -                              string
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [desc]                            -    Name of the associated resource.
    [is-deprecated]                   -                               false
    [name]                            -                       resource_name
    [privilege-level]                 -                            advanced
    [properties]                      -                              string
    [unit]                            -                                none
  [counter-info]                      -                                   *
    [base-counter]                    -                              visits
    [desc]                            - The workload&apos;s average service time per visit to the service center.
    [is-deprecated]                   -                               false
    [name]                            -                        service_time
    [privilege-level]                 -                            advanced
    [properties]                      -              average,no-zero-values
    [unit]                            -                            microsec
  [counter-info]                      -                                   *
    [desc]                            - The number of visits that the workload made to the service center; measured in visits per second.
    [is-deprecated]                   -                               false
    [name]                            -                              visits
    [privilege-level]                 -                            advanced
    [properties]                      -                 rate,no-zero-values
    [unit]                            -                             per_sec
  [counter-info]                      -                                   *
    [base-counter]                    -                              visits
    [desc]                            - The workload&apos;s average wait time per visit to the service center.
    [is-deprecated]                   -                               false
    [name]                            -                           wait_time
    [privilege-level]                 -                            advanced
    [properties]                      -              average,no-zero-values
    [unit]                            -                            microsec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't start the exporter prometheus #236

Can't start the exporter prometheus #236

faguayot commented Jun 24, 2021

cgrinds commented Jun 24, 2021

faguayot commented Jun 25, 2021

rahulguptajss commented Jun 25, 2021

cgrinds commented Jun 25, 2021

faguayot commented Jun 28, 2021

faguayot commented Jun 28, 2021

rahulguptajss commented Jun 28, 2021

faguayot commented Jun 28, 2021

cgrinds commented Jun 29, 2021

faguayot commented Jun 30, 2021

cgrinds commented Jun 30, 2021

faguayot commented Jul 8, 2021

Can't start the exporter prometheus #236

Can't start the exporter prometheus #236

Comments

faguayot commented Jun 24, 2021

cgrinds commented Jun 24, 2021

faguayot commented Jun 25, 2021

rahulguptajss commented Jun 25, 2021

cgrinds commented Jun 25, 2021

faguayot commented Jun 28, 2021

faguayot commented Jun 28, 2021

rahulguptajss commented Jun 28, 2021

faguayot commented Jun 28, 2021

cgrinds commented Jun 29, 2021

faguayot commented Jun 30, 2021

cgrinds commented Jun 30, 2021

faguayot commented Jul 8, 2021