Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a collector for ZFS, currently focussed on ARC stats. #213

Closed
wants to merge 24 commits into from
Closed
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
0a8832e
Add a collector for ZFS, currently focussed on ARC stats.
problame Feb 25, 2016
538183f
Incorporate Feedback from brian-brazil.
problame Mar 12, 2016
62e4a7d
Change behavior on errors.
problame Mar 13, 2016
e92e9a2
Enable ZFS exporter by default and update README.
problame Mar 13, 2016
6c13943
Style fixes.
problame Mar 13, 2016
4d15f0b
Comments should end in periods.
problame Mar 13, 2016
fdfe4a6
Build system fixes.
problame Mar 13, 2016
3ae58ba
Remove noop zfsInitialize().
problame Mar 13, 2016
ffc3455
Use struct instantiation instead of constructor for zfsMetrics.
problame Mar 13, 2016
394305a
Fix unreachable code.
problame Mar 13, 2016
e828e04
Remove race-condition in access on zfsMetricProvider in Update()
problame Mar 13, 2016
45e989d
Extract arcstat procfs file parsing into separate method.
problame Mar 13, 2016
3c33e7e
Add unit test for arcstats parsing.
problame Mar 13, 2016
66f4eeb
Fix out-of-bounds on empty lines at the end of the procfs file.
problame Mar 13, 2016
3a6a8a5
Style fixes.
problame Mar 13, 2016
22bbb85
Remove log statement.
problame Mar 13, 2016
735effc
Only compile on supported platforms.
problame Mar 13, 2016
6bdd416
Restructure implementation without zfsMetricProvider.
problame Mar 13, 2016
60aa697
Rename zfs_arc subsystem.
problame Mar 15, 2016
2f99eb6
Expose all available ARC metrics on FreeBSD and Linux.
problame Mar 15, 2016
006cf29
Rename PrepareUpdate() to make intentions behind method clear.
problame Mar 15, 2016
3bb5356
Expose zpool metrics on FreeBSD.
problame Mar 16, 2016
a102306
Update ZFS collector description.
problame Mar 16, 2016
e8d45a2
Stylistic fixes.
problame Mar 28, 2016
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ textfile | Exposes statistics read from local disk. The `--collector.textfile.di
time | Exposes the current system time. | _any_
vmstat | Exposes statistics from `/proc/vmstat`. | Linux
version | Exposes node\_exporter version. | _any_

zfs | Exposes [ZFS](http://open-zfs.org/) performance statistics.<br/> FreeBSD (ARC, zpool), Linux (ARC) | [FreeBSD](https://www.freebsd.org/doc/handbook/zfs.html), [Linux](http://zfsonlinux.org/)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure ZFS is used widely enough to be enabled by default. It should be explicitly enabled instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general rule is that if it behaves sanely when zfs isn't loaded, then it's okay to have it on by default. We should aim for as much as is sane to work out of the box.


### Disabled by default

Expand Down
93 changes: 93 additions & 0 deletions collector/fixtures/proc/spl/kstat/zfs/arcstats
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
6 1 0x01 91 4368 5266997922 97951858082072
name type data
hits 4 8772612
misses 4 604635
demand_data_hits 4 7221032
demand_data_misses 4 73300
demand_metadata_hits 4 1464353
demand_metadata_misses 4 498170
prefetch_data_hits 4 3615
prefetch_data_misses 4 17094
prefetch_metadata_hits 4 83612
prefetch_metadata_misses 4 16071
mru_hits 4 855535
mru_ghost_hits 4 21100
mfu_hits 4 7829854
mfu_ghost_hits 4 821
deleted 4 60403
mutex_miss 4 2
evict_skip 4 2265729
evict_not_enough 4 680
evict_l2_cached 4 0
evict_l2_eligible 4 8992514560
evict_l2_ineligible 4 992552448
evict_l2_skip 4 0
hash_elements 4 42359
hash_elements_max 4 88245
hash_collisions 4 50564
hash_chains 4 412
hash_chain_max 4 3
p 4 516395305
c 4 1643208777
c_min 4 33554432
c_max 4 8367976448
size 4 1603939792
hdr_size 4 16361080
data_size 4 1295836160
metadata_size 4 175298560
other_size 4 116443992
anon_size 4 1917440
anon_evictable_data 4 0
anon_evictable_metadata 4 0
mru_size 4 402593792
mru_evictable_data 4 278091264
mru_evictable_metadata 4 18606592
mru_ghost_size 4 999728128
mru_ghost_evictable_data 4 883765248
mru_ghost_evictable_metadata 4 115962880
mfu_size 4 1066623488
mfu_evictable_data 4 1017613824
mfu_evictable_metadata 4 9163776
mfu_ghost_size 4 104936448
mfu_ghost_evictable_data 4 96731136
mfu_ghost_evictable_metadata 4 8205312
l2_hits 4 0
l2_misses 4 0
l2_feeds 4 0
l2_rw_clash 4 0
l2_read_bytes 4 0
l2_write_bytes 4 0
l2_writes_sent 4 0
l2_writes_done 4 0
l2_writes_error 4 0
l2_writes_lock_retry 4 0
l2_evict_lock_retry 4 0
l2_evict_reading 4 0
l2_evict_l1cached 4 0
l2_free_on_write 4 0
l2_cdata_free_on_write 4 0
l2_abort_lowmem 4 0
l2_cksum_bad 4 0
l2_io_error 4 0
l2_size 4 0
l2_asize 4 0
l2_hdr_size 4 0
l2_compress_successes 4 0
l2_compress_zeros 4 0
l2_compress_failures 4 0
memory_throttle_count 4 0
duplicate_buffers 4 0
duplicate_buffers_size 4 0
duplicate_reads 4 0
memory_direct_count 4 542
memory_indirect_count 4 3006
arc_no_grow 4 0
arc_tempreserve 4 0
arc_loaned_bytes 4 0
arc_prune 4 0
arc_meta_used 4 308103632
arc_meta_limit 4 6275982336
arc_meta_max 4 449286096
arc_meta_min 4 16777216
arc_need_free 4 0
arc_sys_free 4 261496832
78 changes: 78 additions & 0 deletions collector/fixtures/sysctl/freebsd/kstat.zfs.misc.arcstats.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
kstat.zfs.misc.arcstats.arc_meta_max: 1503210048
kstat.zfs.misc.arcstats.arc_meta_limit: 393216000
kstat.zfs.misc.arcstats.arc_meta_used: 392649848
kstat.zfs.misc.arcstats.duplicate_reads: 0
kstat.zfs.misc.arcstats.duplicate_buffers_size: 0
kstat.zfs.misc.arcstats.duplicate_buffers: 0
kstat.zfs.misc.arcstats.memory_throttle_count: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_null_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_list_iter: 0
kstat.zfs.misc.arcstats.l2_write_buffer_bytes_scanned: 0
kstat.zfs.misc.arcstats.l2_write_pios: 0
kstat.zfs.misc.arcstats.l2_write_buffer_iter: 0
kstat.zfs.misc.arcstats.l2_write_full: 0
kstat.zfs.misc.arcstats.l2_write_not_cacheable: 29425
kstat.zfs.misc.arcstats.l2_write_io_in_progress: 0
kstat.zfs.misc.arcstats.l2_write_in_l2: 0
kstat.zfs.misc.arcstats.l2_write_spa_mismatch: 0
kstat.zfs.misc.arcstats.l2_write_passed_headroom: 0
kstat.zfs.misc.arcstats.l2_write_trylock_fail: 0
kstat.zfs.misc.arcstats.l2_compress_failures: 0
kstat.zfs.misc.arcstats.l2_compress_zeros: 0
kstat.zfs.misc.arcstats.l2_compress_successes: 0
kstat.zfs.misc.arcstats.l2_hdr_size: 0
kstat.zfs.misc.arcstats.l2_asize: 0
kstat.zfs.misc.arcstats.l2_size: 0
kstat.zfs.misc.arcstats.l2_io_error: 0
kstat.zfs.misc.arcstats.l2_cksum_bad: 0
kstat.zfs.misc.arcstats.l2_abort_lowmem: 0
kstat.zfs.misc.arcstats.l2_cdata_free_on_write: 0
kstat.zfs.misc.arcstats.l2_free_on_write: 0
kstat.zfs.misc.arcstats.l2_evict_reading: 0
kstat.zfs.misc.arcstats.l2_evict_lock_retry: 0
kstat.zfs.misc.arcstats.l2_writes_hdr_miss: 0
kstat.zfs.misc.arcstats.l2_writes_error: 0
kstat.zfs.misc.arcstats.l2_writes_done: 0
kstat.zfs.misc.arcstats.l2_writes_sent: 0
kstat.zfs.misc.arcstats.l2_write_bytes: 0
kstat.zfs.misc.arcstats.l2_read_bytes: 0
kstat.zfs.misc.arcstats.l2_rw_clash: 0
kstat.zfs.misc.arcstats.l2_feeds: 0
kstat.zfs.misc.arcstats.l2_misses: 0
kstat.zfs.misc.arcstats.l2_hits: 0
kstat.zfs.misc.arcstats.other_size: 166832272
kstat.zfs.misc.arcstats.data_size: 1200779776
kstat.zfs.misc.arcstats.hdr_size: 27244008
kstat.zfs.misc.arcstats.size: 1394856056
kstat.zfs.misc.arcstats.c_max: 1572864000
kstat.zfs.misc.arcstats.c_min: 196608000
kstat.zfs.misc.arcstats.c: 1470553736
kstat.zfs.misc.arcstats.p: 665524427
kstat.zfs.misc.arcstats.hash_chain_max: 7
kstat.zfs.misc.arcstats.hash_chains: 14180
kstat.zfs.misc.arcstats.hash_collisions: 2180398
kstat.zfs.misc.arcstats.hash_elements_max: 238188
kstat.zfs.misc.arcstats.hash_elements: 111458
kstat.zfs.misc.arcstats.evict_l2_ineligible: 60262400
kstat.zfs.misc.arcstats.evict_l2_eligible: 35702978560
kstat.zfs.misc.arcstats.evict_l2_cached: 0
kstat.zfs.misc.arcstats.evict_skip: 21716568
kstat.zfs.misc.arcstats.mutex_miss: 873
kstat.zfs.misc.arcstats.recycle_miss: 5018771
kstat.zfs.misc.arcstats.stolen: 1327563
kstat.zfs.misc.arcstats.deleted: 1187256
kstat.zfs.misc.arcstats.allocated: 10150518
kstat.zfs.misc.arcstats.mfu_ghost_hits: 1408986
kstat.zfs.misc.arcstats.mfu_hits: 51952454
kstat.zfs.misc.arcstats.mru_ghost_hits: 696819
kstat.zfs.misc.arcstats.mru_hits: 11115835
kstat.zfs.misc.arcstats.prefetch_metadata_misses: 32
kstat.zfs.misc.arcstats.prefetch_metadata_hits: 2
kstat.zfs.misc.arcstats.prefetch_data_misses: 0
kstat.zfs.misc.arcstats.prefetch_data_hits: 0
kstat.zfs.misc.arcstats.demand_metadata_misses: 9231542
kstat.zfs.misc.arcstats.demand_metadata_hits: 40650947
kstat.zfs.misc.arcstats.demand_data_misses: 75230
kstat.zfs.misc.arcstats.demand_data_hits: 22417340
kstat.zfs.misc.arcstats.misses: 9306804
kstat.zfs.misc.arcstats.hits: 63068289
12 changes: 12 additions & 0 deletions collector/fixtures/zfs/zpool_stats_stdout.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
trout size 4294967296 -
trout free 1040117248 -
trout allocated 70144 -
trout capacity 0% -
trout dedupratio 1.00x -
trout fragmentation 0% -
zroot size 118111600640 -
zroot free 3990917120 -
zroot allocated 114120683520 -
zroot capacity 50% -
zroot dedupratio 1.00x -
zroot fragmentation 67% -
114 changes: 114 additions & 0 deletions collector/zfs.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
package collector
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include the same copyright/license header here as in all the other Node Exporter files. Same for the other Go files here.


// +build linux freebsd
// +build !nozfs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should also have a restriction to only run on linux and freebsd, as it won't compile on other platforms


import (
"errors"
"strings"

"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/common/log"
)

type zfsMetricValue int

const zfsErrorValue = zfsMetricValue(-1)

var zfsNotAvailableError = errors.New("ZFS / ZFS statistics are not available")

type zfsSysctl string
type zfsSubsystemName string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe overdoing the special typing a bit, I'd just keep those as normal strings.


const (
arc = zfsSubsystemName("zfsArc")
zpoolSubsystem = zfsSubsystemName("zfsPool")
)

// Metrics

type zfsMetric struct {
subsystem zfsSubsystemName // The Prometheus subsystem name.
name string // The Prometheus name of the metric.
sysctl zfsSysctl // The sysctl of the ZFS metric.
}

type datasetMetric struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is unused

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type is (still) unused.

subsystem zfsSubsystemName
name string
}

// Collector

func init() {
Factories["zfs"] = NewZFSCollector
}

type zfsCollector struct {
zfsMetrics []zfsMetric
}

func NewZFSCollector() (Collector, error) {
return &zfsCollector{}, nil
}

func (c *zfsCollector) Update(ch chan<- prometheus.Metric) (err error) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've noticed you sometimes add extra whitespace at the beginning or end of function blocks. To me, it looks a little bit messy. However, I'm not a member of the Prometheus project, so I will defer to their stylistic preferences.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree.

err = c.zfsAvailable()
switch {
case err == zfsNotAvailableError:
log.Debug(err)
return nil
case err != nil:
return err
}

// Arcstats
err = c.updateArcstats(ch)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unreachable

}

// Pool stats
err = c.updatePoolStats(ch)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return c.updatePoolStats(ch), and remove the next few lines.

if err != nil {
return err
}

return err
}

func (s zfsSysctl) metricName() string {
parts := strings.Split(string(s), ".")
return parts[len(parts)-1]
}

func (c *zfsCollector) ConstSysctlMetric(subsystem zfsSubsystemName, sysctl zfsSysctl, value zfsMetricValue) prometheus.Metric {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lower-case this and the function below please, as it's not used outside of this package.


metricName := sysctl.metricName()

return prometheus.MustNewConstMetric(
prometheus.NewDesc(
prometheus.BuildFQName(Namespace, string(subsystem), metricName),
string(sysctl),
nil,
nil,
),
prometheus.UntypedValue,
float64(value),
)
}

func (c *zfsCollector) ConstZpoolMetric(pool, name string, value float64) prometheus.Metric {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the method above are defined as methods on *zfsCollector, but they don't actually use anything of that type at all. Seems like they should either be standalone functions or maybe the upper one should be a method on zfsSysctl instead?

return prometheus.MustNewConstMetric(
prometheus.NewDesc(
prometheus.BuildFQName(Namespace, string(zpoolSubsystem), name),
name,
[]string{"pool"},
nil,
),
prometheus.UntypedValue,
float64(value),
pool,
)
}
Loading