Skip to content
This repository has been archived by the owner on Mar 27, 2024. It is now read-only.

Commit

Permalink
hadoop hdfs (#269)
Browse files Browse the repository at this point in the history
* hdfs module
  • Loading branch information
ilyam8 authored Sep 20, 2019
1 parent ebbe8dd commit 97f9712
Show file tree
Hide file tree
Showing 13 changed files with 1,854 additions and 1 deletion.
1 change: 1 addition & 0 deletions cmd/godplugin/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ import (
_ "github.com/netdata/go.d.plugin/modules/example"
_ "github.com/netdata/go.d.plugin/modules/fluentd"
_ "github.com/netdata/go.d.plugin/modules/freeradius"
_ "github.com/netdata/go.d.plugin/modules/hdfs"
_ "github.com/netdata/go.d.plugin/modules/httpcheck"
_ "github.com/netdata/go.d.plugin/modules/k8s_kubelet"
_ "github.com/netdata/go.d.plugin/modules/k8s_kubeproxy"
Expand Down
160 changes: 160 additions & 0 deletions config/go.d/hdfs.conf
Original file line number Diff line number Diff line change
@@ -0,0 +1,160 @@
# netdata go.d.plugin configuration for hadoop hdfs
#
# This file is in YaML format. Generally the format is:
#
# name: value
#
# There are 2 sections:
# - GLOBAL
# - JOBS
#
#
# [ GLOBAL ]
# These variables set the defaults for all JOBs, however each JOB may define its own, overriding the defaults.
#
# The GLOBAL section format:
# param1: value1
# param2: value2
#
# Currently supported global parameters:
# - update_every
# Data collection frequency in seconds. Default: 1.
#
# - autodetection_retry
# Re-check interval in seconds. Attempts to start the job are made once every interval.
# Zero means not to schedule re-check. Default: 0.
#
# - priority
# Priority is the relative priority of the charts as rendered on the web page,
# lower numbers make the charts appear before the ones with higher numbers. Default: 70000.
#
#
# [ JOBS ]
# JOBS allow you to collect values from multiple sources.
# Each source will have its own set of charts.
#
# IMPORTANT:
# - Parameter 'name' is mandatory.
# - Jobs with the same name are mutually exclusive. Only one of them will be allowed running at any time.
#
# This allows autodetection to try several alternatives and pick the one that works.
# Any number of jobs is supported.
#
# The JOBS section format:
#
# jobs:
# - name: job1
# param1: value1
# param2: value2
#
# - name: job2
# param1: value1
# param2: value2
#
# - name: job2
# param1: value1
#
#
# [ List of JOB specific parameters ]:
# - url
# Server URL.
# Syntax:
# url: http://localhost:80
#
# - username
# Username for basic HTTP authentication.
# Syntax:
# username: tony
#
# - password
# Password for basic HTTP authentication.
# Syntax:
# password: stark
#
# - proxy_url
# Proxy URL.
# Syntax:
# proxy_url: http://localhost:3128
#
# - proxy_username
# Username for proxy basic HTTP authentication.
# Syntax:
# username: bruce
#
# - proxy_password
# Password for proxy basic HTTP authentication.
# Syntax:
# username: wayne
#
# - timeout
# HTTP response timeout.
# Syntax:
# timeout: 1
#
# - method
# HTTP request method.
# Syntax:
# method: GET
#
# - body
# HTTP request method.
# Syntax:
# body: '{fake: data}'
#
# - headers
# HTTP request headers.
# Syntax:
# headers:
# X-API-Key: key
#
# - not_follow_redirects
# Whether to not follow redirects from the server.
# Syntax:
# not_follow_redirects: yes/no
#
# - tls_skip_verify
# Whether to skip verifying server's certificate chain and hostname.
# Syntax:
# tls_skip_verify: yes/no
#
# - tls_ca
# Certificate authority that client use when verifying server certificates.
# Syntax:
# tls_ca: path/to/ca.pem
#
# - tls_cert
# Client tls certificate.
# Syntax:
# tls_cert: path/to/cert.pem
#
# - tls_key
# Client tls key.
# Syntax:
# tls_key: path/to/key.pem
#
#
# [ JOB defaults ]:
# timeout: 1
# method: GET
# not_follow_redirects: no
# tls_skip_verify: no
#
#
# [ JOB mandatory parameters ]:
# - name
# - url
#
# ------------------------------------------------MODULE-CONFIGURATION--------------------------------------------------
# [ GLOBAL ]
# update_every: 1
# autodetection_retry: 0
# priority: 70000
#
#
# [ JOBS ]
#jobs:
# - name: master
# url: http://127.0.0.1:9870/jmx
#
# - name: slave
# url: http://127.0.0.1:9864/jmx
67 changes: 67 additions & 0 deletions modules/hdfs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,67 @@
# hdfs

This module will monitor one or more [`Hadoop Distributed File System`](https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) (HDFS) nodes depending on configuration.

It access HDFS metrics over `Java Management Extensions` (JMX) through the web interface of an HDFS daemon.

**Requirements:**
* `hdfs` node with accessible `/jmx` endpoint

It produces the following charts for `namenode`:
- Heap Memory in `MiB`
- GC Events in `events/s`
- GC Time in `ms`
- Number of Times That the GC Threshold is Exceeded in `events/s`
- Number of Threads in `num`
- Number of Logs in `logs/s`
- RPC Bandwidth in `kilobits/s`
- RPC Calls in `calls/s`
- RPC Open Connections in `connections`
- RPC Call Queue Length in `num`
- RPC Avg Queue Time in `ms`
- RPC Avg Processing Time in `ms`
- Capacity Across All Datanodes in `KiB`
- Used Capacity Across All Datanodes in `KiB`
- Number of Concurrent File Accesses (read/write) Across All DataNodes in `load`
- Number of Volume Failures Across All Datanodes in `events/s`
- Number of Tracked Files in `num`
- Number of Allocated Blocks in the System in `num`
- Number of Problem Blocks (can point to an unhealthy cluster) in `num`
- Number of Data Nodes By Status in `num`

It produces the following charts for `datanode`:
- Heap Memory in `MiB`
- GC Events in `events/s`
- GC Time in `ms`
- Number of Times That the GC Threshold is Exceeded in `events/s`
- Number of Threads in `num`
- Number of Logs in `logs/s`
- RPC Bandwidth in `kilobits/s`
- RPC Calls in `calls/s`
- RPC Open Connections in `connections`
- RPC Call Queue Length in `num`
- RPC Avg Queue Time in `ms`
- RPC Avg Processing Time in `ms`
- Capacity in `KiB`
- Used Capacity in `KiB`
- Bandwidth in `KiB/s`


### configuration

Needs only `url` to server's `/jmx` endpoint.

Here is an example for 2 servers:

```yaml
jobs:
- name : master
url : http://127.0.0.1:9870/jmx

- name : slave
url : http://127.0.0.1:9864/jmx
```
For all available options please see module [configuration file](https://github.com/netdata/go.d.plugin/blob/master/config/go.d/hdfs.conf).
---
Loading

0 comments on commit 97f9712

Please sign in to comment.