The agent was first written for Kubernetes and is relatively easy to setup in a cluster. The agent is intended to be run on each node and will monitor services running on those same nodes to minimize cross-node traffic.
See the documentation on Monitoring Kubernetes for more information on how to use the UI components in the SignalFx webapp once you are setup.
Follow these instructions to install the SignalFx agent on your Kubernetes cluster and configure it to auto-discover SignalFx-supported integrations to monitor.
-
Store your organization's Access Token as a key named
access-token
in a Kubernetes secret namedsignalfx-agent
:$ kubectl create secret generic --from-literal access-token=MY_ACCESS_TOKEN signalfx-agent
-
If you use Helm, you can use our chart in the stable Helm chart repository. Otherwise, download the following files from SignalFx's Github repository to the machine you usually run
kubectl
from, and modify them as indicated.-
daemonset.yaml: Kubernetes daemon set configuration
-
configmap.yaml: SignalFx agent configuration
-
Using a text editor, replace the default value
MY-CLUSTER
with the desired name for your cluster. This will appear in the dimension calledkubernetes_cluster
in SignalFx. -
If the agent will be sending data via a proxy, see proxy support.
-
If docker and cadvisor metrics are not necessary for certain containers, see filtering.
-
If you have RBAC enabled in your cluster, you can look at the other k8s resources in the agent repo to see what is required for the agent pod to have the proper permissions.
If you are using Rancher for your Kubernetes deployment, complete the instructions in Rancher before proceeding with the next step.
If you are using AWS Elastic Container Service for Kubernetes (EKS) for your Kubernetes deployment, complete the instructions in AWS Elastic Container Service for Kubernetes (EKS) before proceeding with the next step.
If you are using Pivotal Container Service (PKS) for your Kubernetes deployment, complete the instructions in Pivotal Container Service (PKS) before proceeding with the next step.
If you are using Google Container Engine (GKE) for your Kubernetes deployment, complete the instructions in Google Container Engine (GKE) before proceeding with the next step.
If you are using OpenShift 3.0+ for your Kubernetes deployment, complete the instructions in Openshift before proceeding with the next step.
-
-
Run the following commands on your Kubernetes cluster to install the agent with default configuration. Include the path to each .yaml file you downloaded in step #2.
$ kubectl create -f configmap.yaml \ -f daemonset.yaml
-
Data will begin streaming into SignalFx. After a few minutes, verify that data from Kubernetes has arrived using the Infrastructure page. If you don't see data arriving, check the logs on a random agent container and see if there are any errors. You can also exec the command
signalfx-agent status
in any of the agent pods to get a diagnostic output from the agent.
The SignalFx agent that is able to monitor Kubernetes environments is pre-configured to include most of the integrations that SignalFx supports out of the box. Using customizable rules that are based on the container image name and service port, you can automatically start monitoring the microservices running in the containers. Each integration has a default configuration that you can customize for your environment by creating a new integration configuration file.
For more information, see Auto Discovery.
Our provided agent DaemonSet includes a set of tolerations for master nodes that should work across multiple K8s versions. If your master node does not use the taints included in the provided daemonset, you should replace the tolerations with your cluster's master taint so that the agent will run on the master node(s).
Observers are what discover services running in the environment. For monitoring services, our agent is setup to monitor services running on the same K8s node as the agent.
For Kubernetes, there are two observers that you can use:
We recommend the API observer since the Kubelet API is technically undocumented.
Monitors are what collect metrics from the environment or services. See Monitor Config for more information on specific monitors that we support. All of these work the same in Kubernetes.
Of particular relevance to Kubernetes are the following monitors:
- Kubernetes Cluster - Gets cluster level metrics from the K8s API
- cAdvisor - Gets container metrics directly from cAdvisor exposed on the same node (most likely won't work in newer K8s versions that don't expose cAdvisor's port in the Kubelet)
- Kubelet Stats - Gets cAdvisor metrics through
the Kubelet
/stats
endpoint. This is much more robust, as it uses the same interface that Heapster uses. - Prometheus Exporter - Gets prometheus metrics directly from exporters. This is useful especially if you already have exporters deployed in your cluster because you currently use Prometheus.
If you want to pull metrics from kube-state-metrics you can use use a config similar to the following (assuming the kube-state-metrics instance is running once in your cluster in a container using an image that has the string "kube-state-metrics" in it):
- type: prometheus-exporter
discoveryRule: container_image =~ "kube-state-metrics"
disableHostDimensions: true
disableEndpointDimensions: true
extraDimensions:
metric_source: kube-state-metrics
This uses the prometheus-exporter monitor
to pull metrics from the service. It also disables a lot of host and endpoint
specific dimensions that are irrelevant to cluster-level metrics. Note that
many of the metrics exposed by kube-state-metrics
overlap with our own
kubernetes-cluster monitor, so you probably
don't want to enable both unless you are using heavy filtering.
When using the k8s-api
observer, you can use Kubernetes pod annotations to
tell the agent how to monitor your services. There are several annotations
that the k8s-api
observer recognizes:
-
agent.signalfx.com/monitorType.<port>: "<monitor type>"
- Specifies the monitor type to use when monitoring the specified port. If this value is present, any agent config will be ignored, so you must fully specify any non-default config values you want to use in annotations. If this annotation is missing for a port but other config is present, you must have discovery rules or manually configured endpoints in your agent config to monitor this port; the other annotation config values will be merged into the agent config. -
agent.signalfx.com/config.<port>.<configKey>: "<configValue>"
- Specifies a config option for the monitor that will monitor this endpoint. The options are the same as specified in the monitor config reference. Lists may be specified with the syntax[a, b, c]
(YAML compact list) which will be deserialized to a list that will be provided to the monitor. Boolean values are the annotation string valuestrue
orfalse
. Integers can also be specified; they must be strings as the annotation value, but they will be interpreted as an integer if they don't contain any non-number characters. -
agent.signalfx.com/configFromEnv.<port>.<configKey>: "<env var name>"
-- Specifies a config option that will be pulled from an environment variable on the same container as the port being monitored. -
agent.signalfx.com/configFromSecret.<port>.<configKey>: "<secretName>/<secretKey>"
-- Maps the value of a secret to a config option. The<secretKey>
is the key of the secret value within thedata
object of the actual K8s Secret resource. Note that this requires the agent's service account to have the correct permissions to read the specified secret.
In all of the above, the <port>
field can be either the port number of the
endpoint you want to monitor or the assigned name. The config is specific to a
single port, which allows you to monitor multiple ports in a single pod and
container by just specifying annotations with different ports.
The following K8s pod spec and agent YAML configuration accomplish the same thing:
K8s pod spec:
metadata:
annotations:
agent.signalfx.com/monitorType.jmx: "collectd/cassandra"
agent.signalfx.com/config.jmx.intervalSeconds: "20"
agent.signalfx.com/config.jmx.mBeansToCollect: "[cassandra-client-read-latency, threading]"
labels:
app: my-app
spec:
containers:
- name: cassandra
ports:
- containerPort: 7199
name: jmx
protocol: TCP
......
Agent config:
monitors:
- type: collectd/cassandra
intervalSeconds: 20
mBeansToCollect:
- cassandra-client-read-latency
- threading
If a pod has the agent.signalfx.com/monitorType.*
annotation on it, that
pod will be excluded from the auto discovery mechanism and will be monitored
only with the given annotation configuration. If you want to merge
configuration from the annotations with agent configuration, you must omit the
monitorType
annotation and rely on auto discovery to find this endpoint.
At that time, config from both sources will be merged together, with pod
annotation config taking precedent.
If you are using Rancher to manage your Kubernetes cluster, perform these steps after you complete step 3 in Installation.
If the Rancher nodes are behind a proxy, ensure that the Docker engine has the proxy configured so that it can pull the signalfx-agent Docker image from quay.io. See the Rancher documentation for details on how to configure the proxy.
Use the following configuration for the cadvisor monitor:
monitors:
- type: cadvisor
cadvisorURL: http://localhost:9344
Cadvisor runs on port 9344 instead of the standard 4194.
When you have completed these steps, continue with step 3 in Installation.
On EKS, machine ids are identical across worker nodes, which makes that value useless for identification. Therefore, there are two changes you should make to the configmap to use the K8s node name instead of machine-id.
-
In the configmap.yaml, change the top-level config option
sendMachineId
tofalse
. This will cause the agent to omit the machine_id dimension from all datapoints and instead send thekubernetes_node
dimension on all datapoints emitted by the agent. -
Under the kubernetes-cluster monitor configuration, set the option
useNodeName: true
. This will cause that monitor to sync node labels to thekubernetes_node
dimension instead of themachine_id
dimension.
Note that in EKS there is no concept of a "master" node (at least not that is exposed via the K8s API) and so all nodes will be treated as workers.
See AWS Elastic Container Service for Kubernetes -- the setup for PKS is identical because of the similar lack of reliable machine ids.
On GKE, access to the kubelet is highly restricted and service accounts will
not work (at least as of GKE 1.9.4). In those environments, you can use the
alternative, non-secure port 10255 on the kubelet in the kubelet-stats
monitor to get container metrics. The config for that monitor will look like:
monitors:
- type: kubelet-stats
kubeletAPI:
authType: none
url: http://localhost:10255
As long as you use our standard RBAC resources, this should be the only modification needed to accommodate GKE.
If you are using OpenShift 3.0+ for your Kubernetes deployment, perform these steps after you complete step 2 in Installation.
OpenShift 3.0 is based on Kubernetes and thus most of the above instructions apply. There are more restrictive security policies that disallow some of the things our agent needs to be effective, such as running in privileged mode and mounting host filesystems to the agent container, as well as reading from the Kubelet and Kubernetes API with service accounts.
First we need a service account for the agent (you will need to be a cluster administrator to do the following):
oc create serviceaccount signalfx-agent
We need to make this service account able to read information about the cluster:
oadm policy add-cluster-role-to-user cluster-reader system:serviceaccount:default:signalfx-agent
Next we need to add this service account to the privileged SCC. Run oc edit scc privileged
and add the signalfx-agent service account at the end of the
users list:
users: ...
- system:serviceaccount:default:signalfx-agent
Finally in the daemonset config for the agent, you need to add the name of the
service account created above. Add the following line in the spec
section of
the agent daemonset (see above for the base daemonset config file):
serviceAccountName: signalfx-agent
Now you should be able to follow the instructions above and have the agent running in short order.