Proposal: Beat "presets" #3190

sebgl · 2020-06-05T08:51:23Z

I'd like to discuss here a proposal to change a bit the logic of the current Beat CRD.

Concerns

I want to address the following concerns:

if you want to just grab logs from a mounted /data/logs directory with filebeat, you end up by default with many additional things and permissions you don't need:
- a dedicated service account whose token is mounted in the pod
- a cluster role binding for that service account that can access the k8s API
- HostNetwork = true
- a security context to run with the root user (id 0)
- system volumes mounted in:
  - /var/lib/docker/containers
  - /var/log/containers
  - /var/log/pods
similarly with metricbeat, if you just want to grab metrics from a prometheus http endpoint, you still end up with:
- similar RBAC permissions, service account, root user, host network
- system volumes mounted in:
  - /var/run/docker.sock
  - /proc
  - /sys/fs/cgroup
some of those defaults can be overridden in the podTemplate, but some of them are harder to override. For example: the default volumes we create for Filebeat and Metricbeat (eg. /proc) if they don't want them.
with metricbeat we default to a daemonset collecting metrics from all hosts, but we could also have chosen to default to a deployment collecting kube-state-metrics (both are listed in metricbeat doc), or default to a deployment collecting prometheus endpoints
the story around config merging is complicated

Proposal

keep defaults very light (no mounted volumes, no privilege, no RBAC, no config) for filebeat and metricbeat, similar to what we do for all generic "other beat", treating all beats the same. We only set the ES output config iff elasticsearchRef is specified, but don't set any other config setting:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: 
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  daemonSet:
    podTemplate:
      # mount whatever volume you need here and add privileges if needed
  config:
    # whatever you need here, we just add the es output to it

Introduce "presets" to cover default podTemplate, RBAC and configuration for a small set of identified use cases

filebeat-k8s-autodiscover: collect hint-based logs from all Pods, autodiscovered from the apiserver (also handles service account, role binding, etc.)

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: 
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  preset: filebeat-k8s-autodiscover

filebeat-k8s-containers: collect logs from all containers - does not rely on autodiscover from the apiserver

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: 
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  preset: filebeat-k8s-containers

metricbeat-k8s-hosts: collect metrics from all k8s nodes

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: 
spec:
  type: metricbeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  preset: metricbeat-k8s-hosts

metricbeat-k8s-cluster-metrics: collect metrics from the k8s cluster (in a single metricbeat instance)

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: 
spec:
  type: metricbeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  preset: metricbeat-k8s-cluster-metrics

the podTemplate of all those presets can of course be overridden
if you want to include a lot of modifications it's probably easier to not use the preset and set your own podTemplate, config, and RBAC stuff
it gives us room to later on include other presets (for example, packetbeat) while keeping the same packetbeat default. Otherwise, the day we decide to set a packetbeat default means we'd break compatibility wit existing packetbeats resources.
I think we can tackle the config merging problem in case a preset is used, by allowing to override a few identified array items. Keep our current "merge-append" behaviour to be consistent with other CRDs, but allow some specific overrides (typed in the CRD):

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:
  name: 
spec:
  type: metricbeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  preset: metricbeat-k8s-hosts
  configOverrides:
    modules:
    # system modules from the preset removed (user-provided ones are appended instead)
    - system
    # kubernetes modules from the preset removed (no user-provided one)
    - kubernetes
    autodiscover.providers.type:
    # kubernetes provider from the preset removed (user-provided ones appended instead)
    - kubernetes
  config:
    # the usual append-merge behaviour
    modules:
    - module: system # replace ours since specified in `configOverrides.modules`
      period: 5s
      metricsets:
      - cpu
      - load
    - module: system # replace ours since specified in `configOverrides.modules`
      period: 1m
      metricsets:
      - filesystem
    # append-merge this new setting
    setup.dashboards.enabled: true

in case no preset is used, there's no problem with using the config merge-append behaviour since our config is empty anyway (just contains the ES output if elasticsearchRef is specified)

The text was updated successfully, but these errors were encountered:

david-kow · 2020-06-05T14:01:30Z

Thanks for the write up @sebgl! My 2 cents below.

I think the concerns you listed can be divided into two groups:

Defaults

Default config comes with podTemplate that is not suiteable if a different config is used.

To have Beats working there are many moving parts, but I think that's just in the nature of the problem. Still, many of the things listed in the issue will be addressed:

we can detect autodiscover path in the config and provide permissions (or not) based on that
we can decide to default to not setting automount if there is no autodiscover needed (as above)
running as root is only needed because of hostpath volumes. We can detect this and set securitycontext appropriately
if we have to run as a root we can keep it to init container to chown needed dirs
volumes can be overriden - Disable default volumes in podTemplate overrides #3187 (comment)
hostNetwork is arguably a good default - Kubernetes manifest shouldn't use hostNetwork with autodiscovery beats#15013 (comment), fairly easy to switch off if needed

If we still feel that's not enough if the users wants to do the setup themselves, we could provide a flag (in daemonSpec and deployment or at the top level) that would turn off all defaults and cause plain user podTemplate to be used.

Config merging is complicated

I think this is orthogonal to presets. We can have the proposed configOverrides with or without presets and we can have presets with or without configOverrides. I'd propose to tackle this separately. This solution has similar issues as some other ones we discussed in #3042.

IIUC, the proposal is about providing a number of default configs per Beat to address different use cases. It was considered already (although in a bit different form):

We could have new CRDs that map directly to covered scope items that were defined above, ie. have one CRD for cluster monitoring and another one for application monitoring. This seems to enable us to default Beats configs in a way that doesn’t require a lot customization to work, ie. “almost empty” manifest would default to enable all pieces to achieve Kubernetes cluster monitoring. On the other hand this is a very different route from one we took with existing CRDs. It seems that making this customizable would be difficult too.

I do see couple more issues:

how do we pick the use cases to support and how we know in which direction we should make it customizable? I'm afraid that it will be difficult to come up with configurations that suit everyone and if they need to be altered in any way users are again in the territory of looking at the defaults, diffing and trying to figure out the right merge/override
do we know if this is going to be a preferable way of running Beats? We would be making a big promise (that it works and will be supported going forward) and I'm not sure if this is something desirable by many users
I think maintaining a single default per Beat is already a stretch and having more will just grow test matrix. Since this would be a field in the CRD with well-known values it would be expected to work across all dimensions we already have today. This seems to me, could be a big maintainability burden with beats x presets x k8svers x stackvers x secured/nonsecured x ocp/nonocp. We would be in the bussiness of maintaining number of Beats configs similar to (larger then?) the Beats team itself.

Maybe we can do something a little bit different and achieve the same goals. The way I see it, a preset is a beatConfig+podTemplate for a particular Beat type. Which is what Beat CR itself is :)

Could we get similar UX by just providing recipes with Beat CR for different scenarios? This would be similar to current recipes (not production ready, might require hardening, we can change them without figuring out upgrade story, etc.). Users could use it as a starting point and tune it as needed, which is what we expect would happen with presets too.

We can also decide to postpone adding this until we get some feedback. It seems it can be added at any time.

anyasabo · 2020-06-05T14:58:35Z

I think we can tackle the config merging problem in case a preset is used, by allowing to override a few identified array items

In your proposal I think it might also make sense to make the config and preset fields mutually exclusive. You either use our presets or you provide the whole config (minus the output I suppose).

do we know if this is going to be a preferable way of running Beats? We would be making a big promise (that it works and will be supported going forward) and I'm not sure if this is something desirable by many users

I don't quite understand this issue since we are already committing to this for each beat default in the current proposal.

hostNetwork is arguably a good default - elastic/beats#15013 , fairly easy to switch off if needed

It's not clear to me that this is actually a good default, it seems like using the downward API to get the node name provides the same thing without any of the downsides / permissions issues.

sebgl · 2020-06-08T07:33:16Z

Summary of a discussion we had with @david-kow @anyasabo @pebrc:

presets can be seen as opt-in explicit defaults, as opposed to implicit defaults in the current implementation
the main benefit is to more easily be able to specify config and podTemplate for some use cases completely different than the main default (eg. metricbeat where you want to poll an http endpoint, instead of collecting host-level metrics). In which case opting-out from the default could be hard (volume mounts, rbac, privileges, etc.).
we don't want to commit to implement (too) many presets for each beat. 2 presets for metricbeat (daemonset for host metrics & deployment for k8s metrics) and 1 for filebeat (autodiscover pod logs) sounds like a good starting point. The risk is to slowly move towards a helm approach where there are many templates and parameters users can pass.
there is still value in users being able to override completely the config for a given preset. For example with the metricbeat preset they may be happy to keep the existing rbac & volume mount stuff, but just want to override one tiny detail in the config.
as @david-kow said, the config overrides problem is orthogonal to the presets idea. We may want to implement it at some point but for now it seems fair that the user either relies on the preset config, either provides its own config entirely (no overrides, except the ES output).

pebrc · 2020-06-08T08:41:13Z

The minimum prerequisite for introducing this feature would be to remove any existing defaults, volume mounts and RBAC settings from #3041 (or a quick follow-up PR).

We can then in a separate decision introduce the presets (or play around with sample configurations external to the controller)

It would also be helpful to ask for feedback for the CRD and this idea in particular from @\elastic/integrations-platforms

sebgl · 2020-06-08T08:47:07Z

The minimum prerequisite for introducing this feature would be to remove any existing defaults, volume mounts and RBAC settings from #3041 (or a quick follow-up PR).
We can then in a separate decision introduce the presets (or play around with sample configurations external to the controller)

If we remove this default then the Beat CRD is basically useless or very hard to use?
I think I'd rather keep it and make a future preset PR refactor it into an opt-in preset instead.

pebrc · 2020-06-08T09:00:37Z

If we remove this default then the Beat CRD is basically useless or very hard to use?
I think I'd rather keep it and make a future preset PR refactor it into an opt-in preset instead.

You could have example configs to go with it. I think I am OK with removing the defaults together with the preset PR if it is before an initial release. Once we release the Beats CRD even if it is still alpha, removing the defaults would be a breaking change, that we should avoid if possible.

anyasabo · 2020-06-08T17:00:45Z

I think I agree with pebrc.

I am still wrestling with the value of having the presets built in to the controller, vs starting out with having an example library of common use cases in our docs and potentially adding presets later. My inclination is to start smaller with a basic beat. Adding things is easier than taking them away in the future.

It also makes me wonder if stack monitoring would still make sense as a separate CRD or if it makes sense to include as a preset (either in documentation or codified).

david-kow · 2020-06-08T23:57:10Z

I agree on removing the defaults, but I think we should have presets (or some other way?) to specify a working config.

All our other CRDs can have great startup experience with 10-liners specifying all that's needed for simple quickstart. Meanwhile, Beat will need a fairly complicated, unreadable 50-100-150 lines of config which we will probably end up hiding behind kubectl apply -f https://....

Also, having preset (or any other built-in config mechanism) together with configRef enables the workflow where users starts with the default and have an editable Secret created by the operator.

jsoriano · 2020-06-09T12:26:02Z

Hi, sorry if I am joining late to this discussion as my thoughts are going to be a bit against "presets" 😅 Also I am not familiar with current operator, so I may be missing some basic things that may have already been discussed.

I would like to position my comments starting on the deploy manifests we currently maintain in the Beats repo (here), and how they use to be used. They use to be used as they are, and customized as needed following some comments in the same yaml file. One way of supporting beats in the operator could be to move the knowledge in these manifests and its comments to the controller.

For example one thing we comment in the filebeat manifest is that to use autodiscover, you need to remove filebeat.inputs configuration and uncomment the filebeat.autodiscover block. I wonder if this could be modeled with a setting:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  autodiscover: true

Other usual thing to customize is the logs path, this could be another setting:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  autodiscover: true
  containers_logs:
    path: "/var/log/containers/"

Another common thing to do is to enable or disable some things if the beat is deployed in a secured kubernetes, like OpenShift. For example filebeat needs to be privileged to read the logs in some environments. Maybe the operator is able to dectect this and set the proper options, if not, a setting for this may be needed:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  autodiscover: true
  containers_logs:
    path: "/var/log/containers/"
  privileged: true

Only for the combinations of this three quite popular variations we would need lots of presets, but they can be modelled as some clear options. Support for these options could be added on different iterations.
I see the risk of ending up with too many parameters and templates, but we could support only a small set of well-known opinionated use cases, and not all the possible configurations of Beats. We can even prevent to use certain error-prone combinations, like using autodiscover templates and hints-based autodiscover at the same time.

if you want to just grab logs from a mounted /data/logs directory with filebeat,

A case like this could be covered with options like this:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
spec:
  type: filebeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  autodiscover: false
  containers_logs:
    enabled: false
  inputs:
    - type: log
      paths:
        - /data/logs

The user could specify the path as it is in the host, and the operator could then mount /data/logs as a volume in the filebeat container, and generate a config for filebeat with only an input to the directory to the correct path. It could use the other options to decide if it has to give more or less privileges and mount or not other volumes.

One particularity that metricbeat has is that we recommend deploying for it a daemonset and a deployment. The daemonset is intended to collect metrics locally on each node, and the deployment to collect cluster metrics from kube state metrics. A config like the following one could generate only a daemonset config with hints-based autodiscover, and nothing for kube state metrics:

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
spec:
  type: metricbeat
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  autodiscover:
    hints: true
  kube_state_metrics: false

Also, something that may have been discussed in the past, I wonder if we have considered to have a kind of CRD for each well supported beat, so we could have specific options for specific beats (e.g. inputs only makes sense on filebeat).

apiVersion: beat.k8s.elastic.co/v1beta1
kind: Filebeat
spec:
  version: 7.6.2
  elasticsearchRef:
    name: elasticsearch-sample
  autodiscover: false
  containers_logs:
    enabled: false
  inputs:
    - type: log
      paths:
        - /data/logs

We could still have the Beats CRD if needed for completely custom configurations. The problem I see with offering completely custom configuration is that users may wonder why using an operator if they still need to provide whole configs.

david-kow · 2020-06-09T15:20:14Z

Thanks for your input @jsoriano, the timing is perfect :)

I see the value in making Beat config more customizable from the CRD itself. It would allow us to address certain percentage of common setups with short and simple configs. At the same time it requires us to draw arbitrary line between what's easily customizable and what isn't. I'm not sure if that should be something that we do with Beats, but we didn't with any other CRDs we have.

For context, having a generic Beat CRD was driven by the desire to support all Beats at once (including community Beats) instead of (probably limited) number of very similar
CRDs/controllers. Furthermore, I've spent some time thinking about "lifting" Beat config options to CRD spec and my biggest concerns at the time were:

we end up with a big bag of settings that, depending on the Beat type might: be ignored, conflict with each other, depend on each other. I don't think we could easily make obvious
what works together and what doesn't
fields like privileged or container_logs would have two sources of truth: podTemplate and the CRD. So far we tried to avoid being in that situation
some fields would have no meaning for some Beats, eg:
-- no autodiscover for journal, packet, audit, heartbeats (at least not documented)
-- kube_state_metrics only for metricbeat
it would require us to make arbitrary choices what to support. How do we determine what is worthy to put in the CRD directly? What would be our answer to 'why setting x
is not in the CRD'?
expanding support could be a rabbit hole
making sure all combinations work across all Beats as expected in all stack vers, k8s vers, k8s flavours seems like a challenge and would likely be very resource consuming

The above is mostly to give some more context why we chose this path initially :)

The problem I see with offering completely custom configuration is that users may wonder why using an operator if they still need to provide whole configs.

I think this is very valid question. Even if the user provides whole Beat config, operator takes care of:

setting the output to ES
seting up default Kibana dashboards
cert rollover
pods rollover in case of config/version change
putting Secrets into Beat keystore

And this is on top of having a "configless" start up experience if they chose to use the preset (if we decide to have one).

I wonder if this "simplified" config approach was ever considered on the Beat config level (simple.autodiscover: true would translate to some default autodiscover setup, etc.).

jsoriano · 2020-06-10T10:20:04Z

fields like privileged or container_logs would have two sources of truth: podTemplate and the CRD. So far we tried to avoid being in that situation

+1 to avoid multiple sources of truth. If we support a limited number of options in the CRD, we probably shouldn't allow to add custom podTemplates. Maybe a solution for that could be to have a generic Beats CRD that only supports podTemplates, and a set of beats-specific CRDs that don't support podTemplates, but support the specific higher-level options.

we end up with a big bag of settings that, depending on the Beat type might: be ignored, conflict with each other, depend on each other. I don't think we could easily make obvious
what works together and what doesn't

some fields would have no meaning for some Beats, eg:
-- no autodiscover for journal, packet, audit, heartbeats (at least not documented)
-- kube_state_metrics only for metricbeat

Wouldn't these points talk in favour of having specific CRDs for each supported beat? So if autodiscover is only supported in Filebeat and Metricbeat, we only support the autodiscover option in the CRDs for these beats.

it would require us to make arbitrary choices what to support. How do we determine what is worthy to put in the CRD directly? What would be our answer to 'why setting x
is not in the CRD'?

expanding support could be a rabbit hole

making sure all combinations work across all Beats as expected in all stack vers, k8s vers, k8s flavours seems like a challenge and would likely be very resource consuming

Yeah, this can be true, but I guess this would be a decision on the use cases we want to support on the operator. I think it can make sense to support only a limited number of options. Beats options combinatory is wide, and some possible combinations are error-prone (I am thinking in autodiscover hints vs. templates for example, but also what beats make sense as deployments, as daemonsets, or as both). I think we don't need to support all combinations for the Kubernetes use case.

I think this is very valid question. Even if the user provides whole Beat config, operator takes care of:

setting the output to ES

seting up default Kibana dashboards

cert rollover

pods rollover in case of config/version change

putting Secrets into Beat keystore

Oh, this is great 👍

I wonder if this "simplified" config approach was ever considered on the Beat config level (simple.autodiscover: true would translate to some default autodiscover setup, etc.).

We try to provide defaults to cover basic use cases, so configuration can be simpler, for example enabling hints-based autodiscover in filebeat enables collection of logs for all pods, this works with kubernetes and docker providers:

filebeat.autodiscover.providers:
  - type: kubernetes
    hints.enabled: true

This can be done in Filebeat because all logs are collected the same, but it wouldn't work on Metricbeat, where each container is going to need a different way of collecting metrics.

Another example would be the add_*_metadata processors, they are not needed when autodiscover is used because it adds the same metadata. Also add_cloud_metadata autodetects the cloud provider so it doesn't need to be explitely set.

david-kow · 2020-06-12T07:20:55Z

I think it comes down to how much product specific complexity we can handle. I think you are right saying that many of the problems that I mentioned can be solved by having a separate kind for each Beat and a generic one for community Beats. But other than the benefits it also introduces another layer of abstraction which is fine by itself, but requires users to learn and understand it.

We've looked at all the options (implicit defaults, explicit presets, no defaults) and decided to start with no defaults. We had already some people +1'ing on that approach. We will see what is the broader feedback and consider improvements based on that.

You can see the gist for how full Filebeat/Metricbeat examples would look like.

roncohen · 2020-06-17T14:58:46Z

super late to this and far from a kubernetes expert but: I understand the reasoning for going no defaults here. I still like the idea of presets. Could we instead implement them just as documentation? "Cookbook" examples of how to get started with the operator and a specific use case, e.g. collect logs from all containers and then supply an example config file they can just "apply" to get up and running with that use case? In short, a one step guide to collecting logs from all containers, one step to collect metrics from hosts, etc.

pebrc · 2020-06-17T15:01:12Z

Could we instead implement them just as documentation?

That's the plan for the time being. We already have a collection of recipes that we want to update for the Beats CRD

sebgl added the >enhancement Enhancement of existing functionality label Jun 5, 2020

sebgl mentioned this issue Jun 8, 2020

Introduce Beat CRD #3041

Merged

david-kow mentioned this issue Jun 8, 2020

Beat CRD presets #3199

Closed

pebrc added the :beats label Jun 8, 2020

pebrc assigned david-kow Jun 8, 2020

david-kow closed this as completed Jun 17, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Beat "presets" #3190

Proposal: Beat "presets" #3190

sebgl commented Jun 5, 2020 •

edited

Loading

david-kow commented Jun 5, 2020

anyasabo commented Jun 5, 2020 •

edited

Loading

sebgl commented Jun 8, 2020

pebrc commented Jun 8, 2020

sebgl commented Jun 8, 2020 •

edited

Loading

pebrc commented Jun 8, 2020

anyasabo commented Jun 8, 2020

david-kow commented Jun 8, 2020

jsoriano commented Jun 9, 2020

david-kow commented Jun 9, 2020

jsoriano commented Jun 10, 2020 •

edited

Loading

david-kow commented Jun 12, 2020

roncohen commented Jun 17, 2020

pebrc commented Jun 17, 2020

Proposal: Beat "presets" #3190

Proposal: Beat "presets" #3190

Comments

sebgl commented Jun 5, 2020 • edited Loading

Concerns

Proposal

david-kow commented Jun 5, 2020

Defaults

Config merging is complicated

anyasabo commented Jun 5, 2020 • edited Loading

sebgl commented Jun 8, 2020

pebrc commented Jun 8, 2020

sebgl commented Jun 8, 2020 • edited Loading

pebrc commented Jun 8, 2020

anyasabo commented Jun 8, 2020

david-kow commented Jun 8, 2020

jsoriano commented Jun 9, 2020

david-kow commented Jun 9, 2020

jsoriano commented Jun 10, 2020 • edited Loading

david-kow commented Jun 12, 2020

roncohen commented Jun 17, 2020

pebrc commented Jun 17, 2020

sebgl commented Jun 5, 2020 •

edited

Loading

anyasabo commented Jun 5, 2020 •

edited

Loading

sebgl commented Jun 8, 2020 •

edited

Loading

jsoriano commented Jun 10, 2020 •

edited

Loading