Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add filtering option by exact device names in system.diskio: diskio.include_devices. #6085

Merged
merged 14 commits into from
Feb 12, 2018
Merged

Add filtering option by exact device names in system.diskio: diskio.include_devices. #6085

merged 14 commits into from
Feb 12, 2018

Conversation

iahmedov
Copy link
Contributor

Per feature request #6083 added system.diskio and system.raid regular expression filters

@elasticmachine
Copy link
Collaborator

Can one of the admins verify this patch?

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few thoughts on the config option. We should find one that is reusable as this becomes a more common pattern. Here a few suggestions:

With the below, the assumption is that we potentially want to have more fields to filter on and regexp can be used for positive / negative:

  #diskio.filter.name: ^sda
  #raid.filter.name: ^md

The below is more specific on what it does. It allows to set a white and blacklist. Regexp are used to also define multiple patterns.

  #diskio.include.name: ^sda
  #diskio.exclude.name: ^sda
  #raid.include.name: ^md
  #raid.exclude.name: ^md

Same as the previous one except that having to define multiple patterns in regexp, an array can be used.

  #diskio.include.name: [^sda]
  #diskio.exclude.name: [^sda]
  #raid.include.name: [^sda]
  #raid.exclude.name: [^sda]

I was also thinking about a more generic option that the field can be defined and what it should match. But looking at it I think it's overkill and harder to document.

  #diskio.includes:
    - field: name
      match: [^sda]
    - field: other
      match: [^sda]

Having written this, I'm asking myself if it's the right way to do this. We have processors and should use them whenever possible to have only 1 implementation. There are some special cases where the metricset is a sum of multiple events where this doesn't work. But that should not be the case here as diskio and raid are multiple events and can be filtered. The main problem it's hard to define a whitelist.

As far as I understand the initial request from @stefws the important part is that the data is already filtered before it's event retrieved (if that is possible). This changes does not seem to do this as far as I can see.

@stefws @iahmedov I would really appreciate to hear your thoughts on this.

@@ -21,14 +24,33 @@ func init() {
type MetricSet struct {
mb.BaseMetricSet
statistics *DiskIOStat
nameRegexp *regexp.Regexp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use our own Match package here as it's more efficient then the native regexp implementation: https://github.com/elastic/beats/tree/master/libbeat/common/match

@ruflin ruflin added in progress Pull request is currently in progress. discuss Issue needs further discussion. labels Jan 17, 2018
@iahmedov
Copy link
Contributor Author

iahmedov commented Jan 17, 2018

We should find one that is reusable as this becomes a more common pattern.

We have processors and should use them whenever possible to have only 1 implementation.

I agree with you, we probably need some generic way of filtering. By the way, didn't know about libbeat.Processor, which can be applied here as a way of generic filtering.
Something like this could do the job

- module: system
  metricsets: ["diskio", "raid"]
  processors:
    - include_event:
        regexp: ^sda
        field: diskio.name
    - drop_event:
        regexp: ^md1
        field: raid.name

This is going to be slower, since it works after fetching events, but it can be "ok" for some metricsets like raid where there is not too much data. For performance critical cases, there is always a way to write specific filters inside metricsets. If this is okay we can add regexp filter to processor actions.

As far as I understand the initial request from @stefws the important part is that the data is already filtered before it's event retrieved (if that is possible)

Yes possible, since methods we are using do not support filtering while parsing and if only 2 metricsets require this, we can open pull requests to 3rd party libs we depend on, not sure they can accept such request, since data for diskstats and mdstat most of the time is small and there is no need to add more code to do filtering inside parsers:

Any thoughts on this @ruflin ?

@stefws
Copy link

stefws commented Jan 17, 2018 via email

@iahmedov
Copy link
Contributor Author

Just my point, on own hypervisor nodes, there are hundredes of guest devices, that I’ll like not to pull data from just to drop such event again in processor action.

@stefws now I understand your point, if you have exact names of devices to include, then there is a quicker way, to add it inside diskio metricset, since psutil/disk.IOCounters (which diskio uses) supports filtering by exact match names

@ruflin
Copy link
Contributor

ruflin commented Jan 18, 2018

@stefws @iahmedov Thanks for the input. Moving forward I suggest not to add any post processing features to the metricsets if the processors can be used. If the processors can't do what we need, we should discuss if we should extend the processors or add special functionality.

For the pre processing it really depends if the client library supports it as stated. If the feature is useful we should definitively try to make a contribution. In some cases the libs are even under our control as we already forked it because of other changes that were critical for us.

For this PR I suggest we change it to focus only on pre-processing as @stefws initially request. So if the lib already supports it, lets see if we can use it (didn't dig into it).

@stefws
Copy link

stefws commented Jan 19, 2018

@iahmedov 'add it inside diskio metricset, since psutil/disk.IOCounters (which diskio uses) supports filtering by exact match names'

Howto utilize this for diskio?

@iahmedov
Copy link
Contributor Author

iahmedov commented Jan 19, 2018

@stefws I will change the PR soon (I am on mobile now), so that in config you can give an array of device names to include in events

module: system
  metricsets: ["diskio"]
  diskio.include_devices: ["sda", "sda1", "sda2", "sdb1", .... ] # exact match used
...

@ruflin, @stefws what do you think about this field, is this fine?

As for raid they do not support matching by name yet, should be opened PR there.
Let me correct myself, raid metricset uses elastic/procfs not elastic/gosigar project

@ruflin
Copy link
Contributor

ruflin commented Jan 21, 2018

@iahmedov I like your suggestion. Could you also a CHANGELOG?

Looking at the code I somehow missed where the actual implementation in the call to fetch the data happens. Did I miss something?

@iahmedov
Copy link
Contributor Author

Looking at the code I somehow missed where the actual implementation in the call to fetch the data happens. Did I miss something?

Thankfully you didn't miss anything @ruflin , instead I missed to push changes

@@ -111,6 +111,9 @@ metricbeat.modules:
#socket.reverse_lookup.success_ttl: 60s
#socket.reverse_lookup.failure_ttl: 60s

# Diskio configurations
#diskio.include_devices: ["sda", "sda1", "sda2"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what we should have as default config. Should we have [ ] to disable it or ["*"] ? Alternatives?

Copy link
Contributor Author

@iahmedov iahmedov Jan 22, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not specifying the config or just empty array disables filtering

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As our config normally follow the logic to have the "default" configure I suggest we have here []. It's good to have the examples in our docs as you did.

@ruflin
Copy link
Contributor

ruflin commented Jan 22, 2018

@iahmedov Good to know. Somehow Github did not published my comments before, so perhaps some are not valid anymore. Will review later again.

@ruflin
Copy link
Contributor

ruflin commented Jan 22, 2018

jenkins, test it

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code wise it looks good to me. I left some minor docs / config comments.

@@ -111,6 +111,9 @@ metricbeat.modules:
#socket.reverse_lookup.success_ttl: 60s
#socket.reverse_lookup.failure_ttl: 60s

# Diskio configurations
#diskio.include_devices: ["sda", "sda1", "sda2"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As our config normally follow the logic to have the "default" configure I suggest we have here []. It's good to have the examples in our docs as you did.

There are no configuration options for this metricset.
*`diskio.include_devices`*:: When the `diskio` metricset is enabled, you can use the
`diskio.include_devices` option to define a list of device names to pre-filter the
devices that are reported. Filters only exact matches
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this to: This filters only exact matches. (dot at the end).

*`diskio.include_devices`*:: When the `diskio` metricset is enabled, you can use the
`diskio.include_devices` option to define a list of device names to pre-filter the
devices that are reported. Filters only exact matches
If not set, all disk devices are returned
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not set or an empty array is set, all disk devices are returned.

(also missing dot at the end).

}

// New is a mb.MetricSetFactory that returns a new MetricSet.
func New(base mb.BaseMetricSet) (mb.MetricSet, error) {
ms := &MetricSet{
config := struct {
Names []string `config:"diskio.include_devices"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name this variable IncludeDevices. This makes it easier to follow the code I think.

Copy link

@stefws stefws Jan 23, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 would make sense.
Right so this means that this feature isn't already in place as I was lead to believe initially...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stefws Not sure I get your comment above?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I seconded your comment on the more sensible name for the variable and since this is a code change it means that this feature isn't currently available yet (as in v.5.6.5) which I mistakenly thought was whats iahmedov ment when he said:
'since psutil/disk.IOCounters (which diskio uses) supports filtering by exact match names'

@@ -14,6 +14,7 @@
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory
#diskio.include_devices: ["sda", "sda1", "sda2"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not add this here.

@@ -41,6 +41,7 @@ metricbeat.modules:
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory
#diskio.include_devices: ["sda", "sda1", "sda2"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it should be added here as the above is our "default" config.

@ruflin
Copy link
Contributor

ruflin commented Jan 25, 2018

@iahmedov Can you ping me when you pushed an update. I don't think I get a notification on commits.

@iahmedov
Copy link
Contributor Author

iahmedov commented Jan 26, 2018

@ruflin, its ready, can you please review again, thanks.

@ruflin
Copy link
Contributor

ruflin commented Jan 28, 2018

jenkins, test it

Copy link
Contributor

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM. I left 2 minor comments for the short config files to remove it as it should not be in the short configs. We try to have there only the most common options. Thanks for adding docs.

@@ -14,6 +14,7 @@
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory
#diskio.include_devices: []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove this line? We should not have this in our short config. Also run make update afterwards.

@@ -14,6 +14,7 @@
process.include_top_n:
by_cpu: 5 # include top 5 processes by CPU
by_memory: 5 # include top 5 processes by memory
#diskio.include_devices: []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should also be removed here (if make update does not cover it).

@iahmedov
Copy link
Contributor Author

iahmedov commented Feb 3, 2018

@ruflin comments are implemented, could you please review again

@ruflin
Copy link
Contributor

ruflin commented Feb 4, 2018

jenkins, test it

@@ -51,6 +51,8 @@ https://github.com/elastic/beats/compare/v6.0.0-beta2...master[Check the HEAD di
- Rename `golang.heap.system.optained` field to `golang.heap.system.obtained`. {issue}5703[5703]
- Support haproxy stats gathering using http (additionaly to tcp socket). {pull}5819[5819]
- De dot keys in jolokia/jmx metricset to prevent collisions. {pull}5957[5957]
- Support to optionally 'de dot' keys in http/json metricset to prevent collisions. {pull}5957[5957]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sneaked in during rebasing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out yes, removed entry from changelog which is not related to this pull request (just tagging to inform about change @ruflin )

@ruflin
Copy link
Contributor

ruflin commented Feb 5, 2018

jenkins, test it

@ruflin
Copy link
Contributor

ruflin commented Feb 5, 2018

Thanks @iahmedov for the ping. I just pushed one more commit as there was now a conflict in the changelog. Waiting for green from CI now.

@ruflin
Copy link
Contributor

ruflin commented Feb 6, 2018

@iahmedov Could you run make update in the metricbeat directory? One of the auto generated files seem to be out of sync.

@karmi
Copy link

karmi commented Feb 6, 2018

Hi @iahmedov, we have found your signature in our records, but it seems like you have signed with a different e-mail than the one used in yout Git commit. Can you please add both of these e-mails into your Github profile (they can be hidden), so we can match your e-mails to your Github profile?

@iahmedov
Copy link
Contributor Author

iahmedov commented Feb 6, 2018

@ruflin new commit

@iahmedov
Copy link
Contributor Author

iahmedov commented Feb 6, 2018

@karmi fixed mistake, rebased commit

@ruflin
Copy link
Contributor

ruflin commented Feb 9, 2018

jenkins, test it

@ruflin
Copy link
Contributor

ruflin commented Feb 12, 2018

jenkins, test it

@ruflin ruflin merged commit 539caf2 into elastic:master Feb 12, 2018
@ruflin ruflin changed the title Add diskio and raid filtering using regular expressions Add filtering option by exact device names in system.diskio: diskio.include_devices. Feb 12, 2018
@elasticmachine
Copy link
Collaborator

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

@ruflin
Copy link
Contributor

ruflin commented Feb 12, 2018

@iahmedov Finally merged. Took a bit longer then expected. Thanks a lot for the contribution.

@iahmedov
Copy link
Contributor Author

@ruflin Thanks for thorough and patient review 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issue needs further discussion. in progress Pull request is currently in progress. Metricbeat Metricbeat module review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants