Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Processor.parser with GROK cannot fully process fields to tags #5339

Closed
joriws opened this issue Jan 24, 2019 · 5 comments · Fixed by #5533
Closed

Processor.parser with GROK cannot fully process fields to tags #5339

joriws opened this issue Jan 24, 2019 · 5 comments · Fixed by #5533
Labels
bug unexpected problem or unintended behavior
Milestone

Comments

@joriws
Copy link

joriws commented Jan 24, 2019

Relevant telegraf.conf:

[agent]
  interval="1s"
  flush_interval="1s"

[[inputs.exec]]
  timeout = "3s"
  data_format = "csv"
  commands = [
    "echo 'date,instance,Incoming Requests,Outgoing Requests,Incoming Answers 2xxx,Outgoing Answers 2xxx,Incoming Answers UTD,Outgoing Answers UTD,Incoming Answers Redirect,Outgoing Answers Redirect,Incoming Answers Other,Outgoing Answers Other,Retransmit Requests,Rejected Requests Filtering,Rejected Requests In-gress Filtering,Rejected Requests Other,Timeout Requests,Discarded Answers\n2019-01-23-00:00,UTC-KEY-RE1-V6 random1.mit.edu orig.realm.net:node.orig.realm.net:dest.realm.net:ans.dest.realm.net,22,22,11,11,33,44,55,66,77,88,99,12,13,14,15,16'"
  ]
  csv_header_row_count = 1
  csv_delimiter = ","
  csv_trim_space = true
  #csv_tag_columns = ["instance"]
  csv_timestamp_column = "date"
  csv_timestamp_format = "2006-01-02-15:04"

[[processors.parser]]
  parse_fields = ["instance"]
  #drop_original = true
  merge = "override"
  data_format = "grok"
  grok_patterns = [ "^%{USERNAME:CHASSIS:tag}-%{USERNAME:INSTANCE:tag} %{HOSTNAME:NODE:tag} %{HOSTNAME:ORIGINREALM:tag}:%{HOSTNAME:ORIGINHOST:tag}:%{HOSTNAME:REALM:tag}:%{DATA:ANSWERHOST}$" ]
  #grok_patterns = [ "^%{USERNAME:CHASSIS:tag}-%{USERNAME:INSTANCE:tag} %{HOSTNAME:NODE:tag} %{HOSTNAME:ORIGINREALM:tag}:%{HOSTNAME:ORIGINHOST:tag}:%{HOSTNAME:REALM:tag}:%{DATA:ANSWERHOST:tag}$" ]

#[[processors.regex]]
#  [[processors.regex.tags]]
#    key = "instance"
#    pattern = "^([a-zA-Z0-9-]*?).+$"
#    replacement = "${1}"
#    result_key = "CHASSIS"

[[outputs.file]]
  files = ["stdout"]

System info:

Telegraf unknown (git: master efbc83c)

  • go get 2019/01/23 and compilation

Steps to reproduce:

I needed to drop on input.exec csv_tag_columns away that parser will process it as it seems to process fields only. My requirement is to process tags as well.

Now parser reads field "instance" and I can grok_patterns it. Uncommented version works but commented-out version does not work. In my scenario "instance"-column contains 100% tags I need to set for influxdb-output.

Expected behavior:

exec,CHASSIS=UTC-KEY-RE1,INSTANCE=V6,NODE=random1.mit.edu,ORIGINHOST=node.orig.realm.net,ORIGINREALM=orig.realm.net,REALM=dest.realm.net,host=na000teleflow1,ANSWERHOST="ans.dest.realm.net" Rejected\ Requests\ In-gress\ Filtering=13i,Incoming\ Answers\ 2xxx=11i,Rejected\ Requests\ Other=14i,Incoming\ Requests=22i,Incoming\ Answers\ UTD=33i,Outgoing\ Answers\ 2xxx=11i,Retransmit\ Requests=99i,instance="UTC-KEY-RE1-V6 random1.mit.edu orig.realm.net:node.orig.realm.net:dest.realm.net:ans.dest.realm.net",Outgoing\ Answers\ UTD=44i,Incoming\ Answers\ Redirect=55i,Discarded\ Answers=16i,Outgoing\ Answers\ Other=88i,date="2019-01-23-00:00",Outgoing\ Requests=22i,Rejected\ Requests\ Filtering=12i,Outgoing\ Answers\ Redirect=66i,Incoming\ Answers\ Other=77i,Timeout\ Requests=15i 1548201600000000000

Actual behavior:

exec,CHASSIS=UTC-KEY-RE1,INSTANCE=V6,NODE=random1.mit.edu,ORIGINHOST=node.orig.realm.net,ORIGINREALM=orig.realm.net,REALM=dest.realm.net,host=na000teleflow1 Rejected\ Requests\ In-gress\ Filtering=13i,Incoming\ Answers\ 2xxx=11i,Rejected\ Requests\ Other=14i,Incoming\ Requests=22i,Incoming\ Answers\ UTD=33i,Outgoing\ Answers\ 2xxx=11i,Retransmit\ Requests=99i,instance="UTC-KEY-RE1-V6 random1.mit.edu orig.realm.net:node.orig.realm.net:dest.realm.net:ans.dest.realm.net",Outgoing\ Answers\ UTD=44i,Incoming\ Answers\ Redirect=55i,Discarded\ Answers=16i,Outgoing\ Answers\ Other=88i,date="2019-01-23-00:00",Outgoing\ Requests=22i,Rejected\ Requests\ Filtering=12i,Outgoing\ Answers\ Redirect=66i,Incoming\ Answers\ Other=77i,Timeout\ Requests=15i,ANSWERHOST="ans.dest.realm.net" 1548201600000000000

if I try to use commented out grok-pattern I get error message
2019-01-24T14:16:11Z E! [processors.parser] could not parse field instance: grok: must have one or more fields
exec,host=na000teleflow1 Outgoing\ Requests=22i,Outgoing\ Answers\ Other=88i,Outgoing\ Answers\ 2xxx=11i,Outgoing\ Answers\ Redirect=66i,Retransmit\ Requests=99i,Rejected\ Requests\ In-gress\ Filtering=13i,Timeout\ Requests=15i,date="2019-01-23-00:00",instance="UTC-KEY-RE1-V6 random1.mit.edu orig.realm.net:node.orig.realm.net:dest.realm.net:ans.dest.realm.net",Incoming\ Answers\ Redirect=55i,Outgoing\ Answers\ UTD=44i,Incoming\ Answers\ 2xxx=11i,Incoming\ Requests=22i,Incoming\ Answers\ UTD=33i,Rejected\ Requests\ Other=14i,Incoming\ Answers\ Other=77i,Rejected\ Requests\ Filtering=12i,Discarded\ Answers=16i 1548201600000000000

Additional info:

  • I'd want to delete field "instance" from output after being processed by parser (underlined text), can probably be done on "order+1 processor to remove field"
  • ANSWERHOST must be transferrable to TAG as well but parser does not seem to allow 100% of field text grok'd to tags-modifier. This is the problem..
  • Tried to play with processor.regex but it is too simple for extracting multiple tag-values from single tag. It would be nice if I could do similar regex to grok and set multiple tags with single run.
@joriws
Copy link
Author

joriws commented Jan 24, 2019

I see that with processor.converter I could swap last "ANSWERHOST" to tag type. But I feel that I'd be able to do it with single processor.

[[processors.converter]] order=2 fielddrop = [ "instance","date","host" ] [processors.converter.fields] tag = ["ANSWERHOST"]

@danielnelson danielnelson added the bug unexpected problem or unintended behavior label Jan 24, 2019
@joriws
Copy link
Author

joriws commented Jan 25, 2019

Part of the solution I could think is to allow processor.parser also to work with tags. Now according to my experiments and documentation it works only with fields. Because why I as a telegraf potential user cannot create new tags from existing tag. There are numerous examples on my world like separating cluster-name from hostname which I all want to be as tags. Doing it via fields is a dirty trick causing a lot of unnecessary processing required.

@glinton
Copy link
Contributor

glinton commented Feb 1, 2019

Removing this line resolves this issue, but I'm not sure what the consequences are.

@joriws
Copy link
Author

joriws commented Feb 1, 2019

I think in general better option (vs removing this line) would be allow processors.parser to be used also with

parse_tags = ["instance"]

@danielnelson
Copy link
Contributor

We should switch that check to log a warning at debug level. Any metrics without fields will be removed at the next stage of processing, but since this metric is being merged it will produce a complete metric with fields.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants