-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update output config docs #2597
Conversation
successfully build docs via: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left some minor comments.
@@ -106,10 +106,6 @@ The number of workers per configured host publishing events to Elasticsearch. Th | |||
is best used with load balancing mode enabled. Example: If you have 2 hosts and | |||
3 workers, in total 6 workers are started (3 for each host). | |||
|
|||
===== port |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More as a historical remark: port was removed as it would not work will with hosts
.
The index root name to write events to. The default is the Beat name. | ||
For example "{beatname_lc}" generates "[{beatname_lc}-]YYYY.MM.DD" indexes (for example, | ||
"{beatname_lc}-2015.04.26"). | ||
The index name to write events to. The default is "%\{{beatname_lc}-%\{yyyy.MM.dd}" (for example, "{beatname_lc}-2015.04.26"). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the effect of %\
here? Do you have on purpose {{ before beatname_lc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Array of pipeline selector configurations supporting conditionals, *Format String* | ||
based field access and name mappings. The first rule matching will be used to | ||
set the `pipeline` for the event to be published. If `pipelines` is missing or no | ||
rule matches, the `pipeline` field will be used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar example as with indices would be nice here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too much repetition.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like repetition either, but there's no guarantee with reference info that users will read from top to bottom, so some repetition might be necessary (or at least a pointer to the supporting info).
|
||
*`default`*: Default string value if `mapping` does not find a match. | ||
|
||
*`when`*: Condition which must succeed in order to execute the current rule. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can have a general example for topics, pipelines, indices we can link to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 We should have a usage type topic that describes, for example, how to set the pipeline dynamically. Having the reference content is good enough for Beta, though.
|
||
===== key | ||
|
||
The Kafka event key. The event key must be unique and can be extracted from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if no key is set? Is it by default auto generated by Kafka?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
keys in kafka are optional. if key is set and reused, the old content will be overwritten. By never changing content, keys can be used for some artificial kind of deduplication.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a note that the key is optional. From the current description it looks like it is required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
`round_robin`, or `hash`. By default the `hash` partitioner is used. | ||
|
||
*`random.group_events`*: Select a new partition by random every `group_events` | ||
events being published. The default value is 1. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value is 1 which means after each event a new parition is picked randomly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
LGTM |
efbb7d8
to
38945b2
Compare
Update output config docs to match most recent changes in all outputs removing settings not being available, updating docs for changed fields and adding missing fields.
38945b2
to
bd61bdf
Compare
Kafka output broker event partitioning strategy. Must be one of `random`, | ||
`round_robin`, or `hash`. By default the `hash` partitioner is used. | ||
|
||
*`random.group_events`*: Select a new partition by random every `group_events` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't parse this sentence. Maybe there is a missing word?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No missing word. Using group_events
is supposed to use the name as numeric replacement. I want to express that the number configured in group_events
sets how many consecutive published events will be send to the same partition, before the partitioner selects a new partition to use for the next set events being published.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sort of substitution works better when the name of the option clearly indicates what it represents. In this case, it's not clear by the name that group_events
will resolve to a number, so some readers will stumble over how the sentence is constructed. I realize you're trying to avoid using a lot of words, but it's better to be explicit like you've been in your explanation to me. Also, unless you're telling the user to do something, avoid using the imperative. When someone reads "select", they think they need to select something. It might take them a few seconds to realize that you're essentially saying "configure this option so that the software selects." Does that make sense?
events being published. The default value is 1 meaning after each event a new | ||
parition is picked randomly. | ||
|
||
*`round_robin.group_events`*: Select the next partition every `group_events` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as line 599
*`hash.random`*: Randomly distribute events if no hash or key value can be computed. | ||
|
||
All partitioners will try to publish events to all partitions by default. If a | ||
partitions leader becomes unreachable for the beat, the output might block. All |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
partition's leader
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
partitions leader becomes unreachable for the beat, the output might block. All | ||
partitioners support setting `reachable_only` to overwrite this | ||
behavior. If `reachable_only` is set to `true`, events will be published to | ||
available partitions only. Note: publishing to the subset of available partitions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this a proper note by starting it on a separate line and beginning with "NOTE: ". Also, saying "only" makes the sentence grammatically ambiguous in English. I would say something like:
NOTE: Publishing to a subset of available partitions potentially increases resource usage because events may become unevenly distributed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
===== indices | ||
|
||
Array of index selector rules supporting conditionals, *Format String* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description (and the other ones like it in the doc) needs work. It took me a couple of times reading it before I realized that you're describing how to set the index dynamically. I think you're also saying that the index will be set based on the first item in the array that results in a match, but I am not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
copy'n past ;)
I think you're also saying that the index will be set based on the first item in the array that results in a match
correct.
I did try to make it 'short, as these settings are quite 'complicated' in comparison to most other settings. indices
(and the others) is an array of rules executed one after another. The first rule matching (conditionals match and format string doesn't fail) in the array of rules will set index the event is published to. If one rule did match, no other rule in the array will be check to not generate ambiguities in case of 2 rules potentially matching. If no rule did match, the index
field will be evaluated (which is basically another rule). Truth is index
is just another rule appended to the list of rules defined in indices
.
|
||
Rule settings: | ||
|
||
*`index`*: The index *Format String* to use. If the fields used are missing, the rule fails. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Format String shouldn't be capitalized because it's not a product component. Do you plan to point to the reference content that you wrote about format strings? I'm thinking about relocating the content about the config file to the guides for individual Beats, but I won't do that for now, so use the {libbeat}/filename.html[link text]
convention to add links for now. I will clean up the links when I move the content. If you want, I can add links after this PR is merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changing all *Format String*
to format string
.
My idea with documenting all types in "config file format" includes (long-term goal) adding the type to all settings in the reference docs by name + link to the definition of said type. Having some docs in libbeat only and others reused among multiple beats I did struggle with the execution, though.
Update output config docs to match most recent changes in all outputs removing settings not being available, updating docs for changed fields and adding missing fields.
Update output config docs to match most recent changes in all outputs removing settings not being available, updating docs for changed fields and adding missing fields.
Update output config docs to match most recent changes in all outputs removing
settings not being available, updating docs for changed fields and adding
missing fields.
Add output plugin settings for: #2482 and #2074