-
Notifications
You must be signed in to change notification settings - Fork 25.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for wildcards and override option for dot_expander processor #74601
Support for wildcards and override option for dot_expander processor #74601
Conversation
Pinging @elastic/es-core-features (Team:Core/Features) |
...s/ingest-common/src/test/java/org/elasticsearch/ingest/common/DotExpanderProcessorTests.java
Show resolved
Hide resolved
Co-authored-by: Dan Hermann <[email protected]>
Does this address this issue ? #36950 |
Yep, it does. |
@danhermann I have implemented your suggestions. Could you have another look? |
@elasticmachine run elasticsearch-ci/docs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor suggestions below. No need for another round of review if they can be incorporated without issue.
modules/ingest-common/src/main/java/org/elasticsearch/ingest/common/DotExpanderProcessor.java
Show resolved
Hide resolved
Co-authored-by: Dan Hermann <[email protected]>
a1d46d6
to
aa2a673
Compare
Thanks, LGTM. 👍 |
💚 Backport successful
|
Adds the ability to
override
conflicting properties instead of converting them to an array.Also, adds support for specifying a wildcard
*
to apply the dot extension to every property.Background: This is important for creating a parsing pipeline for ECS JSON logs.
Due to human readability and the way some logging frameworks work (based on adding chars to a stream rather than maintaining a dictionary for JSON objects), ECS loggers use dotted field names and nested objects interchangeably (
foo.bar: baz
,foo: { bar: baz }
). As not all fields are known upfront, we need to dedot all fields.Good news is that we don't have to do that recursively, as a path is either fully dotted or fully nested. I.e. we don't have to consider cases like
foo: { bar.baz: qux }
.The
override
property is important for this use case:data_stream: { dataset: foo }, data_stream.dataset: bar
. Expandingdata_stream
would lead todata_stream: { dataset: [foo, bar] }
but this is not expected to be an array field. Instead, the dotted field should always override the nested field.