Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for wildcards and override option for dot_expander processor #74601

Merged

Conversation

felixbarny
Copy link
Member

@felixbarny felixbarny commented Jun 28, 2021

Adds the ability to override conflicting properties instead of converting them to an array.
Also, adds support for specifying a wildcard * to apply the dot extension to every property.

Background: This is important for creating a parsing pipeline for ECS JSON logs.
Due to human readability and the way some logging frameworks work (based on adding chars to a stream rather than maintaining a dictionary for JSON objects), ECS loggers use dotted field names and nested objects interchangeably (foo.bar: baz, foo: { bar: baz }). As not all fields are known upfront, we need to dedot all fields.
Good news is that we don't have to do that recursively, as a path is either fully dotted or fully nested. I.e. we don't have to consider cases like foo: { bar.baz: qux }.

The override property is important for this use case: data_stream: { dataset: foo }, data_stream.dataset: bar. Expanding data_stream would lead to data_stream: { dataset: [foo, bar] } but this is not expected to be an array field. Instead, the dotted field should always override the nested field.

@felixbarny felixbarny added the :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP label Jun 28, 2021
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Jun 28, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (Team:Core/Features)

@danhermann danhermann self-requested a review June 28, 2021 10:19
@jakelandis
Copy link
Contributor

Does this address this issue ? #36950

@felixbarny felixbarny linked an issue Jul 1, 2021 that may be closed by this pull request
@felixbarny felixbarny changed the title Support wildcards override option for dot_expander processor Support for wildcards and override option for dot_expander processor Jul 1, 2021
@felixbarny
Copy link
Member Author

Yep, it does.

@felixbarny felixbarny requested a review from danhermann July 1, 2021 19:11
@felixbarny felixbarny added >enhancement auto-backport Automatically create backport pull requests when merged v7.15.0 v8.0.0 labels Jul 1, 2021
@felixbarny
Copy link
Member Author

@danhermann I have implemented your suggestions. Could you have another look?

@danhermann
Copy link
Contributor

@elasticmachine run elasticsearch-ci/docs

Copy link
Contributor

@danhermann danhermann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor suggestions below. No need for another round of review if they can be incorporated without issue.

@felixbarny felixbarny force-pushed the dot_expander-wildcard-override branch from a1d46d6 to aa2a673 Compare July 8, 2021 05:55
@danhermann
Copy link
Contributor

Thanks, LGTM. 👍

@felixbarny felixbarny merged commit 0a8f725 into elastic:master Jul 8, 2021
@felixbarny felixbarny deleted the dot_expander-wildcard-override branch July 8, 2021 12:40
elasticsearchmachine pushed a commit to elasticsearchmachine/elasticsearch that referenced this pull request Jul 8, 2021
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
7.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP >enhancement Team:Data Management Meta label for data/management team v7.15.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expand dotted fields recursively
5 participants