-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Meta] Enable seamless ECS log onboarding for all log inputs #1454
Comments
Pinging @elastic/integrations (Team:Integrations) |
I did some initial ad-hoc performance tests of the pipeline. Runtime for different types of logs:
|
Turns out that doing measurements with a local ES instance yields much more consistent and much faster results. I also did some CPU profiling to get more details on what's actually slow. Interestingly, the On the other hand, conditional execution by doing a scripted pre-flight check on the message to detect whether it's ECS JSON is very cheap. There's another source of overhead that comes with any non-empty pipeline: Transforming the IndexRequest to a |
Hi! We just realized that we haven't looked into this issue in a while. We're sorry! We're labeling this issue as |
To make the ingest of ECS JSON logs more native and seamless, we want a user experience that does not require custom configuration, JSON parsing, and other processing configuration. Instead, we should detect if a log comes in ECS JSON format and parse it appropriately.
We have identified integrations and Elasticsearch ingest node pipelines to be the best place to automatically detect and parse ECS JSON logs, for the following reason:
I have implemented some improvements for the ES processors that now allow to properly handle ECS JSON logs. I've also created a POC for an ES ingest node pipeline:
Click here to see POC ingest pipeline
Eventually, all log-input-type integrations should leverage this pipeline to automatically handle ECS JSON logs.
If the performance hit of the scripts used in the pipeline turns out to be an issue, we can think about implementing a dedicated processor in Elasticsearch with a pure Java implementation.
We may want to add an option to the integration settings for users to opt out of auto-detection of ECS JSON in case they're worried about the potential additional impact. That can work by adding a
tag
to the events and conditionally executing the ECS pipeline.Where to start?
The custom logs integration comes to mind first but it seems there are some dependencies and open questions. As there are already integrations that include pipelines for CloudWatch and Azure logs, these might be a good start.
Open questions
The text was updated successfully, but these errors were encountered: