-
Notifications
You must be signed in to change notification settings - Fork 442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Platform Observability] Create initial PO package for ingesting kibana ECS formatted logs #3622
[Platform Observability] Create initial PO package for ingesting kibana ECS formatted logs #3622
Conversation
...form-observability/data_stream/audit/elasticsearch/ingest_pipeline/kibana-audit-logs-ecs.yml
Outdated
Show resolved
Hide resolved
packages/platform-observability/data_stream/audit/elasticsearch/ingest_pipeline/default.yml
Outdated
Show resolved
Hide resolved
The first time I made the changes I hit this which reverted them:
Maybe due to me having run elastic-package on an older version. Thought I'd mention it incase other testers bumped into the same thing. |
heh, so far my attempts at starting the stack are failing. Will have to dig:
|
Interestingly enough, the kibana healthcheck checks for the ES security index. which is red. Go figure 😆 |
ah!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Success! I also had to do this stuff to get your package to show up https://github.com/elastic/integrations/blob/main/docs/developer_workflow_design_build_test_integration.md#build
The elastic-package stack was surprisingly hard to setup. A full docker disk failed in a way that looked like kibana was wrong, I had to import & trust the self-signed SSL cert. Either I'm holding it wrong or we have some work to do there. :)
Oh, and not sure where the kibana logo SVG is from. It's a little different than https://brand.elastic.co/302f66895/p/41ff52-solution-logos/b/14d954/i/11013297 - maybe good to re-import?
packages/platform-observability/data_stream/audit/fields/ecs.yml
Outdated
Show resolved
Hide resolved
required: true | ||
show_user: true | ||
default: | ||
- /tmp/service_logs/kibana.log |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see a /usr/share/kibana/logs
in the container. Wondering if we should default to that path and update the elastic-package stack bindmounts to go there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. Should we also enable Kibana logging by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely :)
- {{this}} | ||
{{/each}} | ||
{{#if processors}} | ||
{{processors}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing we can add "filebeat" processors (like https://www.elastic.co/guide/en/beats/filebeat/current/decode-json-fields.html) to the manifest and have them show up here. Or maybe just put them in this file directly.
Not sure what's in-fashion for packages today regarding reader-vs-ES side processing but personally I like to push as many simple transformations as I can to the collection point. Maybe @mtojek can advise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did a quick test and it seems to work. I'll explore it a bit more. It would simplify the pipeline config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is all I have for the pipeline:
[
{
"remove": {
"field": "data_stream.dataset",
"ignore_missing": true
}
},
{
"remove": {
"field": "event.dataset",
"ignore_missing": true
}
},
{
"set": {
"description": "Uses event.dataset as a default for data_stream.dataset if the latter is not set.",
"field": "data_stream.dataset",
"copy_from": "event.dataset",
"if": "ctx.event?.dataset instanceof String && ctx.event.dataset.length() > 1",
"override": false
}
},
{
"script": {
"source": "ctx.data_stream.dataset = /[\\/*?\"<>|, #:-]/.matcher(ctx.data_stream.dataset).replaceAll('_')\n",
"if": "ctx.data_stream?.dataset != null"
}
},
{
"script": {
"source": "ctx.data_stream.namespace = /[\\/*?\"<>|, #:]/.matcher(ctx.data_stream.namespace).replaceAll('_')\n",
"if": "ctx.data_stream?.namespace != null"
}
},
{
"set": {
"field": "data_stream.type",
"value": "logs"
}
},
{
"set": {
"field": "data_stream.dataset",
"value": "kibana-logs",
"override": false
}
},
{
"set": {
"field": "data_stream.namespace",
"value": "platform-observability",
"override": false
}
},
{
"set": {
"field": "event.dataset",
"copy_from": "data_stream.dataset"
}
},
{
"rename": {
"field": "message",
"target_field": "event.original",
"ignore_missing": true
}
}
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A downside of this approach is that the tests are not able to use filebeat processor to parse the logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another surprise here :) Hoping we can have some integration experts weigh-in. Are we maybe not supposed to use processors anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used to follow the rule of thumb. Are you collecting large observability data and is the network loaded? If so, limit those with processors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We had a slack chat with @crespocarlos, and decided to go with the ingest pipeline due to code simplicity and convenient use of pipeline tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you collecting large observability data and is the network loaded?
Interesting. Does the end user get to set processors? I'd expect network load to be a factor of deployment size.
I have changed and polished the PoC as I started to understand a bit better how to create integration packages. SolutionsFor kibana/audit logs, the pipeline is configured to parse the json message using filebeat processors. With this approach, I noticed that the tests are not able to properly parse the json using the filebeat processor, but kibana is. For kibana/logs, I kept the configuration from the previous iteration, using a similar approach as here #2972 The question is: which approach is the best? Field filesField files can impact in some ways. If we follow the approach for During my tests, I noticed that ECS fields parsed from a json object don't necessarily need to be documented in the package and only what's documented in |
...-observability/data_stream/audit/_dev/test/pipeline/test-kibana-audit-logs.log-expected.json
Outdated
Show resolved
Hide resolved
🌐 Coverage report
|
I've started a doc when playing with Integrations and updated it with some findings from this document https://github.com/elastic/observability-dev/pull/2253. Feel free to add anything you feel relevant :) |
Is this implementation a drop-in replacement for the one we have in the filebeat module ? If we go the single-package route for SM and PO integration, it'll be interesting to align both implementations. Also for a single-package we'll have to carry over all the ingest pipelines for the legacy pipelines as well, unless we decide to only support starting from v8 |
Good point @klacabane. I focused more on ingesting the ECS log as-is, but there are fields in the existing Kibana package that haven't been ported to this new package and fields from Kibana logs that I didn't remove in this new pipeline. So I'll probably have to align at least the missing fields and/or make sure the new pipeline generates the same expected output. Since Kibana generates ECS formatted logs starting from V8 (correct me if I'm wrong) I think we'll probably support this new PO package starting from V8 @matschaffer, what's your opinion here? |
I wasn't thinking we'd try to support anything prior to 8 with this package for now. That will get handled by existing modules. |
packages/platform_observability/data_stream/kibana_log/manifest.yml
Outdated
Show resolved
Hide resolved
packages/platform_observability/data_stream/kibana_audit/manifest.yml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
I didn't review data stream ingestion as I don't have domain-specific knowledge. I will leave it for your team
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a couple of final-polish questions. But overall this is looking pretty good.
I guess once we add ES we'll need to rethink the package logo (maybe just use the elastic "cluster" logo), but at least it's an up-to-date kibana logo for now. Thanks for that :)
...es/platform_observability/data_stream/kibana_audit/_dev/test/pipeline/test-common-config.yml
Show resolved
Hide resolved
...es/platform_observability/data_stream/kibana_audit/elasticsearch/ingest_pipeline/default.yml
Outdated
Show resolved
Hide resolved
...ages/platform_observability/data_stream/kibana_log/elasticsearch/ingest_pipeline/default.yml
Outdated
Show resolved
Hide resolved
packages/platform_observability/data_stream/kibana_log/fields/ecs.yml
Outdated
Show resolved
Hide resolved
} | ||
}, | ||
"kibana": { | ||
"session_id": "ccZ0sbxrmmJwo+/y2Mn1tmGIrKOuZYaF8voUh0SkA/k=", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkakavas this feels like it could be risky to log (from a audit kibana log sample). Do you know if we should chase it down, or with whom? Or maybe if you know it's safe already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why would it be risky in your opinion? By looking at the docs, it seems that it's an id associated with the current login.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, thanks that helps.
At least this is nothing new so if it's an issue we don't have to fix it in this PR.
I just get a little nervous whenever I see a base64 string in a log stream. Too many occasions when they turned out to be access-providing tokens (like JWT). Hopefully this is just an encoded UUID. 😅
...orm_observability/data_stream/kibana_log/elasticsearch/ingest_pipeline/ecs-logs-pipeline.yml
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is probably good to merge, especially since it's 0.0.1 so shouldn't end up in a released UI (I think).
This is the deepest I've gone into integrations/packages so could be good to have @klacabane or @neptunian give a look too. Either in this PR or the next one.
required: true | ||
show_user: true | ||
default: | ||
- /var/log/kibana/kibana.stdout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry about the late feedback, but it took an hour for the stack to start today. Seems like this should be kibana.log no? That's what appears in https://www.elastic.co/guide/en/kibana/master/logging-configuration.html anyway
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I've also aligned the audit log file name with what's in Kibana docs
🎉 |
What does this PR do?
This PR is the initial work for creating the new Platform Observability package. We will start with ingesting Kibana logs but its capabilities will be extended over time. This package is heavily based on the solution provided here #2972
Checklist
changelog.yml
file.Author's Checklist
How to test this PR locally
In order to tests this we first need to edit the configuration used by
elastic-packages
~/.elastic-package/profiles/default/stack/kibana.config.8x.yml
~/.elastic-package/profiles/default/stack/snapshot.yml
, adding"../../../tmp/service_logs:/usr/share/kibana/.kbn-logs"
to the list of volumes forkibana
container, to enable the agent to see the logspackages/platform_observability
folder and runelastic-package build
elastic-package stack up -v -d
Integrations
Platform Observability
with default configuration, adding it toElastic-Agent
hostRelated issues
Screenshots
Integration page

Kibana logs

Kibana audit logs
