Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add file system information to each event #36065

Merged
merged 13 commits into from
Jul 17, 2023
Merged

Conversation

rdner
Copy link
Member

@rdner rdner commented Jul 14, 2023

What does this PR do?

This includes:

  • Unix-like:
    • device
    • inode
  • Windows:
    • idxlo
    • idxhi
    • vol
  • All:
    • Fingerprint (if the fingerprint mode is enabled)

Why is it important?

This was requested by some customers and it makes it easier for us to troubleshoot.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Author's Checklist

  • Manual Test on Unix
  • Manual Test on Windows

How to test this PR locally

  1. Run Filebeat with the following configuration:
filebeat.inputs:
  - type: filestream
    id: my-filestream-id
    enabled: true
    paths:
      - "/path/to/your/logs/*.log"
    prospector.scanner.check_interval: 5s
    prospector.scanner.fingerprint.enabled: true
path.data: "/path/to/your/data"
logging:
  level: debug
output.console:
  enabled: true

Note that you need to replace the path placeholders.

  1. Create a file suitable for ingestion in your logs folder:
touch some.log && printf 'a%.0s' {1..1024} >> some.log && echo >> some.log

You should see this event printed to the console:

{
  "@timestamp": "2023-07-14T10:12:50.031Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.10.0"
  },
  "log": {
    "offset": 0,
    "file": {
      "inode": 119400347,
      "fingerprint": "2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a",
      "path": "/path/to/your/logs/some.log",
      "device_id": 16777234
    }
  },
  "message": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa",
  "input": {
    "type": "filestream"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "host": {
    "name": "MacBook-Pro.localdomain"
  },
  "agent": {
    "name": "MacBook-Pro.localdomain",
    "type": "filebeat",
    "version": "8.10.0",
    "ephemeral_id": "f29a0401-d260-4e5d-8c88-d9bc3e5f3366",
    "id": "7be28e66-2f52-4873-857f-e4b419d4a5b5"
  }
}

On Windows you'd see:

{
  "@timestamp": "2023-07-14T11:10:09.363Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.10.0"
  },
  "input": {
    "type": "filestream"
  },
  "host": {
    "name": "DESKTOP-O7KLH6G"
  },
  "agent": {
    "id": "0495da3e-56f2-4f92-a52e-51fef0e6ecb9",
    "name": "DESKTOP-O7KLH6G",
    "type": "filebeat",
    "version": "8.10.0",
    "ephemeral_id": "e6da3978-33bd-402b-8308-3229f3b76f1d"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "log": {
    "offset": 0,
    "file": {
      "path": "C:\\Users\\Admin\\Desktop\\test\\fs_metadata\\logs\\some.log",
      "idxhi": 327680,
      "idxlo": 99308,
      "vol": 2860917223,
      "fingerprint": "0e9d2f73e97662c9a156111ea556fc536782043f2a1cd69fecbe3b69cab64c9d"
    }
  },
  "message": "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
}

Note the file system metadata in the log.file section of the event.

Related issues

This includes:

* Unix-like:
  * device
	* inode
* Windows:
  * idxlo
	* idxhi
	* vol
* All:
  * Fingerprint (if the fingerprint mode is enabled)
@rdner rdner added enhancement Filebeat Filebeat backport-skip Skip notification from the automated backport with mergify Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team labels Jul 14, 2023
@rdner rdner self-assigned this Jul 14, 2023
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Jul 14, 2023
@elasticmachine
Copy link
Collaborator

elasticmachine commented Jul 14, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-07-17T13:44:17.030+0000

  • Duration: 84 min 44 sec

Test stats 🧪

Test Results
Failed 0
Passed 27517
Skipped 2029
Total 29546

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@rdner rdner marked this pull request as ready for review July 14, 2023 13:14
@rdner rdner requested a review from a team as a code owner July 14, 2023 13:14
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@rdner rdner requested a review from pchila July 14, 2023 13:49
@ycombinator
Copy link
Contributor

ycombinator commented Jul 14, 2023

"log": {
    "offset": 0,
    "file": {
      "inode": "119400347",
      "fingerprint": "2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a",
      "path": "/path/to/your/logs/some.log",
      "device": "16777234"
    }
  }
"log": {
    "offset": 0,
    "file": {
      "path": "C:\\Users\\Admin\\Desktop\\test\\fs_metadata\\logs\\some.log",
      "idxhi": "327680",
      "idxlo": "99308",
      "vol": "2860917223",
      "fingerprint": "0e9d2f73e97662c9a156111ea556fc536782043f2a1cd69fecbe3b69cab64c9d"
    }

Since we're adding what looks like 3-4 string fields to every event, I wonder how this will impact performance. Should we do some benchmarking first? Or put this behind a configuration setting (either enabled by default so we can turn it off in case of any reported performance problems or disabled by default so we can turn it on only for customers that want this information)?

@ycombinator
Copy link
Contributor

Also, should fields like idxhi and idxlo be string fields? I don't have any domain knowledge about these fields but they seem like fields that might be useful for range queries, so integers might be more appropriate?

@rdner
Copy link
Member Author

rdner commented Jul 14, 2023

@ycombinator we've already discussed the performance impact in the original issue and agreed the value for us is more important.

The biggest field here is the fingerprint which is already an opt-in behaviour, this field is present only if the fingerprint mode is active. The rest are just very short strings.

@rdner
Copy link
Member Author

rdner commented Jul 14, 2023

@ycombinator also adding to my previous comment: the length of all the added fields is constant and they're attached only to events produced by the filestream input. So, the impact is not global.

@ycombinator
Copy link
Contributor

ycombinator commented Jul 14, 2023

It makes me a bit unesasy that we don't actually know what the performance impact of these additional fields on every filestream event would be. But I see the points that the sizes of these fields are constants and also that in terms of a % increase compared to the size of the overall event, the increase might not be significant.

The only question I have remaining, then, is #36065 (comment).

@rdner
Copy link
Member Author

rdner commented Jul 14, 2023

@ycombinator don't get me wrong, I'm all for testing the performance here. We can do it once @alexsapran is available. If you think we should hold this PR until then, I'm fine with that.

Regarding the use of integers in #36065 (comment) I made the change here abc9643

@ycombinator
Copy link
Contributor

ycombinator commented Jul 14, 2023

@ycombinator don't get me wrong, I'm all for testing the performance here. We can do it once @alexsapran is available. If you think we should hold this PR until then, I'm fine with that.

I'm 99% sure the performance testing is going to agree with our speculation that the performance impact is non-existent or negligible. But I think it's worth waiting for it only because, if we don't have proof, once this is released we won't have a way to undo or disable it in case there are negative consequences.

That being said, we do have another month to 8.10.0 feature freeze so I'd be okay with merging this PR as-is provided we can be assured of doing the performance testing before that deadline. Maybe make an issue and put it in the next sprint?

Regarding the use of integers in #36065 (comment) I made the change here abc9643

Thanks!

@rdner
Copy link
Member Author

rdner commented Jul 14, 2023

@pierrehilbert @nimarezainia @cmacknz opinions?

Should we add a new flag to the filestream input configuration to make these fields optional?

The maximum impact from this change is an event size increase by:

  • 141 bytes on Windows ("idxhi": 4294967295,"idxlo": 4294967295,"vol": 4294967295,"fingerprint": "2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a",)
  • 144 bytes on Unix ("inode": 18446744073709551615,"device": 18446744073709551615,"fingerprint": "2edc986847e209b4016e141a6dc8716d3207350f416969382d431539bf292e4a",)

I personally think that the input configuration is already too polluted with options, I'm not sure if it's worth to add a new option to save ~145 bytes.

@ycombinator
Copy link
Contributor

To be clear, my suggestion for a new configuration flag to enable/disable these fields is only because we haven't (yet) performance-tested the impact of this change — a change that adds ~145 bytes to every filestream event. If we do the performance testing and it proves that the impact is nonexistent or negligible, I don't think we need a flag.

@pierrehilbert
Copy link
Collaborator

As mentioned in the issue itself when you created it, I agree that we should validate the perf impact of this change to ensure that it's as little as what we imagine.
@alexsapran can you help us here or, better option, give us everything to be able to run the tests by ourself?

@rdner
Copy link
Member Author

rdner commented Jul 16, 2023

@pierrehilbert see my comment above #36065 (comment)

@alexsapran
Copy link
Contributor

Hi all, sorry for the late replay, I was out sick.

Looking at the discussion I can definitely help, but at the same time, I want to make sure we are talking about the same items when we say performance.
What performance implications do we anticipate this PR to have? Is only the additional 145bytes for each event on the output side or do we need to calculate the fingerprint and file info for each event?
I can sync with Denis and provide an environment in which we can test this out, but from reading this it's not clear to be which aspect of the performance impact we are talking about.

@alexsapran
Copy link
Contributor

Discussed it with @rdner and provided me with the information needed. I will run some verifications and get back to this thread.

@rdner
Copy link
Member Author

rdner commented Jul 17, 2023

@ycombinator as @alexsapran pointed out to me, in ECS https://www.elastic.co/guide/en/ecs/current/ecs-file.html#field-file-device device is a keyword for some reason.

So, what should we do, keep the integer because it makes sense, or be ECS compliant with a string value?

I suppose we cannot make a change to ECS, can we?

`device` in ECS is not a numerical ID, it's a string name of the device.
@rdner
Copy link
Member Author

rdner commented Jul 17, 2023

@ycombinator I've just realised that in ECS they mean a name of the device, not its ID, so I renamed my new field to device_id here f8582d6.

@alexsapran
Copy link
Contributor

Overall it would be nice to backport those changes back to ECS so we keep the FB output ECS compliant when possible, I know this is more involved, but it's something to keep an eye out for future work we do especially when adding new fields.

@rdner
Copy link
Member Author

rdner commented Jul 17, 2023

@alexsapran well technically, the changes in this PR are compliant. It's just ECS does not have additional fields that we have here. I'm not even sure it should.

@alexsapran
Copy link
Contributor

I run the benchmarks for the use case of JSON formatted log and unstructured format.

JSON

The data show that we were able to achieve the same amount of EPS. I noticed a slight increase in the output throughput metrics, from 20.9MB/s to 21.6MB/s, which is expected given that we are increasing the data we ship.

Unstructured RAW

When I was running the benchmark I noticed that the EPS was increased after this PR, not including the changes from f8582d6 which was pushed after I started.
Similar to the JSON we saw also an increase in the output throughput, from 30Mb/s to 33.1MB/s which is normal given the additional data we sent.

All of the above was done with compression level 0, which is the default compression.

I am re-running the unstructured tests and looking into them.

Overall I don't see this PR adding any performance regression on the output side of the process.

Copy link
Contributor

@ycombinator ycombinator left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rdner rdner merged commit e4d287f into elastic:main Jul 17, 2023
@rdner rdner deleted the fs-info-in-event branch July 17, 2023 16:45
rdner added a commit to rdner/beats that referenced this pull request Aug 12, 2023
In elastic#36065 a few fields were
renamed in order to clarify their purpose. Unfortunately, this rename
was a part of a new feature PR which was not supposed to be backported
to 7.17.

However, backporting some other changes related to this code
has now become challenging and results in build failures (note the
fixing commits):

- elastic#36095
- elastic#36264

This PR makes the naming consistent with the main branch, so we can
easily backport changes.
rdner added a commit that referenced this pull request Aug 12, 2023
In #36065 a few fields were
renamed in order to clarify their purpose. Unfortunately, this rename
was a part of a new feature PR which was not supposed to be backported
to 7.17.

However, backporting some other changes related to this code
has now become challenging and results in build failures (note the
fixing commits):

- #36095
- #36264

This PR makes the naming consistent with the main branch, so we can
easily backport changes.
Scholar-Li pushed a commit to Scholar-Li/beats that referenced this pull request Feb 5, 2024
This includes:

* Unix-like:
  * device
	* inode
* Windows:
  * idxlo
	* idxhi
	* vol
* All:
  * Fingerprint (if the fingerprint mode is enabled)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-skip Skip notification from the automated backport with mergify enhancement Filebeat Filebeat Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Include device ID and inode in each event ingested from a file
6 participants