-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
message_stop_indicator
for file source
#1431
Comments
Thanks @anton-ryzhov, I was actually just discussing this exact problem on Docker with @MOZGIII. @MOZGIII what do you think about taking this as your next issue? |
Right, this is exactly the problem we discussed. I can work on it. |
|
@binarylogic @MOZGIII good to know. Docker problem is more complex in the end. So I just started with this more universal feature request of complementary message split. It could be useful not only for docker. Speaking about docker (seems it should be a separate ticket) — I feel this should be handled on the next level, as transform. But for doing this we need to split messages properly at "source" level. Detecting by Another reason to merge docker logs at transform step — docker logs could be read from Is it possible to have transformation which will have access to current and previous events [from the same source] and will be able to keep/delete/replace them. Kind of reducing function & sliding window. |
Sounds good! I was thinking about solving the particular case with docker where the inner log messages are JSON - in that case, it'd be possible to just use a streaming JSON parser and join across multiple messages until the parser completes or reaches an unexpected token. |
I want to share my approach/workaround. Doesn't work as expected because of this ticket, but maybe it will help you to come up with a better solution.
|
Now, when #1504 landed, this should be possible to achieve what you're looking for here via a |
When there is a universal So maybe idea of that issue is outdated now. |
It's an ongoing process to determine what sources should support built-in merging, and how. With build-in merges, we looking for the following:
So, I guess this completes the list of items that we consider when we consider implementing merging at the source. The current implementation has a design trait that will stash the partial events and won't release them until it can provide a merged one. Configuring this can be error-prone at times and can cause the transform to consume all system memory if used incorrectly. We probably should add it to the documentation, so people are aware. |
We'll be adding some new parameters to In particular, in addition to
|
That makes sense to me. While we're adding these options, would it be possible to nest them under a new [sources.in]
type = "file"
[sources.in.multiline]
start = "..."
continue_through = "..."
continue_past = "..."
halt_before = "..."
halt_with = "..." What do you think? Also, notice the name is snake case. |
I like the nesting! I think it'll map to the internal implementation better than the way we're doing it right now.
Do you like this? |
I’m a fan. The ‘mode’ option looks good to me. |
But for Don't just copy their config, you can make it much more intuitive.
For
For
|
Thanks, @anton-ryzhov! I've started working on the implementation, and I too noticed that we can probably do it even better than with just Your point is definitely valid, and it was an overlook at the initial design pass. What you're proposing looks better, and I'll consider modeling the configuration like that - cause it seems to also be a better fit for the implementation needs. By the way, I've added a couple of real-world test cases, and I'm looking to add more to make our implementation easier to maintain. For example, for Java stack traces I have this: let lines = vec![
"java.lang.Exception",
" at com.foo.bar(bar.java:123)",
" at com.foo.baz(baz.java:456)",
];
let config = Config {
start_pattern: Regex::new("^[^\\s]").unwrap(),
condition_pattern: Regex::new("^[\\s]+at").unwrap(),
mode: Mode::ContinueThrough,
// ...
};
let expected = vec![
concat!(
"java.lang.Exception\n",
" at com.foo.bar(bar.java:123)\n",
" at com.foo.baz(baz.java:456)",
),
];
run_and_assert(&lines, config, &expected); I also have a use case test for Ruby stack traces. If you'd like to, now would be a good time to share the real-world usage examples that are important to you - and we add them as test cases too! |
Hey @anton-ryzhov, a follow up after we merged the PR! We ended up going with I'll continue with a follow-up issue on improving the configuration fields. |
In addition to
message_start_indicator
, opposite parameter can be very useful.For example, docker json log driver splits long lines by 16k. And only very last chunk contains
\r\n
.By defining
message_start_indicator = '\\n"'
I'm getting "Second..." merged to first line rather than to its continuation.So only difference is how to combine unmatched lines — to previous or to next matched line.
The text was updated successfully, but these errors were encountered: