`message_stop_indicator` for file source #1431

anton-ryzhov · 2019-12-24T23:45:37Z

In addition to message_start_indicator, opposite parameter can be very useful.

For example, docker json log driver splits long lines by 16k. And only very last chunk contains \r\n.

{"log":"First line\r\n","stream":"stdout","time":"2019-12-16T17:25:16.44039104Z"}
{"log":"Second...","stream":"stdout","time":"2019-12-16T17:28:28.448678893Z"}
{"log":"... line\r\n","stream":"stdout","time":"2019-12-16T17:28:28.448700857Z"}
{"log":"Third line\r\n","stream":"stdout","time":"2019-12-16T17:28:28.448700857Z"}

By defining message_start_indicator = '\\n"' I'm getting "Second..." merged to first line rather than to its continuation.

So only difference is how to combine unmatched lines — to previous or to next matched line.

The text was updated successfully, but these errors were encountered:

binarylogic · 2019-12-26T17:13:43Z

Thanks @anton-ryzhov, I was actually just discussing this exact problem on Docker with @MOZGIII. @MOZGIII what do you think about taking this as your next issue?

MOZGIII · 2019-12-26T17:18:55Z

Right, this is exactly the problem we discussed. I can work on it.

MOZGIII · 2019-12-26T17:24:30Z

message_start_indicator = '\\n"' is a smart trick, however, the proper fix for this issue should probably be implemented on a different level, to ensure it works with gelf driver too.

anton-ryzhov · 2019-12-26T17:44:34Z

@binarylogic @MOZGIII good to know.

Docker problem is more complex in the end. So I just started with this more universal feature request of complementary message split. It could be useful not only for docker.

Speaking about docker (seems it should be a separate ticket) — I feel this should be handled on the next level, as transform. But for doing this we need to split messages properly at "source" level. Detecting by '\\n"' looks like dirty hack for me.

Another reason to merge docker logs at transform step — docker logs could be read from journald and docker sources. And same logic should be applied. But we can't merge two events at transform step.

Is it possible to have transformation which will have access to current and previous events [from the same source] and will be able to keep/delete/replace them. Kind of reducing function & sliding window.

MOZGIII · 2019-12-26T17:52:15Z

So I just started with this more universal feature request of complementary message split.

Sounds good!

I was thinking about solving the particular case with docker where the inner log messages are JSON - in that case, it'd be possible to just use a streaming JSON parser and join across multiple messages until the parser completes or reaches an unexpected token.
It may be possible to solve this issue more generically, so having a detailed feature request in the first place would be very valuable.

anton-ryzhov · 2019-12-26T17:54:29Z

I want to share my approach/workaround. Doesn't work as expected because of this ticket, but maybe it will help you to come up with a better solution.

[sources.docker_logs]
  type = "file"
  include = ["containers/*/*.log"]

  message_start_indicator = '\\n"'
  # Should be:
  # message_stop_indicator = '\\n"'

[transforms.newlines_to_array]
  type = "lua"
  inputs = ["docker_logs"]
  source = """
   event["message"] = "{\\"chunks\\": [" .. string.gsub(event["message"], "\\n", ",") .. "]}"
  """

[transforms.json_parser]
  type = "json_parser"
  inputs = ["newlines_to_array"]

[transforms.merge_chunks]
  type = "lua"
  inputs = ["json_parser"]
  source = """
    event["time"] = event["chunks[0].time"]
    event["stream"] = event["chunks[0].stream"]
    event["log"] = ""

    local i = 0
    while event["chunks[" .. i .. "].time"] do
      event["log"] = event["log"] .. event["chunks[" .. i .. "].log"]
      event["chunks[" .. i .. "].time"] = nil
      event["chunks[" .. i .. "].log"] = nil
      event["chunks[" .. i .. "].stream"] = nil
      i = i + 1
    end
  """

MOZGIII · 2020-02-05T20:17:18Z

Now, when #1504 landed, this should be possible to achieve what you're looking for here via a merge transform.
We'll continue to look into this issue. We might integrate the merge transform into the file source to provide an built-in solution.

anton-ryzhov · 2020-02-07T14:58:16Z

When there is a universal merge transform — maybe sources shouldn't do that in general (but it's 100% reasonable for docker source).

So maybe idea of that issue is outdated now.

MOZGIII · 2020-02-07T18:58:29Z

It's an ongoing process to determine what sources should support built-in merging, and how.

With build-in merges, we looking for the following:

better defaults where applicable - docker is an obvious example, same goes for kubernetes; journald is not so trivial though, we can discuss it at Merge partial events at journald source #1719 if you're interested;
better performance - in theory, we can get significant performance wins in some cases if we implement merge transform at the source level;
better internal implementation - this is related to both points above, but a bit different; in essence, when you have a source that can produce multiple streams (like docker or journald) in the source internals those streams can be accessed individually with less overhead and reliability compared to using the stream_discriminant_fields; so, in those cases, we can provide an easier to understand configuration interface for the end-users - even if merging is not enabled by default for a particular source.

So, I guess this completes the list of items that we consider when we consider implementing merging at the source.
I am under the impression that we definitely can't cover all possible needs by merging at sources, so we need a merge transform.

The current implementation has a design trait that will stash the partial events and won't release them until it can provide a merged one. Configuring this can be error-prone at times and can cause the transform to consume all system memory if used incorrectly.
Another issue is that when the partial event is read from the source and gets stashed at the merge transform, the source is not aware that the event is not processed, and will not redeliver it in case vector restarts. This is the main problem we see with the design, and the reason we're currently only shipping the transform as a beta.

We probably should add it to the documentation, so people are aware.

MOZGIII · 2020-02-18T12:08:15Z

We'll be adding some new parameters to file source, taking inspiration from https://app.scalyr.com/help/parsing-logs#multiline

In particular, in addition to message_start_indicator, we'll add new parameters, based on how Scalyr does it:

continueThrough: all consecutive lines matching this pattern are included in the group. The first line (the line that matched the start pattern) does not need to match the continueThrough pattern. This is useful in cases such as a Java stack trace, where some indicator in the line (such as leading whitespace) indicates that it is an extension of the preceeding line.
continuePast: all consecutive lines matching this pattern, plus one additional line, are included in the group. This is useful in cases where a log message ends with a continuation marker, such as a backslash, indicating that the following line is part of the same message.
haltBefore: all consecutive lines not matching this pattern are included in the group. This is useful where a log line contains a marker indicating that it begins a new message.
haltWith: all consecutive lines, up to and including the first line matching this pattern, are included in the group. This is useful where a log line ends with a termination marker, such as a semicolon.

binarylogic · 2020-02-18T16:17:48Z

That makes sense to me. While we're adding these options, would it be possible to nest them under a new multiline table? I would like to keep the current message_start_indicator option, but simply remove it from the docs. Ex:

[sources.in]
  type = "file"

  [sources.in.multiline]
    start = "..."
    continue_through = "..."
    continue_past = "..."
    halt_before = "..."
    halt_with = "..."

What do you think? Also, notice the name is snake case.

MOZGIII · 2020-02-19T07:56:26Z

I like the nesting! I think it'll map to the internal implementation better than the way we're doing it right now.
Regarding the exact config options set and names - definitely snake case, but having it as Scalyr does is a bit confusing - you have to specify a start pattern + exactly one of the continuation patterns. It'd be easier for everyone if we just have three fields:

start
continuation
mode - an enum of continue_through/continue_past/halt_before/halt_with

Do you like this?

binarylogic · 2020-02-19T08:18:34Z

I’m a fan. The ‘mode’ option looks good to me.

anton-ryzhov · 2020-02-19T10:39:13Z

But for halt mode it is not a continuation.

Don't just copy their config, you can make it much more intuitive.

continue and halt could be defined by mutual exclusive pattern definitions without explicit mode.

For continueThrough:

start = "^Error"
through = "^     at"

For haltWith:

start = "^Error"
stop = "Summary line"

haltBefore mode sounds for me like start pattern of the next message. If you met start again — just start a new event.

continuePast mode for backslash example could be stop="[^\\]$"

MOZGIII · 2020-02-19T10:57:47Z

Thanks, @anton-ryzhov!

I've started working on the implementation, and I too noticed that we can probably do it even better than with just mode. I'm working with the mode field for now, and I think the better configuration parameters will emerge in the process.

Your point is definitely valid, and it was an overlook at the initial design pass. What you're proposing looks better, and I'll consider modeling the configuration like that - cause it seems to also be a better fit for the implementation needs.

By the way, I've added a couple of real-world test cases, and I'm looking to add more to make our implementation easier to maintain.

For example, for Java stack traces I have this:

let lines = vec![
    "java.lang.Exception",
    "    at com.foo.bar(bar.java:123)",
    "    at com.foo.baz(baz.java:456)",
];
let config = Config {
    start_pattern: Regex::new("^[^\\s]").unwrap(),
    condition_pattern: Regex::new("^[\\s]+at").unwrap(),
    mode: Mode::ContinueThrough,
    // ...
};
let expected = vec![
    concat!(
        "java.lang.Exception\n",
        "    at com.foo.bar(bar.java:123)\n",
        "    at com.foo.baz(baz.java:456)",
    ),
];
run_and_assert(&lines, config, &expected);

I also have a use case test for Ruby stack traces.

If you'd like to, now would be a good time to share the real-world usage examples that are important to you - and we add them as test cases too!

MOZGIII · 2020-02-23T00:16:30Z

Hey @anton-ryzhov, a follow up after we merged the PR! We ended up going with start_pattern, condition_pattern, mode and timeout_ms fields.

I'll continue with a follow-up issue on improving the configuration fields.

ghost added source: file Anything `file` source related type: enhancement A value-adding code change that enhances its existing functionality. labels Dec 25, 2019

binarylogic assigned MOZGIII Dec 26, 2019

binarylogic added the platform: docker Anything `docker` platform related label Dec 26, 2019

binarylogic added the needs: requirements Needs a a list of requirements before work can be begin label Dec 26, 2019

anton-ryzhov mentioned this issue Dec 26, 2019

Merge back docker logs #1436

Closed

binarylogic added this to the Improve log continuation strategy milestone Jan 7, 2020

MOZGIII mentioned this issue Feb 19, 2020

enhancement(file source): Better multi-line support #1852

Merged

3 tasks

MOZGIII closed this as completed in #1852 Feb 22, 2020

MOZGIII mentioned this issue Feb 23, 2020

Default file.multiline.mode to halt_before #1897

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`message_stop_indicator` for file source #1431

`message_stop_indicator` for file source #1431

anton-ryzhov commented Dec 24, 2019 •

edited

Loading

binarylogic commented Dec 26, 2019

MOZGIII commented Dec 26, 2019

MOZGIII commented Dec 26, 2019

anton-ryzhov commented Dec 26, 2019

MOZGIII commented Dec 26, 2019 •

edited

Loading

anton-ryzhov commented Dec 26, 2019

MOZGIII commented Feb 5, 2020

anton-ryzhov commented Feb 7, 2020

MOZGIII commented Feb 7, 2020 •

edited

Loading

MOZGIII commented Feb 18, 2020 •

edited

Loading

binarylogic commented Feb 18, 2020

MOZGIII commented Feb 19, 2020

binarylogic commented Feb 19, 2020

anton-ryzhov commented Feb 19, 2020

MOZGIII commented Feb 19, 2020 •

edited

Loading

MOZGIII commented Feb 23, 2020

message_stop_indicator for file source #1431

message_stop_indicator for file source #1431

Comments

anton-ryzhov commented Dec 24, 2019 • edited Loading

binarylogic commented Dec 26, 2019

MOZGIII commented Dec 26, 2019

MOZGIII commented Dec 26, 2019

anton-ryzhov commented Dec 26, 2019

MOZGIII commented Dec 26, 2019 • edited Loading

anton-ryzhov commented Dec 26, 2019

MOZGIII commented Feb 5, 2020

anton-ryzhov commented Feb 7, 2020

MOZGIII commented Feb 7, 2020 • edited Loading

MOZGIII commented Feb 18, 2020 • edited Loading

binarylogic commented Feb 18, 2020

MOZGIII commented Feb 19, 2020

binarylogic commented Feb 19, 2020

anton-ryzhov commented Feb 19, 2020

MOZGIII commented Feb 19, 2020 • edited Loading

MOZGIII commented Feb 23, 2020

`message_stop_indicator` for file source #1431

`message_stop_indicator` for file source #1431

anton-ryzhov commented Dec 24, 2019 •

edited

Loading

MOZGIII commented Dec 26, 2019 •

edited

Loading

MOZGIII commented Feb 7, 2020 •

edited

Loading

MOZGIII commented Feb 18, 2020 •

edited

Loading

MOZGIII commented Feb 19, 2020 •

edited

Loading