Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge back docker logs #1436

Closed
anton-ryzhov opened this issue Dec 26, 2019 · 24 comments
Closed

Merge back docker logs #1436

anton-ryzhov opened this issue Dec 26, 2019 · 24 comments
Assignees
Labels
needs: approval Needs review & approval before work can begin. platform: docker Anything `docker` platform related type: enhancement A value-adding code change that enhances its existing functionality.

Comments

@anton-ryzhov
Copy link
Contributor

Docker log drivers split messages by 16k (moby/moby#32923)

Json log driver splits logs into multiple messages but only last one has trailing newline symbols (\r\n).

In Journald all parts but last contains CONTAINER_PARTIAL_MESSAGE property.

These features can be used to merge log events back. But currently it's not possible with vector because transforms can't access to two events in row to merge them.

See also #1431 discussion

@LucioFranco
Copy link
Contributor

This seems like something the docker source should possibly support. As for the transforms I believe you can return None and then move Event into a transform local buffer. Then return a new event that is aggregated across the events in the buffer.

@anton-ryzhov
Copy link
Contributor Author

Docker source doesn't support checkpointing, so its application is very limited.
Or do you mean docker source in general (also file and journald)?

@LucioFranco
Copy link
Contributor

I mean specifically any source that might always get split logs should in theory be able to handle this without needing to add an extra transform.

@binarylogic binarylogic added platform: docker Anything `docker` platform related source: docker type: enhancement A value-adding code change that enhances its existing functionality. labels Dec 26, 2019
@binarylogic
Copy link
Contributor

@LucioFranco can you share why? We had a good discussion about this in #1431. I'm not sure I agree with you. The fact that you can read docker logs from multiple sources means that we should, in some way, decouple the logic of joining events from sources.

@LucioFranco
Copy link
Contributor

Sure, decoupling is fine, but I think that adds a level of indirection where I think the docker source should just work out of the box for specific cases that are produced by docker. This seems like a pathological case where it will always do it for a certain size and a user just adding the docker source should just have this work. As for the check pointing in this case you don't have the full log until you send it down, so if the docker daemon gets reset you have a half corrupted log in your buffer which could affect downstream transforms. If you aggregate at the source you can ensure that you don't get corrupted logs.

As for the transform anyways, it should be possible to aggregate, though you could hook up multiple sources to this and there is not really a way to differentiate but I think that should be a decently obvious things for users to notice if they are getting tangled logs.

@binarylogic
Copy link
Contributor

Thanks.

user just adding the docker source should just have this work

Agree 👍 . I'm just curious how you'd solve reading Docker logs from another source, like journald?

@anton-ryzhov
Copy link
Contributor Author

Special handling of these cases in sources?

[sources.docker_logs]
  type = "file"
  include = ["containers/*/*.log"]
  merge_docker_logs = true

[sources.docker_logs2]
  type = "journald"
  merge_docker_logs = true

And maybe some shared code behind the scene.

@binarylogic
Copy link
Contributor

Nice, that's interesting and not bad idea. We've discussed similar options for decoding as well, avoiding the need for a json_parser transform.

@MOZGIII
Copy link
Contributor

MOZGIII commented Dec 28, 2019

I was thinking about this issue recently, and I think have come up with a very flexible design.

Here is my plan.

  1. Add a notion of incomplete (or partial to follow the journald vocabulary) log messages. We implement this the following way: for journald we just use the provided metadata, for docker source we use heuristics based on the "\n" presence at the end of the message (until a better way - like adding a special flag to the message explicitly - is implemented at the docker end, which we can suggest to them I guess), and we can also implement a transform to allow users to implement custom logic to flip the flag. If we represent this flag simply as part of the message - we don't need this extra transform for flag toggling. We might need to add special support to preserve information that sequential partial messages coming from a single source belong together and complete with a non-partial message.

  2. Add separate transform to merge partial messages. I believe this has to be opt-in, simply because the implications for the sink might be dramatic, for instance, Elasticsearch might start rejecting our data and/or this can cause cluster collapse, as it happens usually with Elasticsearch if the documents are too big for the internal cross-node transport to handle. Therefore, I suggest we should require explicit opt-in here. Furthermore, we can provide an option to truncate or drop partial messages instead of merging them - users might want to simply get rid of the data that don't fit in their sink.

This would allow us to solve a much bigger set of issues for the users (i.e. it's a generic solution, not just for docker source) and doesn't involve special casing (i.e. streaming JSON parser works only with JSON data).

What do you think?

I missed a couple of messages here, and I see it was discussed before that docker source should merge messages automatically out of the box. We can achieve this with the design I'm proposing by sort of adding the transform from (2) to all the configurations by default. Personally, I would vote against merging automatically for the reasons described above (at (2)), but I can see why people might find it useful, so I don't mind it as long as it can be turned off cause in some cases I do want to do a lower-level log aggregation, for example, for the docker engine debugging purposes.

The desire to implement this as primitives comes from the desire to
a) share the implementation parts efficiently and
b) allow users to specify their own rules for marking message as partial.

Alternatives are to implement all the logic described in (1) and (2) within relevant sources. That would, however, be way less flexible and configurable. The implementation might be simpler - mainly for the reason that we won't need to deal with adding the "related messages", but it might also be not too bad if we figure the right abstraction.

@MOZGIII
Copy link
Contributor

MOZGIII commented Dec 28, 2019

I forgot to mention: flexibility is an important feature of the design.
For example, if delivering docker logs via non-docker source. Vector doesn't currently support GELF sources, but gelf can be used to pass logs from docker directly to the log processor, without storing them on the fs first. This is useful in some cases, for example in certain bare metal topologies, or embedded. If we implement the \n heuristics for docker partial message detection as a transform as well, users will be able to trivially reuse it with gelf source. In fact, it only makes sense to implement marking partial messages in the sources only for the sources that explicitly have the representation for it in the data model - so just journald so far.

@binarylogic
Copy link
Contributor

binarylogic commented Dec 29, 2019

Nice writeup! I have a few questions and suggestions before we move forward with this:

Comments

for journald we just use the provided metadata

Journald provides this as part of the message's metadata? Do you mind providing an example or linking to the docs? I wasn't aware of this 😄 .

Add separate transform to merge partial messages. I believe this has to be opt-in, simply because the implications for the sink might be dramatic, for instance, Elasticsearch might start rejecting our data and/or this can cause cluster collapse

This is tangential, but you might be conflating 2 separate issues here. In my opinion, we should solve the message continuation problem without regard to downstream side-effects and add other separate options for these side-effects.

For example, the file source already has a max_line_bytes option that could protect against large messages, and we could consider a transform that drops large messages, or a max_line_bytes setting on the elasticsearch sink itself. (as well as many other options)

Factors

There are a few factors worth addressing individually:

  1. User Experience - What is the path of least surprise?

    If a source explicitly flags a message as a continuation (through metadata or a trailing character) we should try to merge them by default. This, in my opinion, is more in-line with user expectations and takes the path of least surprise. For example, most users aren't aware Docker is splitting messages until it is a problem (lines being dropped, errors in processing, etc). Separately, the file source offers a message_start_indicator option because there is no explicit flag for continuation here, anything goes.

  2. Design & Flexibility - how can we avoid coupling to allow for cases we aren't considering?

    Our strategy should be composable so that users can solve this problem through a matrix of platforms and sources (docker -> gelf, docker -> log driver, docker -> journald, etc). Decoupling these things let's users solve this in ways we aren't even considering.

  3. Consistency - how do we make solving this problem consistent across all source?

    For example, the file source already solves this through the message_start_indicator option. And whatever solution we take here should be consistent across all sources.

Final Thoughts

I'm not entirely sure what the best approach is yet, but I'm leaning towards the proposed transform approach. Especially given #1447, which makes chaining transforms trivial. We can solve factor #1 (UX) by including this in the documentation examples.


Before we proceed, I want to get buy-in from other members on the team.

@binarylogic binarylogic added the needs: approval Needs review & approval before work can begin. label Dec 29, 2019
@anton-ryzhov
Copy link
Contributor Author

Journald provides this as part of the message's metadata? Do you mind providing an example or linking to the docs? I wasn't aware of this.

It isn't Journald feature nor metadata. It's a docker log driver feature — here an example dump

For example, the file source already solves this through the message_start_indicator option.

It doesn't solve it for all cases. Specifically docker case can't be handled — #1431

@anton-ryzhov
Copy link
Contributor Author

Having this as separate transform solves this and similar problems in most generic way.

One thing you should take care of — merge partial messages by specific key: (source_id, file_name) for file source, (source_id, CONTAINER_ID) for journald, something like (source_host, file_name) for centralized setup etc…

@lukesteensen lukesteensen self-assigned this Dec 30, 2019
@MOZGIII
Copy link
Contributor

MOZGIII commented Jan 1, 2020

Yeah, sorry, I was under the impression partial message indicator is a common journald feature.

I like the concept of using message fields to indicate that a message is partial. This is simple, “no magic” approach, compared to storing this data as an internal per-message flag. I.e. to mark a docker source message as partial, transform would have to peek into the log field (preferably user-configurable and maybe the transform itself can be not docker-specific), and simply inject a new field partial: true (also user-configurable).
Then we add another transform, that has user-specified parameters for a) stream identifier fields - as suggested above by @anton-ryzhov, b) the name if the field indicating whether message is partial (partial == true from the above, users can use multiple different flags for different flows over a single source if needed - for example, it’d be useful if both a structured log and a segfault stack trace is expected from a source), and c) - the behavior of the operation - to drop the messages, merge messages (if so - how to assemble individual fields) or truncate - i.e. drop everything except for the first partial message.
If we won't be using hidden flags, we should consider how this will interact with other transforms that are not aware of partial messages. It's easier with hidden flags cause we'll be able to naturally ignore partial messages. However, if we only allow sequential partial messages from a single source, it might be a non-issue.
Anyhow, both internal partialness flags and field-based flags are allowing to implement the merging for any of the sources, as long as the source or an intermediate transform properly marks messages as partial. It will also be consistent since the underlying components will be exactly the same, so exactly the same codepaths will be run.

In regards to user experience - exposing a lot of tweakability might be overwhelming, but it is necessary to cover all cases. Therefore I would propose to also add some kind of configuration presets and reusability. We can hard code some well-known patterns behind some easy-to-turn-on flags - for docker case, we might introduce configuration in the docker source that would automatically add a merging transform preconfigured for docker source, or something like that.
A quite different idea would be to introduce exchangeable presets, so that, instead of us hardcoding them, we let users write the presets, and then share and re-use them. Actually, this might be a major leap forward in how configurations is managed for logging agents. So this approach definitely worth a separate issue.

@MOZGIII
Copy link
Contributor

MOZGIII commented Jan 1, 2020

This one replaces the #1431, so, @lukesteensen would you mind if I take this?

@lukesteensen
Copy link
Member

Sure, @MOZGIII, that approach sounds reasonable! It's possible we could generalize the existing line aggregator that we've built into the file source to achieve this.

@MOZGIII MOZGIII self-assigned this Jan 7, 2020
@binarylogic binarylogic added this to the Improve log continuation strategy milestone Jan 7, 2020
@binarylogic binarylogic mentioned this issue Jan 7, 2020
7 tasks
@MOZGIII
Copy link
Contributor

MOZGIII commented Jan 9, 2020

While conducting the research on docker, I stumbled upon this.

https://github.com/docker/compose/blob/26f1aeff1505f5be0c4552a7ed7d4d3498cff9d3/compose/utils.py#L50-L79

This is the approach utilized by the docker-compose. Is essence, they implement a buffer that takes in the data, joins everything it takes in, and then splits the merged content and produces the properly split chunks.
This is different from what I've outlined before. Instead of having the notion of partial messages, what this does is joins everything (even non-partial messages) and re-splits according to user-specified rules.
Compared to what I proposed before, it may be easier to grasp and configure from the user perspective.
Performance-wise though, it seems havier. There would be a lot more copying from buffer to buffer, and performance is also important for the users. I'm not sure how much worse it would be as copying is optimized really good in modern CPUs. So it has pros and cons.

We could implement both, or choose just one approach and go with it.

@anton-ryzhov
Copy link
Contributor Author

I didn't get how referred function is related to the discussion. They yielding parsed jsons from json stream.

If you'll apply this to docker (outer) json — it'll split exactly it does now. If you'll unwrap outer json — you can handle only json logs with this approach.

@MOZGIII
Copy link
Contributor

MOZGIII commented Jan 9, 2020

I don't think that function has to do anything with json. It's generic over the splitter, and the default splitter is the line_splitter. Function operates on the streams of bytes, notice the stream_as_text.
Given that it is invoked from here I expect it to work over the the log field of the JSONLines log file that docker keeps at /var/lib/docker/containers/.../...-json.log:
{"log":"my_message\n","stream":"stdout","time":"2020-01-09T15:21:54.124422689Z"}

@MOZGIII
Copy link
Contributor

MOZGIII commented Jan 9, 2020

Btw, I created a repo to replicate the issue: https://github.com/MOZGIII/vector-merge-test-setup

@MOZGIII
Copy link
Contributor

MOZGIII commented Jan 9, 2020

I just realized that the approach that the docker-compose uses doesn't allow us to preserve event fields other than the one we're merging. That's a problem, though we can probably adapt the solution to solve it.

@MOZGIII
Copy link
Contributor

MOZGIII commented Feb 5, 2020

Closed via #1504

docker source will now concatenate messages by default without any extra configuration 🎉

@MOZGIII MOZGIII closed this as completed Feb 5, 2020
@anton-ryzhov
Copy link
Contributor Author

Looks nice and usable.
I wanted to try it but unfortunately current nightly build is 2020-01-27. Are NBs broken?

@LucioFranco
Copy link
Contributor

@anton-ryzhov they should have been fixed via #1661 but maybe there haven't been? I'll reopen that issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs: approval Needs review & approval before work can begin. platform: docker Anything `docker` platform related type: enhancement A value-adding code change that enhances its existing functionality.
Projects
None yet
Development

No branches or pull requests

5 participants