-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Span with lots of Logs #571
Comments
All three are possible solutions, all have pros and cons, there's no single "correct answer". For (a), you might run into the UDP max packet size trying to export that many logs out of process in a single span. Also, indexing all those logs will be quite expensive, so it depends on what you're trying to do with them aside from making them accessible in the UI in a contextual way. Also, the UI might not be happy trying to display that many logs per span. I would be curious to know what your use case is that requires so many log lines per single span. Are you talking about some long-running operation? |
One tangential consideration is to support log-levels as a collector configuration. Not addressing the OP's issue, directly, but could be part of concerted effort to enable larger-scale logging. @yurishkuro Are logs indexed? If so, we can expose that in the search UI. |
by default logs are indexed by their fields |
@yurishkuro The use case is a modular processing system, where the processes itself are also modular and split up into steps, that can be run in series or parallel. We have an older ui, which renders these as a tree panel (known from senchas extjs). We want to replace this with jaeger, since the data model would be a perfect fit. The running process is a trace and every step is a span. sequentially and parallel running steps can be modeled with the parent relations. There is a log4j appender which controls the generation of the spans and redirects logging calls to the span.log interface. These processes can run for a longer time (this is why we are working on this jaegertracing/jaeger-client-java#231 , too), like several minutes. Yes, the plan is to index the logs, so they are searchable. It wouldn't make much sense to display all the 5000 logs at once. In the old ui it was possible to page through the logs. The logs were saved in an oracle database and had the id of the process run (in jaeger terms a trace) as a foreign key. The jaeger ui would need pagination and a log search endpoint by span id and/or trace id, if we consider this as a good feature for jaeger. Also, in my current understading a span is the smallest piece of data, that is sent via thrift. I guess we would need to change that, in order to sent these amounts of logs to a collector. Maybe small sets of logs . But I didn't think this through, I am just thinking out loud. @tiffon The question is, do we really need loglevels? In more recent pieces of software I am more used to tags, which allow for fine-grained filtering of logs for appenders, but I am curious on your experience with tags so far. I guess the place for this functionality would be the client, which decides to include a log or drop it based on (preconfigured) filter rules. |
My first instinct is that it would make a lot more sense to represent each "step" as a separate span. This indirectly addresses the question of many logs per span, but more importantly it seems like a better representation of the application and its transaction semantics. After all, a span is supposed to represent an operation within the application, which sounds very much like those "steps". The idea is that you should be able to reason easily about what happened within a span, and I don't see how that's possible if the span contains up to 5000 events (logs). And if your "steps" can run in parallel, it makes even more sense to split them into different spans, using adequate span references between them to properly describe the causality relationships, say in order to calculate critical path through a larger transaction. |
Yeah, maybe I made myself unclear. This is exactly our intention. Every span represents a step of this process. Jaeger is a perfect fit to store and visualize this structure. Even a failing step would be easily visible through the jaeger mechanics, since a span can be marked as errorneous. Maybe the most important information I forgot: The reason why one span has that many logs is, that there are loops inside a step, which generate this mass of logs. |
@yurishkuro |
Sorry, I don't think I have any more insight. Like we said, there multiple possible solutions, you need to evaluate pros and cons in your specific case. Personally I am having hard time seeing how 5000 logs per span are useful, it sounds like information overload. Do you have a specific question I can help with? Side note: at Uber we collect logs separately, usually with Kafka/elk stack, and we have plans to build integration into Jaeger UI to pull logs for a given span from an external source. |
@yurishkuro Generating a span for each loop seems like we could give it a try, since a span could also hold context information of every loop. You mentioned in the v1 release post, that it's possible to work with traces containing 50.000 spans and more. So this should be sufficient. Otherwise I would go for (c) and I would be interested, if you have more precise plans, how pulling external logs could look like. If we decide to go this route, it would be good to align our implementation with your envisioned structure :) |
We're currently in the design phase for our internal unified log query service that is meant to abstract away the actual location of the service logs (e.g. Elasticsearch, Kafka, HDFS, even files on the hosts). Once we have that we can look into integrating Jaeger with it (#649). |
+1 for this approach and +11 for #649 |
@phal0r what did you end up doing? Would be nice to have your insights after all this time :) |
@jpkrohling we ended up doing c) and defined a log format for all our applications. We didn't just want to store log messages as string, but more structured ones. So we put the data into elasticsearch and defined a mapping for these indexes. As suggested spanId and traceId are attributes of the log messages. The first implementation is to display these logs in a grafana table (i.e. filtered by errors). In grafana it is easy to define an additional column, which creates clickable jaeger links. This is straightforward as we have the traceId and can just construct these links to the jaeger ui. So far, there we don't have a convenient way for the other direction from trace to logs, but it is easy to copy the traceId and search for it in Kibana for example. We also had in mind, that this leaves the route open for #649 to integrate 3rd party logs into jaeger in the future. |
Thanks for your comment. My question was indeed in relation to #649 and your experience might help us shape that feature. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
This issue has been automatically closed due to inactivity. |
Hey Guys,
first of all, this is not a feature request, but I want to have some more opinions. We want to use jaeger for centralizing our logs of process executions. The data model is a perfect fit, but we want to store 2.000-5.000 logs in one span, since some steps generate lot of important logs.
I can think about 3 ways to solve this problem:
a) this is a concern of jaeger and should be supported in future versions
b) these spans should be split into subspans with less logs per span
c) spans should be stored without logs in jaeger and a third party system should store the logs with a reference of the span id
Now i am curious about your thoughts. Thanks in advance for thinking this through :)
The text was updated successfully, but these errors were encountered: