-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Filebeat] Reduce memory usage of multiline #14068
[Filebeat] Reduce memory usage of multiline #14068
Conversation
The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB). This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.". It's not clear to me why this problem only appears when many lines on the input file don't match the pattern.
cd1d6cc
to
7e1b650
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! It looks to me like multiline might still have some issues, Next
's handling of r.running
looks suspicious, but this is a good step :-)
libbeat/reader/readfile/timeout.go
Outdated
timer := time.NewTimer(r.timeout) | ||
defer func() { | ||
timer.Stop() | ||
select { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extra select looks redundant, is it still needed to free memory after timer.Stop
? If so please explain in a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. Draining the channel is unnecessary unless the timer is going to be reused.
I took the opportunity to rewrite a little bit so that a single channel is used instead of allocating a new one every time.
libbeat/reader/readfile/timeout.go
Outdated
if r.timer == nil { | ||
r.timer = time.NewTimer(r.timeout) | ||
} else { | ||
r.timer.Reset(r.timeout) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we're seeing noticeable memory / cpu overhead from it, I actually like the ephemeral version better (allocate a timer each time). Reusing timers is a lot more subtle -- because Reset
can only be called on a timer that has stopped or expired, and <-timer.C
can deadlock if it's called in the wrong state... it looks to me like you've handled all the cases correctly here, but it takes considerable thought to make sure.
If this is a significant performance fix over the previous version, then this looks ok, but please comment on these issues (e.g. pointing out that timer.Reset
is safe because in every code path the timer is Stop
ped or expired before this function returns, and that <-r.timer.C
can't deadlock because Stop
can only be called once per Reset
and thus a false return value means there is an expiration signal waiting...)
If this doesn't affect performance much, though, I'd rather just create and deallocate the value every time, to save the next person who reads this function the cognitive load of thinking about state invariants :-)
@@ -28,11 +28,12 @@ var ( | |||
errTimeout = errors.New("timeout") | |||
) | |||
|
|||
// timeoutProcessor will signal some configurable timeout error if no | |||
// TimeoutReader will signal some configurable timeout error if no |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for fixing comments!
Great find. We should not use |
The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB). This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.". It's not clear to me why this problem only appears when many lines on the input file don't match the pattern. (cherry picked from commit ce651e0)
The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB). This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.". It's not clear to me why this problem only appears when many lines on the input file don't match the pattern. (cherry picked from commit ce651e0)
The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB). This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.". It's not clear to me why this problem only appears when many lines on the input file don't match the pattern. (cherry picked from commit ce651e0)
#14074) The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB). This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.". It's not clear to me why this problem only appears when many lines on the input file don't match the pattern. (cherry picked from commit ce651e0)
…14073) The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB). This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.". It's not clear to me why this problem only appears when many lines on the input file don't match the pattern. (cherry picked from commit 572ee79)
The use of time.After when multiline is enabled and lines don't match the multiline pattern increases the memory usage (from 30MB to 1GB).
This extra memory is attributed to unexpired timers allocated internally by the Go runtime when time.After(duration) is used. According to the docs: "If efficiency is a concern, use NewTimer instead and call Timer.Stop if the timer is no longer needed.".
It's not clear to me why this problem only appears when many lines on the input file don't match the pattern.
Reproduced with this config:
and a random input file generated by: