-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-613: drop messages when they accumulate in the exporter #63
Conversation
New image: ["quay.io/netobserv/netobserv-ebpf-agent:9c12b1d"]. It will expire after two weeks. |
df := c.droppedFlows | ||
if df > 0 { | ||
c.droppedFlows = 0 | ||
cllog.Warnf("%d flows were dropped during the last %s because the agent is forwarding "+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it's time to have metrics rather than logs, for this kind of things? Or create a follow-up ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with you despite we can also keep this message as a first-hand notification that could be useful for the user. I'd create a JIRA issue to implement and expose metrics.
droppedFlows int | ||
} | ||
|
||
func (c *CapacityLimiter) Limit(in <-chan []*Record, out chan<- []*Record) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's worth it to have a whole stage with in/out buffers for just this simple check, no ? I tried benchmarking against another version with the limiting done at the source and got slightly better time: 15016 ns/op versus 21698 ns/op with this limiter stage (same number of allocs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I just wanted to separate each functionality as much as possible to avoid overloading too much some pipeline stages. Also because otherwise we should implement this in multiple places: since the deduper is optional, it should Go both in the FlowAccounter and MapTracer.
About the performance improvement, I think in this case is negligible as the pipeline processes in batches and each batch is usually forwarded every CacheActiveTimeout
period (5s
by default).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair enough, after our discussion I understand better.
Anyway we can still revisit later if we find there's optimizations to bring here
Codecov Report
@@ Coverage Diff @@
## main #63 +/- ##
==========================================
+ Coverage 33.04% 33.28% +0.24%
==========================================
Files 21 22 +1
Lines 1392 1433 +41
==========================================
+ Hits 460 477 +17
- Misses 913 936 +23
- Partials 19 20 +1
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
/lgtm |
The following PRs need to be merged first: #61 and #62. Then, this PR needs to be rebased into master.
This PR adds a flow's dropper before the exporter to avoid memory growing indefinitely or synchronized channel operations blocking previous parts of the system.
It also logs the number of dropped flows, suggesting the customer to take reactive actions.
The new architecture results in: