Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OVS Detrace #493

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

OVS Detrace #493

wants to merge 15 commits into from

Conversation

amorenoz
Copy link
Contributor

This PR contains a bunch of fixups and preparation patches, including what was discussed in #464 , and then the feature we presented in the OVS Conference: use the unixctl interface to extract flow information from OVS at runtime.

@amorenoz
Copy link
Contributor Author

@vlrpl I think we can start discussing / reviewing this feature. But maybe we should start by looking at retis-org/ovs-unixctl#1

@amorenoz amorenoz force-pushed the ovs-detrace_v2 branch 2 times, most recently from c59af0e to 3a85fc2 Compare February 10, 2025 21:34
@amorenoz amorenoz added the run-functional-tests Request functional tests to be run by CI label Feb 10, 2025
Copy link
Contributor

@atenart atenart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a look mostly at the non-ovs part and it looks good.

nit: fmt_ufid is added in a patch and removed in another, but never used.

@amorenoz
Copy link
Contributor Author

I had a look mostly at the non-ovs part and it looks good.

nit: fmt_ufid is added in a patch and removed in another, but never used.

Yep. The original patch is from @vlrpl and since we first considered merging his part independently I kept my update separate. But I think I can squash both and aim to merge everything together.

vlrpl and others added 8 commits March 7, 2025 10:44
Signed-off-by: Paolo Valerio <[email protected]>
Anyhow has a good way of wrapping errors and add context to them: the
Context trait.

By using it, collectors can yield "collector-level" errors that wrap
even lower errors and they all get printed nicely, e.g:

$ retis -p generic collect -c ovs --ovs-enrich-flows
Applying profile generic
Error: Could not initialize the Ovs collector

Caused by:
    0: Failed to connect to OpenvSwitch
    1: No such file or directory (os error 2)

Signed-off-by: Adrian Moreno <[email protected]>
This is a continuation of discussions that took place during [1].

Some collectors' configuration might affect the way section factories
operate. Pass the section factories to collectors during init() phase so
they can pass such configuration.

In order to make use of factories easier, make it a proper type.

The skb collector is the first user of this functionality.

[1] retis-org#464

Signed-off-by: Adrian Moreno <[email protected]>
While at it, n_{mask,cache}_hit have been included.

Ideally, this should be done with fexit indexing by sw_flow_key, but
given that is missing, it reuses the existing inflight_exec map.

Signed-off-by: Paolo Valerio <[email protected]>
Co-authored-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Paolo Valerio <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
The FlowEnricher thread creates a channel and exposes a sender that is
passed to the OvsEventFactory so that it can send new UFIDs for
enrichment.

For now, no much enrichment is done, leaving that to a future patch.

Signed-off-by: Adrian Moreno <[email protected]>
@amorenoz amorenoz changed the title WIP: OVS Detrace OVS Detrace Mar 12, 2025
The main concern here is to balance a between a reasonable enrichment
rate and avoid overloading OVS.

Calls to OVS are throttled and if f events come at a higher pace than
what the enricher thread can handle, old requests are dropped.

In addition, a registry keeps track of the UFIDs that have been
enriched, to avoid doing so twice. For this registry to work sf_acts and
flow pointers are added. This is because a UFID just represents the
key of the flow (the match part). By adding the sf_acts and flow
pointers we make sure the full flow is still the same.

Signed-off-by: Adrian Moreno <[email protected]>
A ctx hook is needed because the skb pointer is stored in a map by the
kprobe.

Signed-off-by: Adrian Moreno <[email protected]>
When sorting events, embed the flow enrichment result into the
FlowLookup events that have the same ufid, sw_acts, and flow.

Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Signed-off-by: Adrian Moreno <[email protected]>
Collector initialization is currently done in two phases (init & start).
This is both unecessary (collection cannot be started again after it's
stop so having start/stop has no real value) and cumbersome (it makes us
store the `SectionFactories` object in an `Option` just to pass it from
one function to the next.

Merge them into a common `config` function.

This should not have any functional change.
Copy link
Contributor

@atenart atenart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(More than) partial review, leaving a few comments not to forget about them. Will have another look later.

Comment on lines +604 to +612
/// Check whether the map is empty.
pub(crate) fn is_empty(&self) -> bool {
self.0.is_empty()
}

/// Insert a new factory.
pub(crate) fn insert(&mut self, id: FactoryId, factory: Box<dyn EventSectionFactory>) {
self.0.insert(id, factory);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion only: if you implement the Deref trait those come for free, in addition to all the other inner methods.

Some(f) => Ok(Some(
f.as_any_mut()
.downcast_mut::<T>()
.ok_or_else(|| anyhow!("Failed to downcast"))?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: + "section factory with id {T::FACTORY_ID}" ?


impl FlowEnricher {
pub(crate) fn new(events_factory: Arc<RetisEventsFactory>) -> Result<Self> {
let (sender, receiver) = mpsc::channel::<EnrichRequest>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use a sync_channel instead, to avoid having an infinite growing buffer.

(We use non sync channels in other parts of the code, that's fine in practice for now as their readers are not the bottleneck; I'll fix their use in another PR).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-functional-tests Request functional tests to be run by CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants