NETOBSERV-1092 Move CRD fields to advanced #467

jpinsonneau · 2023-10-20T09:28:00Z

Description

This PR move unused API fields to debug sections under each component.
See NETOBSERV-1092 for the list of affected fields.

Dependencies

n/a

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

jpinsonneau · 2023-10-20T09:39:40Z

@msherif1234 Should we consider the eBPF cache fields as debug ?

	// `cacheActiveTimeout` is the max period during which the reporter aggregates flows before sending.
	// Increasing `cacheMaxFlows` and `cacheActiveTimeout` can decrease the network traffic overhead and the CPU load,
	// however you can expect higher memory consumption and an increased latency in the flow collection.
	//+kubebuilder:validation:Pattern:=^\d+(ns|ms|s|m)?$
	//+kubebuilder:default:="5s"
	CacheActiveTimeout string `json:"cacheActiveTimeout,omitempty"`

	// `cacheMaxFlows` is the max number of flows in an aggregate; when reached, the reporter sends the flows.
	// Increasing `cacheMaxFlows` and `cacheActiveTimeout` can decrease the network traffic overhead and the CPU load,
	// however you can expect higher memory consumption and an increased latency in the flow collection.
	//+kubebuilder:validation:Minimum=1
	//+kubebuilder:default:=100000
	CacheMaxFlows int32 `json:"cacheMaxFlows,omitempty"`

msherif1234 · 2023-10-20T11:48:17Z

@msherif1234 Should we consider the eBPF cache fields as debug ?

	// `cacheActiveTimeout` is the max period during which the reporter aggregates flows before sending.
	// Increasing `cacheMaxFlows` and `cacheActiveTimeout` can decrease the network traffic overhead and the CPU load,
	// however you can expect higher memory consumption and an increased latency in the flow collection.
	//+kubebuilder:validation:Pattern:=^\d+(ns|ms|s|m)?$
	//+kubebuilder:default:="5s"
	CacheActiveTimeout string `json:"cacheActiveTimeout,omitempty"`

	// `cacheMaxFlows` is the max number of flows in an aggregate; when reached, the reporter sends the flows.
	// Increasing `cacheMaxFlows` and `cacheActiveTimeout` can decrease the network traffic overhead and the CPU load,
	// however you can expect higher memory consumption and an increased latency in the flow collection.
	//+kubebuilder:validation:Minimum=1
	//+kubebuilder:default:=100000
	CacheMaxFlows int32 `json:"cacheMaxFlows,omitempty"`

I don't think so those knob if user decided to tune will be because resource limitation or excess and will change and stay so they are more of config operation knobs not debug IMHO

jotak · 2023-10-25T07:38:21Z

api/v1beta2/flowcollector_types.go

@@ -635,37 +591,10 @@ type FlowCollectorLoki struct {
 	// Set `enable` to `true` to store flows in Loki. It is required for the OpenShift Console plugin installation.
 	Enable *bool `json:"enable,omitempty"`

-	//+kubebuilder:default:="1s"


Like for the ebpf agent, I think batchWait and batchSize could be good to keep not in Debug, for performance tuning?
Especially BatchSize is something users need to keep consistent with the msg max size defined on loki server-side.

Well I based that on the Jira you initially wrote 🤔
If you changed your mind I can revert. Just let me know.

On my side I would feel better as a user to have that elsewhere but debug is a strong word here.
Maybe we could just concider renaming the debug sections to something more generic such as advanced ?

jotak · 2023-10-25T07:42:19Z

Just a remark that we should probably take back loki batch settings , other than that LGTM!

jpinsonneau · 2023-11-07T11:18:18Z

Just a remark that we should probably take back loki batch settings , other than that LGTM!

Sure, I moved loki batch wait, batch size and timeout to FLP since these are only related to processor.
Since we now have 3 loki related fields and 4 kafka ones I can create subsections under processor. WDYT ?

https://github.com/jpinsonneau/network-observability-operator/blob/1092/api/v1beta2/flowcollector_types.go#L392-L425

codecov · 2023-11-07T11:18:43Z

Codecov Report

Attention: 238 lines in your changes are missing coverage. Please review.

Comparison is base (da4bdd3) 56.27% compared to head (83166a8) 57.60%.

Files	Patch %	Lines
.../flowcollector/v1alpha1/zz_generated.conversion.go	0.00%	58 Missing ⚠️
...s/flowcollector/v1beta1/zz_generated.conversion.go	35.06%	50 Missing ⚠️
...pis/flowcollector/v1beta2/zz_generated.deepcopy.go	75.15%	37 Missing and 2 partials ⚠️
pkg/helper/crd.go	77.86%	23 Missing and 6 partials ⚠️
...is/flowcollector/v1alpha1/flowcollector_webhook.go	0.00%	27 Missing ⚠️
...pis/flowcollector/v1beta1/flowcollector_webhook.go	80.00%	18 Missing and 4 partials ⚠️
controllers/consoleplugin/consoleplugin_objects.go	66.66%	2 Missing and 4 partials ⚠️
pkg/helper/flowcollector.go	93.75%	4 Missing and 2 partials ⚠️
controllers/ovs/flowsconfig_ovnk_reconciler.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #467      +/-   ##
==========================================
+ Coverage   56.27%   57.60%   +1.32%     
==========================================
  Files          69       70       +1     
  Lines        9104     9446     +342     
==========================================
+ Hits         5123     5441     +318     
- Misses       3648     3668      +20     
- Partials      333      337       +4

Flag	Coverage Δ
unittests	`57.60% <66.75%> (+1.32%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

jotak · 2023-11-08T11:18:01Z

api/v1beta2/flowcollector_types.go

+	//+kubebuilder:default="1s"
+	// `minBackoff` is the initial backoff time for client connection between retries.
+	MinBackoff *metav1.Duration `json:"minBackoff,omitempty"` // Warning: keep as pointer, else default is ignored
+
+	//+kubebuilder:default="5s"
+	// `maxBackoff` is the maximum backoff time for client connection between retries.
+	MaxBackoff *metav1.Duration `json:"maxBackoff,omitempty"` // Warning: keep as pointer, else default is ignored
+
+	//+kubebuilder:validation:Minimum=0
+	//+kubebuilder:default:=2
+	// `maxRetries` is the maximum number of retries for client connections.
+	MaxRetries *int32 `json:"maxRetries,omitempty"`
+
+	//+kubebuilder:default:={"app":"netobserv-flowcollector"}
+	// +optional
+	// `staticLabels` is a map of common labels to set on each flow.
+	StaticLabels map[string]string `json:"staticLabels"`


I'm fine with your suggestion to move flp-related loki config under processor;
For those 4 fields I guess they should be prefixed with "Loki" like you did for the others

Renamed while rebasing https://github.com/netobserv/network-observability-operator/compare/3ae36b13d67536bd3a8ca834c04813f9c72aad31..93e67291a0ab7be328dbe506037dd39c8e5b852a

jpinsonneau · 2023-11-21T15:21:30Z

api/v1beta1/flowcollector_webhook_test.go

@@ -29,7 +29,6 @@ func TestBeta1ConversionRoundtrip_Loki(t *testing.T) {
 					Enable:             true,
 					InsecureSkipVerify: true,
 				},
-				BatchSize: 1000,


For beta1 -> beta2 -> beta1 cycle, we loose BatchSize value since this field moved from Loki to Processor spec.

If we want to manage such, we'll need to manually override Convert_v1beta1_FlowCollector_To_v1beta2_FlowCollector and Convert_v1beta2_FlowCollector_To_v1beta1_FlowCollector to copy the fields.

Is it worth the price ? @jotak @msherif1234

I think it's necessary to do it, else we loose the ability to set this field, since the conversions always happen between the served version (like v1beta2) and the stored version (v1beta1) ?

(and same for all the other fields that are moved)

also I tested deploying the sample flowcollector doesn't work, I think it still has all the old fields

Hm, that's annoying since it always runs the conversion webhook, you end with debug fields in the yaml all the time even if you directly apply v1beta2 sample:

spec: agent: ebpf: ... debug: {} consolePlugin: ... debug: port: 9001 register: true processor: ... debug: port: 2055 lokiMaxRetries: 2 lokiMaxBackoff: 5s conversationTerminatingTimeout: 5s conversationEndTimeout: 10s lokiStaticLabels: app: netobserv-flowcollector enableKubeProbes: true lokiMinBackoff: 1s healthPort: 8080 dropUnusedFields: true conversationHeartbeatInterval: 30s

I would prefer to omit these when default.

I agree but I don't see a perfect way to managing that ...
having in-code defaults, and fill these values only when they don't match the default? Not perfect because it would force us to keep defaults defined in two places...

It would be nice to have a codegen tool like kubebuilder that automatically generates defaults as go consts from the kubebuilder annotations or from the CRD openAPI .... sounds like a hack'n'hustle project?

I found a way to load defaults from CRD and show custom values only when set.
That required a bit of refactoring and creating a new crd helper but it works very well and the user still have autocompletion in the yaml.

For simplicity, I have created getters in flowcollector helper that sets defaults and load custom values in their related config so we can rely on these everywhere in the code and we don't need to check for nils.

jpinsonneau · 2023-11-22T11:13:14Z

Added v1beta1 <> v1beta2 FlowCollector manual conversion to move Loki / Processor fields
Added webhook tests
Updated v1beta2 sample

https://github.com/netobserv/network-observability-operator/compare/93e67291a0ab7be328dbe506037dd39c8e5b852a..d814d996a12b1345fe8e567acd92e46a3a80867c

jpinsonneau · 2024-01-02T10:34:03Z

Found a bug on conversion webhook: 04269b8

github-actions · 2024-01-17T13:32:20Z

New images:

quay.io/netobserv/network-observability-operator:8b85581
quay.io/netobserv/network-observability-operator-bundle:v0.0.0-8b85581
quay.io/netobserv/network-observability-operator-catalog:v0.0.0-8b85581

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:8b85581 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-8b85581

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-8b85581
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

openshift-ci · 2024-01-18T11:02:13Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [jotak]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

memodi · 2024-01-18T22:21:05Z

/test e2e-operator

read defaults from CRD fix merge Feedback on Debug crd sections - As suggested by Julien, rename Debug to Advanced - Merge back loki fields into Loki section

- Add webhook tests on Advanced sections - Fix loki.advanced not being overwritten from anotations on conversion - StaticLabel isn't a pointer to map anymore, as maps are already nillable - fix alm sample with renamed fields - generalize CRD setup for all suite_tests

- Some loki settings, when provided by a v1beta1 CR, were ignored, such as batchWait/Size - Found a minor day-0 bug in console: when the port setting is changed, console plugin was unreachable because the pod didn't use that port (only service did). It seems nobody ever changed that setting >_<

jpinsonneau · 2024-01-19T16:06:53Z

I've rebased again since some changes were introduced in #501 requiring 83166a8 update

nathan-weinberg · 2024-01-24T19:12:18Z

/ok-to-test

jpinsonneau requested review from jotak, msherif1234 and OlivierCazade October 20, 2023 09:28

jpinsonneau added ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. testing-breaking-change labels Oct 20, 2023

jotak reviewed Oct 25, 2023

View reviewed changes

openshift-merge-robot added the needs-rebase label Nov 1, 2023

jpinsonneau force-pushed the 1092 branch from 2cbe30a to 3ae36b1 Compare November 7, 2023 11:15

openshift-merge-robot removed the needs-rebase label Nov 7, 2023

github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 7, 2023

jotak reviewed Nov 8, 2023

View reviewed changes

openshift-merge-robot added the needs-rebase label Nov 8, 2023

jpinsonneau force-pushed the 1092 branch from 3ae36b1 to 93e6729 Compare November 21, 2023 15:16

openshift-merge-robot removed the needs-rebase label Nov 21, 2023

jpinsonneau commented Nov 21, 2023

View reviewed changes

jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 21, 2023

jpinsonneau force-pushed the 1092 branch from 93e6729 to d814d99 Compare November 22, 2023 11:11

github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 22, 2023

jpinsonneau requested a review from jotak November 22, 2023 11:13

jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Nov 24, 2023

openshift-merge-robot added the needs-rebase label Dec 1, 2023

jpinsonneau force-pushed the 1092 branch from d814d99 to e3fe142 Compare December 4, 2023 09:49

openshift-merge-robot removed the needs-rebase label Dec 4, 2023

github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Dec 4, 2023

jpinsonneau force-pushed the 1092 branch from 864bf32 to 1a76dc3 Compare December 14, 2023 12:20

jpinsonneau mentioned this pull request Dec 20, 2023

NETOBSERV-1443 Console plugin loki timeout should be configurable #522

Merged

10 tasks

jpinsonneau mentioned this pull request Jan 5, 2024

OSDOCS-8253: Improved LokiStack integration openshift/openshift-docs#67836

Merged

1 task

openshift-merge-robot added the needs-rebase label Jan 16, 2024

jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 17, 2024

jotak force-pushed the 1092 branch from 3e7be1f to 1d145e8 Compare January 18, 2024 10:00

openshift-merge-robot removed the needs-rebase label Jan 18, 2024

github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jan 18, 2024

jotak approved these changes Jan 18, 2024

View reviewed changes

openshift-ci bot assigned jotak Jan 18, 2024

openshift-ci bot added the lgtm label Jan 18, 2024

openshift-ci bot added the approved label Jan 18, 2024

jotak removed the approved label Jan 18, 2024

jpinsonneau changed the title ~~NETOBSERV-1092 Move CRD fields to debug~~ NETOBSERV-1092 Move CRD fields to advanced Jan 18, 2024

jpinsonneau and others added 4 commits January 19, 2024 17:00

move fields to debug

14e7263

read defaults from CRD fix merge Feedback on Debug crd sections - As suggested by Julien, rename Debug to Advanced - Merge back loki fields into Loki section

fix rebase

83166a8

jpinsonneau force-pushed the 1092 branch from b5450c9 to 83166a8 Compare January 19, 2024 16:04

openshift-ci bot added approved and removed lgtm labels Jan 19, 2024

jotak approved these changes Jan 22, 2024

View reviewed changes

openshift-ci bot added the lgtm label Jan 22, 2024

openshift-merge-bot bot merged commit bf1562e into netobserv:main Jan 22, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NETOBSERV-1092 Move CRD fields to advanced #467

NETOBSERV-1092 Move CRD fields to advanced #467

jpinsonneau commented Oct 20, 2023 •

edited

Loading

jpinsonneau commented Oct 20, 2023

msherif1234 commented Oct 20, 2023

jotak Oct 25, 2023

jpinsonneau Oct 25, 2023

jpinsonneau Oct 25, 2023

jotak commented Oct 25, 2023

jpinsonneau commented Nov 7, 2023 •

edited

Loading

codecov bot commented Nov 7, 2023 •

edited

Loading

jotak Nov 8, 2023

jpinsonneau Nov 21, 2023

jpinsonneau Nov 21, 2023

jotak Nov 21, 2023

jotak Nov 21, 2023

jotak Nov 21, 2023

jpinsonneau Nov 22, 2023 •

edited

Loading

jotak Nov 24, 2023

jotak Nov 24, 2023 •

edited

Loading

jpinsonneau Dec 6, 2023

jpinsonneau commented Nov 22, 2023 •

edited

Loading

jpinsonneau commented Jan 2, 2024

github-actions bot commented Jan 17, 2024

openshift-ci bot commented Jan 18, 2024

memodi commented Jan 18, 2024

jpinsonneau commented Jan 19, 2024

nathan-weinberg commented Jan 24, 2024

NETOBSERV-1092 Move CRD fields to advanced #467

NETOBSERV-1092 Move CRD fields to advanced #467

Conversation

jpinsonneau commented Oct 20, 2023 • edited Loading

Description

Dependencies

Checklist

jpinsonneau commented Oct 20, 2023

msherif1234 commented Oct 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jotak commented Oct 25, 2023

jpinsonneau commented Nov 7, 2023 • edited Loading

codecov bot commented Nov 7, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpinsonneau Nov 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jotak Nov 24, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jpinsonneau commented Nov 22, 2023 • edited Loading

jpinsonneau commented Jan 2, 2024

github-actions bot commented Jan 17, 2024

openshift-ci bot commented Jan 18, 2024

memodi commented Jan 18, 2024

jpinsonneau commented Jan 19, 2024

nathan-weinberg commented Jan 24, 2024

jpinsonneau commented Oct 20, 2023 •

edited

Loading

jpinsonneau commented Nov 7, 2023 •

edited

Loading

codecov bot commented Nov 7, 2023 •

edited

Loading

jpinsonneau Nov 22, 2023 •

edited

Loading

jotak Nov 24, 2023 •

edited

Loading

jpinsonneau commented Nov 22, 2023 •

edited

Loading