perf: [NPM] [LINUX] add NetPols in background #1969

huntergregory · 2023-05-18T19:46:37Z

Reason for Change:
NPM must use nftables on Ubuntu 22 (k8s 1.25+). NPM SysCalls into nftables take much longer than into legacy iptables. Customers with ~100+ NetPols can see:

Application timeouts (due to increased latencies while xtables lock is held).
Delay in applying NetPols.

Issue Fixed:

Requirements:

uses conventional commit messages
includes documentation
adds unit tests

Notes:

Latency Comparison

Ran 2 experiments for each version. Used b-series burstable VMs.

Cluster has 640 services with 2 Pods each.

Version	Applying 640 NetPols	Max iptables Restore Latency	List all rules (once at bootup)
Legacy (Ubuntu 18)	4 minutes	0.4 seconds	0.01 seconds
nftables (Ubuntu 22)	9-22 hours	3-5 minutes	5-11 minutes
This PR (for nftables)	13-17 minutes	3-5 minutes	5-11 minutes (same codepath)

Fix: Long Term

nftables do not scale for services · Issue #96018 · kubernetes/kubernetes (github.com)

iptables v1.8.8 has scale improvements. However, AKS is on iptables v1.8.4.

Fix: Near Term (this PR)

Optimizing for ADD NetworkPolicy.

Current Design

ADD NetworkPolicy:

Create all IPSets at once via ipset restore.
Create all iptables rules/chains at once via iptables-restore.

This PR

Since iptables SysCalls take so long, add iptables rules for multiple NetworkPolicies at once.
This change will only take effect if nftables is present on machine & NPM's ConfigMap has NetPolInBackground=true.

Other Changes in this PR

Pipelines

Stop running pipelines for NPM v1.
Enable Linux Scale test.

Metrics

Adds the following metrics:

npm_linux_iptables_restore_latency_seconds
npm_linux_iptables_delete_latency_seconds_bucket



# HELP npm_linux_iptables_delete_latency_seconds Latency in seconds to delete an iptables rule
# TYPE npm_linux_iptables_delete_latency_seconds histogram
npm_linux_iptables_delete_latency_seconds_bucket{le="0.016"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="0.032"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="0.064"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="0.128"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="0.256"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="0.512"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="1.024"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="2.048"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="4.096"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="8.192"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="16.384"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="32.768"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="65.536"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="131.072"} 0
npm_linux_iptables_delete_latency_seconds_bucket{le="+Inf"} 0
npm_linux_iptables_delete_latency_seconds_sum 0
npm_linux_iptables_delete_latency_seconds_count 0
# HELP npm_linux_iptables_restore_latency_seconds Latency in seconds to restore iptables rules by operation label (add/delete NetPol)
# TYPE npm_linux_iptables_restore_latency_seconds histogram
npm_linux_iptables_restore_latency_seconds_bucket{operation="create",le="0.016"} 0
...

vakalapa · 2023-05-18T23:27:10Z

npm/config/config.go

@@ -4,6 +4,8 @@ import "github.com/Azure/azure-container-networking/npm/util"

 const (
 	defaultResyncPeriod         = 15
+	defaultNetPolMaxBatches     = 100


Any reason we cannot reuse the defaultApplyMaxBatches and defaultApplyInterval ?

discussed offline to have separate threads/config for windows and Linux

npm/pkg/dataplane/dataplane.go

npm/pkg/dataplane/policies/policymanager_linux.go

huntergregory · 2023-05-19T21:50:29Z

npm/pkg/dataplane/policies/policymanager_linux.go

+		// 1. Delete jump rules from ingress/egress chains to ingress/egress policy chains.
+		// We ought to delete these jump rules here in the foreground since if we add an NP back after deleting, iptables-restore --noflush can add duplicate jump rules.
+		deleteErr := pMgr.deleteOldJumpRulesOnRemove(networkPolicy)
+		if deleteErr != nil {


need to ignore errors here

we actually already ignore doesNotExistErrorCode in case the rule doesn't exist

npm/pkg/dataplane/dataplane.go

huntergregory · 2023-06-14T18:59:39Z

TODO: only apply netpol in background if nf-tables is present

vakalapa · 2023-06-20T21:19:18Z

npm/config/config.go

@@ -7,6 +7,8 @@ const (
 	defaultApplyMaxBatches      = 100
 	defaultApplyInterval        = 500
 	defaultMaxBatchedACLsPerPod = 30
+	defaultMaxPendingNetPols    = 100


Isnt 100 too much ? can we have something like 10 ?

Tested with MaxPendingNetPols=10. This doesn't seem to give us much performance boost. Took 12 hours to add 640 NetPols. With MaxPendingNetPols=100, the same takes <20 minutes (table in PR description has more details)

With MaxPendingNetPols=10, I also saw 1/10 NPM Pods crash with this error after adding 625/640 NetPols, 11.5 hours after NPM came online:

I0622 05:55:28.926162 1 trace.go:219] Trace[1241999867]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169 (22-Jun-2023 05:54:24.710) (total time: 64119ms): Trace[1241999867]: ---"Objects listed" error:<nil> 64118ms (05:55:28.828) Trace[1241999867]: [1m4.119544167s] [1m4.119544167s] END I0622 05:55:29.727548 1 trace.go:219] Trace[257377008]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169 (22-Jun-2023 05:54:24.512) (total time: 65114ms): Trace[257377008]: ---"Objects listed" error:<nil> 65018ms (05:55:29.531) Trace[257377008]: [1m5.114375378s] [1m5.114375378s] END

npm/pkg/dataplane/types.go

vakalapa · 2023-06-20T21:37:12Z

npm/pkg/dataplane/dataplane.go

+
+	// Prevent netpol in background unless we're in Linux and using nftables.
+	// This step must be performed after bootupDataplane() because it calls util.DetectIptablesVersion(), which sets the proper value for util.Iptables
+	dp.netPolInBackground = cfg.NetPolInBackground && !util.IsWindowsDP() && (strings.Contains(util.Iptables, "nft") || dp.debug)


are we saying we will only make this change for nft ? why not for legacy too ?

discussed offline to only do this for nft. We will keep codepath the same for legacy, which doesn't need the performance boost

npm/pkg/dataplane/dataplane.go

vakalapa · 2023-06-20T22:45:05Z

npm/pkg/dataplane/dataplane.go

+// The caller must lock netPolQueue.
+func (dp *DataPlane) addPoliciesWithRetry(context string) {
+	netPols := dp.netPolQueue.dump()
+	klog.Infof("[DataPlane] adding policies %+v", netPols)


Can we make this a debug statement ?

Seems like we don't have debug logging capability via klog

vakalapa · 2023-06-20T22:50:21Z

npm/pkg/dataplane/dataplane.go

+	// 2. get Endpoints
+	var err error
+	var endpointList map[string]string
+	if !dp.inBootupPhase() {


Is this condition correct ? the only condition for wasInBootPhase to be true is if dp.inBootupPhase() returns true.

here you are checking the opposite and wasInBootPhase , i think we will never hit this condition ? unless i am missing something ?

The idea is to protect against a race:

DP in bootup phase.

dp.incrementBatchAndApplyIfNeeded() is called in last iteration of above loop.

dp.FinishBootupPhase() in another thread

Few more details in code comment below

npm/pkg/dataplane/dataplane_windows.go

matmerr · 2023-07-13T17:05:47Z

npm/pkg/dataplane/dataplane.go

+// addPoliciesWithRetry tries adding all policies. If this fails, it tries adding policies one by one.
+// The caller must lock netPolQueue.
+func (dp *DataPlane) addPoliciesWithRetry(context string) {
+	netPols := dp.netPolQueue.dump()


storing a queue of netpols as a map in background, but when applied we convert from map back to slice, any reason to not store background queue as a slice and then avoid this conversion?

having the map will help with dp.RemovePolicy(), where we remove the NetPol from the queue

huntergregory · 2023-07-17T23:27:37Z

/azp run

azure-pipelines · 2023-07-17T23:27:43Z

No commit pushedDate could be found for PR 1969 in repo Azure/azure-container-networking

huntergregory · 2023-07-18T18:02:18Z

/azp run

azure-pipelines · 2023-07-18T18:03:03Z

Azure Pipelines successfully started running 4 pipeline(s), but failed to run 1 pipeline(s).

huntergregory · 2023-07-18T20:39:22Z

/azp run

azure-pipelines · 2023-07-18T20:40:04Z

Azure Pipelines successfully started running 4 pipeline(s), but failed to run 1 pipeline(s).

huntergregory · 2023-07-18T21:37:33Z

/azp run

azure-pipelines · 2023-07-18T21:38:08Z

Azure Pipelines successfully started running 4 pipeline(s), but failed to run 1 pipeline(s).

xanecs · 2023-07-27T14:17:33Z

@huntergregory why not update the iptables binary that is used in the docker image?

huntergregory · 2023-08-07T20:02:04Z

@huntergregory why not update the iptables binary that is used in the docker image?

Hi @xanecs, the limitation seems to be in the iptables version installed on the AKS node

huntergregory added do-not-merge npm Related to NPM. linux labels May 18, 2023

huntergregory requested a review from vakalapa May 18, 2023 19:46

huntergregory force-pushed the hgregory/05-18-netpol-background branch from 21bda59 to 849b570 Compare May 18, 2023 19:50

vakalapa reviewed May 18, 2023

View reviewed changes

huntergregory commented May 19, 2023

View reviewed changes

npm/pkg/dataplane/dataplane.go Outdated Show resolved Hide resolved

huntergregory mentioned this pull request May 23, 2023

tmp: [NPM] [DO-NOT-MERGE] print restore file #1973

Closed

3 tasks

huntergregory added 21 commits May 23, 2023 11:56

wip: apply dirty NetPols every 500ms in Linux

01917c6

only build npm linux image

521417e

fix: check for empty cache

c6b0da6

feat: toggle for netpol interval. default 500 ms

2b23891

ci: remove stages "build binaries" and "run windows tests"

6f3538c

wip: max batched netpols (toggle-specified)

b664e9d

ci: remove manifest build/push for win npm

b84f11b

wip: handle ipset deletion properly and max batch for delete too

1f9767a

fix: correct remove policy

ab494e5

fix: only remove policy if it was in kernel

9d400d3

finalize toggles, allowing ability to turn off iptablesInBackground

a914945

ci: conf + cyc use PR's configmaps

67743b3

fix: lints

71080f0

fix dp toggle: iptablesInBackground

1ec9b81

fix lock typo and config logging

adb4473

fix background thread. add comments. only add tmp ref when enabled

5b3169f

copy pod selector list

ef0efe2

fix: removepolicy needs namespace too

940337d

rename opInfo to event

9e5e699

fix: fix references and prevent concurrent map read/write

abf3638

tmp: debug logging

c27e507

huntergregory changed the title ~~perf: [NPM] [LINUX] apply dirty NetPols in background~~ perf: [NPM] [LINUX] add NetPols in background Jun 16, 2023

huntergregory and others added 4 commits June 16, 2023 13:54

fix: windows bootup phase logic for addpolicy

43afc64

feat: restrict netpol in background to linux + nftables

ff1dac6

test: skip nftables check for UT

36da0d7

Merge branch 'master' into hgregory/05-18-netpol-background

9a117d2

vakalapa reviewed Jun 20, 2023

View reviewed changes

huntergregory added 2 commits June 21, 2023 10:50

style: netpols[0] instead of loop

f615436

log: address log comments

33732fd

huntergregory requested a review from a team as a code owner June 23, 2023 18:52

Merge branch 'master' into hgregory/05-18-netpol-background

6ad7639

huntergregory force-pushed the hgregory/05-18-netpol-background branch from 3503c1c to 6ad7639 Compare June 23, 2023 18:57

style: lint for long line

f337aee

matmerr reviewed Jul 13, 2023

View reviewed changes

vakalapa approved these changes Jul 17, 2023

View reviewed changes

Merge branch 'master' into hgregory/05-18-netpol-background

cd40a02

Merge branch 'master' into hgregory/05-18-netpol-background

35f07f3

matmerr approved these changes Jul 18, 2023

View reviewed changes

vakalapa merged commit ebddca1 into master Jul 19, 2023

vakalapa deleted the hgregory/05-18-netpol-background branch July 19, 2023 16:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: [NPM] [LINUX] add NetPols in background #1969

perf: [NPM] [LINUX] add NetPols in background #1969

huntergregory commented May 18, 2023 •

edited

Loading

vakalapa May 18, 2023

huntergregory May 23, 2023

huntergregory May 19, 2023

huntergregory May 23, 2023

huntergregory commented Jun 14, 2023

vakalapa Jun 20, 2023

huntergregory Jun 22, 2023 •

edited

Loading

huntergregory Jun 22, 2023

vakalapa Jun 20, 2023

huntergregory Jun 23, 2023

vakalapa Jun 20, 2023

huntergregory Jun 21, 2023

vakalapa Jun 20, 2023

huntergregory Jun 21, 2023 •

edited

Loading

matmerr Jul 13, 2023

huntergregory Jul 13, 2023

huntergregory commented Jul 17, 2023

azure-pipelines bot commented Jul 17, 2023

huntergregory commented Jul 18, 2023

azure-pipelines bot commented Jul 18, 2023

huntergregory commented Jul 18, 2023

azure-pipelines bot commented Jul 18, 2023

huntergregory commented Jul 18, 2023

azure-pipelines bot commented Jul 18, 2023

xanecs commented Jul 27, 2023

huntergregory commented Aug 7, 2023

perf: [NPM] [LINUX] add NetPols in background #1969

perf: [NPM] [LINUX] add NetPols in background #1969

Conversation

huntergregory commented May 18, 2023 • edited Loading

Latency Comparison

Fix: Long Term

Fix: Near Term (this PR)

Current Design

This PR

Other Changes in this PR

Pipelines

Metrics

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huntergregory commented Jun 14, 2023

Choose a reason for hiding this comment

huntergregory Jun 22, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huntergregory Jun 21, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

huntergregory commented Jul 17, 2023

azure-pipelines bot commented Jul 17, 2023

huntergregory commented Jul 18, 2023

azure-pipelines bot commented Jul 18, 2023

huntergregory commented Jul 18, 2023

azure-pipelines bot commented Jul 18, 2023

huntergregory commented Jul 18, 2023

azure-pipelines bot commented Jul 18, 2023

xanecs commented Jul 27, 2023

huntergregory commented Aug 7, 2023

huntergregory commented May 18, 2023 •

edited

Loading

huntergregory Jun 22, 2023 •

edited

Loading

huntergregory Jun 21, 2023 •

edited

Loading