Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: [NPM] [LINUX] add NetPols in background #1969

Merged
merged 88 commits into from
Jul 19, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
88 commits
Select commit Hold shift + click to select a range
01917c6
wip: apply dirty NetPols every 500ms in Linux
huntergregory May 18, 2023
521417e
only build npm linux image
huntergregory May 18, 2023
c6b0da6
fix: check for empty cache
huntergregory May 18, 2023
2b23891
feat: toggle for netpol interval. default 500 ms
huntergregory May 18, 2023
6f3538c
ci: remove stages "build binaries" and "run windows tests"
huntergregory May 18, 2023
b664e9d
wip: max batched netpols (toggle-specified)
huntergregory May 18, 2023
b84f11b
ci: remove manifest build/push for win npm
huntergregory May 18, 2023
1f9767a
wip: handle ipset deletion properly and max batch for delete too
huntergregory May 19, 2023
ab494e5
fix: correct remove policy
huntergregory May 19, 2023
9d400d3
fix: only remove policy if it was in kernel
huntergregory May 19, 2023
a914945
finalize toggles, allowing ability to turn off iptablesInBackground
huntergregory May 19, 2023
67743b3
ci: conf + cyc use PR's configmaps
huntergregory May 19, 2023
71080f0
fix: lints
huntergregory May 19, 2023
1ec9b81
fix dp toggle: iptablesInBackground
huntergregory May 19, 2023
adb4473
fix lock typo and config logging
huntergregory May 22, 2023
5b3169f
fix background thread. add comments. only add tmp ref when enabled
huntergregory May 22, 2023
ef0efe2
copy pod selector list
huntergregory May 22, 2023
940337d
fix: removepolicy needs namespace too
huntergregory May 22, 2023
9e5e699
rename opInfo to event
huntergregory May 22, 2023
abf3638
fix: fix references and prevent concurrent map read/write
huntergregory May 22, 2023
c27e507
tmp: debug logging
huntergregory May 22, 2023
0336346
fix: missing set references by swap keys and values
huntergregory May 22, 2023
1e8d6af
Revert "tmp: debug logging"
huntergregory May 22, 2023
05ba523
fix: add podSelectorList to fake NetPol
huntergregory May 22, 2023
1da93f2
log: do not print error when failing to delete non-existent nft rule
huntergregory May 22, 2023
546857d
log: verbose iptables bootup
huntergregory May 23, 2023
f7e06f3
log: use fmt.Errorf for clean logging
huntergregory May 23, 2023
c876d86
log: never return error for iptables in background and fix some lints
huntergregory May 23, 2023
8267e80
fix: activate/deactivate azure chain rules
huntergregory May 23, 2023
ed5de88
fix: correctly decrement netpols in kernel
huntergregory May 23, 2023
9a3cd62
ci: run UTs again
huntergregory May 23, 2023
173fa77
ci: update profiles. default to placefirst=false
huntergregory May 23, 2023
d904251
address comment: rename batch to pendingPolicy
huntergregory May 23, 2023
bcf4ee7
refactor: make dirty cache OS-specific
huntergregory May 23, 2023
ed82252
test: UTs
huntergregory May 23, 2023
f0329ba
test: put UT cfg back to placefirst to not break things
huntergregory May 23, 2023
b8de686
ci: update cyclonus workflows
huntergregory May 23, 2023
807f6df
fmt: address comment & lint
huntergregory May 23, 2023
2cc8283
fmt: rename numInKernel to policiesInKernel
huntergregory May 23, 2023
a0b0335
log: switch to fmt.Errorf
huntergregory May 23, 2023
134ec4e
fmt: whitespace
huntergregory May 24, 2023
9dd332f
feat: resiliency to errors while reconciling dirty netpols
huntergregory May 24, 2023
b8a11ca
log: temporarily print everything for ipset restore
huntergregory May 24, 2023
10e3257
fix: remove nomatch from ipset -D for cidr blocks
huntergregory Jun 1, 2023
416d981
test: UTs for non-happy path
huntergregory Jun 2, 2023
ddcfaee
test: fix hns fake
huntergregory Jun 2, 2023
57d7c17
fix: don't change windows. let it delete ipsets when removing policies
huntergregory Jun 2, 2023
1d51247
fix windows lint
huntergregory Jun 2, 2023
23ee4f1
fix: ignore chain doesn't exist errors for iptables -D
huntergregory Jun 2, 2023
8518bda
feat: latency and failure metrics
huntergregory Jun 2, 2023
76b37c7
test: update exit code for UT
huntergregory Jun 2, 2023
ae9d755
metrics: new metrics should go in node-metrics path
huntergregory Jun 2, 2023
8f395ec
Merge branch 'master' into hgregory/05-18-netpol-background
huntergregory Jun 2, 2023
bb70d52
Merge branch 'master' into hgregory/05-18-netpol-background
huntergregory Jun 9, 2023
7df05fa
style: simplify nesting
huntergregory Jun 9, 2023
3700763
style: move identical windows & linux code to shared file
huntergregory Jun 9, 2023
a95bef4
ci: remove v1 conformance and cyclonus
huntergregory Jun 9, 2023
eace676
feat: add NetPols in background from the DP (revert background code i…
huntergregory Jun 9, 2023
d96413d
style: remove "background" from iptables metrics
huntergregory Jun 9, 2023
dec20ed
revert changes in ipsetmanager, const.go, and dp.Remove/UpdatePolicy
huntergregory Jun 12, 2023
a454548
style: whitespace
huntergregory Jun 12, 2023
9502a42
perf: use len() instead of creating slice from map
huntergregory Jun 12, 2023
ca5e7ce
remove verbosity for iptables bootup
huntergregory Jun 12, 2023
c72ccdf
build: add return statement
huntergregory Jun 12, 2023
e952178
style: whitespace
huntergregory Jun 12, 2023
86cce28
build: fix variable shadowing
huntergregory Jun 12, 2023
badd2c7
build: fix more import shadowing
huntergregory Jun 12, 2023
cb2b426
build: windows pointer issue and UT issue
huntergregory Jun 12, 2023
0dc1df0
test: fix UT for iptables error code 2
huntergregory Jun 12, 2023
82e8497
ci: enable linux scale test
huntergregory Jun 12, 2023
6e0dc1b
ci: revert to master pipeline.yaml
huntergregory Jun 12, 2023
695fc1f
revert changes to chain-management. do changes in PR #2012
huntergregory Jun 12, 2023
c89c7f8
Merge branch 'master' into hgregory/05-18-netpol-background
huntergregory Jun 12, 2023
483123e
log: change wording
huntergregory Jun 13, 2023
9e9812e
test: UTs for netpol in background
huntergregory Jun 13, 2023
3ea0bab
log: wording
huntergregory Jun 13, 2023
12b1318
feat: apply ipsets for each netpol individually
huntergregory Jun 13, 2023
f754343
config: rearrange ConfigMap & update capz yaml
huntergregory Jun 13, 2023
43afc64
fix: windows bootup phase logic for addpolicy
huntergregory Jun 16, 2023
ff1dac6
feat: restrict netpol in background to linux + nftables
huntergregory Jun 16, 2023
36da0d7
test: skip nftables check for UT
huntergregory Jun 16, 2023
9a117d2
Merge branch 'master' into hgregory/05-18-netpol-background
vakalapa Jun 20, 2023
f615436
style: netpols[0] instead of loop
huntergregory Jun 21, 2023
33732fd
log: address log comments
huntergregory Jun 21, 2023
6ad7639
Merge branch 'master' into hgregory/05-18-netpol-background
huntergregory Jun 23, 2023
f337aee
style: lint for long line
huntergregory Jun 23, 2023
cd40a02
Merge branch 'master' into hgregory/05-18-netpol-background
huntergregory Jul 17, 2023
35f07f3
Merge branch 'master' into hgregory/05-18-netpol-background
huntergregory Jul 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,10 @@ jobs:
# run cyclonus tests in parallel for NPM with the given ConfigMaps
profile:
[
v1-default.yaml,
v1-place-azure-chain-first.yaml,
v2-default.yaml,
v2-apply-on-need.yaml,
v2-place-azure-after-kube-services.yaml,
v2-background.yaml,
v2-foreground.yaml,
v2-place-first.yaml,
]
steps:
- name: Checkout
Expand Down
8 changes: 7 additions & 1 deletion .github/workflows/cyclonus-netpol-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,13 @@ jobs:
strategy:
matrix:
# run cyclonus tests in parallel for NPM with the given ConfigMaps
profile: [v1-default.yaml, v1-place-azure-chain-first.yaml, v2-default.yaml, v2-apply-on-need.yaml, v2-place-azure-after-kube-services.yaml]
profile:
[
v2-apply-on-need.yaml,
v2-background.yaml,
v2-foreground.yaml,
v2-place-first.yaml,
]
steps:
- name: Checkout
uses: actions/checkout@v3
Expand Down
32 changes: 16 additions & 16 deletions .pipelines/npm/npm-conformance-tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,21 +90,21 @@ jobs:
displayName: "Run Kubernetes Network Policy Test Suite"
strategy:
matrix:
v1-default:
AZURE_CLUSTER: "conformance-v1-default"
PROFILE: "v1-default"
v2-foreground:
AZURE_CLUSTER: "conformance-v2-foreground"
PROFILE: "v2-foreground"
IS_STRESS_TEST: "false"
v2-default:
AZURE_CLUSTER: "conformance-v2-default"
PROFILE: "v2-default"
v2-background:
AZURE_CLUSTER: "conformance-v2-background"
PROFILE: "v2-background"
IS_STRESS_TEST: "false"
v2-default-ws22:
AZURE_CLUSTER: "conformance-v2-default-ws22"
v2-ws22:
AZURE_CLUSTER: "conformance-v2-ws22"
PROFILE: "v2-default-ws22"
IS_STRESS_TEST: "false"
v2-default-stress:
AZURE_CLUSTER: "conformance-v2-default-stress"
PROFILE: "v2-default"
v2-linux-stress:
AZURE_CLUSTER: "conformance-v2-linux-stress"
PROFILE: "v2-background"
IS_STRESS_TEST: "true"
pool:
name: $(BUILD_POOL_NAME_DEFAULT)
Expand All @@ -117,7 +117,7 @@ jobs:
TAG: $[ dependencies.setup.outputs['EnvironmentalVariables.TAG'] ]
FQDN: empty
steps:
- checkout: none
- checkout: self
- download: current
artifact: Test

Expand Down Expand Up @@ -200,7 +200,7 @@ jobs:
fi

az aks get-credentials -n $(AZURE_CLUSTER) -g $(RESOURCE_GROUP) --file ./kubeconfig
./kubectl --kubeconfig=./kubeconfig apply -f https://raw.githubusercontent.com/Azure/azure-container-networking/master/npm/examples/windows/azure-npm.yaml
./kubectl --kubeconfig=./kubeconfig apply -f $(Pipeline.Workspace)/s/npm/examples/windows/azure-npm.yaml
./kubectl --kubeconfig=./kubeconfig set image daemonset/azure-npm-win -n kube-system azure-npm=$IMAGE_REGISTRY/azure-npm:windows-amd64-ltsc2022-$(TAG)

else
Expand All @@ -219,13 +219,13 @@ jobs:
az aks get-credentials -n $(AZURE_CLUSTER) -g $(RESOURCE_GROUP) --file ./kubeconfig

# deploy azure-npm
./kubectl --kubeconfig=./kubeconfig apply -f https://raw.githubusercontent.com/Azure/azure-container-networking/master/npm/azure-npm.yaml
./kubectl --kubeconfig=./kubeconfig apply -f $(Pipeline.Workspace)/s/npm/azure-npm.yaml

# swap azure-npm image with one built during run
./kubectl --kubeconfig=./kubeconfig set image daemonset/azure-npm -n kube-system azure-npm=$IMAGE_REGISTRY/azure-npm:linux-amd64-$(TAG)

# swap NPM profile with one specified as parameter
./kubectl --kubeconfig=./kubeconfig apply -f https://raw.githubusercontent.com/Azure/azure-container-networking/master/npm/profiles/$(PROFILE).yaml
./kubectl --kubeconfig=./kubeconfig apply -f $(Pipeline.Workspace)/s/npm/profiles/$(PROFILE).yaml
./kubectl --kubeconfig=./kubeconfig rollout restart ds azure-npm -n kube-system
fi

Expand Down Expand Up @@ -437,7 +437,7 @@ jobs:
chmod +x kubectl

# deploy azure-npm
./kubectl --kubeconfig=./kubeconfig apply -f https://raw.githubusercontent.com/Azure/azure-container-networking/master/npm/examples/windows/azure-npm.yaml
./kubectl --kubeconfig=./kubeconfig apply -f $(Pipeline.Workspace)/s/npm/examples/windows/azure-npm.yaml

# swap azure-npm image with one built during run
./kubectl --kubeconfig=./kubeconfig set image daemonset/azure-npm-win -n kube-system azure-npm=$IMAGE_REGISTRY/azure-npm:windows-amd64-ltsc2022-$(TAG)
Expand Down
8 changes: 4 additions & 4 deletions .pipelines/npm/npm-scale-test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,10 @@ jobs:
FQDN: empty
strategy:
matrix:
# v2-linux:
# PROFILE: "sc-lin"
# NUM_NETPOLS: 800
# INITIAL_CONNECTIVITY_TIMEOUT: 60
v2-linux:
PROFILE: "sc-lin"
NUM_NETPOLS: 800
INITIAL_CONNECTIVITY_TIMEOUT: 60
ws22:
PROFILE: "sc-ws22"
NUM_NETPOLS: 50
Expand Down
10 changes: 6 additions & 4 deletions network/hnswrapper/hnsv2wrapperfake.go
Original file line number Diff line number Diff line change
Expand Up @@ -150,10 +150,12 @@ func (f Hnsv2wrapperFake) ModifyNetworkSettings(network *hcn.HostComputeNetwork,
if setpol.PolicyType != hcn.SetPolicyTypeIpSet && setpol.Values != "" {
// Check Nested SetPolicy members
members := strings.Split(setpol.Values, ",")
for _, memberID := range members {
_, ok := networkCache.Policies[memberID]
if !ok {
return newErrorFakeHNS(fmt.Sprintf("Member Policy %s not found for hcn.RequestTypeUpdate", memberID))
if setpol.Values != "" {
for _, memberID := range members {
_, ok := networkCache.Policies[memberID]
if !ok {
return newErrorFakeHNS(fmt.Sprintf("Member Policy %s not found for hcn.RequestTypeUpdate", memberID))
}
}
}
}
Expand Down
19 changes: 11 additions & 8 deletions npm/azure-npm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -148,19 +148,22 @@ metadata:
data:
azure-npm.json: |
{
"ResyncPeriodInMinutes": 15,
"ListeningPort": 10091,
"ListeningAddress": "0.0.0.0",
"ApplyMaxBatches": 100,
"ApplyIntervalInMilliseconds": 500,
"MaxBatchedACLsPerPod": 30,
"ResyncPeriodInMinutes": 15,
"ListeningPort": 10091,
"ListeningAddress": "0.0.0.0",
"ApplyIntervalInMilliseconds": 500,
"ApplyMaxBatches": 100,
"MaxBatchedACLsPerPod": 30,
"NetPolInvervalInMilliseconds": 500,
"MaxPendingNetPols": 100,
"Toggles": {
"EnablePrometheusMetrics": true,
"EnablePprof": true,
"EnableHTTPDebugAPI": true,
"EnableV2NPM": true,
"PlaceAzureChainFirst": true,
"PlaceAzureChainFirst": false,
"ApplyIPSetsOnNeed": false,
"ApplyInBackground": true
"ApplyInBackground": true,
"NetPolInBackground": true
}
}
13 changes: 13 additions & 0 deletions npm/cmd/start.go
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,19 @@ func start(config npmconfig.Config, flags npmconfig.Flags) error {
// update the dataplane config
npmV2DataplaneCfg.MaxBatchedACLsPerPod = config.MaxBatchedACLsPerPod

npmV2DataplaneCfg.NetPolInBackground = config.Toggles.NetPolInBackground
if config.NetPolInvervalInMilliseconds > 0 {
npmV2DataplaneCfg.NetPolInterval = time.Duration(config.NetPolInvervalInMilliseconds * int(time.Millisecond))
} else {
npmV2DataplaneCfg.NetPolInterval = time.Duration(npmconfig.DefaultConfig.NetPolInvervalInMilliseconds * int(time.Millisecond))
}

if config.MaxPendingNetPols > 0 {
npmV2DataplaneCfg.MaxPendingNetPols = config.MaxPendingNetPols
} else {
npmV2DataplaneCfg.MaxPendingNetPols = npmconfig.DefaultConfig.MaxPendingNetPols
}

npmV2DataplaneCfg.ApplyInBackground = config.Toggles.ApplyInBackground
if config.ApplyMaxBatches > 0 {
npmV2DataplaneCfg.ApplyMaxBatches = config.ApplyMaxBatches
Expand Down
20 changes: 16 additions & 4 deletions npm/config/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@ const (
defaultApplyMaxBatches = 100
defaultApplyInterval = 500
defaultMaxBatchedACLsPerPod = 30
defaultMaxPendingNetPols = 100
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt 100 too much ? can we have something like 10 ?

Copy link
Contributor Author

@huntergregory huntergregory Jun 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested with MaxPendingNetPols=10. This doesn't seem to give us much performance boost. Took 12 hours to add 640 NetPols. With MaxPendingNetPols=100, the same takes <20 minutes (table in PR description has more details)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With MaxPendingNetPols=10, I also saw 1/10 NPM Pods crash with this error after adding 625/640 NetPols, 11.5 hours after NPM came online:

I0622 05:55:28.926162       1 trace.go:219] Trace[1241999867]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169 (22-Jun-2023 05:54:24.710) (total time: 64119ms):
Trace[1241999867]: ---"Objects listed" error:<nil> 64118ms (05:55:28.828)
Trace[1241999867]: [1m4.119544167s] [1m4.119544167s] END
I0622 05:55:29.727548       1 trace.go:219] Trace[257377008]: "Reflector ListAndWatch" name:pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:169 (22-Jun-2023 05:54:24.512) (total time: 65114ms):
Trace[257377008]: ---"Objects listed" error:<nil> 65018ms (05:55:29.531)
Trace[257377008]: [1m5.114375378s] [1m5.114375378s] END

defaultNetPolInterval = 500
defaultListeningPort = 10091
defaultGrpcPort = 10092
defaultGrpcServicePort = 9002
Expand Down Expand Up @@ -35,14 +37,20 @@ var DefaultConfig = Config{
ApplyIntervalInMilliseconds: defaultApplyInterval,
MaxBatchedACLsPerPod: defaultMaxBatchedACLsPerPod,

MaxPendingNetPols: defaultMaxPendingNetPols,
NetPolInvervalInMilliseconds: defaultNetPolInterval,

Toggles: Toggles{
EnablePrometheusMetrics: true,
EnablePprof: true,
EnableHTTPDebugAPI: true,
EnableV2NPM: true,
PlaceAzureChainFirst: util.PlaceAzureChainFirst,
PlaceAzureChainFirst: util.PlaceAzureChainAfterKubeServices,
ApplyIPSetsOnNeed: false,
ApplyInBackground: true,
// ApplyInBackground is currently used in Windows to apply the following in background: IPSets and NetPols for new/updated Pods
ApplyInBackground: true,
// NetPolInBackground is currently used in Linux to apply NetPol controller Add events in the background
NetPolInBackground: true,
},
}

Expand All @@ -69,8 +77,10 @@ type Config struct {
// MaxBatchedACLsPerPod is the maximum number of ACLs that can be added to a Pod at once in Windows.
// The zero value is valid.
// A NetworkPolicy's ACLs are always in the same batch, and there will be at least one NetworkPolicy per batch.
MaxBatchedACLsPerPod int `json:"MaxBatchedACLsPerPod,omitempty"`
Toggles Toggles `json:"Toggles,omitempty"`
MaxBatchedACLsPerPod int `json:"MaxBatchedACLsPerPod,omitempty"`
MaxPendingNetPols int `json:"MaxPendingNetPols,omitempty"`
NetPolInvervalInMilliseconds int `json:"NetPolInvervalInMilliseconds,omitempty"`
Toggles Toggles `json:"Toggles,omitempty"`
}

type Toggles struct {
Expand All @@ -82,6 +92,8 @@ type Toggles struct {
ApplyIPSetsOnNeed bool
// ApplyInBackground applies for Windows only
ApplyInBackground bool
// NetPolInBackground
NetPolInBackground bool
}

type Flags struct {
Expand Down
12 changes: 6 additions & 6 deletions npm/examples/windows/azure-npm-capz.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -145,17 +145,17 @@ data:
"ResyncPeriodInMinutes": 15,
"ListeningPort": 10091,
"ListeningAddress": "0.0.0.0",
"ApplyIntervalInMilliseconds": 500,
"ApplyMaxBatches": 100,
"MaxBatchedACLsPerPod": 30,
"Toggles": {
"EnablePrometheusMetrics": true,
"EnablePprof": true,
"EnableHTTPDebugAPI": true,
"EnableV2NPM": true,
"PlaceAzureChainFirst": true,
"ApplyIPSetsOnNeed": false
"ApplyIPSetsOnNeed": false,
"ApplyInBackground": true,
"NetPolInBackground": false
},
"Transport": {
"Address": "azure-npm.kube-system.svc.cluster.local",
"Port": 10092,
"ServicePort": 9001
}
}
19 changes: 11 additions & 8 deletions npm/examples/windows/azure-npm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -140,19 +140,22 @@ metadata:
data:
azure-npm.json: |
{
"ResyncPeriodInMinutes": 15,
"ListeningPort": 10091,
"ListeningAddress": "0.0.0.0",
"ApplyMaxBatches": 100,
"ApplyIntervalInMilliseconds": 500,
"MaxBatchedACLsPerPod": 30,
"ResyncPeriodInMinutes": 15,
"ListeningPort": 10091,
"ListeningAddress": "0.0.0.0",
"ApplyIntervalInMilliseconds": 500,
"ApplyMaxBatches": 100,
"MaxBatchedACLsPerPod": 30,
"NetPolInvervalInMilliseconds": 500,
"MaxPendingNetPols": 100,
"Toggles": {
"EnablePrometheusMetrics": true,
"EnablePprof": true,
"EnableHTTPDebugAPI": true,
"EnableV2NPM": true,
"PlaceAzureChainFirst": true,
"PlaceAzureChainFirst": false,
"ApplyIPSetsOnNeed": false,
"ApplyInBackground": true
"ApplyInBackground": true,
"NetPolInBackground": true
}
}
43 changes: 43 additions & 0 deletions npm/metrics/acl_rules_linux.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
package metrics

import (
"github.com/prometheus/client_golang/prometheus"
)

func RecordIPTablesRestoreLatency(timer *Timer, op OperationKind) {
labels := prometheus.Labels{
operationLabel: string(op),
}
itpablesRestoreLatency.With(labels).Observe(timer.timeElapsed())
}

func RecordIPTablesDeleteLatency(timer *Timer) {
iptablesDeleteLatency.Observe(timer.timeElapsed())
}

func IncIPTablesRestoreFailures(op OperationKind) {
labels := prometheus.Labels{
operationLabel: string(op),
}
iptablesRestoreFailures.With(labels).Inc()
}

func TotalIPTablesRestoreLatencyCalls(op OperationKind) (int, error) {
return histogramVecCount(itpablesRestoreLatency, prometheus.Labels{
operationLabel: string(op),
})
}

func TotalIPTablesDeleteLatencyCalls() (int, error) {
collector, ok := iptablesDeleteLatency.(prometheus.Collector)
if !ok {
return 0, errNotCollector
}
return histogramCount(collector)
}

func TotalIPTablesRestoreFailures(op OperationKind) (int, error) {
return counterValue(iptablesRestoreFailures.With(prometheus.Labels{
operationLabel: string(op),
}))
}
49 changes: 49 additions & 0 deletions npm/metrics/acl_rules_linux_test.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
package metrics

import (
"testing"
"time"

"github.com/stretchr/testify/require"
)

func TestRecordIPTablesRestoreLatency(t *testing.T) {
timer := StartNewTimer()
time.Sleep(1 * time.Millisecond)
RecordIPTablesRestoreLatency(timer, UpdateOp)
timer = StartNewTimer()
time.Sleep(1 * time.Millisecond)
RecordIPTablesRestoreLatency(timer, CreateOp)

count, err := TotalIPTablesRestoreLatencyCalls(CreateOp)
require.Nil(t, err, "failed to get metric")
require.Equal(t, 1, count, "should have recorded create once")

count, err = TotalIPTablesRestoreLatencyCalls(UpdateOp)
require.Nil(t, err, "failed to get metric")
require.Equal(t, 1, count, "should have recorded update once")
}

func TestRecordIPTablesDeleteLatency(t *testing.T) {
timer := StartNewTimer()
time.Sleep(1 * time.Millisecond)
RecordIPTablesDeleteLatency(timer)

count, err := TotalIPTablesDeleteLatencyCalls()
require.Nil(t, err, "failed to get metric")
require.Equal(t, 1, count, "should have recorded create once")
}

func TestIncIPTablesRestoreFailures(t *testing.T) {
IncIPTablesRestoreFailures(CreateOp)
IncIPTablesRestoreFailures(UpdateOp)
IncIPTablesRestoreFailures(CreateOp)

count, err := TotalIPTablesRestoreFailures(CreateOp)
require.Nil(t, err, "failed to get metric")
require.Equal(t, 2, count, "should have failed to create twice")

count, err = TotalIPTablesRestoreFailures(UpdateOp)
require.Nil(t, err, "failed to get metric")
require.Equal(t, 1, count, "should have failed to update once")
}
Loading