Skip to content

Commit

Permalink
Enable disabling TX checksum offload for Antrea host gateway (#6843)
Browse files Browse the repository at this point in the history
This commit introduces the ability to disable TX checksum offload
for the host gateway interface (default: `antrea-gw0`) by setting the
`disableTXChecksumOffload` option to `true`.

If this option is later set to false, Antrea does nothing to the affected
container network interfaces and the host gateway interface.

Signed-off-by: Hongliang Liu <[email protected]>
  • Loading branch information
hongliangl authored Feb 12, 2025
1 parent b7f650d commit 1ee108b
Show file tree
Hide file tree
Showing 13 changed files with 196 additions and 79 deletions.
9 changes: 6 additions & 3 deletions build/charts/antrea/conf/antrea-agent.conf
Original file line number Diff line number Diff line change
Expand Up @@ -179,9 +179,12 @@ trafficEncryptionMode: {{ .Values.trafficEncryptionMode | quote }}
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: {{ .Values.enableBridgingMode }}

# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
# It affects Pods running on Linux Nodes only.
# Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
# antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
# which causes packets to be dropped due to bad checksum.
# If this option is later set to false, Antrea does nothing to the affected container network interfaces
# and the host gateway interface.
# This option affects Linux Nodes only.
disableTXChecksumOffload: {{ .Values.disableTXChecksumOffload }}

# Default MTU to use for the host gateway interface and the network interface of each Pod.
Expand Down
13 changes: 8 additions & 5 deletions build/yamls/antrea-aks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4160,9 +4160,12 @@ data:
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: false
# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
# It affects Pods running on Linux Nodes only.
# Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
# antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
# which causes packets to be dropped due to bad checksum.
# If this option is later set to false, Antrea does nothing to the affected container network interfaces
# and the host gateway interface.
# This option affects Linux Nodes only.
disableTXChecksumOffload: false
# Default MTU to use for the host gateway interface and the network interface of each Pod.
Expand Down Expand Up @@ -5443,7 +5446,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 370890f19fdae1e870e0cf1d5e4c5227fb343efb9538535a0c65f7b7f6a054f5
checksum/config: 9c5fd81219c99e3ac42cdbafe79a80f2462119a30249c4dffc6d8eb969251f4e
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5687,7 +5690,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 370890f19fdae1e870e0cf1d5e4c5227fb343efb9538535a0c65f7b7f6a054f5
checksum/config: 9c5fd81219c99e3ac42cdbafe79a80f2462119a30249c4dffc6d8eb969251f4e
labels:
app: antrea
component: antrea-controller
Expand Down
13 changes: 8 additions & 5 deletions build/yamls/antrea-eks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4160,9 +4160,12 @@ data:
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: false
# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
# It affects Pods running on Linux Nodes only.
# Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
# antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
# which causes packets to be dropped due to bad checksum.
# If this option is later set to false, Antrea does nothing to the affected container network interfaces
# and the host gateway interface.
# This option affects Linux Nodes only.
disableTXChecksumOffload: false
# Default MTU to use for the host gateway interface and the network interface of each Pod.
Expand Down Expand Up @@ -5443,7 +5446,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 370890f19fdae1e870e0cf1d5e4c5227fb343efb9538535a0c65f7b7f6a054f5
checksum/config: 9c5fd81219c99e3ac42cdbafe79a80f2462119a30249c4dffc6d8eb969251f4e
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5688,7 +5691,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 370890f19fdae1e870e0cf1d5e4c5227fb343efb9538535a0c65f7b7f6a054f5
checksum/config: 9c5fd81219c99e3ac42cdbafe79a80f2462119a30249c4dffc6d8eb969251f4e
labels:
app: antrea
component: antrea-controller
Expand Down
13 changes: 8 additions & 5 deletions build/yamls/antrea-gke.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4160,9 +4160,12 @@ data:
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: false
# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
# It affects Pods running on Linux Nodes only.
# Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
# antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
# which causes packets to be dropped due to bad checksum.
# If this option is later set to false, Antrea does nothing to the affected container network interfaces
# and the host gateway interface.
# This option affects Linux Nodes only.
disableTXChecksumOffload: false
# Default MTU to use for the host gateway interface and the network interface of each Pod.
Expand Down Expand Up @@ -5443,7 +5446,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 107cf72235dd1aabce91dd716bc3bd62a4b6500ef9d7ed309071a78e68b1ede1
checksum/config: 115af3aa2408672d2f38c5dfd9aae3a4754703158adc807f695b74a8689f1ada
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5685,7 +5688,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 107cf72235dd1aabce91dd716bc3bd62a4b6500ef9d7ed309071a78e68b1ede1
checksum/config: 115af3aa2408672d2f38c5dfd9aae3a4754703158adc807f695b74a8689f1ada
labels:
app: antrea
component: antrea-controller
Expand Down
13 changes: 8 additions & 5 deletions build/yamls/antrea-ipsec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4173,9 +4173,12 @@ data:
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: false
# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
# It affects Pods running on Linux Nodes only.
# Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
# antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
# which causes packets to be dropped due to bad checksum.
# If this option is later set to false, Antrea does nothing to the affected container network interfaces
# and the host gateway interface.
# This option affects Linux Nodes only.
disableTXChecksumOffload: false
# Default MTU to use for the host gateway interface and the network interface of each Pod.
Expand Down Expand Up @@ -5456,7 +5459,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: e40f0f3f4e412b4463e40c1062c1e10e6c66f471d31748e253262499103cb39f
checksum/config: d7a27b42825a5fb89da24f0e2ba23b6672d3c62ac9bba4507722d6a57bfffaca
checksum/ipsec-secret: d0eb9c52d0cd4311b6d252a951126bf9bea27ec05590bed8a394f0f792dcb2a4
labels:
app: antrea
Expand Down Expand Up @@ -5744,7 +5747,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: e40f0f3f4e412b4463e40c1062c1e10e6c66f471d31748e253262499103cb39f
checksum/config: d7a27b42825a5fb89da24f0e2ba23b6672d3c62ac9bba4507722d6a57bfffaca
labels:
app: antrea
component: antrea-controller
Expand Down
13 changes: 8 additions & 5 deletions build/yamls/antrea.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4160,9 +4160,12 @@ data:
# `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
enableBridgingMode: false
# Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
# datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
# It affects Pods running on Linux Nodes only.
# Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
# antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
# which causes packets to be dropped due to bad checksum.
# If this option is later set to false, Antrea does nothing to the affected container network interfaces
# and the host gateway interface.
# This option affects Linux Nodes only.
disableTXChecksumOffload: false
# Default MTU to use for the host gateway interface and the network interface of each Pod.
Expand Down Expand Up @@ -5443,7 +5446,7 @@ spec:
kubectl.kubernetes.io/default-container: antrea-agent
# Automatically restart Pods with a RollingUpdate if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 3e4001d4c859dc8db92b7889b13c97a682dd8771ef7cfdf3d04ab70f2cf18879
checksum/config: b5a31ae863dbec89793167ebf4204eed1b5649180295c89064769b8e9526a1d6
labels:
app: antrea
component: antrea-agent
Expand Down Expand Up @@ -5685,7 +5688,7 @@ spec:
annotations:
# Automatically restart Pod if the ConfigMap changes
# See https://helm.sh/docs/howto/charts_tips_and_tricks/#automatically-roll-deployments
checksum/config: 3e4001d4c859dc8db92b7889b13c97a682dd8771ef7cfdf3d04ab70f2cf18879
checksum/config: b5a31ae863dbec89793167ebf4204eed1b5649180295c89064769b8e9526a1d6
labels:
app: antrea
component: antrea-controller
Expand Down
3 changes: 2 additions & 1 deletion cmd/antrea-agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,8 @@ func run(o *Options) error {
connectUplinkToBridge,
o.enableAntreaProxy,
l7NetworkPolicyEnabled,
l7FlowExporterEnabled)
l7FlowExporterEnabled,
o.config.DisableTXChecksumOffload)
err = agentInitializer.Initialize()
if err != nil {
return fmt.Errorf("error initializing agent: %v", err)
Expand Down
8 changes: 6 additions & 2 deletions docs/antrea-l7-network-policy.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,8 +34,12 @@ This guide demonstrates how to configure layer 7 NetworkPolicy.

Layer 7 NetworkPolicy was introduced in v1.10 as an alpha feature and is disabled by default. A feature gate,
`L7NetworkPolicy`, must be enabled in antrea-controller.conf and antrea-agent.conf in the `antrea-config` ConfigMap.
Additionally, due to the constraint of the application detection engine, TX checksum offloading must be disabled via the
`disableTXChecksumOffload` option in antrea-agent.conf for the feature to work. An example configuration is as below:
Additionally, to ensure proper functionality, TX checksum offloading must be disabled for container network interfaces
and the host gateway interface (default: antrea-gw0) due to the constraint of the application detection engine. Ths can
be configured using the `disableTXChecksumOffload` option in antrea-agent.conf. Disabling TX checksum offloading ensures
that TCP connections traverse these interfaces correctly, preventing connection failures and packet loss.

An example configuration is as below:

```yaml
apiVersion: v1
Expand Down
96 changes: 51 additions & 45 deletions pkg/agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -109,27 +109,28 @@ var (

// Initializer knows how to setup host networking, OpenVSwitch, and Openflow.
type Initializer struct {
client clientset.Interface
crdClient versioned.Interface
ovsBridgeClient ovsconfig.OVSBridgeClient
ovsCtlClient ovsctl.OVSCtlClient
ofClient openflow.Client
routeClient route.Interface
wireGuardClient wireguard.Interface
ifaceStore interfacestore.InterfaceStore
ovsBridge string
hostGateway string // name of gateway port on the OVS bridge
mtu int
networkConfig *config.NetworkConfig
nodeConfig *config.NodeConfig
wireGuardConfig *config.WireGuardConfig
egressConfig *config.EgressConfig
serviceConfig *config.ServiceConfig
l7NetworkPolicyConfig *config.L7NetworkPolicyConfig
enableL7NetworkPolicy bool
enableL7FlowExporter bool
connectUplinkToBridge bool
enableAntreaProxy bool
client clientset.Interface
crdClient versioned.Interface
ovsBridgeClient ovsconfig.OVSBridgeClient
ovsCtlClient ovsctl.OVSCtlClient
ofClient openflow.Client
routeClient route.Interface
wireGuardClient wireguard.Interface
ifaceStore interfacestore.InterfaceStore
ovsBridge string
hostGateway string // name of gateway port on the OVS bridge
mtu int
networkConfig *config.NetworkConfig
nodeConfig *config.NodeConfig
wireGuardConfig *config.WireGuardConfig
egressConfig *config.EgressConfig
serviceConfig *config.ServiceConfig
l7NetworkPolicyConfig *config.L7NetworkPolicyConfig
enableL7NetworkPolicy bool
enableL7FlowExporter bool
connectUplinkToBridge bool
enableAntreaProxy bool
disableTXChecksumOffload bool
// podNetworkWait should be decremented once the Node's network is ready.
// The CNI server will wait for it before handling any CNI Add requests.
podNetworkWait *utilwait.Group
Expand Down Expand Up @@ -166,32 +167,34 @@ func NewInitializer(
enableAntreaProxy bool,
enableL7NetworkPolicy bool,
enableL7FlowExporter bool,
disableTXChecksumOffload bool,
) *Initializer {
return &Initializer{
ovsBridgeClient: ovsBridgeClient,
ovsCtlClient: ovsCtlClient,
client: k8sClient,
crdClient: crdClient,
ifaceStore: ifaceStore,
ofClient: ofClient,
routeClient: routeClient,
ovsBridge: ovsBridge,
hostGateway: hostGateway,
mtu: mtu,
networkConfig: networkConfig,
wireGuardConfig: wireGuardConfig,
egressConfig: egressConfig,
serviceConfig: serviceConfig,
l7NetworkPolicyConfig: &config.L7NetworkPolicyConfig{},
podNetworkWait: podNetworkWait,
flowRestoreCompleteWait: flowRestoreCompleteWait,
stopCh: stopCh,
nodeType: nodeType,
externalNodeNamespace: externalNodeNamespace,
connectUplinkToBridge: connectUplinkToBridge,
enableAntreaProxy: enableAntreaProxy,
enableL7NetworkPolicy: enableL7NetworkPolicy,
enableL7FlowExporter: enableL7FlowExporter,
ovsBridgeClient: ovsBridgeClient,
ovsCtlClient: ovsCtlClient,
client: k8sClient,
crdClient: crdClient,
ifaceStore: ifaceStore,
ofClient: ofClient,
routeClient: routeClient,
ovsBridge: ovsBridge,
hostGateway: hostGateway,
mtu: mtu,
networkConfig: networkConfig,
wireGuardConfig: wireGuardConfig,
egressConfig: egressConfig,
serviceConfig: serviceConfig,
l7NetworkPolicyConfig: &config.L7NetworkPolicyConfig{},
podNetworkWait: podNetworkWait,
flowRestoreCompleteWait: flowRestoreCompleteWait,
stopCh: stopCh,
nodeType: nodeType,
externalNodeNamespace: externalNodeNamespace,
connectUplinkToBridge: connectUplinkToBridge,
enableAntreaProxy: enableAntreaProxy,
enableL7NetworkPolicy: enableL7NetworkPolicy,
enableL7FlowExporter: enableL7FlowExporter,
disableTXChecksumOffload: disableTXChecksumOffload,
}
}

Expand Down Expand Up @@ -706,6 +709,9 @@ func (i *Initializer) setupGatewayInterface() error {
return err
}
}
if err := i.setTXChecksumOffloadOnGateway(); err != nil {
return err
}

return nil
}
Expand Down
11 changes: 11 additions & 0 deletions pkg/agent/agent_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ import (
"antrea.io/antrea/pkg/agent/config"
"antrea.io/antrea/pkg/agent/interfacestore"
"antrea.io/antrea/pkg/agent/util"
"antrea.io/antrea/pkg/agent/util/ethtool"
"antrea.io/antrea/pkg/apis/crd/v1alpha1"
"antrea.io/antrea/pkg/ovs/ovsconfig"
utilip "antrea.io/antrea/pkg/util/ip"
Expand Down Expand Up @@ -262,3 +263,13 @@ func (i *Initializer) prepareL7EngineInterfaces() error {
}
return nil
}

func (i *Initializer) setTXChecksumOffloadOnGateway() error {
if i.disableTXChecksumOffload {
if err := ethtool.EthtoolTXHWCsumOff(i.hostGateway); err != nil {
return fmt.Errorf("error when disabling TX checksum offload on %s: %v", i.hostGateway, err)
}
klog.InfoS("Disabled TX checksum offload on host gateway interface", "hostGateway", i.hostGateway)
}
return nil
}
4 changes: 4 additions & 0 deletions pkg/agent/agent_windows.go
Original file line number Diff line number Diff line change
Expand Up @@ -512,3 +512,7 @@ func (i *Initializer) installVMInitialFlows() error {
func (i *Initializer) prepareL7EngineInterfaces() error {
return nil
}

func (i *Initializer) setTXChecksumOffloadOnGateway() error {
return nil
}
10 changes: 7 additions & 3 deletions pkg/config/agent/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -120,9 +120,13 @@ type AgentConfig struct {
// IPv4 and Linux Nodes, and can be enabled only when `ovsDatapathType` is `system`,
// `trafficEncapMode` is `noEncap`, and `noSNAT` is true.
EnableBridgingMode bool `yaml:"enableBridgingMode,omitempty"`
// Disable TX checksum offloading for container network interfaces. It's supposed to be set to true when the
// datapath doesn't support TX checksum offloading, which causes packets to be dropped due to bad checksum.
// It affects Pods running on Linux Nodes only.
// Disable TX checksum offloading for container network interfaces and the host gateway interface (default:
// antrea-gw0). It's supposed to be set to true when the datapath doesn't support TX checksum offloading,
// which causes packets to be dropped due to bad checksum.
// If this option is later set to false, Antrea does nothing to the affected container network interfaces
// and the host gateway interface. To restore the default TX checksum state of the affected interfaces,
// it is recommended to delete them and recreate.
// This option affects Linux Nodes only.
DisableTXChecksumOffload bool `yaml:"disableTXChecksumOffload,omitempty"`
// APIPort is the port for the antrea-agent APIServer to serve on.
// Defaults to 10350.
Expand Down
Loading

0 comments on commit 1ee108b

Please sign in to comment.