Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

listener: switch default for SO_REUSEPORT to true #15794

Closed
mattklein123 opened this issue Mar 31, 2021 · 2 comments · Fixed by #17259
Closed

listener: switch default for SO_REUSEPORT to true #15794

mattklein123 opened this issue Mar 31, 2021 · 2 comments · Fixed by #17259
Assignees

Comments

@mattklein123
Copy link
Member

Switch the default to SO_REUSEPORT to true as at this point functional kernel support is widespread, and it's an ongoing support burden for Envoy with people complaining about connection imbalance. Plan:

  1. Deprecate reuse_port
  2. Add WKT enable_reuse_port
  3. Default for enable_reuse_port will be true, with a runtime feature flag to flip the default to false (preserve current behavior).

cc @envoyproxy/api-shepherds @antoniovicente

@mattklein123
Copy link
Member Author

I'm realizing that flipping the default to true probably also has to be aware of hot restart. I don't think hot restarting onto a different default is going to go well.

mattklein123 added a commit that referenced this issue Apr 1, 2021
1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.

Fixes #15794

Signed-off-by: Matt Klein <[email protected]>
@daixiang0
Copy link
Member

Still available?

mattklein123 added a commit that referenced this issue Jul 7, 2021
1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.
5) Change listener init so that almost all error handling occurs on
   the main thread. This a) vastly simplifies error handling and
   b) makes it so that we pre-create all sockets on the main thread
   and can use them all during hot restart.
6) Change hot restart to pass reuse port sockets by socket/worker
   index. This works around a race condition in which a draining
   listener has a new connection on its accept queue, but it's
   never accepted by the old process worker. It will be dropped.
   By passing all sockets (even reuse port sockets) we make sure
   the accept queue is fully processed.

Fixes #15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

Signed-off-by: Matt Klein <[email protected]>
mattklein123 added a commit that referenced this issue Jul 20, 2021
1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.
5) Change listener init so that almost all error handling occurs on
   the main thread. This a) vastly simplifies error handling and
   b) makes it so that we pre-create all sockets on the main thread
   and can use them all during hot restart.
6) Change hot restart to pass reuse port sockets by socket/worker
   index. This works around a race condition in which a draining
   listener has a new connection on its accept queue, but it's
   never accepted by the old process worker. It will be dropped.
   By passing all sockets (even reuse port sockets) we make sure
   the accept queue is fully processed.

Fixes #15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

Signed-off-by: Matt Klein <[email protected]>
lizan pushed a commit to envoyproxy/data-plane-api that referenced this issue Jul 20, 2021
1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.
5) Change listener init so that almost all error handling occurs on
   the main thread. This a) vastly simplifies error handling and
   b) makes it so that we pre-create all sockets on the main thread
   and can use them all during hot restart.
6) Change hot restart to pass reuse port sockets by socket/worker
   index. This works around a race condition in which a draining
   listener has a new connection on its accept queue, but it's
   never accepted by the old process worker. It will be dropped.
   By passing all sockets (even reuse port sockets) we make sure
   the accept queue is fully processed.

Fixes envoyproxy/envoy#15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

Signed-off-by: Matt Klein <[email protected]>

Mirrored from https://github.com/envoyproxy/envoy @ ba474ac375418d5529a30a9510a8ff5f0a5ada9a
leyao-daily pushed a commit to leyao-daily/envoy that referenced this issue Sep 30, 2021
1) Deprecate existing reuse_port field
2) Add new enable_reuse_port field which uses a WKT
3) Make the new default hot restart aware so the default is
   not changed during hot restart.
4) Allow the default to be reverted using the
   "envoy.reloadable_features.listener_reuse_port_default_enabled"
   feature flag.
5) Change listener init so that almost all error handling occurs on
   the main thread. This a) vastly simplifies error handling and
   b) makes it so that we pre-create all sockets on the main thread
   and can use them all during hot restart.
6) Change hot restart to pass reuse port sockets by socket/worker
   index. This works around a race condition in which a draining
   listener has a new connection on its accept queue, but it's
   never accepted by the old process worker. It will be dropped.
   By passing all sockets (even reuse port sockets) we make sure
   the accept queue is fully processed.

Fixes envoyproxy#15794

Risk Level: High, scary stuff involving hot restart and listener init
Testing: New and existing tests. It was very hard to get the tests to pass which gives me more confidence.
Docs Changes: N/A
Release Notes: Added
Platform Specific Features: N/A

Signed-off-by: Matt Klein <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants