Envoy public listener bound to incorrect port #11630
Labels
needs-investigation
The issue described is detailed and complex.
theme/envoy/xds
Related to Envoy support
theme/mesh-gw
Track mesh gateway work
type/bug
Feature does not function as expected
When filing a bug, please include the following headings if possible. Any example text in this template can be deleted.
Overview of the Issue
Background on our install:
We run the consul k8s helm chart to deploy consul on our EKS clusters. In addition we run consul agents on EC2 instances that join the mesh. We manage our consul service config on EC2 via consul and run our envoy sidecars via a systemd unit template. We've ran into this issue over time as we add more services to our EC2 instances.
We observed a few cases recently in consul service mesh where the envoy-sidecar process was attempting to bind the public_listener, but failed because the port was already used. When inspecting what was bound to the requested port, we found another envoy sidecar using the port. This "other" envoy sidecar was bound to a port that is different from what consul has configured.
Most of these occurrences seem to relate to the addition of new services on a host.
Reproduction Steps
One case where we saw this occur recently:
promtail
postgres_exporter
service - by adding a json file to/opt/consul/services
and callingconsul reload
postgres_exporter
by runningsystemctl start consul-sidecar@postgres_exporter.service
this starts an envoy sidecar with a command likeconsul-sidecar start postgres_exporter
Consul info for both Client and Server
Client info
Server info
Operating system and Environment details
We're using EKS 1.18 for servers and k8s-workloads
Our EC2 instances are either Centos7, or AL2
I've built custom binaries based on 1.10.3 with a patch for #8283 and #11422 - nothing else was changed
Log Fragments
Include appropriate Client or Server log fragments. If the log is longer than a few dozen lines, please include the URL to the gist of the log instead of posting it in the issue. Use
-log-level=TRACE
on the client and server to capture the maximum log detail.The text was updated successfully, but these errors were encountered: