Priority of autoconfigured mirrors #765

thejoeejoee · 2025-03-03T17:03:04Z

Spegel version

v0.0.30

Kubernetes distribution

custom kubeadm

Kubernetes version

1.31.5

CNI

cilium

Describe the bug

Currently, the configuration init container configures the order of mirrors as following:

- --mirror-targets
- http://$(NODE_IP):{{ .Values.service.registry.hostPort }}
- http://$(NODE_IP):{{ .Values.service.registry.nodePort }}

which leads to expected order in hosts.toml (order of hosts matters here):

server = 'internal'

[host.'http://10.245.0.49:30020']
capabilities = ['pull', 'resolve']

[host.'http://10.245.0.49:30021']
capabilities = ['pull', 'resolve'

What is the reasoning behind this order?

In my perception, using the NodePort as first mirror would be more meaningful,
since it's the responsibility of local Spegel to do the lookup for an image/blob.

Using the HostPort service as the first mirror introduces the unnecessary network hop
even if the local spegel is able to resolve the image via p2p.

Also, the infrastrusture schema doesn't correspond to the order, schema is showing that 30020 is local Spegel instance, which is not true, it's balanced HostPort.

The text was updated successfully, but these errors were encountered:

phillebaba · 2025-03-04T11:06:22Z

The node port service was added a while back as a fallback if the local Spegel instance was not functioning properly. About three years ago there was a KEP which sadly was never implemented which would have allowed prefer local to be a setting. This would basically create a load balanced service which would always direct traffic locally and only fall back if the local Pod was unhealthy. My initial design idea was to add the node port service and then switch over to it completely when then KEP was implemented.

I think you may have mixed up the two ports when trying to understand the functionality. The port 30020 is the host port, and will only function if the local instance of Spegel is running. The port 30021 routes to the service which will route traffic to any other node in the cluster. No matter where traffic is routed the first request will have to try to lookup another node which has the requested layer. There is no guarantee that the node the service routes to actually has the requested layer.

You actually want the first request to be to the local Spegel instance because the networking is local. It is then up to the local instance to forward the request to the correct Spegel instance on another node. If you are using the fallback you could in theory make two hops to two different nodes which would most certainly increase the latency.

Does this makes sense?

I have actually been considering removing the fallback service or at lease disabling it by default because I see less value in having it. It increases complexity and may even make latency higher. I am unsure if it actually solves the problem I first thought of when adding it.

thejoeejoee added the bug Something isn't working label Mar 3, 2025

thejoeejoee linked a pull request Mar 3, 2025 that will close this issue

fix(configuration): prioritize NodePort over HostPort #766

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Priority of autoconfigured mirrors #765

Priority of autoconfigured mirrors #765

thejoeejoee commented Mar 3, 2025 •

edited

Loading

phillebaba commented Mar 4, 2025

Priority of autoconfigured mirrors #765

Priority of autoconfigured mirrors #765

Comments

thejoeejoee commented Mar 3, 2025 • edited Loading

Spegel version

Kubernetes distribution

Kubernetes version

CNI

Describe the bug

phillebaba commented Mar 4, 2025

thejoeejoee commented Mar 3, 2025 •

edited

Loading