Fine-Grained Resource Allocation for Spegel DaemonSets #718

ugur99 · 2025-01-29T08:34:42Z

Spegel version

v0.0.28

Kubernetes distribution

Kubeadm

Kubernetes version

v1.31.3

CNI

Cilium

Describe the bug

We are deploying Spegel across a Kubernetes cluster with multiple node groups, each with different resource availability and workload patterns. A single DaemonSet doesn’t work efficiently due to varying resource needs.

Additionally, Spegel’s memory usage differs significantly even within the same node group, making it hard to set uniform resource requests/limits without over- or under-provisioning.

You can see the spegel memory usage differences for a one specific node group in this image:

Is there a way to manage spegel daemonset's resource request/limits more efficiently for the same node group?

craig-seeman · 2025-01-29T14:47:25Z

Hey there @ugur99! We had some pretty wild growth in my own deployment a while back and posted a similar issue ( #546 ) as well, but after some pretty intense digging in discovered that you can safely set the request/limit to a pretty low number (such as 256MB) and spegel should continue to operate without issue. It appears this is more of an OS/kernel level memory utilization that will eventually just take what it is given and continue to utilize that, while the spegel binary itself in memory isn't truly needing it to operate. We have been setup in this 256MB manner in a 50+ node cluster since September and have not really had an issue of OOM or container crashes.

Hope this helps!

phillebaba · 2025-01-30T14:06:22Z

@ugur99 this is an issue that I have been trying to track down for a while now. There is something that consumes the memory when no limits have been set. Maybe it is time to set some defaults in the Helm chart. It would be really helpful if you could share profiler data from one of the pods consuming 12 GB of data. This would really help me pinpoint what is consuming the memory.

If you port forward to port 9090 to one of the pods consuming the memory you can run the following command to run the profiler.

curl http://localhost:9090/debug/pprof/heap > heap.pprof

ugur99 · 2025-01-31T13:41:29Z

Thank you both for the answers! @phillebaba I sent it to you on the kubernetes slack channel dm :) Hope it helps.

phillebaba · 2025-02-06T18:17:39Z

I was hoping the pprof data would point me in the right direction but gave me nothing. I have a hunch about where the problem is. I am going to spin up a AKS cluster tomorrow and try to reproduce this by throwing a bunch of parallel requests to a large layer. If I can reproduce it and then show that #725 does not have this problem we can close this issue with that PR.

phillebaba · 2025-02-07T23:28:23Z

I have tried a lot of different things today to put pressure on Spegel. While I have found some interesting behavior I have not been able to trigger the memory leak. Looking at your metrics there has to be something that triggers memory usage from 100 MB to 12GB which I am guessing is the memory available on the node.

phillebaba · 2025-02-08T12:57:19Z

I managed to reproduce this finally in an AKS cluster. I still dont know what the source of this memory leak is because pprof is not giving anything. What surprised me is that the consumption is not happening in the proxy but when serving blob? I can clearly see that memory usage increases each time I pull an image and is never released.

Now starts the difficult process of figuring out what is leaking memory.

phillebaba · 2025-02-20T19:15:04Z

It has been very educational for me to research this issue 😄

I now have an explanation for why this is happening when no memory limit is set. What is happening is that the page cache is counted as part of the memory usage of the container. Meaning that when we are streaming large blobs we populate the page cache with large amounts of data. As we close the files after serving them they are technically not needed but are not cleaned up until memory pressure occurs. When there is no limit set this pressure is assumed to be the nodes available memory. As more and more layers are served the memory usage will just keep on growing until memory pressure occurs and the kernel kicks in to free the page cache.

Here is an issue which discusses things in more detail.
kubernetes/kubernetes#43916

The goal of Spegel should be to give a good UX right out of the box. I do not think that this is something that is being considered by the majority of users. Right now unless I find some other solution I think the best way forward is to set a default memory request and limit in the Helm chart.

ugur99 added the bug Something isn't working label Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fine-Grained Resource Allocation for Spegel DaemonSets #718

Fine-Grained Resource Allocation for Spegel DaemonSets #718

ugur99 commented Jan 29, 2025

craig-seeman commented Jan 29, 2025

phillebaba commented Jan 30, 2025

ugur99 commented Jan 31, 2025

phillebaba commented Feb 6, 2025

phillebaba commented Feb 7, 2025

phillebaba commented Feb 8, 2025

phillebaba commented Feb 20, 2025

Fine-Grained Resource Allocation for Spegel DaemonSets #718

Fine-Grained Resource Allocation for Spegel DaemonSets #718

Comments

ugur99 commented Jan 29, 2025

Spegel version

Kubernetes distribution

Kubernetes version

CNI

Describe the bug

craig-seeman commented Jan 29, 2025

phillebaba commented Jan 30, 2025

ugur99 commented Jan 31, 2025

phillebaba commented Feb 6, 2025

phillebaba commented Feb 7, 2025

phillebaba commented Feb 8, 2025

phillebaba commented Feb 20, 2025