Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Elastic Agent with "Agent monitoring" enabled uses excessive memory on Kubernetes deployments #6594

Open
BenB196 opened this issue Jan 24, 2025 · 3 comments
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@BenB196
Copy link

BenB196 commented Jan 24, 2025

  • Version: 8.17.1
  • Operating System: Kubernetes/AWS EKS (1.31)
  • Steps to Reproduce:
    1. Deploy a Fleet Managed Elastic Agent DaemonSet to Kubernetes (with "Agent monitoring" enabled)
    2. Observe memory utilization
    3. Disable "Agent monitoring"
    4. Observe memory utilization
    5. Observe that the memory utilization decreases by ~200-250MB (~25% of total agent pod/container memory utilization)

Image

The drop in memory on the above graph is when "Agent monitoring" was disabled. The right-hand legend, is the total (sum) of all memory utilization for Elastic Agents in an environment. The total goes from ~39.5GB (with "Agent Monitoring" enabled), to ~30GB (with "Agent monitoring" disabled).

Image

If observed on a per pod basis, the average memory utilization goes from ~980MB to ~740MB.

I believe that this is a bug, as it doesn't make sense as to why simply enabling monitoring of the Elastic Agent itself increases the memory utilization by ~200-250MB/~25% of total memory used by the agent.

@BenB196 BenB196 added the bug Something isn't working label Jan 24, 2025
@jlind23 jlind23 added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Jan 24, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@cmacknz
Copy link
Member

cmacknz commented Jan 24, 2025

When you turn on monitoring it causes agent to start 3 new Beat sub-processes, each of which increases memory usage by ~75 MB just to exist.

We have a big architecture change in progress now to move away from sub-processes where we can which should reduce steady state memory usage quite a bit beyond just the monitoring components once it's all done.

We observe this same problem internally as we use Elastic Agent for observability in our own cloud. We are just starting the work to make and deploy the change needed to fix this for the monitoring components there first to prove it out before turning it on for external users.

@BenB196
Copy link
Author

BenB196 commented Jan 27, 2025

Thanks @cmacknz for the information, look forward to the changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

No branches or pull requests

4 participants