-
-
Notifications
You must be signed in to change notification settings - Fork 92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible memory-leak investigation #276
Comments
@cpoppema jfyi I did some investigation on this during the weekend using https://github.com/bloomberg/memray, but I wasn't able to reproduce the memory increase trend you had on my machine (macOS). I need to schedule a similar test on a Linux machine, as it might be a target-specific issue. |
If you tell me what flags to run, I can run memray no problem. As a quick example, if I wrap my service in memray:
and then use
For 1.2.2 this prints 1.0 GiB
For 1.2.3 this prints only 5.x MiB
More stats:
but RES memory in htop did keep growing. Starting with 134M, ending up at around 612M after siege is finished after 5 minutes. |
@cpoppema the difference between 1.2.2 and 1.2.3 should be due to Running Thank you for the patience 🙏 |
It generates 3 .bin files per run (~3M, ~88M, 1.9G) here are the smallest of each run (1.2.3 and 1.3.0) |
@cpoppema I'm afraid those files only contains data from the main process, not the actual workers. Would be nice to have also the other dumps (maybe compressed?). |
@gi0baro Right. I've re-uploaded them as .tar.xz with just 30 seconds of siege. Looks like there is only two files without the |
@cpoppema so, given that we trust
my suspect for this to happen only in Granian
is that probably Python is not deallocating objects between requests. Which is super-weird, but at least now I have a starting point to do some additional checks. Gonna post updates as soon as I make new discoveries. |
Interestingly gunicorn also has a
We also noticed memory leaks in our django application (with WSGI and ASGI), but I haven't got time to investigate it yet. Our Kubernetes pods will just restart from time to time. Restarting granian only instead would be faster and more efficient. Related to #34 |
Possibly related to #252 (comment). |
@cpoppema this commit might be worth a test round, at least to see if @sciyoshi is right about the root cause. |
That looks very promising indeed! I'm not sure about the implications of that change, but comparing observed memory usage it seems to have fixed it 💪 , see graphs below 🎉🎉🎉 To generate the graphs I used baseline granian 1.2.3: in contrast with uvicorn: with granian master: |
Closing this as per changes in 1.3. |
Apologies in advance for commenting on an already closed issue however I am wondering if this should address the memory leak (Edit: Updated to 1.3 and still see the memory leak issue described above ) that is being caused by memory not being released by python itself described here w.r.t malloc:
Should I create a new issue for this in case there is something that can be done in granian? |
Both those seems unrelated to this issue (I also participated in the uvicorn discussion you mentioned). I also frankly doubt the discussion in uvicorn has anything to do with the CPython one, given no ssl/tls is used. But I definitely have no time to investigate those.
Seems a bit unrelated as well. Can you please open up a new issue with numbers, tested alternatives and a MRE? |
Currently suspected protocol: ASGI.
Verification on RSGI and WSGI should be done as well.
Discussed in #268
Originally posted by cpoppema April 12, 2024
For two projects I replaced uvicorn with uvloop with granian and my monitoring tools show an increase in memory usage. I am wondering if this is something I have to accept with granian - I don't know if it'll flatten out at some point - or if there is an underlying issue I might be able to help debug. Benchmarking is all about request numbers but does not mention any memory increase, so I suspect an issue.
Figures below are from 2 services running fastapi + pydantic + sqlalchemy.
service 1, graph is from 7 days:
cpu before (uvicorn)/after (granian)
requests before (uvicorn)/after (granian)
memory before (uvicorn)/after (granian)
service 2, graph is from 2 days:
cpu before (uvicorn)/after (granian)
requests before (uvicorn)/after (granian)
memory before (uvicorn)/after (granian)
The text was updated successfully, but these errors were encountered: