Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.2.0 doesn't shut down properly #252

Closed
tim-x-y-z opened this issue Mar 22, 2024 · 26 comments
Closed

1.2.0 doesn't shut down properly #252

tim-x-y-z opened this issue Mar 22, 2024 · 26 comments
Labels
asgi Issue related to ASGI protocol bug Something isn't working
Milestone

Comments

@tim-x-y-z
Copy link

When upgrading to v1.2.0 it seems the shutdown for asgi doesn't work properly - it seems to be hanging.

@gi0baro
Copy link
Member

gi0baro commented Mar 22, 2024

@tim-habitat can you please provide more details on this (eg: OS/container, Python version, libraries used, granian cmd used) and possibly logs/MRE validating your statement?

@tim-x-y-z
Copy link
Author

tim-x-y-z commented Mar 23, 2024

sorry yes,
it was on an arm on mac os using fastapi, python 3.11.
both granian reload and not were affected.

i'll try to get a reproducible example here in the next few days!

note: i did have a lifespan to create database tables

@gi0baro gi0baro added the asgi Issue related to ASGI protocol label Mar 25, 2024
@zarekr
Copy link

zarekr commented Apr 1, 2024

similar issue here with Django running granian as asgi server:

 [ERROR] Application callable raised an exception
 Traceback (most recent call last):
   File "/home/python/.local/lib/python3.12/site-packages/granian/_futures.py", line 4, in future_watcher
     await inner(watcher.scope, watcher.proto)
   File "/home/python/.local/lib/python3.12/site-packages/channels/routing.py", line 62, in __call__
     return await application(scope, receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/sessions.py", line 47, in __call__
     return await self.inner(dict(scope, cookies=cookies), receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/sessions.py", line 263, in __call__
     return await self.inner(wrapper.scope, receive, wrapper.send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/auth.py", line 185, in __call__
     return await super().__call__(scope, receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/middleware.py", line 24, in __call__
     return await self.inner(scope, receive, send)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/channels/routing.py", line 116, in __call__
     return await application(
            ^^^^^^^^^^^^^^^^^^
   File "/home/python/.local/lib/python3.12/site-packages/django/core/handlers/asgi.py", line 170, in __call__
     await self.handle(scope, receive, send)
   File "/home/python/.local/lib/python3.12/site-packages/django/core/handlers/asgi.py", line 227, in handle
     await task
   File "/home/python/.local/lib/python3.12/site-packages/django/core/handlers/asgi.py", line 245, in listen_for_disconnect
     raise RequestAborted()
 RequestAborted: 

running in docker on mac m2

@gi0baro
Copy link
Member

gi0baro commented Apr 1, 2024

@zarekr no, it's not similar. Seems more like #186. This issue refers to shutdown.

@zarekr
Copy link

zarekr commented Apr 1, 2024

@zarekr no, it's not similar. Seems more like #186. This issue refers to shutdown.

Oops. Had the wrong issue open 🥱

@RafaelWO
Copy link
Contributor

RafaelWO commented Apr 18, 2024

I'm experiencing similar issues when trying to shutdown granian. It is stuck on stopping the worker:

[2024-04-18 11:29:02.088 | INFO  ] PUT "/items" HTTP 100 --- middleware
^C[INFO] Shutting down granian
[INFO] Stopping worker-1
[INFO] Stopping worker-1 runtime-1

To stop it I have to kill Granian via SIGKILL.


System:

Granian: 1.2.3
Python:  3.12.1  
Linux:   6.5.0-27-generic #28~22.04.1-Ubuntu x86_64 GNU/Linux

@gi0baro
Copy link
Member

gi0baro commented Apr 18, 2024

@RafaelWO which OS? Linux?

@RafaelWO
Copy link
Contributor

@gi0baro Yes! I updated my comment.

I downgraded to granian==1.1.2 and had no issues - so far 🙂

@gi0baro gi0baro added the bug Something isn't working label Apr 18, 2024
@RafaelWO
Copy link
Contributor

It is a weird issue and hard to reproduce...

What I've found so far is that if I disable my custom HTTP middleware I was not able to run into the shutdown bug. (I'm using FastAPI and adding a function via app.add_middleware() of type starlette.middleware.base.BaseHTTPMiddleware.)

@gi0baro
Copy link
Member

gi0baro commented Apr 18, 2024

It is a weird issue and hard to reproduce...

What I've found so far is that if I disable my custom HTTP middleware I was not able to run into the shutdown bug. (I'm using FastAPI and adding a function via app.add_middleware() of type starlette.middleware.base.BaseHTTPMiddleware.)

That sounds quite strange, I agree. Is that still only on 1.2 or also 1.1?

@RafaelWO
Copy link
Contributor

RafaelWO commented Apr 18, 2024

I might have found a somewhat minimal reproducible example:

# file: requirements.txt
fastapi
granian>=1.2
# file: main.py
from fastapi import FastAPI, Request

app = FastAPI()


# If I comment/remove this middleware, everything works fine
@app.middleware("http")
async def log_req(request: Request, call_next):
    response = await call_next(request)
    print("[MIDDLEWARE]", request.method, request.url, response.status_code)
    return response


data = [{"id": 1, "type": "foo"}, {"id": 2, "type": "bar"}]


@app.get("/items")
def read_items():
    print("Hello!")
    return data

To reproduce:

  1. Create a virtual environment using Python 3.12
  2. Install the requirements
  3. Run the app via granian --interface asgi --host "0.0.0.0" --port 8000 main:app
  4. Open the swagger UI at http://localhost:8000/docs
  5. Do at least 4 GET requests to /items via Try it out and Execute or curl (yes, it is weird)
  6. Try to shutdown your app via CTRL+C in your shell where you started granian
  7. You should get stuck (if not try again)

I can reproduce this issue at least 90% of the time 😆

@gi0baro
Copy link
Member

gi0baro commented Apr 19, 2024

  1. Do at least 4 GET requests to /items via Try it out and Execute or curl (yes, it is weird)
  2. You should get stuck (if not try again)

I can reproduce this issue at least 90% of the time 😆

LoL. Super weird.
Can you do a quick test for me and change the route to be an async function? I suspect this might be related to asyncio executors.

@RafaelWO
Copy link
Contributor

Can you do a quick test for me and change the route to be an async function? I suspect this might be related to asyncio executors.

You are right! If the route is an async def it works fine. Any CTA on this?

P.S. Could you also reproduce this issue with my code above?

@gi0baro
Copy link
Member

gi0baro commented Apr 19, 2024

You are right! If the route is an async def it works fine. Any CTA on this?

I'm not sure, it might be related to the way workers handle signals, but further investigations are needed.

P.S. Could you also reproduce this issue with my code above?

Not yet, I might have some time during the weekend to run tests.

@sciyoshi
Copy link

Bisecting shows that this issue is due to #232, specifically the changes to src/callbacks.rs and src/runtime.rs. Something is causing certain requests to have hanging tasks, which prevents threads from shutting down. Incidentally this may also be the cause of #276, because those hanging tasks are preventing some requests from being cleaned up.

@gi0baro
Copy link
Member

gi0baro commented Apr 25, 2024

Bisecting shows that this issue is due to #232, specifically the changes to src/callbacks.rs and src/runtime.rs. Something is causing certain requests to have hanging tasks, which prevents threads from shutting down. Incidentally this may also be the cause of #276, because those hanging tasks are preventing some requests from being cleaned up.

Can you elaborate?
When referring to tasks you mean asyncio tasks still running on the Python event loop? Or are you referring to the Rust runtime?
Also, which thread gets stuck? You mean the event loop in the Python main thread?

@sciyoshi
Copy link

Yes, asyncio tasks on the event loop, and the main thread running the Python event loop in the worker. The issue appears to be related to cancellations - #279 fixes it in some cases for me, although there seems to be something else that is still broken.

@sciyoshi
Copy link

FWIW, the shutdown issue I see when running on our larger codebase also happens on 1.1.2.

@gi0baro
Copy link
Member

gi0baro commented Apr 25, 2024

FWIW, the shutdown issue I see when running on our larger codebase also happens on 1.1.2.

Then it is not related to changes made in #232: 1.1.x still uses the standard PyO3 pyclass implementation.
Maybe running some of the code involved in the issue with asyncio debug enabled can give us more info.

@sciyoshi
Copy link

The changes from #232 do appear related, because the repro provided fails consistently, but it seems very tricky to figure out 😢

@gi0baro
Copy link
Member

gi0baro commented Apr 26, 2024

@sciyoshi @RafaelWO anybody able to test with current master (312a264) and see if cancellation theory is correct?

@sciyoshi
Copy link

Fixes it for me at least, thanks @gi0baro!

@gi0baro gi0baro added this to the 1.2 milestone Apr 26, 2024
@gi0baro
Copy link
Member

gi0baro commented Apr 27, 2024

Closing this due to changes in 1.3.
Feel free to open a new issue in case you experience the same with 1.3.x.

@gi0baro gi0baro closed this as completed Apr 27, 2024
@RafaelWO
Copy link
Contributor

I also tested it again with granian 1.3 and it's fixed! Thx @gi0baro 🙂

@Tragio
Copy link

Tragio commented Jun 3, 2024

Hi @gi0baro 👋

I have a project with Django Tenants and Django Channels. I tried granian, and made a very dummy frontend with WebSocket. Django Channels on connection sent this to frontend:

      await self.send(
            text_data=json.dumps(
                {"type": "connection_established", "message": "You are now connected!"}
            )
        )

It worked, but after some browser refreshes, I stopped seeing WebSocket information on the granian, and after a couple of minutes it suddenly picked a bunch of those previous requests but never picking the current one. Closing the server and opening again was not solving, only by restarting the computer. 😅

Not sure if it is not closing properly and then starts to throttle, no idea.

Let me know what can I do to help you 🙏

@gi0baro
Copy link
Member

gi0baro commented Jun 3, 2024

@Tragio the issue you reported seems to be unrelated to the OP.
Please open up a new issue with a MRE and details about your environment (OS, Python version, Granian version, involved packages, etc).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
asgi Issue related to ASGI protocol bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants