-
-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Installation]: VLLM does not support TPU v5p-16 (Multi-Host) with Ray Cluster #10155
Comments
Ray doesn't detect TPU.. I have the same issue |
I see you are using |
This will probably be fixed by #11257 |
@ruisearch42 why #11257 can fix it? |
is it because the change from |
@youkaichao actually #11257 won't fix it. @Bihan I think for some reason TPU resource was not detected by Ray. Here is how the detection works: You can add some debug code into your ray installation, and see what these would print when you run |
@ruisearch42 Thank you. Also should I use |
@Bihan you can try that, since it was reported to be working. |
Hi, I believe #10155 (comment) should fix this issue. Can this be closed? |
@richardsliu looks like your link is not correct? |
Not sure why. I meant this fix:
|
@richardsliu Sounds good, thanks. I'm closing the issue. @Bihan feel free to reopen if it is not fixed. |
Your current environment
How you are installing vllm
Create a TPU VM
On head node
ray start --block --head --port=6379
On other node (Note: TPU v5p-16 has 2 nodes)
ray start --block --address=<head-node-address>:6379
#Below is the ray-status
Ray status shows both the nodes active
However, when I run vllm serve from master node it issues error "The number of required TPUs exceeds the total number of available TPUs in the placement group.", even when it is connected to cluster.
I have tried with --tensor-parallel-size 2, 4, 8, 16 and the output is same.
Before submitting a new issue...
The text was updated successfully, but these errors were encountered: