[Feature request] Support CPU+GPU mixed execution #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Open

malfet opened this issue Mar 28, 2024 · 1 comment

Labels

Contributor

malfet commented Mar 28, 2024

Assumption right now, it's only needed when there is not enough GPU memory, but perhaps sometimes it's just faster this way

Right now we only doing tokenization on CPU and inference can run on either CPU or GPU

Contributor

mikekgfb commented Apr 25, 2024

Already supported in PyTorch, but it's on the user to orchestrate

mikekgfb changed the title ~~Support CPU+GPU mixed execution~~ [Feature request] Support CPU+GPU mixed execution

Olivia-liu added the enhancement label

This was referenced Oct 10, 2024

Can't install using Python 3.12.3 #1289

Open

export to AOTI using cuda doesn't work using WSL #1293

Open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment