Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile for Whisper.cpp with PyTorch, CUDA, and GPU Support (whisper-cli / whisper-stream) #2752

Open
naren200 opened this issue Jan 20, 2025 · 0 comments

Comments

@naren200
Copy link

naren200 commented Jan 20, 2025

Description:

Hi Community,

I'm trying to set up a Docker environment for using whisper.cpp, but I've encountered several issues with existing Dockerfiles online that don't seem to work out of the box. Specifically, I'm looking for a Dockerfile that meets the following requirements:

  • PyTorch with CUDA support: I need to have PyTorch installed with proper CUDA support for GPU acceleration.
  • NVIDIA-SMI drivers: The container should have access to the NVIDIA-SMI drivers to check and manage GPU status.
  • NVCC toolkit: The NVIDIA CUDA toolkit (nvcc) should be installed to compile CUDA code.
  • Most importantly GPU Access through whisper-cli or whisper-stream: The container should provide GPU access through tools like whisper-cli or whisper-stream for inference tasks.

Prodominant outcome

One of the Issue used: #2032 (comment)
Many users have supported the solution, which doesn't seem to work with GPU support along with whisper.cpp
Command for running Docker image:

docker run \
      --rm \
      --gpus all -e LD_LIBRARY_PATH="" \
      -v ./whisper_models:/app/models \
      -v ./wav_dir:/app/testdata \
      ghcr.io/ggerganov/whisper.cpp:main-cuda \
      "/app/main --file /app/testdata/harvard.wav --language en --output-txt true --model /app/models/ggml-large-v3-turbo.bin --output-file /app/testdata/harvard"

Outcome:

whisper_init_with_params_no_state: use gpu    = 1
whisper_init_with_params_no_state: flash attn = 0
whisper_init_with_params_no_state: gpu_device = 0

ggml_cuda_init: failed to initialize CUDA: unknown error
whisper_model_load:      CPU total size =  1623.92 MB
whisper_model_load: model size    = 1623.92 MB
whisper_backend_init_gpu: using CUDA backend
ggml_backend_cuda_init: invalid device 0
whisper_backend_init_gpu: ggml_backend_cuda_init() failed

Please let me know if anyone has a working Dockerfile that satisfies all of the above requirements, or if anyone can point me in the right direction to resolve this. Let's conclude a dockerfile which works for any system.

Kind request: Don't close this issue without a proper solution, because there are several Dockerfiles out there that don't work out of the box. I have been trying this for several days. No solution yet...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant